Problem background
------------------
The LXC driver has support for the filesystem types "file" and
"block"
that allow a disk image to be mounted in the guest (container). [1]
However, when user-namespace is enabled (uid/gid mapping is used) the
mount of the root filesystem block device fails. [2]
According to "man 7 user_namespaces":
Mounting block-based filesystems can be done only by a process that holds
CAP_SYS_ADMIN in the initial user namespace.
Suggested approach
------------------
Mount the root file system block device before the clone() call, then set
filesystem type to VIR_DOMAIN_FS_TYPE_MOUNT and filesystem source to the folder
where it was mounted.
Issues encountered
--------------------
This patch series implements the basic idea of the mentioned approach.
In result, a container with configured idmap and NBD filesystem is able to start.
However, on guest shutdown this kernel error [3] occurs.
Similar messages [4] occur on shutdown when NBD filesystem is used with LXC
container without idmap.
Perhaps, one reason could be that on guest shutdown the LXC driver kills qemu-nbd
process without sending disconnect for the specified device.
References
----------
[1]
https://libvirt.org/formatdomain.html#elementsFilesystems
[2]
https://bugzilla.redhat.com/show_bug.cgi?id=1328946
[3]
https://pastebin.com/raw/jMBk5mtG
[4]
https://pastebin.com/raw/wTKbuRP9
Radostin Stoyanov (3):
lxc: Make lxcContainerMountFSBlock non static
lxc: Move up virLXCControllerAppendNBDPids
lxc: Mount NBD devices before clone
src/lxc/lxc_container.c | 58 +------------------
src/lxc/lxc_container.h | 4 ++
src/lxc/lxc_controller.c | 145 +++++++++++++++++++++++++++--------------------
3 files changed, 87 insertions(+), 120 deletions(-)
--
2.14.3