[libvirt] LXC broken on Linux >= 3.15

Hi! Kernel commit 23adbe12 ("fs,userns: Change inode_capable to capable_wrt_inode_uidgid") uncovered a libvirt-lxc issue. Starting with that commit the kernel correctly checks also the gid of an inode. Sadly this change breaks libvirt-lxc in a way such that openpty() will always fail with -EPERM within a container. Therefore ssh and other programs are no longer usable. Libvirt's virLXCControllerSetupDevPTS() has a hardcoded mount string for mounting devpts, namely "newinstance,ptmxmode=0666,mode=0620,gid=5", devpts correctly translates the uid and gid while mounting but libvirt mounts devpts _before_ setting up the uid/gid mappings. Therefore the internal gid for the new devpts instance is still 5 instead the mapped gid and the new check in the kernel will always fail. We have two options to fix that: a) virLXCControllerSetupDevPTS() translates the gid (5) by hand and passes the correct value to devpts. (IMHO hacky) b) We setup devpts and therefore also the consoles after installing the mappings. This needs maybe a bit of work. First I thought a trivial patch like the appended one will do it, but then libvirt fails to start a guest with no further explanation. Maybe I've later the time to investigate further. What do you think? Thanks, //richard diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 2d220eb..3435f42 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -2157,9 +2157,6 @@ virLXCControllerRun(virLXCControllerPtr ctrl) if (virLXCControllerSetupResourceLimits(ctrl) < 0) goto cleanup; - if (virLXCControllerSetupDevPTS(ctrl) < 0) - goto cleanup; - if (virLXCControllerPopulateDevices(ctrl) < 0) goto cleanup; @@ -2172,9 +2169,6 @@ virLXCControllerRun(virLXCControllerPtr ctrl) if (virLXCControllerSetupFuse(ctrl) < 0) goto cleanup; - if (virLXCControllerSetupConsoles(ctrl, containerTTYPaths) < 0) - goto cleanup; - if (lxcSetPersonality(ctrl->def) < 0) goto cleanup; @@ -2198,6 +2192,12 @@ virLXCControllerRun(virLXCControllerPtr ctrl) if (virLXCControllerSetupUserns(ctrl) < 0) goto cleanup; + if (virLXCControllerSetupDevPTS(ctrl) < 0) + goto cleanup; + + if (virLXCControllerSetupConsoles(ctrl, containerTTYPaths) < 0) + goto cleanup; + if (virLXCControllerMoveInterfaces(ctrl) < 0) goto cleanup;

On Mon, Jul 28, 2014 at 04:25:56PM +0200, Richard Weinberger wrote:
Hi!
Kernel commit 23adbe12 ("fs,userns: Change inode_capable to capable_wrt_inode_uidgid") uncovered a libvirt-lxc issue. Starting with that commit the kernel correctly checks also the gid of an inode.
Sadly this change breaks libvirt-lxc in a way such that openpty() will always fail with -EPERM within a container. Therefore ssh and other programs are no longer usable.
Libvirt's virLXCControllerSetupDevPTS() has a hardcoded mount string for mounting devpts, namely "newinstance,ptmxmode=0666,mode=0620,gid=5", devpts correctly translates the uid and gid while mounting but libvirt mounts devpts _before_ setting up the uid/gid mappings. Therefore the internal gid for the new devpts instance is still 5 instead the mapped gid and the new check in the kernel will always fail.
We have two options to fix that: a) virLXCControllerSetupDevPTS() translates the gid (5) by hand and passes the correct value to devpts. (IMHO hacky)
You mean that instead of passing the value '5', if the guest GIDs had been remapped to start at 1000, we would pass in '1005' to mount ? I don't think that's hacky - it seems like a perfectly sensible fix to do. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Am 28.07.2014 16:37, schrieb Daniel P. Berrange:
On Mon, Jul 28, 2014 at 04:25:56PM +0200, Richard Weinberger wrote:
Hi!
Kernel commit 23adbe12 ("fs,userns: Change inode_capable to capable_wrt_inode_uidgid") uncovered a libvirt-lxc issue. Starting with that commit the kernel correctly checks also the gid of an inode.
Sadly this change breaks libvirt-lxc in a way such that openpty() will always fail with -EPERM within a container. Therefore ssh and other programs are no longer usable.
Libvirt's virLXCControllerSetupDevPTS() has a hardcoded mount string for mounting devpts, namely "newinstance,ptmxmode=0666,mode=0620,gid=5", devpts correctly translates the uid and gid while mounting but libvirt mounts devpts _before_ setting up the uid/gid mappings. Therefore the internal gid for the new devpts instance is still 5 instead the mapped gid and the new check in the kernel will always fail.
We have two options to fix that: a) virLXCControllerSetupDevPTS() translates the gid (5) by hand and passes the correct value to devpts. (IMHO hacky)
You mean that instead of passing the value '5', if the guest GIDs had been remapped to start at 1000, we would pass in '1005' to mount ? I don't think that's hacky - it seems like a perfectly sensible fix to do.
Correct. If you're fine with that I'll happily submit a patch. Thanks, //richard
participants (2)
-
Daniel P. Berrange
-
Richard Weinberger