
On Wed, Dec 07, 2016 at 09:36:15AM +0100, Michal Privoznik wrote:
Prime time. When it comes to spawning qemu process and relabelling all the devices it's going to touch, there's inherent race with other applications in the system (e.g. udev). Instead of trying convincing udev to not touch libvirt managed devices, we can create a separate mount namespace for the qemu, and mount our own /dev there. Of course this puts more work onto us as we have to maintain /dev files on each domain start and device hot(un-)plug. On the other hand, this enhances security also.
From technical POV, on domain startup process the parent (libvirtd) creates:
/var/lib/libvirt/qemu/$domain.dev /var/lib/libvirt/qemu/$domain.devpts
The child (which is going to be qemu eventually) calls unshare() to create new mount namespace. From now on anything that child does is invisible to the parent. Child then mounts tmpfs on $domain.dev (so that it still sees original /dev from the host) and creates some devices (as explained in one of the previous patches). The devices have to be created exactly as they are in the host (including perms, seclabels, ACLs, ...). After that it moves $domain.dev mount to /dev.
What's the $domain.devpts mount there for then you ask? QEMU can create PTYs for some chardevs. And historically we exposed the host ends in our domain XML allowing users to connect to them. Therefore we must preserve devpts mount to be shared with the host's one.
To make this patch as small as possible, creating of devices configured for domain in question is implemented in next patches.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 377 +++++++++++++++++++++++++++++++++++++++++++++++- src/qemu/qemu_domain.h | 20 +++ src/qemu/qemu_process.c | 13 ++ 3 files changed, 408 insertions(+), 2 deletions(-)
+int +qemuDomainBuildNamespace(virQEMUDriverPtr driver, + virDomainObjPtr vm) +{ + virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + const unsigned long mount_flags = MS_MOVE; + char *devPath = NULL; + char *devptsPath = NULL; + int ret = -1; + + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) { + ret = 0; + goto cleanup; + } + + if (virAsprintf(&devPath, "%s/%s.dev", + cfg->stateDir, vm->def->name) < 0 || + virAsprintf(&devptsPath, "%s/%s.devpts", + cfg->stateDir, vm->def->name) < 0) + goto cleanup; + + if (qemuDomainSetupDev(driver, vm, devPath) < 0) + goto cleanup; + + /* Save /dev/pts mount point because /dev/pts/NNN is exposed in our live + * XML and other applications are supposed to be able to use it. */ + if (mount("/dev/pts", devptsPath, NULL, mount_flags, NULL) < 0) { + virReportSystemError(errno, "%s", + _("Unable to move /dev/pts mount")); + goto cleanup; + } + + if (mount(devPath, "/dev", NULL, mount_flags, NULL) < 0) { + virReportSystemError(errno, + _("Failed to mount %s on /dev"), + devPath); + goto cleanup; + } + + if (virFileMakePath("/dev/pts") < 0) { + virReportSystemError(errno, "%s", + _("Cannot create /dev/pts")); + goto cleanup; + } + + if (mount(devptsPath, "/dev/pts", NULL, mount_flags, NULL) < 0) { + virReportSystemError(errno, + _("Failed to mount %s on /dev/pts"), + devptsPath); + goto cleanup; + } + + if (virFileBindMountDevice("/dev/pts/ptmx", "/dev/ptmx") < 0) + goto cleanup;
Kill this line, since it breaks Fedora with a restrictive /dev/pts/ptmx. Unlike wth the LXC driver, we don't need this symlink, since we are using the original /dev/pts, not creating a new instance. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|