[PATCH v1 00/34] Rework building of domain's namespaces

There are couple of things happening in this series. The first patch fixes a bug. We are leaking /dev/mapper/control in QEMU. See the patch for detailed info. The second patch then cleans up code a bit. The third patch moves namespace handling code into a separate file. Patches 4 - 15 then prepare the code for switch to different approach in populating the namespace. Patches 16 - 26 then do the switch and finally, the rest of the patches drops and deduplicates some code. You can fetch the patches from here too: https://gitlab.com/MichalPrivoznik/libvirt/-/commits/dm_namespace/ Michal Prívozník (34): virDevMapperGetTargetsImpl: Close /dev/mapper/control in the end virDevMapperGetTargets: Don't ignore EBADF qemu: Separate out namespace handling code qemu_domain_namespace: Rename qemuDomainCreateNamespace() qemu_domain_namespace: Drop unused @cfg argument qemu_domain_namespace: Check for namespace enablement earlier qemuDomainNamespaceSetupHostdev: Create paths in one go qemuDomainAttachDeviceMknodHelper: Don't leak data->target qemu_domain_namespace.c: Rename qemuDomainAttachDeviceMknodData qemuDomainAttachDeviceMknodRecursive: Isolate bind mounted devices condition qemuDomainAttachDeviceMknodHelper: Create more files in a single go qemuDomainNamespaceMknodPaths: Create more files in one go qemuDomainNamespaceMknodPaths: Turn @paths into string list qemuDomainSetupDisk: Accept @src qemu_domain_namespace: Repurpose qemuDomainBuildNamespace() qemuDomainBuildNamespace: Populate basic /dev from daemon's namespace qemuDomainBuildNamespace: Populate disks from daemon's namespace qemuDomainBuildNamespace: Populate hostdevs from daemon's namespace qemuDomainBuildNamespace: Populate memory from daemon's namespace qemuDomainBuildNamespace: Populate chardevs from daemon's namespace qemuDomainBuildNamespace: Populate TPM from daemon's namespace qemuDomainBuildNamespace: Populate graphics from daemon's namespace qemuDomainBuildNamespace: Populate inputs from daemon's namespace qemuDomainBuildNamespace: Populate RNGs from daemon's namespace qemuDomainBuildNamespace: Populate loader from daemon's namespace qemuDomainBuildNamespace: Populate SEV from daemon's namespace qemu_domain_namespace: Drop unused functions qemuDomainDetachDeviceUnlink: Unlink paths in one go qemuDomainNamespaceUnlinkPaths: Turn @paths into string list qemuDomainNamespaceTeardownHostdev: Unlink paths in one go qemuDomainNamespaceTeardownMemory: Deduplicate code qemuDomainNamespaceTeardownChardev: Deduplicate code qemuDomainNamespaceTeardownRNG: Deduplicate code qemuDomainNamespaceTeardownInput: Deduplicate code po/POTFILES.in | 1 + src/qemu/Makefile.inc.am | 2 + src/qemu/qemu_cgroup.c | 2 +- src/qemu/qemu_conf.c | 1 + src/qemu/qemu_domain.c | 1848 +----------------------------- src/qemu/qemu_domain.h | 57 - src/qemu/qemu_domain_namespace.c | 1610 ++++++++++++++++++++++++++ src/qemu/qemu_domain_namespace.h | 86 ++ src/qemu/qemu_driver.c | 1 + src/qemu/qemu_hotplug.c | 1 + src/qemu/qemu_process.c | 23 +- src/qemu/qemu_security.c | 1 + src/util/virdevmapper.c | 4 +- 13 files changed, 1727 insertions(+), 1910 deletions(-) create mode 100644 src/qemu/qemu_domain_namespace.c create mode 100644 src/qemu/qemu_domain_namespace.h -- 2.26.2

When building domain's private /dev in a namespace, libdevmapper is consulted for getting full dependency tree of domain's disks. The reason is that for a multipath devices all dependent devices must be created in the namespace and allowed in CGroups. However, this approach is very fragile as building of namespace happens in the forked off child process, after mass close of FDs and just before dropping privileges and execing QEMU. And it so happens that when calling libdevmapper APIs, one of them opens /dev/mapper/control and saves the FD into a global variable. The FD is kept open until the lib is unlinked or dm_lib_release() is called explicitly. We are doing neither. This is not a problem when calling the function from libvirtd (when setting up CGroups), but it is a problem when called from the pre-exec hook because we leak the FD into QEMU. Fixes: a30078cb832646177defd256e77c632905f1e6d0 Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1858260 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index 40a82285f9..1c216fb6c1 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -156,6 +156,7 @@ virDevMapperGetTargetsImpl(const char *path, virStringListFree(recursiveDevPaths); virStringListFree(devPaths); dm_task_destroy(dmt); + dm_lib_release(); return ret; } -- 2.26.2

On Wed, Jul 22, 2020 at 11:39:55AM +0200, Michal Privoznik wrote:
When building domain's private /dev in a namespace, libdevmapper is consulted for getting full dependency tree of domain's disks. The reason is that for a multipath devices all dependent devices must be created in the namespace and allowed in CGroups.
However, this approach is very fragile as building of namespace happens in the forked off child process, after mass close of FDs and just before dropping privileges and execing QEMU. And it so happens that when calling libdevmapper APIs, one of them opens /dev/mapper/control and saves the FD into a global variable. The FD is kept open until the lib is unlinked or dm_lib_release() is called explicitly. We are doing neither.
This is not a problem when calling the function from libvirtd (when setting up CGroups), but it is a problem when called from the pre-exec hook because we leak the FD into QEMU.
Fixes: a30078cb832646177defd256e77c632905f1e6d0 Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1858260
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index 40a82285f9..1c216fb6c1 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -156,6 +156,7 @@ virDevMapperGetTargetsImpl(const char *path, virStringListFree(recursiveDevPaths); virStringListFree(devPaths); dm_task_destroy(dmt); + dm_lib_release(); return ret; }
Hmm, this function isn't threadsafe though, so I'm kind of worried about us breaking parallel callers. For that matter dm_task_run isn't thread safe when it opens the FD in the first place either AFAICT. This libdevice-mapper.so looks like a bit of a disaster in general, as I can't see mutexes acquired anywhere in the public APIs and it has a load of static global variables including this FD :-( We could put mutex locking around our own virDevMapperGetTargetsImpl if we blindly assume nothing else we call may secretly be using libdevice-mapper too. Makes me wonder though if there's any way we can obtain the info we need without touching libdevice-mapper.so at all. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 7/22/20 12:46 PM, Daniel P. Berrangé wrote:
On Wed, Jul 22, 2020 at 11:39:55AM +0200, Michal Privoznik wrote:
When building domain's private /dev in a namespace, libdevmapper is consulted for getting full dependency tree of domain's disks. The reason is that for a multipath devices all dependent devices must be created in the namespace and allowed in CGroups.
However, this approach is very fragile as building of namespace happens in the forked off child process, after mass close of FDs and just before dropping privileges and execing QEMU. And it so happens that when calling libdevmapper APIs, one of them opens /dev/mapper/control and saves the FD into a global variable. The FD is kept open until the lib is unlinked or dm_lib_release() is called explicitly. We are doing neither.
This is not a problem when calling the function from libvirtd (when setting up CGroups), but it is a problem when called from the pre-exec hook because we leak the FD into QEMU.
Fixes: a30078cb832646177defd256e77c632905f1e6d0 Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1858260
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index 40a82285f9..1c216fb6c1 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -156,6 +156,7 @@ virDevMapperGetTargetsImpl(const char *path, virStringListFree(recursiveDevPaths); virStringListFree(devPaths); dm_task_destroy(dmt); + dm_lib_release(); return ret; }
Hmm, this function isn't threadsafe though, so I'm kind of worried about us breaking parallel callers.
For that matter dm_task_run isn't thread safe when it opens the FD in the first place either AFAICT.
This libdevice-mapper.so looks like a bit of a disaster in general, as I can't see mutexes acquired anywhere in the public APIs and it has a load of static global variables including this FD :-(
We could put mutex locking around our own virDevMapperGetTargetsImpl if we blindly assume nothing else we call may secretly be using libdevice-mapper too.
Makes me wonder though if there's any way we can obtain the info we need without touching libdevice-mapper.so at all.
This is sort of resolved in the rest of the series. If we drop this patch then after patch 17 which moves populating /dev with domain disks into libvirtd rather than forked child, we don't need to call dm_lib_release(). But opening will still be unsafe, yes. Well, looks like devmapper is doing a single ioctl (if I don't count VERSION): ioctl(3</dev/mapper/control>, DM_VERSION, {version=4.0.0, data_size=16384, flags=DM_EXISTS_FLAG} => {version=4.42.0, data_size=16384, flags=DM_EXISTS_FLAG}) = 0 ioctl(3</dev/mapper/control>, DM_TABLE_DEPS, {version=4.0.0, data_size=16384, data_start=312, name="myusb", flags=DM_EXISTS_FLAG} => {version=4.42.0, data_size=328, data_start=312, dev=makedev(0xfd, 0x1), name="myusb", uuid="CRYPT-LUKS1-12fa3b2b646d43eb82ffd8f671515f93-myusb", target_count=1, open_count=0, event_nr=0, flags=DM_EXISTS_FLAG|DM_ACTIVE_PRESENT_FLAG, ...}) = 0 so maybe we can use ioctl() ourselves and drop libdevmapper completely? Michal

On Wed, Jul 22, 2020 at 02:14:38PM +0200, Michal Privoznik wrote:
On 7/22/20 12:46 PM, Daniel P. Berrangé wrote:
On Wed, Jul 22, 2020 at 11:39:55AM +0200, Michal Privoznik wrote:
When building domain's private /dev in a namespace, libdevmapper is consulted for getting full dependency tree of domain's disks. The reason is that for a multipath devices all dependent devices must be created in the namespace and allowed in CGroups.
However, this approach is very fragile as building of namespace happens in the forked off child process, after mass close of FDs and just before dropping privileges and execing QEMU. And it so happens that when calling libdevmapper APIs, one of them opens /dev/mapper/control and saves the FD into a global variable. The FD is kept open until the lib is unlinked or dm_lib_release() is called explicitly. We are doing neither.
This is not a problem when calling the function from libvirtd (when setting up CGroups), but it is a problem when called from the pre-exec hook because we leak the FD into QEMU.
Fixes: a30078cb832646177defd256e77c632905f1e6d0 Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1858260
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index 40a82285f9..1c216fb6c1 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -156,6 +156,7 @@ virDevMapperGetTargetsImpl(const char *path, virStringListFree(recursiveDevPaths); virStringListFree(devPaths); dm_task_destroy(dmt); + dm_lib_release(); return ret; }
Hmm, this function isn't threadsafe though, so I'm kind of worried about us breaking parallel callers.
For that matter dm_task_run isn't thread safe when it opens the FD in the first place either AFAICT.
This libdevice-mapper.so looks like a bit of a disaster in general, as I can't see mutexes acquired anywhere in the public APIs and it has a load of static global variables including this FD :-(
We could put mutex locking around our own virDevMapperGetTargetsImpl if we blindly assume nothing else we call may secretly be using libdevice-mapper too.
Makes me wonder though if there's any way we can obtain the info we need without touching libdevice-mapper.so at all.
This is sort of resolved in the rest of the series. If we drop this patch then after patch 17 which moves populating /dev with domain disks into libvirtd rather than forked child, we don't need to call dm_lib_release(). But opening will still be unsafe, yes.
Yeah, I'm a little concerned that the full patch series is too big for downstreams to be able to backport, so most likely they'll do a targetted fix of just this patch.
Well, looks like devmapper is doing a single ioctl (if I don't count VERSION):
ioctl(3</dev/mapper/control>, DM_VERSION, {version=4.0.0, data_size=16384, flags=DM_EXISTS_FLAG} => {version=4.42.0, data_size=16384, flags=DM_EXISTS_FLAG}) = 0 ioctl(3</dev/mapper/control>, DM_TABLE_DEPS, {version=4.0.0, data_size=16384, data_start=312, name="myusb", flags=DM_EXISTS_FLAG} => {version=4.42.0, data_size=328, data_start=312, dev=makedev(0xfd, 0x1), name="myusb", uuid="CRYPT-LUKS1-12fa3b2b646d43eb82ffd8f671515f93-myusb", target_count=1, open_count=0, event_nr=0, flags=DM_EXISTS_FLAG|DM_ACTIVE_PRESENT_FLAG, ...}) = 0
so maybe we can use ioctl() ourselves and drop libdevmapper completely?
Given that the kernel promises stable ABI, this should be OK, since we're only after one very small piece of information. The dm_task_run method seems quite simple if we exclude the bits that are not relevant for DM_TABLE_DEPS. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

One of the symptoms of the bug [1] is that on the second start of a domain we get EBADF when talking to libdevmapper. The reason is that libdevmapper opens /dev/mapper/control to talk to kernel and saves the FD into a global variable. This works well when starting a domain for the first time: the pre-exec hook (which is a separate process) gets info it needs; then the daemon sets up CGroups (where it will open the file again, because it's a different process). Now, libdevmapper won't close this FD until library is unloaded (in destructor) or dm_lib_release() is called. We were not doing any of that, hence, when starting a domain (any domain, even a different one), we forked off a child process (which will eventually become QEMU), mass close all FDs (including the libdevmapper's one), and run pre-exec hook. Since we closed the FD, libdevmapper will pass closed FD into an ioctl() and thus we got EBADF. After previous patch, this approach is history and thus we should not see EBADF anymore. 1: https://bugzilla.redhat.com/show_bug.cgi?id=1858260 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_cgroup.c | 2 +- src/qemu/qemu_domain.c | 4 ++-- src/util/virdevmapper.c | 3 --- 3 files changed, 3 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 914bf640ca..e88da02341 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -87,7 +87,7 @@ qemuSetupImagePathCgroup(virDomainObjPtr vm, } if (virDevMapperGetTargets(path, &targetPaths) < 0 && - errno != ENOSYS && errno != EBADF) { + errno != ENOSYS) { virReportSystemError(errno, _("Unable to get devmapper targets for %s"), path); diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 5b22eb2eaa..2058290870 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -10264,7 +10264,7 @@ qemuDomainSetupDisk(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, return -1; if (virDevMapperGetTargets(next->path, &targetPaths) < 0 && - errno != ENOSYS && errno != EBADF) { + errno != ENOSYS) { virReportSystemError(errno, _("Unable to get devmapper targets for %s"), next->path); @@ -11328,7 +11328,7 @@ qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, tmpPath = g_strdup(next->path); if (virDevMapperGetTargets(next->path, &targetPaths) < 0 && - errno != ENOSYS && errno != EBADF) { + errno != ENOSYS) { virReportSystemError(errno, _("Unable to get devmapper targets for %s"), next->path); diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index 1c216fb6c1..139d70b854 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -176,9 +176,6 @@ virDevMapperGetTargetsImpl(const char *path, * If @path consists of yet another devmapper targets these are * consulted recursively. * - * If we don't have permissions to talk to kernel, -1 is returned - * and errno is set to EBADF. - * * Returns 0 on success, * -1 otherwise (with errno set, no libvirt error is * reported) -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
One of the symptoms of the bug [1] is that on the second start of a domain we get EBADF when talking to libdevmapper. The reason is that libdevmapper opens /dev/mapper/control to talk to kernel and saves the FD into a global variable. This works well when starting a domain for the first time: the pre-exec hook (which is a separate process) gets info it needs; then the daemon sets up CGroups (where it will open the file again, because it's a different process). Now, libdevmapper won't close this FD until library is unloaded (in destructor) or dm_lib_release() is called. We were not doing any of that, hence, when starting a domain (any domain, even a different one), we forked off a child process (which will eventually become QEMU), mass close all FDs (including the libdevmapper's one), and run pre-exec hook. Since we closed the FD, libdevmapper will pass closed FD into an ioctl() and thus we got EBADF.
After previous patch, this approach is history and thus we should not see EBADF anymore.
That bug is private.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_cgroup.c | 2 +- src/qemu/qemu_domain.c | 4 ++-- src/util/virdevmapper.c | 3 --- 3 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 914bf640ca..e88da02341 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -87,7 +87,7 @@ qemuSetupImagePathCgroup(virDomainObjPtr vm, }
if (virDevMapperGetTargets(path, &targetPaths) < 0 && - errno != ENOSYS && errno != EBADF) { + errno != ENOSYS) { virReportSystemError(errno, _("Unable to get devmapper targets for %s"), path); diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 5b22eb2eaa..2058290870 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -10264,7 +10264,7 @@ qemuDomainSetupDisk(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, return -1;
if (virDevMapperGetTargets(next->path, &targetPaths) < 0 && - errno != ENOSYS && errno != EBADF) { + errno != ENOSYS) { virReportSystemError(errno, _("Unable to get devmapper targets for %s"), next->path); @@ -11328,7 +11328,7 @@ qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, tmpPath = g_strdup(next->path);
if (virDevMapperGetTargets(next->path, &targetPaths) < 0 && - errno != ENOSYS && errno != EBADF) { + errno != ENOSYS) { virReportSystemError(errno, _("Unable to get devmapper targets for %s"), next->path);
This changes the functions that are moved in the very next patch. Doing changes after the movement makes 'git blame' easier to use. Jano

The qemu_domain.c file is big as is and we should split it into separate semantic blocks. Start with code that handles domain namespaces. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- po/POTFILES.in | 1 + src/qemu/Makefile.inc.am | 2 + src/qemu/qemu_conf.c | 1 + src/qemu/qemu_domain.c | 1848 +---------------------------- src/qemu/qemu_domain.h | 57 - src/qemu/qemu_domain_namespace.c | 1885 ++++++++++++++++++++++++++++++ src/qemu/qemu_domain_namespace.h | 86 ++ src/qemu/qemu_driver.c | 1 + src/qemu/qemu_hotplug.c | 1 + src/qemu/qemu_process.c | 1 + src/qemu/qemu_security.c | 1 + 11 files changed, 1980 insertions(+), 1904 deletions(-) create mode 100644 src/qemu/qemu_domain_namespace.c create mode 100644 src/qemu/qemu_domain_namespace.h diff --git a/po/POTFILES.in b/po/POTFILES.in index b10008ae3d..de4fb172d2 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -153,6 +153,7 @@ @SRCDIR@src/qemu/qemu_dbus.c @SRCDIR@src/qemu/qemu_domain.c @SRCDIR@src/qemu/qemu_domain_address.c +@SRCDIR@src/qemu/qemu_domain_namespace.c @SRCDIR@src/qemu/qemu_domainjob.c @SRCDIR@src/qemu/qemu_driver.c @SRCDIR@src/qemu/qemu_extdevice.c diff --git a/src/qemu/Makefile.inc.am b/src/qemu/Makefile.inc.am index 9e1d6192f5..01aa734597 100644 --- a/src/qemu/Makefile.inc.am +++ b/src/qemu/Makefile.inc.am @@ -21,6 +21,8 @@ QEMU_DRIVER_SOURCES = \ qemu/qemu_domainjob.h \ qemu/qemu_domain_address.c \ qemu/qemu_domain_address.h \ + qemu/qemu_domain_namespace.c \ + qemu/qemu_domain_namespace.h \ qemu/qemu_cgroup.c \ qemu/qemu_cgroup.h \ qemu/qemu_extdevice.c \ diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 4762f2a88a..bc418082f7 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -30,6 +30,7 @@ #include "qemu_conf.h" #include "qemu_capabilities.h" #include "qemu_domain.h" +#include "qemu_domain_namespace.h" #include "qemu_firmware.h" #include "qemu_security.h" #include "viruuid.h" diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 2058290870..92dc69ce39 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -37,6 +37,7 @@ #include "qemu_blockjob.h" #include "qemu_checkpoint.h" #include "qemu_validate.h" +#include "qemu_domain_namespace.h" #include "viralloc.h" #include "virlog.h" #include "virerror.h" @@ -65,17 +66,8 @@ #include "virutil.h" #include "virdevmapper.h" -#ifdef __linux__ -# include <sys/sysmacros.h> -#endif #include <sys/time.h> #include <fcntl.h> -#if defined(HAVE_SYS_MOUNT_H) -# include <sys/mount.h> -#endif -#ifdef WITH_SELINUX -# include <selinux/selinux.h> -#endif #define QEMU_QXL_VGAMEM_DEFAULT 16 * 1024 @@ -83,11 +75,6 @@ VIR_LOG_INIT("qemu.qemu_domain"); -VIR_ENUM_IMPL(qemuDomainNamespace, - QEMU_DOMAIN_NS_LAST, - "mount", -); - static void * qemuJobAllocPrivate(void) @@ -239,54 +226,6 @@ qemuDomainLogContextFinalize(GObject *object) G_OBJECT_CLASS(qemu_domain_log_context_parent_class)->finalize(object); } - -bool -qemuDomainNamespaceEnabled(virDomainObjPtr vm, - qemuDomainNamespace ns) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - - return priv->namespaces && - virBitmapIsBitSet(priv->namespaces, ns); -} - - -static int -qemuDomainEnableNamespace(virDomainObjPtr vm, - qemuDomainNamespace ns) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - - if (!priv->namespaces && - !(priv->namespaces = virBitmapNew(QEMU_DOMAIN_NS_LAST))) - return -1; - - if (virBitmapSetBit(priv->namespaces, ns) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to enable namespace: %s"), - qemuDomainNamespaceTypeToString(ns)); - return -1; - } - - return 0; -} - - -static void -qemuDomainDisableNamespace(virDomainObjPtr vm, - qemuDomainNamespace ns) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - - if (priv->namespaces) { - ignore_value(virBitmapClearBit(priv->namespaces, ns)); - if (virBitmapIsAllClear(priv->namespaces)) { - virBitmapFree(priv->namespaces); - priv->namespaces = NULL; - } - } -} - /* qemuDomainGetMasterKeyFilePath: * @libDir: Directory path to domain lib files * @@ -9799,1791 +9738,6 @@ qemuDomainGetHostdevPath(virDomainHostdevDefPtr dev, } -/** - * qemuDomainGetPreservedMountPath: - * @cfg: driver configuration data - * @vm: domain object - * @mountpoint: mount point path to convert - * - * For given @mountpoint return new path where the mount point - * should be moved temporarily whilst building the namespace. - * - * Returns: allocated string on success which the caller must free, - * NULL on failure. - */ -static char * -qemuDomainGetPreservedMountPath(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, - const char *mountpoint) -{ - char *path = NULL; - char *tmp; - const char *suffix = mountpoint + strlen(QEMU_DEVPREFIX); - g_autofree char *domname = virDomainDefGetShortName(vm->def); - size_t off; - - if (!domname) - return NULL; - - if (STREQ(mountpoint, "/dev")) - suffix = "dev"; - - path = g_strdup_printf("%s/%s.%s", cfg->stateDir, domname, suffix); - - /* Now consider that @mountpoint is "/dev/blah/blah2". - * @suffix then points to "blah/blah2". However, caller - * expects all the @paths to be the same depth. The - * caller doesn't always do `mkdir -p` but sometimes bare - * `touch`. Therefore fix all the suffixes. */ - off = strlen(path) - strlen(suffix); - - tmp = path + off; - while (*tmp) { - if (*tmp == '/') - *tmp = '.'; - tmp++; - } - - return path; -} - - -/** - * qemuDomainGetPreservedMounts: - * - * Process list of mounted filesystems and: - * a) save all FSs mounted under /dev to @devPath - * b) generate backup path for all the entries in a) - * - * Any of the return pointers can be NULL. - * - * Returns 0 on success, -1 otherwise (with error reported) - */ -static int -qemuDomainGetPreservedMounts(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, - char ***devPath, - char ***devSavePath, - size_t *ndevPath) -{ - char **paths = NULL, **mounts = NULL; - size_t i, j, nmounts; - - if (virFileGetMountSubtree(QEMU_PROC_MOUNTS, "/dev", - &mounts, &nmounts) < 0) - goto error; - - if (!nmounts) { - if (ndevPath) - *ndevPath = 0; - return 0; - } - - /* There can be nested mount points. For instance - * /dev/shm/blah can be a mount point and /dev/shm too. It - * doesn't make much sense to return the former path because - * caller preserves the latter (and with that the former - * too). Therefore prune nested mount points. - * NB mounts[0] is "/dev". Should we start the outer loop - * from the beginning of the array all we'd be left with is - * just the first element. Think about it. - */ - for (i = 1; i < nmounts; i++) { - j = i + 1; - while (j < nmounts) { - char *c = STRSKIP(mounts[j], mounts[i]); - - if (c && (*c == '/' || *c == '\0')) { - VIR_DEBUG("Dropping path %s because of %s", mounts[j], mounts[i]); - VIR_DELETE_ELEMENT(mounts, j, nmounts); - } else { - j++; - } - } - } - - if (VIR_ALLOC_N(paths, nmounts) < 0) - goto error; - - for (i = 0; i < nmounts; i++) { - if (!(paths[i] = qemuDomainGetPreservedMountPath(cfg, vm, mounts[i]))) - goto error; - } - - if (devPath) - *devPath = mounts; - else - virStringListFreeCount(mounts, nmounts); - - if (devSavePath) - *devSavePath = paths; - else - virStringListFreeCount(paths, nmounts); - - if (ndevPath) - *ndevPath = nmounts; - - return 0; - - error: - virStringListFreeCount(mounts, nmounts); - virStringListFreeCount(paths, nmounts); - return -1; -} - - -struct qemuDomainCreateDeviceData { - const char *path; /* Path to temp new /dev location */ - char * const *devMountsPath; - size_t ndevMountsPath; -}; - - -static int -qemuDomainCreateDeviceRecursive(const char *device, - const struct qemuDomainCreateDeviceData *data, - bool allow_noent, - unsigned int ttl) -{ - g_autofree char *devicePath = NULL; - g_autofree char *target = NULL; - GStatBuf sb; - int ret = -1; - bool isLink = false; - bool isDev = false; - bool isReg = false; - bool isDir = false; - bool create = false; -#ifdef WITH_SELINUX - char *tcon = NULL; -#endif - - if (!ttl) { - virReportSystemError(ELOOP, - _("Too many levels of symbolic links: %s"), - device); - return ret; - } - - if (g_lstat(device, &sb) < 0) { - if (errno == ENOENT && allow_noent) { - /* Ignore non-existent device. */ - return 0; - } - virReportSystemError(errno, _("Unable to stat %s"), device); - return ret; - } - - isLink = S_ISLNK(sb.st_mode); - isDev = S_ISCHR(sb.st_mode) || S_ISBLK(sb.st_mode); - isReg = S_ISREG(sb.st_mode) || S_ISFIFO(sb.st_mode) || S_ISSOCK(sb.st_mode); - isDir = S_ISDIR(sb.st_mode); - - /* Here, @device might be whatever path in the system. We - * should create the path in the namespace iff it's "/dev" - * prefixed. However, if it is a symlink, we need to traverse - * it too (it might point to something in "/dev"). Just - * consider: - * - * /var/sym1 -> /var/sym2 -> /dev/sda (because users can) - * - * This means, "/var/sym1" is not created (it's shared with - * the parent namespace), nor "/var/sym2", but "/dev/sda". - * - * TODO Remove all `.' and `..' from the @device path. - * Otherwise we might get fooled with `/dev/../var/my_image'. - * For now, lets hope callers play nice. - */ - if (STRPREFIX(device, QEMU_DEVPREFIX)) { - size_t i; - - for (i = 0; i < data->ndevMountsPath; i++) { - if (STREQ(data->devMountsPath[i], "/dev")) - continue; - if (STRPREFIX(device, data->devMountsPath[i])) - break; - } - - if (i == data->ndevMountsPath) { - /* Okay, @device is in /dev but not in any mount point under /dev. - * Create it. */ - devicePath = g_strdup_printf("%s/%s", data->path, - device + strlen(QEMU_DEVPREFIX)); - - if (virFileMakeParentPath(devicePath) < 0) { - virReportSystemError(errno, - _("Unable to create %s"), - devicePath); - goto cleanup; - } - VIR_DEBUG("Creating dev %s", device); - create = true; - } else { - VIR_DEBUG("Skipping dev %s because of %s mount point", - device, data->devMountsPath[i]); - } - } - - if (isLink) { - g_autoptr(GError) gerr = NULL; - - /* We are dealing with a symlink. Create a dangling symlink and descend - * down one level which hopefully creates the symlink's target. */ - if (!(target = g_file_read_link(device, &gerr))) { - virReportError(VIR_ERR_SYSTEM_ERROR, - _("failed to resolve symlink %s: %s"), device, gerr->message); - goto cleanup; - } - - if (create && - symlink(target, devicePath) < 0) { - if (errno == EEXIST) { - ret = 0; - } else { - virReportSystemError(errno, - _("unable to create symlink %s"), - devicePath); - } - goto cleanup; - } - - /* Tricky part. If the target starts with a slash then we need to take - * it as it is. Otherwise we need to replace the last component in the - * original path with the link target: - * /dev/rtc -> rtc0 (want /dev/rtc0) - * /dev/disk/by-id/ata-SanDisk_SDSSDXPS480G_161101402485 -> ../../sda - * (want /dev/disk/by-id/../../sda) - * /dev/stdout -> /proc/self/fd/1 (no change needed) - */ - if (!g_path_is_absolute(target)) { - g_autofree char *devTmp = g_strdup(device); - char *c = NULL, *tmp = NULL; - - if ((c = strrchr(devTmp, '/'))) - *(c + 1) = '\0'; - - tmp = g_strdup_printf("%s%s", devTmp, target); - VIR_FREE(target); - target = g_steal_pointer(&tmp); - } - - if (qemuDomainCreateDeviceRecursive(target, data, - allow_noent, ttl - 1) < 0) - goto cleanup; - } else if (isDev) { - if (create) { - unlink(devicePath); - if (mknod(devicePath, sb.st_mode, sb.st_rdev) < 0) { - virReportSystemError(errno, - _("Failed to make device %s"), - devicePath); - goto cleanup; - } - } - } else if (isReg) { - if (create && - virFileTouch(devicePath, sb.st_mode) < 0) - goto cleanup; - /* Just create the file here so that code below sets - * proper owner and mode. Bind mount only after that. */ - } else if (isDir) { - if (create && - virFileMakePathWithMode(devicePath, sb.st_mode) < 0) { - virReportSystemError(errno, - _("Unable to make dir %s"), - devicePath); - goto cleanup; - } - } else { - virReportError(VIR_ERR_OPERATION_UNSUPPORTED, - _("unsupported device type %s 0%o"), - device, sb.st_mode); - goto cleanup; - } - - if (!create) { - ret = 0; - goto cleanup; - } - - if (lchown(devicePath, sb.st_uid, sb.st_gid) < 0) { - virReportSystemError(errno, - _("Failed to chown device %s"), - devicePath); - goto cleanup; - } - - /* Symlinks don't have mode */ - if (!isLink && - chmod(devicePath, sb.st_mode) < 0) { - virReportSystemError(errno, - _("Failed to set permissions for device %s"), - devicePath); - goto cleanup; - } - - /* Symlinks don't have ACLs. */ - if (!isLink && - virFileCopyACLs(device, devicePath) < 0 && - errno != ENOTSUP) { - virReportSystemError(errno, - _("Failed to copy ACLs on device %s"), - devicePath); - goto cleanup; - } - -#ifdef WITH_SELINUX - if (lgetfilecon_raw(device, &tcon) < 0 && - (errno != ENOTSUP && errno != ENODATA)) { - virReportSystemError(errno, - _("Unable to get SELinux label from %s"), - device); - goto cleanup; - } - - if (tcon && - lsetfilecon_raw(devicePath, (const char *)tcon) < 0) { - VIR_WARNINGS_NO_WLOGICALOP_EQUAL_EXPR - if (errno != EOPNOTSUPP && errno != ENOTSUP) { - VIR_WARNINGS_RESET - virReportSystemError(errno, - _("Unable to set SELinux label on %s"), - devicePath); - goto cleanup; - } - } -#endif - - /* Finish mount process started earlier. */ - if ((isReg || isDir) && - virFileBindMountDevice(device, devicePath) < 0) - goto cleanup; - - ret = 0; - cleanup: -#ifdef WITH_SELINUX - freecon(tcon); -#endif - return ret; -} - - -static int -qemuDomainCreateDevice(const char *device, - const struct qemuDomainCreateDeviceData *data, - bool allow_noent) -{ - long symloop_max = sysconf(_SC_SYMLOOP_MAX); - - return qemuDomainCreateDeviceRecursive(device, data, - allow_noent, symloop_max); -} - - -static int -qemuDomainPopulateDevices(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm G_GNUC_UNUSED, - const struct qemuDomainCreateDeviceData *data) -{ - const char *const *devices = (const char *const *) cfg->cgroupDeviceACL; - size_t i; - - if (!devices) - devices = defaultDeviceACL; - - for (i = 0; devices[i]; i++) { - if (qemuDomainCreateDevice(devices[i], data, true) < 0) - return -1; - } - - return 0; -} - - -static int -qemuDomainSetupDev(virQEMUDriverConfigPtr cfg, - virSecurityManagerPtr mgr, - virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) -{ - g_autofree char *mount_options = NULL; - g_autofree char *opts = NULL; - - VIR_DEBUG("Setting up /dev/ for domain %s", vm->def->name); - - mount_options = qemuSecurityGetMountOptions(mgr, vm->def); - - if (!mount_options) - mount_options = g_strdup(""); - - /* - * tmpfs is limited to 64kb, since we only have device nodes in there - * and don't want to DOS the entire OS RAM usage - */ - opts = g_strdup_printf("mode=755,size=65536%s", mount_options); - - if (virFileSetupDev(data->path, opts) < 0) - return -1; - - if (qemuDomainPopulateDevices(cfg, vm, data) < 0) - return -1; - - return 0; -} - - -static int -qemuDomainSetupDisk(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainDiskDefPtr disk, - const struct qemuDomainCreateDeviceData *data) -{ - virStorageSourcePtr next; - bool hasNVMe = false; - - for (next = disk->src; virStorageSourceIsBacking(next); next = next->backingStore) { - VIR_AUTOSTRINGLIST targetPaths = NULL; - size_t i; - - if (next->type == VIR_STORAGE_TYPE_NVME) { - g_autofree char *nvmePath = NULL; - - hasNVMe = true; - - if (!(nvmePath = virPCIDeviceAddressGetIOMMUGroupDev(&next->nvme->pciAddr))) - return -1; - - if (qemuDomainCreateDevice(nvmePath, data, false) < 0) - return -1; - } else { - if (!next->path || !virStorageSourceIsLocalStorage(next)) { - /* Not creating device. Just continue. */ - continue; - } - - if (qemuDomainCreateDevice(next->path, data, false) < 0) - return -1; - - if (virDevMapperGetTargets(next->path, &targetPaths) < 0 && - errno != ENOSYS) { - virReportSystemError(errno, - _("Unable to get devmapper targets for %s"), - next->path); - return -1; - } - - for (i = 0; targetPaths && targetPaths[i]; i++) { - if (qemuDomainCreateDevice(targetPaths[i], data, false) < 0) - return -1; - } - } - } - - /* qemu-pr-helper might require access to /dev/mapper/control. */ - if (disk->src->pr && - qemuDomainCreateDevice(QEMU_DEVICE_MAPPER_CONTROL_PATH, data, true) < 0) - return -1; - - if (hasNVMe && - qemuDomainCreateDevice(QEMU_DEV_VFIO, data, false) < 0) - return -1; - - return 0; -} - - -static int -qemuDomainSetupAllDisks(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) -{ - size_t i; - VIR_DEBUG("Setting up disks"); - - for (i = 0; i < vm->def->ndisks; i++) { - if (qemuDomainSetupDisk(cfg, - vm->def->disks[i], - data) < 0) - return -1; - } - - VIR_DEBUG("Setup all disks"); - return 0; -} - - -static int -qemuDomainSetupHostdev(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainHostdevDefPtr dev, - const struct qemuDomainCreateDeviceData *data) -{ - g_autofree char *path = NULL; - - if (qemuDomainGetHostdevPath(dev, &path, NULL) < 0) - return -1; - - if (path && qemuDomainCreateDevice(path, data, false) < 0) - return -1; - - if (qemuHostdevNeedsVFIO(dev) && - qemuDomainCreateDevice(QEMU_DEV_VFIO, data, false) < 0) - return -1; - - return 0; -} - - -static int -qemuDomainSetupAllHostdevs(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) -{ - size_t i; - - VIR_DEBUG("Setting up hostdevs"); - for (i = 0; i < vm->def->nhostdevs; i++) { - if (qemuDomainSetupHostdev(cfg, - vm->def->hostdevs[i], - data) < 0) - return -1; - } - VIR_DEBUG("Setup all hostdevs"); - return 0; -} - - -static int -qemuDomainSetupMemory(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainMemoryDefPtr mem, - const struct qemuDomainCreateDeviceData *data) -{ - if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) - return 0; - - return qemuDomainCreateDevice(mem->nvdimmPath, data, false); -} - - -static int -qemuDomainSetupAllMemories(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) -{ - size_t i; - - VIR_DEBUG("Setting up memories"); - for (i = 0; i < vm->def->nmems; i++) { - if (qemuDomainSetupMemory(cfg, - vm->def->mems[i], - data) < 0) - return -1; - } - VIR_DEBUG("Setup all memories"); - return 0; -} - - -static int -qemuDomainSetupChardev(virDomainDefPtr def G_GNUC_UNUSED, - virDomainChrDefPtr dev, - void *opaque) -{ - const struct qemuDomainCreateDeviceData *data = opaque; - const char *path = NULL; - - if (!(path = virDomainChrSourceDefGetPath(dev->source))) - return 0; - - /* Socket created by qemu. It doesn't exist upfront. */ - if (dev->source->type == VIR_DOMAIN_CHR_TYPE_UNIX && - dev->source->data.nix.listen) - return 0; - - return qemuDomainCreateDevice(path, data, true); -} - - -static int -qemuDomainSetupAllChardevs(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) -{ - VIR_DEBUG("Setting up chardevs"); - - if (virDomainChrDefForeach(vm->def, - true, - qemuDomainSetupChardev, - (void *)data) < 0) - return -1; - - VIR_DEBUG("Setup all chardevs"); - return 0; -} - - -static int -qemuDomainSetupTPM(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainTPMDefPtr dev, - const struct qemuDomainCreateDeviceData *data) -{ - switch (dev->type) { - case VIR_DOMAIN_TPM_TYPE_PASSTHROUGH: - if (qemuDomainCreateDevice(dev->data.passthrough.source.data.file.path, - data, false) < 0) - return -1; - break; - - case VIR_DOMAIN_TPM_TYPE_EMULATOR: - case VIR_DOMAIN_TPM_TYPE_LAST: - /* nada */ - break; - } - - return 0; -} - - -static int -qemuDomainSetupAllTPMs(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) -{ - size_t i; - - VIR_DEBUG("Setting up TPMs"); - - for (i = 0; i < vm->def->ntpms; i++) { - if (qemuDomainSetupTPM(cfg, vm->def->tpms[i], data) < 0) - return -1; - } - - VIR_DEBUG("Setup all TPMs"); - return 0; -} - - -static int -qemuDomainSetupGraphics(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainGraphicsDefPtr gfx, - const struct qemuDomainCreateDeviceData *data) -{ - const char *rendernode = virDomainGraphicsGetRenderNode(gfx); - - if (!rendernode) - return 0; - - return qemuDomainCreateDevice(rendernode, data, false); -} - - -static int -qemuDomainSetupAllGraphics(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) -{ - size_t i; - - VIR_DEBUG("Setting up graphics"); - for (i = 0; i < vm->def->ngraphics; i++) { - if (qemuDomainSetupGraphics(cfg, - vm->def->graphics[i], - data) < 0) - return -1; - } - - VIR_DEBUG("Setup all graphics"); - return 0; -} - - -static int -qemuDomainSetupInput(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainInputDefPtr input, - const struct qemuDomainCreateDeviceData *data) -{ - const char *path = virDomainInputDefGetPath(input); - - if (path && qemuDomainCreateDevice(path, data, false) < 0) - return -1; - - return 0; -} - - -static int -qemuDomainSetupAllInputs(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) -{ - size_t i; - - VIR_DEBUG("Setting up inputs"); - for (i = 0; i < vm->def->ninputs; i++) { - if (qemuDomainSetupInput(cfg, - vm->def->inputs[i], - data) < 0) - return -1; - } - VIR_DEBUG("Setup all inputs"); - return 0; -} - - -static int -qemuDomainSetupRNG(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainRNGDefPtr rng, - const struct qemuDomainCreateDeviceData *data) -{ - switch ((virDomainRNGBackend) rng->backend) { - case VIR_DOMAIN_RNG_BACKEND_RANDOM: - if (qemuDomainCreateDevice(rng->source.file, data, false) < 0) - return -1; - break; - - case VIR_DOMAIN_RNG_BACKEND_EGD: - case VIR_DOMAIN_RNG_BACKEND_BUILTIN: - case VIR_DOMAIN_RNG_BACKEND_LAST: - /* nada */ - break; - } - - return 0; -} - - -static int -qemuDomainSetupAllRNGs(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) -{ - size_t i; - - VIR_DEBUG("Setting up RNGs"); - for (i = 0; i < vm->def->nrngs; i++) { - if (qemuDomainSetupRNG(cfg, - vm->def->rngs[i], - data) < 0) - return -1; - } - - VIR_DEBUG("Setup all RNGs"); - return 0; -} - - -static int -qemuDomainSetupLoader(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) -{ - virDomainLoaderDefPtr loader = vm->def->os.loader; - - VIR_DEBUG("Setting up loader"); - - if (loader) { - switch ((virDomainLoader) loader->type) { - case VIR_DOMAIN_LOADER_TYPE_ROM: - if (qemuDomainCreateDevice(loader->path, data, false) < 0) - return -1; - break; - - case VIR_DOMAIN_LOADER_TYPE_PFLASH: - if (qemuDomainCreateDevice(loader->path, data, false) < 0) - return -1; - - if (loader->nvram && - qemuDomainCreateDevice(loader->nvram, data, false) < 0) - return -1; - break; - - case VIR_DOMAIN_LOADER_TYPE_NONE: - case VIR_DOMAIN_LOADER_TYPE_LAST: - break; - } - } - - VIR_DEBUG("Setup loader"); - return 0; -} - - -static int -qemuDomainSetupLaunchSecurity(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) -{ - virDomainSEVDefPtr sev = vm->def->sev; - - if (!sev || sev->sectype != VIR_DOMAIN_LAUNCH_SECURITY_SEV) - return 0; - - VIR_DEBUG("Setting up launch security"); - - if (qemuDomainCreateDevice(QEMU_DEV_SEV, data, false) < 0) - return -1; - - VIR_DEBUG("Set up launch security"); - return 0; -} - - -int -qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, - virSecurityManagerPtr mgr, - virDomainObjPtr vm) -{ - struct qemuDomainCreateDeviceData data; - const char *devPath = NULL; - char **devMountsPath = NULL, **devMountsSavePath = NULL; - size_t ndevMountsPath = 0, i; - int ret = -1; - - if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) { - ret = 0; - goto cleanup; - } - - if (qemuDomainGetPreservedMounts(cfg, vm, - &devMountsPath, &devMountsSavePath, - &ndevMountsPath) < 0) - goto cleanup; - - for (i = 0; i < ndevMountsPath; i++) { - if (STREQ(devMountsPath[i], "/dev")) { - devPath = devMountsSavePath[i]; - break; - } - } - - if (!devPath) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Unable to find any /dev mount")); - goto cleanup; - } - - data.path = devPath; - data.devMountsPath = devMountsPath; - data.ndevMountsPath = ndevMountsPath; - - if (virProcessSetupPrivateMountNS() < 0) - goto cleanup; - - if (qemuDomainSetupDev(cfg, mgr, vm, &data) < 0) - goto cleanup; - - if (qemuDomainSetupAllDisks(cfg, vm, &data) < 0) - goto cleanup; - - if (qemuDomainSetupAllHostdevs(cfg, vm, &data) < 0) - goto cleanup; - - if (qemuDomainSetupAllMemories(cfg, vm, &data) < 0) - goto cleanup; - - if (qemuDomainSetupAllChardevs(cfg, vm, &data) < 0) - goto cleanup; - - if (qemuDomainSetupAllTPMs(cfg, vm, &data) < 0) - goto cleanup; - - if (qemuDomainSetupAllGraphics(cfg, vm, &data) < 0) - goto cleanup; - - if (qemuDomainSetupAllInputs(cfg, vm, &data) < 0) - goto cleanup; - - if (qemuDomainSetupAllRNGs(cfg, vm, &data) < 0) - goto cleanup; - - if (qemuDomainSetupLoader(cfg, vm, &data) < 0) - goto cleanup; - - if (qemuDomainSetupLaunchSecurity(cfg, vm, &data) < 0) - goto cleanup; - - /* Save some mount points because we want to share them with the host */ - for (i = 0; i < ndevMountsPath; i++) { - struct stat sb; - - if (devMountsSavePath[i] == devPath) - continue; - - if (stat(devMountsPath[i], &sb) < 0) { - virReportSystemError(errno, - _("Unable to stat: %s"), - devMountsPath[i]); - goto cleanup; - } - - /* At this point, devMountsPath is either: - * a file (regular or special), or - * a directory. */ - if ((S_ISDIR(sb.st_mode) && virFileMakePath(devMountsSavePath[i]) < 0) || - (!S_ISDIR(sb.st_mode) && virFileTouch(devMountsSavePath[i], sb.st_mode) < 0)) { - virReportSystemError(errno, - _("Failed to create %s"), - devMountsSavePath[i]); - goto cleanup; - } - - if (virFileMoveMount(devMountsPath[i], devMountsSavePath[i]) < 0) - goto cleanup; - } - - if (virFileMoveMount(devPath, "/dev") < 0) - goto cleanup; - - for (i = 0; i < ndevMountsPath; i++) { - struct stat sb; - - if (devMountsSavePath[i] == devPath) - continue; - - if (stat(devMountsSavePath[i], &sb) < 0) { - virReportSystemError(errno, - _("Unable to stat: %s"), - devMountsSavePath[i]); - goto cleanup; - } - - if (S_ISDIR(sb.st_mode)) { - if (virFileMakePath(devMountsPath[i]) < 0) { - virReportSystemError(errno, _("Cannot create %s"), - devMountsPath[i]); - goto cleanup; - } - } else { - if (virFileMakeParentPath(devMountsPath[i]) < 0 || - virFileTouch(devMountsPath[i], sb.st_mode) < 0) { - virReportSystemError(errno, _("Cannot create %s"), - devMountsPath[i]); - goto cleanup; - } - } - - if (virFileMoveMount(devMountsSavePath[i], devMountsPath[i]) < 0) - goto cleanup; - } - - ret = 0; - cleanup: - for (i = 0; i < ndevMountsPath; i++) { -#if defined(__linux__) - umount(devMountsSavePath[i]); -#endif /* defined(__linux__) */ - /* The path can be either a regular file or a dir. */ - if (virFileIsDir(devMountsSavePath[i])) - virFileDeleteTree(devMountsSavePath[i]); - else - unlink(devMountsSavePath[i]); - } - virStringListFreeCount(devMountsPath, ndevMountsPath); - virStringListFreeCount(devMountsSavePath, ndevMountsPath); - return ret; -} - - -int -qemuDomainCreateNamespace(virQEMUDriverPtr driver, - virDomainObjPtr vm) -{ - g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); - - if (virBitmapIsBitSet(cfg->namespaces, QEMU_DOMAIN_NS_MOUNT) && - qemuDomainEnableNamespace(vm, QEMU_DOMAIN_NS_MOUNT) < 0) - return -1; - - return 0; -} - - -void -qemuDomainDestroyNamespace(virQEMUDriverPtr driver G_GNUC_UNUSED, - virDomainObjPtr vm) -{ - if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) - qemuDomainDisableNamespace(vm, QEMU_DOMAIN_NS_MOUNT); -} - - -bool -qemuDomainNamespaceAvailable(qemuDomainNamespace ns G_GNUC_UNUSED) -{ -#if !defined(__linux__) - /* Namespaces are Linux specific. */ - return false; - -#else /* defined(__linux__) */ - - switch (ns) { - case QEMU_DOMAIN_NS_MOUNT: -# if !defined(HAVE_SYS_ACL_H) || !defined(WITH_SELINUX) - /* We can't create the exact copy of paths if either of - * these is not available. */ - return false; -# else - if (virProcessNamespaceAvailable(VIR_PROCESS_NAMESPACE_MNT) < 0) - return false; -# endif - break; - case QEMU_DOMAIN_NS_LAST: - break; - } - - return true; -#endif /* defined(__linux__) */ -} - - -struct qemuDomainAttachDeviceMknodData { - virQEMUDriverPtr driver; - virDomainObjPtr vm; - const char *file; - const char *target; - GStatBuf sb; - void *acl; -#ifdef WITH_SELINUX - char *tcon; -#endif -}; - - -/* Our way of creating devices is highly linux specific */ -#if defined(__linux__) -static int -qemuDomainAttachDeviceMknodHelper(pid_t pid G_GNUC_UNUSED, - void *opaque) -{ - struct qemuDomainAttachDeviceMknodData *data = opaque; - int ret = -1; - bool delDevice = false; - bool isLink = S_ISLNK(data->sb.st_mode); - bool isDev = S_ISCHR(data->sb.st_mode) || S_ISBLK(data->sb.st_mode); - bool isReg = S_ISREG(data->sb.st_mode) || S_ISFIFO(data->sb.st_mode) || S_ISSOCK(data->sb.st_mode); - bool isDir = S_ISDIR(data->sb.st_mode); - - qemuSecurityPostFork(data->driver->securityManager); - - if (virFileMakeParentPath(data->file) < 0) { - virReportSystemError(errno, - _("Unable to create %s"), data->file); - goto cleanup; - } - - if (isLink) { - VIR_DEBUG("Creating symlink %s -> %s", data->file, data->target); - - /* First, unlink the symlink target. Symlinks change and - * therefore we have no guarantees that pre-existing - * symlink is still valid. */ - if (unlink(data->file) < 0 && - errno != ENOENT) { - virReportSystemError(errno, - _("Unable to remove symlink %s"), - data->file); - goto cleanup; - } - - if (symlink(data->target, data->file) < 0) { - virReportSystemError(errno, - _("Unable to create symlink %s (pointing to %s)"), - data->file, data->target); - goto cleanup; - } else { - delDevice = true; - } - } else if (isDev) { - VIR_DEBUG("Creating dev %s (%d,%d)", - data->file, major(data->sb.st_rdev), minor(data->sb.st_rdev)); - unlink(data->file); - if (mknod(data->file, data->sb.st_mode, data->sb.st_rdev) < 0) { - virReportSystemError(errno, - _("Unable to create device %s"), - data->file); - goto cleanup; - } else { - delDevice = true; - } - } else if (isReg || isDir) { - /* We are not cleaning up disks on virDomainDetachDevice - * because disk might be still in use by different disk - * as its backing chain. This might however clash here. - * Therefore do the cleanup here. */ - if (umount(data->file) < 0 && - errno != ENOENT && errno != EINVAL) { - virReportSystemError(errno, - _("Unable to umount %s"), - data->file); - goto cleanup; - } - if ((isReg && virFileTouch(data->file, data->sb.st_mode) < 0) || - (isDir && virFileMakePathWithMode(data->file, data->sb.st_mode) < 0)) - goto cleanup; - delDevice = true; - /* Just create the file here so that code below sets - * proper owner and mode. Move the mount only after that. */ - } else { - virReportError(VIR_ERR_OPERATION_UNSUPPORTED, - _("unsupported device type %s 0%o"), - data->file, data->sb.st_mode); - goto cleanup; - } - - if (lchown(data->file, data->sb.st_uid, data->sb.st_gid) < 0) { - virReportSystemError(errno, - _("Failed to chown device %s"), - data->file); - goto cleanup; - } - - /* Symlinks don't have mode */ - if (!isLink && - chmod(data->file, data->sb.st_mode) < 0) { - virReportSystemError(errno, - _("Failed to set permissions for device %s"), - data->file); - goto cleanup; - } - - /* Symlinks don't have ACLs. */ - if (!isLink && - virFileSetACLs(data->file, data->acl) < 0 && - errno != ENOTSUP) { - virReportSystemError(errno, - _("Unable to set ACLs on %s"), data->file); - goto cleanup; - } - -# ifdef WITH_SELINUX - if (data->tcon && - lsetfilecon_raw(data->file, (const char *)data->tcon) < 0) { - VIR_WARNINGS_NO_WLOGICALOP_EQUAL_EXPR - if (errno != EOPNOTSUPP && errno != ENOTSUP) { - VIR_WARNINGS_RESET - virReportSystemError(errno, - _("Unable to set SELinux label on %s"), - data->file); - goto cleanup; - } - } -# endif - - /* Finish mount process started earlier. */ - if ((isReg || isDir) && - virFileMoveMount(data->target, data->file) < 0) - goto cleanup; - - ret = 0; - cleanup: - if (ret < 0 && delDevice) { - if (isDir) - virFileDeleteTree(data->file); - else - unlink(data->file); - } -# ifdef WITH_SELINUX - freecon(data->tcon); -# endif - virFileFreeACLs(&data->acl); - return ret; -} - - -static int -qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, - virDomainObjPtr vm, - const char *file, - char * const *devMountsPath, - size_t ndevMountsPath, - unsigned int ttl) -{ - g_autoptr(virQEMUDriverConfig) cfg = NULL; - struct qemuDomainAttachDeviceMknodData data; - int ret = -1; - g_autofree char *target = NULL; - bool isLink; - bool isReg; - bool isDir; - - if (!ttl) { - virReportSystemError(ELOOP, - _("Too many levels of symbolic links: %s"), - file); - return ret; - } - - memset(&data, 0, sizeof(data)); - - data.driver = driver; - data.vm = vm; - data.file = file; - - if (g_lstat(file, &data.sb) < 0) { - virReportSystemError(errno, - _("Unable to access %s"), file); - return ret; - } - - isLink = S_ISLNK(data.sb.st_mode); - isReg = S_ISREG(data.sb.st_mode) || S_ISFIFO(data.sb.st_mode) || S_ISSOCK(data.sb.st_mode); - isDir = S_ISDIR(data.sb.st_mode); - - if ((isReg || isDir) && STRPREFIX(file, QEMU_DEVPREFIX)) { - cfg = virQEMUDriverGetConfig(driver); - if (!(target = qemuDomainGetPreservedMountPath(cfg, vm, file))) - goto cleanup; - - if (virFileBindMountDevice(file, target) < 0) - goto cleanup; - - data.target = target; - } else if (isLink) { - g_autoptr(GError) gerr = NULL; - - if (!(target = g_file_read_link(file, &gerr))) { - virReportError(VIR_ERR_SYSTEM_ERROR, - _("failed to resolve symlink %s: %s"), file, gerr->message); - return ret; - } - - if (!g_path_is_absolute(target)) { - g_autofree char *fileTmp = g_strdup(file); - char *c = NULL, *tmp = NULL; - - if ((c = strrchr(fileTmp, '/'))) - *(c + 1) = '\0'; - - tmp = g_strdup_printf("%s%s", fileTmp, target); - VIR_FREE(target); - target = g_steal_pointer(&tmp); - } - - data.target = target; - } - - /* Symlinks don't have ACLs. */ - if (!isLink && - virFileGetACLs(file, &data.acl) < 0 && - errno != ENOTSUP) { - virReportSystemError(errno, - _("Unable to get ACLs on %s"), file); - goto cleanup; - } - -# ifdef WITH_SELINUX - if (lgetfilecon_raw(file, &data.tcon) < 0 && - (errno != ENOTSUP && errno != ENODATA)) { - virReportSystemError(errno, - _("Unable to get SELinux label from %s"), file); - goto cleanup; - } -# endif - - if (STRPREFIX(file, QEMU_DEVPREFIX)) { - size_t i; - - for (i = 0; i < ndevMountsPath; i++) { - if (STREQ(devMountsPath[i], "/dev")) - continue; - if (STRPREFIX(file, devMountsPath[i])) - break; - } - - if (i == ndevMountsPath) { - if (qemuSecurityPreFork(driver->securityManager) < 0) - goto cleanup; - - if (virProcessRunInMountNamespace(vm->pid, - qemuDomainAttachDeviceMknodHelper, - &data) < 0) { - qemuSecurityPostFork(driver->securityManager); - goto cleanup; - } - qemuSecurityPostFork(driver->securityManager); - } else { - VIR_DEBUG("Skipping dev %s because of %s mount point", - file, devMountsPath[i]); - } - } - - if (isLink && - qemuDomainAttachDeviceMknodRecursive(driver, vm, target, - devMountsPath, ndevMountsPath, - ttl -1) < 0) - goto cleanup; - - ret = 0; - cleanup: -# ifdef WITH_SELINUX - freecon(data.tcon); -# endif - virFileFreeACLs(&data.acl); - if (isReg && target) - umount(target); - return ret; -} - - -#else /* !defined(__linux__) */ - - -static int -qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver G_GNUC_UNUSED, - virDomainObjPtr vm G_GNUC_UNUSED, - const char *file G_GNUC_UNUSED, - char * const *devMountsPath G_GNUC_UNUSED, - size_t ndevMountsPath G_GNUC_UNUSED, - unsigned int ttl G_GNUC_UNUSED) -{ - virReportSystemError(ENOSYS, "%s", - _("Namespaces are not supported on this platform.")); - return -1; -} - - -#endif /* !defined(__linux__) */ - - -static int -qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, - virDomainObjPtr vm, - const char *file, - char * const *devMountsPath, - size_t ndevMountsPath) -{ - long symloop_max = sysconf(_SC_SYMLOOP_MAX); - - return qemuDomainAttachDeviceMknodRecursive(driver, vm, file, - devMountsPath, ndevMountsPath, - symloop_max); -} - - -static int -qemuDomainDetachDeviceUnlinkHelper(pid_t pid G_GNUC_UNUSED, - void *opaque) -{ - const char *path = opaque; - - VIR_DEBUG("Unlinking %s", path); - if (unlink(path) < 0 && errno != ENOENT) { - virReportSystemError(errno, - _("Unable to remove device %s"), path); - return -1; - } - - return 0; -} - - -static int -qemuDomainDetachDeviceUnlink(virQEMUDriverPtr driver G_GNUC_UNUSED, - virDomainObjPtr vm, - const char *file, - char * const *devMountsPath, - size_t ndevMountsPath) -{ - size_t i; - - if (STRPREFIX(file, QEMU_DEVPREFIX)) { - for (i = 0; i < ndevMountsPath; i++) { - if (STREQ(devMountsPath[i], "/dev")) - continue; - if (STRPREFIX(file, devMountsPath[i])) - break; - } - - if (i == ndevMountsPath) { - if (virProcessRunInMountNamespace(vm->pid, - qemuDomainDetachDeviceUnlinkHelper, - (void *)file) < 0) - return -1; - } - } - - return 0; -} - - -static int -qemuDomainNamespaceMknodPaths(virDomainObjPtr vm, - const char **paths, - size_t npaths) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - virQEMUDriverPtr driver = priv->driver; - g_autoptr(virQEMUDriverConfig) cfg = NULL; - char **devMountsPath = NULL; - size_t ndevMountsPath = 0; - int ret = -1; - size_t i; - - if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) || - !npaths) - return 0; - - cfg = virQEMUDriverGetConfig(driver); - if (qemuDomainGetPreservedMounts(cfg, vm, - &devMountsPath, NULL, - &ndevMountsPath) < 0) - goto cleanup; - - for (i = 0; i < npaths; i++) { - if (qemuDomainAttachDeviceMknod(driver, - vm, - paths[i], - devMountsPath, ndevMountsPath) < 0) - goto cleanup; - } - - ret = 0; - cleanup: - virStringListFreeCount(devMountsPath, ndevMountsPath); - return ret; -} - - -static int -qemuDomainNamespaceMknodPath(virDomainObjPtr vm, - const char *path) -{ - const char *paths[] = { path }; - - return qemuDomainNamespaceMknodPaths(vm, paths, 1); -} - - -static int -qemuDomainNamespaceUnlinkPaths(virDomainObjPtr vm, - const char **paths, - size_t npaths) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - virQEMUDriverPtr driver = priv->driver; - g_autoptr(virQEMUDriverConfig) cfg = NULL; - char **devMountsPath = NULL; - size_t ndevMountsPath = 0; - size_t i; - int ret = -1; - - if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) || - !npaths) - return 0; - - cfg = virQEMUDriverGetConfig(driver); - - if (qemuDomainGetPreservedMounts(cfg, vm, - &devMountsPath, NULL, - &ndevMountsPath) < 0) - goto cleanup; - - for (i = 0; i < npaths; i++) { - if (qemuDomainDetachDeviceUnlink(driver, vm, paths[i], - devMountsPath, ndevMountsPath) < 0) - goto cleanup; - } - - ret = 0; - cleanup: - virStringListFreeCount(devMountsPath, ndevMountsPath); - return ret; -} - - -static int -qemuDomainNamespaceUnlinkPath(virDomainObjPtr vm, - const char *path) -{ - const char *paths[] = { path }; - - return qemuDomainNamespaceUnlinkPaths(vm, paths, 1); -} - - -int -qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, - virStorageSourcePtr src) -{ - virStorageSourcePtr next; - VIR_AUTOSTRINGLIST paths = NULL; - size_t npaths = 0; - bool hasNVMe = false; - - for (next = src; virStorageSourceIsBacking(next); next = next->backingStore) { - g_autofree char *tmpPath = NULL; - - if (next->type == VIR_STORAGE_TYPE_NVME) { - hasNVMe = true; - - if (!(tmpPath = virPCIDeviceAddressGetIOMMUGroupDev(&next->nvme->pciAddr))) - return -1; - } else { - VIR_AUTOSTRINGLIST targetPaths = NULL; - - if (virStorageSourceIsEmpty(next) || - !virStorageSourceIsLocalStorage(next)) { - /* Not creating device. Just continue. */ - continue; - } - - tmpPath = g_strdup(next->path); - - if (virDevMapperGetTargets(next->path, &targetPaths) < 0 && - errno != ENOSYS) { - virReportSystemError(errno, - _("Unable to get devmapper targets for %s"), - next->path); - return -1; - } - - if (virStringListMerge(&paths, &targetPaths) < 0) - return -1; - } - - if (virStringListAdd(&paths, tmpPath) < 0) - return -1; - } - - /* qemu-pr-helper might require access to /dev/mapper/control. */ - if (src->pr && - virStringListAdd(&paths, QEMU_DEVICE_MAPPER_CONTROL_PATH) < 0) - return -1; - - if (hasNVMe && - virStringListAdd(&paths, QEMU_DEV_VFIO) < 0) - return -1; - - npaths = virStringListLength((const char **) paths); - if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths, npaths) < 0) - return -1; - - return 0; -} - - -int -qemuDomainNamespaceTeardownDisk(virDomainObjPtr vm G_GNUC_UNUSED, - virStorageSourcePtr src G_GNUC_UNUSED) -{ - /* While in hotplug case we create the whole backing chain, - * here we must limit ourselves. The disk we want to remove - * might be a part of backing chain of another disk. - * If you are reading these lines and have some spare time - * you can come up with and algorithm that checks for that. - * I don't, therefore: */ - return 0; -} - - -/** - * qemuDomainNamespaceSetupHostdev: - * @vm: domain object - * @hostdev: hostdev to create in @vm's namespace - * - * For given @hostdev, create its devfs representation (if it has one) in - * domain namespace. Note, @hostdev must not be in @vm's definition. - * - * Returns: 0 on success, - * -1 otherwise. - */ -int -qemuDomainNamespaceSetupHostdev(virDomainObjPtr vm, - virDomainHostdevDefPtr hostdev) -{ - g_autofree char *path = NULL; - - if (qemuDomainGetHostdevPath(hostdev, &path, NULL) < 0) - return -1; - - if (path && qemuDomainNamespaceMknodPath(vm, path) < 0) - return -1; - - if (qemuHostdevNeedsVFIO(hostdev) && - !qemuDomainNeedsVFIO(vm->def) && - qemuDomainNamespaceMknodPath(vm, QEMU_DEV_VFIO) < 0) - return -1; - - return 0; -} - - -/** - * qemuDomainNamespaceTeardownHostdev: - * @vm: domain object - * @hostdev: hostdev to remove in @vm's namespace - * - * For given @hostdev, remove its devfs representation (if it has one) in - * domain namespace. Note, @hostdev must not be in @vm's definition. - * - * Returns: 0 on success, - * -1 otherwise. - */ -int -qemuDomainNamespaceTeardownHostdev(virDomainObjPtr vm, - virDomainHostdevDefPtr hostdev) -{ - g_autofree char *path = NULL; - - if (qemuDomainGetHostdevPath(hostdev, &path, NULL) < 0) - return -1; - - if (path && qemuDomainNamespaceUnlinkPath(vm, path) < 0) - return -1; - - if (qemuHostdevNeedsVFIO(hostdev) && - !qemuDomainNeedsVFIO(vm->def) && - qemuDomainNamespaceUnlinkPath(vm, QEMU_DEV_VFIO) < 0) - return -1; - - return 0; -} - - -int -qemuDomainNamespaceSetupMemory(virDomainObjPtr vm, - virDomainMemoryDefPtr mem) -{ - if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) - return 0; - - if (qemuDomainNamespaceMknodPath(vm, mem->nvdimmPath) < 0) - return -1; - - return 0; -} - - -int -qemuDomainNamespaceTeardownMemory(virDomainObjPtr vm, - virDomainMemoryDefPtr mem) -{ - if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) - return 0; - - if (qemuDomainNamespaceUnlinkPath(vm, mem->nvdimmPath) < 0) - return -1; - - return 0; -} - - -int -qemuDomainNamespaceSetupChardev(virDomainObjPtr vm, - virDomainChrDefPtr chr) -{ - const char *path; - - if (!(path = virDomainChrSourceDefGetPath(chr->source))) - return 0; - - /* Socket created by qemu. It doesn't exist upfront. */ - if (chr->source->type == VIR_DOMAIN_CHR_TYPE_UNIX && - chr->source->data.nix.listen) - return 0; - - if (qemuDomainNamespaceMknodPath(vm, path) < 0) - return -1; - - return 0; -} - - -int -qemuDomainNamespaceTeardownChardev(virDomainObjPtr vm, - virDomainChrDefPtr chr) -{ - const char *path = NULL; - - if (chr->source->type != VIR_DOMAIN_CHR_TYPE_DEV) - return 0; - - path = chr->source->data.file.path; - - if (qemuDomainNamespaceUnlinkPath(vm, path) < 0) - return -1; - - return 0; -} - - -int -qemuDomainNamespaceSetupRNG(virDomainObjPtr vm, - virDomainRNGDefPtr rng) -{ - const char *path = NULL; - - switch ((virDomainRNGBackend) rng->backend) { - case VIR_DOMAIN_RNG_BACKEND_RANDOM: - path = rng->source.file; - break; - - case VIR_DOMAIN_RNG_BACKEND_EGD: - case VIR_DOMAIN_RNG_BACKEND_BUILTIN: - case VIR_DOMAIN_RNG_BACKEND_LAST: - break; - } - - if (path && qemuDomainNamespaceMknodPath(vm, path) < 0) - return -1; - - return 0; -} - - -int -qemuDomainNamespaceTeardownRNG(virDomainObjPtr vm, - virDomainRNGDefPtr rng) -{ - const char *path = NULL; - - switch ((virDomainRNGBackend) rng->backend) { - case VIR_DOMAIN_RNG_BACKEND_RANDOM: - path = rng->source.file; - break; - - case VIR_DOMAIN_RNG_BACKEND_EGD: - case VIR_DOMAIN_RNG_BACKEND_BUILTIN: - case VIR_DOMAIN_RNG_BACKEND_LAST: - break; - } - - if (path && qemuDomainNamespaceUnlinkPath(vm, path) < 0) - return -1; - - return 0; -} - - -int -qemuDomainNamespaceSetupInput(virDomainObjPtr vm, - virDomainInputDefPtr input) -{ - const char *path = NULL; - - if (!(path = virDomainInputDefGetPath(input))) - return 0; - - if (path && qemuDomainNamespaceMknodPath(vm, path) < 0) - return -1; - return 0; -} - - -int -qemuDomainNamespaceTeardownInput(virDomainObjPtr vm, - virDomainInputDefPtr input) -{ - const char *path = NULL; - - if (!(path = virDomainInputDefGetPath(input))) - return 0; - - if (path && qemuDomainNamespaceUnlinkPath(vm, path) < 0) - return -1; - - return 0; -} - - /** * qemuDomainDiskLookupByNodename: * @def: domain definition to look for the disk diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 29849a7313..3a1bcbbfa3 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -91,15 +91,6 @@ struct _qemuDomainUnpluggingDevice { #define QEMU_DEVICE_MAPPER_CONTROL_PATH "/dev/mapper/control" -typedef enum { - QEMU_DOMAIN_NS_MOUNT = 0, - QEMU_DOMAIN_NS_LAST -} qemuDomainNamespace; -VIR_ENUM_DECL(qemuDomainNamespace); - -bool qemuDomainNamespaceEnabled(virDomainObjPtr vm, - qemuDomainNamespace ns); - /* Type of domain secret */ typedef enum { VIR_DOMAIN_SECRET_INFO_TYPE_PLAIN = 0, @@ -919,54 +910,6 @@ int qemuDomainGetHostdevPath(virDomainHostdevDefPtr dev, char **path, int *perms); -int qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, - virSecurityManagerPtr mgr, - virDomainObjPtr vm); - -int qemuDomainCreateNamespace(virQEMUDriverPtr driver, - virDomainObjPtr vm); - -void qemuDomainDestroyNamespace(virQEMUDriverPtr driver, - virDomainObjPtr vm); - -bool qemuDomainNamespaceAvailable(qemuDomainNamespace ns); - -int qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, - virStorageSourcePtr src); - -int qemuDomainNamespaceTeardownDisk(virDomainObjPtr vm, - virStorageSourcePtr src); - -int qemuDomainNamespaceSetupHostdev(virDomainObjPtr vm, - virDomainHostdevDefPtr hostdev); - -int qemuDomainNamespaceTeardownHostdev(virDomainObjPtr vm, - virDomainHostdevDefPtr hostdev); - -int qemuDomainNamespaceSetupMemory(virDomainObjPtr vm, - virDomainMemoryDefPtr memory); - -int qemuDomainNamespaceTeardownMemory(virDomainObjPtr vm, - virDomainMemoryDefPtr memory); - -int qemuDomainNamespaceSetupChardev(virDomainObjPtr vm, - virDomainChrDefPtr chr); - -int qemuDomainNamespaceTeardownChardev(virDomainObjPtr vm, - virDomainChrDefPtr chr); - -int qemuDomainNamespaceSetupRNG(virDomainObjPtr vm, - virDomainRNGDefPtr rng); - -int qemuDomainNamespaceTeardownRNG(virDomainObjPtr vm, - virDomainRNGDefPtr rng); - -int qemuDomainNamespaceSetupInput(virDomainObjPtr vm, - virDomainInputDefPtr input); - -int qemuDomainNamespaceTeardownInput(virDomainObjPtr vm, - virDomainInputDefPtr input); - virDomainDiskDefPtr qemuDomainDiskLookupByNodename(virDomainDefPtr def, const char *nodename, virStorageSourcePtr *src); diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c new file mode 100644 index 0000000000..1e54cb2153 --- /dev/null +++ b/src/qemu/qemu_domain_namespace.c @@ -0,0 +1,1885 @@ +/* + * qemu_domain_namespace.c: QEMU domain namespace helpers + * + * Copyright (C) 2006-2019 Red Hat, Inc. + * Copyright (C) 2006 Daniel P. Berrange + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#include <config.h> + +#ifdef __linux__ +# include <sys/sysmacros.h> +#endif +#if defined(HAVE_SYS_MOUNT_H) +# include <sys/mount.h> +#endif +#ifdef WITH_SELINUX +# include <selinux/selinux.h> +#endif + +#include "qemu_domain_namespace.h" +#include "qemu_domain.h" +#include "qemu_cgroup.h" +#include "qemu_security.h" +#include "qemu_hostdev.h" +#include "viralloc.h" +#include "virlog.h" +#include "virstring.h" +#include "virdevmapper.h" + +#define VIR_FROM_THIS VIR_FROM_QEMU + +VIR_LOG_INIT("qemu.qemu_domain"); + + +VIR_ENUM_IMPL(qemuDomainNamespace, + QEMU_DOMAIN_NS_LAST, + "mount", +); + + +/** + * qemuDomainGetPreservedMountPath: + * @cfg: driver configuration data + * @vm: domain object + * @mountpoint: mount point path to convert + * + * For given @mountpoint return new path where the mount point + * should be moved temporarily whilst building the namespace. + * + * Returns: allocated string on success which the caller must free, + * NULL on failure. + */ +static char * +qemuDomainGetPreservedMountPath(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + const char *mountpoint) +{ + char *path = NULL; + char *tmp; + const char *suffix = mountpoint + strlen(QEMU_DEVPREFIX); + g_autofree char *domname = virDomainDefGetShortName(vm->def); + size_t off; + + if (!domname) + return NULL; + + if (STREQ(mountpoint, "/dev")) + suffix = "dev"; + + path = g_strdup_printf("%s/%s.%s", cfg->stateDir, domname, suffix); + + /* Now consider that @mountpoint is "/dev/blah/blah2". + * @suffix then points to "blah/blah2". However, caller + * expects all the @paths to be the same depth. The + * caller doesn't always do `mkdir -p` but sometimes bare + * `touch`. Therefore fix all the suffixes. */ + off = strlen(path) - strlen(suffix); + + tmp = path + off; + while (*tmp) { + if (*tmp == '/') + *tmp = '.'; + tmp++; + } + + return path; +} + + +/** + * qemuDomainGetPreservedMounts: + * + * Process list of mounted filesystems and: + * a) save all FSs mounted under /dev to @devPath + * b) generate backup path for all the entries in a) + * + * Any of the return pointers can be NULL. + * + * Returns 0 on success, -1 otherwise (with error reported) + */ +static int +qemuDomainGetPreservedMounts(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + char ***devPath, + char ***devSavePath, + size_t *ndevPath) +{ + char **paths = NULL, **mounts = NULL; + size_t i, j, nmounts; + + if (virFileGetMountSubtree(QEMU_PROC_MOUNTS, "/dev", + &mounts, &nmounts) < 0) + goto error; + + if (!nmounts) { + if (ndevPath) + *ndevPath = 0; + return 0; + } + + /* There can be nested mount points. For instance + * /dev/shm/blah can be a mount point and /dev/shm too. It + * doesn't make much sense to return the former path because + * caller preserves the latter (and with that the former + * too). Therefore prune nested mount points. + * NB mounts[0] is "/dev". Should we start the outer loop + * from the beginning of the array all we'd be left with is + * just the first element. Think about it. + */ + for (i = 1; i < nmounts; i++) { + j = i + 1; + while (j < nmounts) { + char *c = STRSKIP(mounts[j], mounts[i]); + + if (c && (*c == '/' || *c == '\0')) { + VIR_DEBUG("Dropping path %s because of %s", mounts[j], mounts[i]); + VIR_DELETE_ELEMENT(mounts, j, nmounts); + } else { + j++; + } + } + } + + if (VIR_ALLOC_N(paths, nmounts) < 0) + goto error; + + for (i = 0; i < nmounts; i++) { + if (!(paths[i] = qemuDomainGetPreservedMountPath(cfg, vm, mounts[i]))) + goto error; + } + + if (devPath) + *devPath = mounts; + else + virStringListFreeCount(mounts, nmounts); + + if (devSavePath) + *devSavePath = paths; + else + virStringListFreeCount(paths, nmounts); + + if (ndevPath) + *ndevPath = nmounts; + + return 0; + + error: + virStringListFreeCount(mounts, nmounts); + virStringListFreeCount(paths, nmounts); + return -1; +} + + +struct qemuDomainCreateDeviceData { + const char *path; /* Path to temp new /dev location */ + char * const *devMountsPath; + size_t ndevMountsPath; +}; + + +static int +qemuDomainCreateDeviceRecursive(const char *device, + const struct qemuDomainCreateDeviceData *data, + bool allow_noent, + unsigned int ttl) +{ + g_autofree char *devicePath = NULL; + g_autofree char *target = NULL; + GStatBuf sb; + int ret = -1; + bool isLink = false; + bool isDev = false; + bool isReg = false; + bool isDir = false; + bool create = false; +#ifdef WITH_SELINUX + char *tcon = NULL; +#endif + + if (!ttl) { + virReportSystemError(ELOOP, + _("Too many levels of symbolic links: %s"), + device); + return ret; + } + + if (g_lstat(device, &sb) < 0) { + if (errno == ENOENT && allow_noent) { + /* Ignore non-existent device. */ + return 0; + } + virReportSystemError(errno, _("Unable to stat %s"), device); + return ret; + } + + isLink = S_ISLNK(sb.st_mode); + isDev = S_ISCHR(sb.st_mode) || S_ISBLK(sb.st_mode); + isReg = S_ISREG(sb.st_mode) || S_ISFIFO(sb.st_mode) || S_ISSOCK(sb.st_mode); + isDir = S_ISDIR(sb.st_mode); + + /* Here, @device might be whatever path in the system. We + * should create the path in the namespace iff it's "/dev" + * prefixed. However, if it is a symlink, we need to traverse + * it too (it might point to something in "/dev"). Just + * consider: + * + * /var/sym1 -> /var/sym2 -> /dev/sda (because users can) + * + * This means, "/var/sym1" is not created (it's shared with + * the parent namespace), nor "/var/sym2", but "/dev/sda". + * + * TODO Remove all `.' and `..' from the @device path. + * Otherwise we might get fooled with `/dev/../var/my_image'. + * For now, lets hope callers play nice. + */ + if (STRPREFIX(device, QEMU_DEVPREFIX)) { + size_t i; + + for (i = 0; i < data->ndevMountsPath; i++) { + if (STREQ(data->devMountsPath[i], "/dev")) + continue; + if (STRPREFIX(device, data->devMountsPath[i])) + break; + } + + if (i == data->ndevMountsPath) { + /* Okay, @device is in /dev but not in any mount point under /dev. + * Create it. */ + devicePath = g_strdup_printf("%s/%s", data->path, + device + strlen(QEMU_DEVPREFIX)); + + if (virFileMakeParentPath(devicePath) < 0) { + virReportSystemError(errno, + _("Unable to create %s"), + devicePath); + goto cleanup; + } + VIR_DEBUG("Creating dev %s", device); + create = true; + } else { + VIR_DEBUG("Skipping dev %s because of %s mount point", + device, data->devMountsPath[i]); + } + } + + if (isLink) { + g_autoptr(GError) gerr = NULL; + + /* We are dealing with a symlink. Create a dangling symlink and descend + * down one level which hopefully creates the symlink's target. */ + if (!(target = g_file_read_link(device, &gerr))) { + virReportError(VIR_ERR_SYSTEM_ERROR, + _("failed to resolve symlink %s: %s"), device, gerr->message); + goto cleanup; + } + + if (create && + symlink(target, devicePath) < 0) { + if (errno == EEXIST) { + ret = 0; + } else { + virReportSystemError(errno, + _("unable to create symlink %s"), + devicePath); + } + goto cleanup; + } + + /* Tricky part. If the target starts with a slash then we need to take + * it as it is. Otherwise we need to replace the last component in the + * original path with the link target: + * /dev/rtc -> rtc0 (want /dev/rtc0) + * /dev/disk/by-id/ata-SanDisk_SDSSDXPS480G_161101402485 -> ../../sda + * (want /dev/disk/by-id/../../sda) + * /dev/stdout -> /proc/self/fd/1 (no change needed) + */ + if (!g_path_is_absolute(target)) { + g_autofree char *devTmp = g_strdup(device); + char *c = NULL, *tmp = NULL; + + if ((c = strrchr(devTmp, '/'))) + *(c + 1) = '\0'; + + tmp = g_strdup_printf("%s%s", devTmp, target); + VIR_FREE(target); + target = g_steal_pointer(&tmp); + } + + if (qemuDomainCreateDeviceRecursive(target, data, + allow_noent, ttl - 1) < 0) + goto cleanup; + } else if (isDev) { + if (create) { + unlink(devicePath); + if (mknod(devicePath, sb.st_mode, sb.st_rdev) < 0) { + virReportSystemError(errno, + _("Failed to make device %s"), + devicePath); + goto cleanup; + } + } + } else if (isReg) { + if (create && + virFileTouch(devicePath, sb.st_mode) < 0) + goto cleanup; + /* Just create the file here so that code below sets + * proper owner and mode. Bind mount only after that. */ + } else if (isDir) { + if (create && + virFileMakePathWithMode(devicePath, sb.st_mode) < 0) { + virReportSystemError(errno, + _("Unable to make dir %s"), + devicePath); + goto cleanup; + } + } else { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("unsupported device type %s 0%o"), + device, sb.st_mode); + goto cleanup; + } + + if (!create) { + ret = 0; + goto cleanup; + } + + if (lchown(devicePath, sb.st_uid, sb.st_gid) < 0) { + virReportSystemError(errno, + _("Failed to chown device %s"), + devicePath); + goto cleanup; + } + + /* Symlinks don't have mode */ + if (!isLink && + chmod(devicePath, sb.st_mode) < 0) { + virReportSystemError(errno, + _("Failed to set permissions for device %s"), + devicePath); + goto cleanup; + } + + /* Symlinks don't have ACLs. */ + if (!isLink && + virFileCopyACLs(device, devicePath) < 0 && + errno != ENOTSUP) { + virReportSystemError(errno, + _("Failed to copy ACLs on device %s"), + devicePath); + goto cleanup; + } + +#ifdef WITH_SELINUX + if (lgetfilecon_raw(device, &tcon) < 0 && + (errno != ENOTSUP && errno != ENODATA)) { + virReportSystemError(errno, + _("Unable to get SELinux label from %s"), + device); + goto cleanup; + } + + if (tcon && + lsetfilecon_raw(devicePath, (const char *)tcon) < 0) { + VIR_WARNINGS_NO_WLOGICALOP_EQUAL_EXPR + if (errno != EOPNOTSUPP && errno != ENOTSUP) { + VIR_WARNINGS_RESET + virReportSystemError(errno, + _("Unable to set SELinux label on %s"), + devicePath); + goto cleanup; + } + } +#endif + + /* Finish mount process started earlier. */ + if ((isReg || isDir) && + virFileBindMountDevice(device, devicePath) < 0) + goto cleanup; + + ret = 0; + cleanup: +#ifdef WITH_SELINUX + freecon(tcon); +#endif + return ret; +} + + +static int +qemuDomainCreateDevice(const char *device, + const struct qemuDomainCreateDeviceData *data, + bool allow_noent) +{ + long symloop_max = sysconf(_SC_SYMLOOP_MAX); + + return qemuDomainCreateDeviceRecursive(device, data, + allow_noent, symloop_max); +} + + +static int +qemuDomainPopulateDevices(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm G_GNUC_UNUSED, + const struct qemuDomainCreateDeviceData *data) +{ + const char *const *devices = (const char *const *) cfg->cgroupDeviceACL; + size_t i; + + if (!devices) + devices = defaultDeviceACL; + + for (i = 0; devices[i]; i++) { + if (qemuDomainCreateDevice(devices[i], data, true) < 0) + return -1; + } + + return 0; +} + + +static int +qemuDomainSetupDev(virQEMUDriverConfigPtr cfg, + virSecurityManagerPtr mgr, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + g_autofree char *mount_options = NULL; + g_autofree char *opts = NULL; + + VIR_DEBUG("Setting up /dev/ for domain %s", vm->def->name); + + mount_options = qemuSecurityGetMountOptions(mgr, vm->def); + + if (!mount_options) + mount_options = g_strdup(""); + + /* + * tmpfs is limited to 64kb, since we only have device nodes in there + * and don't want to DOS the entire OS RAM usage + */ + opts = g_strdup_printf("mode=755,size=65536%s", mount_options); + + if (virFileSetupDev(data->path, opts) < 0) + return -1; + + if (qemuDomainPopulateDevices(cfg, vm, data) < 0) + return -1; + + return 0; +} + + +static int +qemuDomainSetupDisk(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, + virDomainDiskDefPtr disk, + const struct qemuDomainCreateDeviceData *data) +{ + virStorageSourcePtr next; + bool hasNVMe = false; + + for (next = disk->src; virStorageSourceIsBacking(next); next = next->backingStore) { + VIR_AUTOSTRINGLIST targetPaths = NULL; + size_t i; + + if (next->type == VIR_STORAGE_TYPE_NVME) { + g_autofree char *nvmePath = NULL; + + hasNVMe = true; + + if (!(nvmePath = virPCIDeviceAddressGetIOMMUGroupDev(&next->nvme->pciAddr))) + return -1; + + if (qemuDomainCreateDevice(nvmePath, data, false) < 0) + return -1; + } else { + if (!next->path || !virStorageSourceIsLocalStorage(next)) { + /* Not creating device. Just continue. */ + continue; + } + + if (qemuDomainCreateDevice(next->path, data, false) < 0) + return -1; + + if (virDevMapperGetTargets(next->path, &targetPaths) < 0 && + errno != ENOSYS) { + virReportSystemError(errno, + _("Unable to get devmapper targets for %s"), + next->path); + return -1; + } + + for (i = 0; targetPaths && targetPaths[i]; i++) { + if (qemuDomainCreateDevice(targetPaths[i], data, false) < 0) + return -1; + } + } + } + + /* qemu-pr-helper might require access to /dev/mapper/control. */ + if (disk->src->pr && + qemuDomainCreateDevice(QEMU_DEVICE_MAPPER_CONTROL_PATH, data, true) < 0) + return -1; + + if (hasNVMe && + qemuDomainCreateDevice(QEMU_DEV_VFIO, data, false) < 0) + return -1; + + return 0; +} + + +static int +qemuDomainSetupAllDisks(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + size_t i; + VIR_DEBUG("Setting up disks"); + + for (i = 0; i < vm->def->ndisks; i++) { + if (qemuDomainSetupDisk(cfg, + vm->def->disks[i], + data) < 0) + return -1; + } + + VIR_DEBUG("Setup all disks"); + return 0; +} + + +static int +qemuDomainSetupHostdev(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, + virDomainHostdevDefPtr dev, + const struct qemuDomainCreateDeviceData *data) +{ + g_autofree char *path = NULL; + + if (qemuDomainGetHostdevPath(dev, &path, NULL) < 0) + return -1; + + if (path && qemuDomainCreateDevice(path, data, false) < 0) + return -1; + + if (qemuHostdevNeedsVFIO(dev) && + qemuDomainCreateDevice(QEMU_DEV_VFIO, data, false) < 0) + return -1; + + return 0; +} + + +static int +qemuDomainSetupAllHostdevs(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + size_t i; + + VIR_DEBUG("Setting up hostdevs"); + for (i = 0; i < vm->def->nhostdevs; i++) { + if (qemuDomainSetupHostdev(cfg, + vm->def->hostdevs[i], + data) < 0) + return -1; + } + VIR_DEBUG("Setup all hostdevs"); + return 0; +} + + +static int +qemuDomainSetupMemory(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, + virDomainMemoryDefPtr mem, + const struct qemuDomainCreateDeviceData *data) +{ + if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) + return 0; + + return qemuDomainCreateDevice(mem->nvdimmPath, data, false); +} + + +static int +qemuDomainSetupAllMemories(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + size_t i; + + VIR_DEBUG("Setting up memories"); + for (i = 0; i < vm->def->nmems; i++) { + if (qemuDomainSetupMemory(cfg, + vm->def->mems[i], + data) < 0) + return -1; + } + VIR_DEBUG("Setup all memories"); + return 0; +} + + +static int +qemuDomainSetupChardev(virDomainDefPtr def G_GNUC_UNUSED, + virDomainChrDefPtr dev, + void *opaque) +{ + const struct qemuDomainCreateDeviceData *data = opaque; + const char *path = NULL; + + if (!(path = virDomainChrSourceDefGetPath(dev->source))) + return 0; + + /* Socket created by qemu. It doesn't exist upfront. */ + if (dev->source->type == VIR_DOMAIN_CHR_TYPE_UNIX && + dev->source->data.nix.listen) + return 0; + + return qemuDomainCreateDevice(path, data, true); +} + + +static int +qemuDomainSetupAllChardevs(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + VIR_DEBUG("Setting up chardevs"); + + if (virDomainChrDefForeach(vm->def, + true, + qemuDomainSetupChardev, + (void *)data) < 0) + return -1; + + VIR_DEBUG("Setup all chardevs"); + return 0; +} + + +static int +qemuDomainSetupTPM(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, + virDomainTPMDefPtr dev, + const struct qemuDomainCreateDeviceData *data) +{ + switch (dev->type) { + case VIR_DOMAIN_TPM_TYPE_PASSTHROUGH: + if (qemuDomainCreateDevice(dev->data.passthrough.source.data.file.path, + data, false) < 0) + return -1; + break; + + case VIR_DOMAIN_TPM_TYPE_EMULATOR: + case VIR_DOMAIN_TPM_TYPE_LAST: + /* nada */ + break; + } + + return 0; +} + + +static int +qemuDomainSetupAllTPMs(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + size_t i; + + VIR_DEBUG("Setting up TPMs"); + + for (i = 0; i < vm->def->ntpms; i++) { + if (qemuDomainSetupTPM(cfg, vm->def->tpms[i], data) < 0) + return -1; + } + + VIR_DEBUG("Setup all TPMs"); + return 0; +} + + +static int +qemuDomainSetupGraphics(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, + virDomainGraphicsDefPtr gfx, + const struct qemuDomainCreateDeviceData *data) +{ + const char *rendernode = virDomainGraphicsGetRenderNode(gfx); + + if (!rendernode) + return 0; + + return qemuDomainCreateDevice(rendernode, data, false); +} + + +static int +qemuDomainSetupAllGraphics(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + size_t i; + + VIR_DEBUG("Setting up graphics"); + for (i = 0; i < vm->def->ngraphics; i++) { + if (qemuDomainSetupGraphics(cfg, + vm->def->graphics[i], + data) < 0) + return -1; + } + + VIR_DEBUG("Setup all graphics"); + return 0; +} + + +static int +qemuDomainSetupInput(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, + virDomainInputDefPtr input, + const struct qemuDomainCreateDeviceData *data) +{ + const char *path = virDomainInputDefGetPath(input); + + if (path && qemuDomainCreateDevice(path, data, false) < 0) + return -1; + + return 0; +} + + +static int +qemuDomainSetupAllInputs(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + size_t i; + + VIR_DEBUG("Setting up inputs"); + for (i = 0; i < vm->def->ninputs; i++) { + if (qemuDomainSetupInput(cfg, + vm->def->inputs[i], + data) < 0) + return -1; + } + VIR_DEBUG("Setup all inputs"); + return 0; +} + + +static int +qemuDomainSetupRNG(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, + virDomainRNGDefPtr rng, + const struct qemuDomainCreateDeviceData *data) +{ + switch ((virDomainRNGBackend) rng->backend) { + case VIR_DOMAIN_RNG_BACKEND_RANDOM: + if (qemuDomainCreateDevice(rng->source.file, data, false) < 0) + return -1; + break; + + case VIR_DOMAIN_RNG_BACKEND_EGD: + case VIR_DOMAIN_RNG_BACKEND_BUILTIN: + case VIR_DOMAIN_RNG_BACKEND_LAST: + /* nada */ + break; + } + + return 0; +} + + +static int +qemuDomainSetupAllRNGs(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + size_t i; + + VIR_DEBUG("Setting up RNGs"); + for (i = 0; i < vm->def->nrngs; i++) { + if (qemuDomainSetupRNG(cfg, + vm->def->rngs[i], + data) < 0) + return -1; + } + + VIR_DEBUG("Setup all RNGs"); + return 0; +} + + +static int +qemuDomainSetupLoader(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + virDomainLoaderDefPtr loader = vm->def->os.loader; + + VIR_DEBUG("Setting up loader"); + + if (loader) { + switch ((virDomainLoader) loader->type) { + case VIR_DOMAIN_LOADER_TYPE_ROM: + if (qemuDomainCreateDevice(loader->path, data, false) < 0) + return -1; + break; + + case VIR_DOMAIN_LOADER_TYPE_PFLASH: + if (qemuDomainCreateDevice(loader->path, data, false) < 0) + return -1; + + if (loader->nvram && + qemuDomainCreateDevice(loader->nvram, data, false) < 0) + return -1; + break; + + case VIR_DOMAIN_LOADER_TYPE_NONE: + case VIR_DOMAIN_LOADER_TYPE_LAST: + break; + } + } + + VIR_DEBUG("Setup loader"); + return 0; +} + + +static int +qemuDomainSetupLaunchSecurity(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, + virDomainObjPtr vm, + const struct qemuDomainCreateDeviceData *data) +{ + virDomainSEVDefPtr sev = vm->def->sev; + + if (!sev || sev->sectype != VIR_DOMAIN_LAUNCH_SECURITY_SEV) + return 0; + + VIR_DEBUG("Setting up launch security"); + + if (qemuDomainCreateDevice(QEMU_DEV_SEV, data, false) < 0) + return -1; + + VIR_DEBUG("Set up launch security"); + return 0; +} + + +int +qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, + virSecurityManagerPtr mgr, + virDomainObjPtr vm) +{ + struct qemuDomainCreateDeviceData data; + const char *devPath = NULL; + char **devMountsPath = NULL, **devMountsSavePath = NULL; + size_t ndevMountsPath = 0, i; + int ret = -1; + + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) { + ret = 0; + goto cleanup; + } + + if (qemuDomainGetPreservedMounts(cfg, vm, + &devMountsPath, &devMountsSavePath, + &ndevMountsPath) < 0) + goto cleanup; + + for (i = 0; i < ndevMountsPath; i++) { + if (STREQ(devMountsPath[i], "/dev")) { + devPath = devMountsSavePath[i]; + break; + } + } + + if (!devPath) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to find any /dev mount")); + goto cleanup; + } + + data.path = devPath; + data.devMountsPath = devMountsPath; + data.ndevMountsPath = ndevMountsPath; + + if (virProcessSetupPrivateMountNS() < 0) + goto cleanup; + + if (qemuDomainSetupDev(cfg, mgr, vm, &data) < 0) + goto cleanup; + + if (qemuDomainSetupAllDisks(cfg, vm, &data) < 0) + goto cleanup; + + if (qemuDomainSetupAllHostdevs(cfg, vm, &data) < 0) + goto cleanup; + + if (qemuDomainSetupAllMemories(cfg, vm, &data) < 0) + goto cleanup; + + if (qemuDomainSetupAllChardevs(cfg, vm, &data) < 0) + goto cleanup; + + if (qemuDomainSetupAllTPMs(cfg, vm, &data) < 0) + goto cleanup; + + if (qemuDomainSetupAllGraphics(cfg, vm, &data) < 0) + goto cleanup; + + if (qemuDomainSetupAllInputs(cfg, vm, &data) < 0) + goto cleanup; + + if (qemuDomainSetupAllRNGs(cfg, vm, &data) < 0) + goto cleanup; + + if (qemuDomainSetupLoader(cfg, vm, &data) < 0) + goto cleanup; + + if (qemuDomainSetupLaunchSecurity(cfg, vm, &data) < 0) + goto cleanup; + + /* Save some mount points because we want to share them with the host */ + for (i = 0; i < ndevMountsPath; i++) { + struct stat sb; + + if (devMountsSavePath[i] == devPath) + continue; + + if (stat(devMountsPath[i], &sb) < 0) { + virReportSystemError(errno, + _("Unable to stat: %s"), + devMountsPath[i]); + goto cleanup; + } + + /* At this point, devMountsPath is either: + * a file (regular or special), or + * a directory. */ + if ((S_ISDIR(sb.st_mode) && virFileMakePath(devMountsSavePath[i]) < 0) || + (!S_ISDIR(sb.st_mode) && virFileTouch(devMountsSavePath[i], sb.st_mode) < 0)) { + virReportSystemError(errno, + _("Failed to create %s"), + devMountsSavePath[i]); + goto cleanup; + } + + if (virFileMoveMount(devMountsPath[i], devMountsSavePath[i]) < 0) + goto cleanup; + } + + if (virFileMoveMount(devPath, "/dev") < 0) + goto cleanup; + + for (i = 0; i < ndevMountsPath; i++) { + struct stat sb; + + if (devMountsSavePath[i] == devPath) + continue; + + if (stat(devMountsSavePath[i], &sb) < 0) { + virReportSystemError(errno, + _("Unable to stat: %s"), + devMountsSavePath[i]); + goto cleanup; + } + + if (S_ISDIR(sb.st_mode)) { + if (virFileMakePath(devMountsPath[i]) < 0) { + virReportSystemError(errno, _("Cannot create %s"), + devMountsPath[i]); + goto cleanup; + } + } else { + if (virFileMakeParentPath(devMountsPath[i]) < 0 || + virFileTouch(devMountsPath[i], sb.st_mode) < 0) { + virReportSystemError(errno, _("Cannot create %s"), + devMountsPath[i]); + goto cleanup; + } + } + + if (virFileMoveMount(devMountsSavePath[i], devMountsPath[i]) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + for (i = 0; i < ndevMountsPath; i++) { +#if defined(__linux__) + umount(devMountsSavePath[i]); +#endif /* defined(__linux__) */ + /* The path can be either a regular file or a dir. */ + if (virFileIsDir(devMountsSavePath[i])) + virFileDeleteTree(devMountsSavePath[i]); + else + unlink(devMountsSavePath[i]); + } + virStringListFreeCount(devMountsPath, ndevMountsPath); + virStringListFreeCount(devMountsSavePath, ndevMountsPath); + return ret; +} + + +int +qemuDomainCreateNamespace(virQEMUDriverPtr driver, + virDomainObjPtr vm) +{ + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + + if (virBitmapIsBitSet(cfg->namespaces, QEMU_DOMAIN_NS_MOUNT) && + qemuDomainEnableNamespace(vm, QEMU_DOMAIN_NS_MOUNT) < 0) + return -1; + + return 0; +} + + +bool +qemuDomainNamespaceEnabled(virDomainObjPtr vm, + qemuDomainNamespace ns) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + return priv->namespaces && + virBitmapIsBitSet(priv->namespaces, ns); +} + + +int +qemuDomainEnableNamespace(virDomainObjPtr vm, + qemuDomainNamespace ns) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (!priv->namespaces && + !(priv->namespaces = virBitmapNew(QEMU_DOMAIN_NS_LAST))) + return -1; + + if (virBitmapSetBit(priv->namespaces, ns) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to enable namespace: %s"), + qemuDomainNamespaceTypeToString(ns)); + return -1; + } + + return 0; +} + + +static void +qemuDomainDisableNamespace(virDomainObjPtr vm, + qemuDomainNamespace ns) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + if (priv->namespaces) { + ignore_value(virBitmapClearBit(priv->namespaces, ns)); + if (virBitmapIsAllClear(priv->namespaces)) { + virBitmapFree(priv->namespaces); + priv->namespaces = NULL; + } + } +} + + +void +qemuDomainDestroyNamespace(virQEMUDriverPtr driver G_GNUC_UNUSED, + virDomainObjPtr vm) +{ + if (qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + qemuDomainDisableNamespace(vm, QEMU_DOMAIN_NS_MOUNT); +} + + +bool +qemuDomainNamespaceAvailable(qemuDomainNamespace ns G_GNUC_UNUSED) +{ +#if !defined(__linux__) + /* Namespaces are Linux specific. */ + return false; + +#else /* defined(__linux__) */ + + switch (ns) { + case QEMU_DOMAIN_NS_MOUNT: +# if !defined(HAVE_SYS_ACL_H) || !defined(WITH_SELINUX) + /* We can't create the exact copy of paths if either of + * these is not available. */ + return false; +# else + if (virProcessNamespaceAvailable(VIR_PROCESS_NAMESPACE_MNT) < 0) + return false; +# endif + break; + case QEMU_DOMAIN_NS_LAST: + break; + } + + return true; +#endif /* defined(__linux__) */ +} + + +struct qemuDomainAttachDeviceMknodData { + virQEMUDriverPtr driver; + virDomainObjPtr vm; + const char *file; + const char *target; + GStatBuf sb; + void *acl; +#ifdef WITH_SELINUX + char *tcon; +#endif +}; + + +/* Our way of creating devices is highly linux specific */ +#if defined(__linux__) +static int +qemuDomainAttachDeviceMknodHelper(pid_t pid G_GNUC_UNUSED, + void *opaque) +{ + struct qemuDomainAttachDeviceMknodData *data = opaque; + int ret = -1; + bool delDevice = false; + bool isLink = S_ISLNK(data->sb.st_mode); + bool isDev = S_ISCHR(data->sb.st_mode) || S_ISBLK(data->sb.st_mode); + bool isReg = S_ISREG(data->sb.st_mode) || S_ISFIFO(data->sb.st_mode) || S_ISSOCK(data->sb.st_mode); + bool isDir = S_ISDIR(data->sb.st_mode); + + qemuSecurityPostFork(data->driver->securityManager); + + if (virFileMakeParentPath(data->file) < 0) { + virReportSystemError(errno, + _("Unable to create %s"), data->file); + goto cleanup; + } + + if (isLink) { + VIR_DEBUG("Creating symlink %s -> %s", data->file, data->target); + + /* First, unlink the symlink target. Symlinks change and + * therefore we have no guarantees that pre-existing + * symlink is still valid. */ + if (unlink(data->file) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove symlink %s"), + data->file); + goto cleanup; + } + + if (symlink(data->target, data->file) < 0) { + virReportSystemError(errno, + _("Unable to create symlink %s (pointing to %s)"), + data->file, data->target); + goto cleanup; + } else { + delDevice = true; + } + } else if (isDev) { + VIR_DEBUG("Creating dev %s (%d,%d)", + data->file, major(data->sb.st_rdev), minor(data->sb.st_rdev)); + unlink(data->file); + if (mknod(data->file, data->sb.st_mode, data->sb.st_rdev) < 0) { + virReportSystemError(errno, + _("Unable to create device %s"), + data->file); + goto cleanup; + } else { + delDevice = true; + } + } else if (isReg || isDir) { + /* We are not cleaning up disks on virDomainDetachDevice + * because disk might be still in use by different disk + * as its backing chain. This might however clash here. + * Therefore do the cleanup here. */ + if (umount(data->file) < 0 && + errno != ENOENT && errno != EINVAL) { + virReportSystemError(errno, + _("Unable to umount %s"), + data->file); + goto cleanup; + } + if ((isReg && virFileTouch(data->file, data->sb.st_mode) < 0) || + (isDir && virFileMakePathWithMode(data->file, data->sb.st_mode) < 0)) + goto cleanup; + delDevice = true; + /* Just create the file here so that code below sets + * proper owner and mode. Move the mount only after that. */ + } else { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("unsupported device type %s 0%o"), + data->file, data->sb.st_mode); + goto cleanup; + } + + if (lchown(data->file, data->sb.st_uid, data->sb.st_gid) < 0) { + virReportSystemError(errno, + _("Failed to chown device %s"), + data->file); + goto cleanup; + } + + /* Symlinks don't have mode */ + if (!isLink && + chmod(data->file, data->sb.st_mode) < 0) { + virReportSystemError(errno, + _("Failed to set permissions for device %s"), + data->file); + goto cleanup; + } + + /* Symlinks don't have ACLs. */ + if (!isLink && + virFileSetACLs(data->file, data->acl) < 0 && + errno != ENOTSUP) { + virReportSystemError(errno, + _("Unable to set ACLs on %s"), data->file); + goto cleanup; + } + +# ifdef WITH_SELINUX + if (data->tcon && + lsetfilecon_raw(data->file, (const char *)data->tcon) < 0) { + VIR_WARNINGS_NO_WLOGICALOP_EQUAL_EXPR + if (errno != EOPNOTSUPP && errno != ENOTSUP) { + VIR_WARNINGS_RESET + virReportSystemError(errno, + _("Unable to set SELinux label on %s"), + data->file); + goto cleanup; + } + } +# endif + + /* Finish mount process started earlier. */ + if ((isReg || isDir) && + virFileMoveMount(data->target, data->file) < 0) + goto cleanup; + + ret = 0; + cleanup: + if (ret < 0 && delDevice) { + if (isDir) + virFileDeleteTree(data->file); + else + unlink(data->file); + } +# ifdef WITH_SELINUX + freecon(data->tcon); +# endif + virFileFreeACLs(&data->acl); + return ret; +} + + +static int +qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *file, + char * const *devMountsPath, + size_t ndevMountsPath, + unsigned int ttl) +{ + g_autoptr(virQEMUDriverConfig) cfg = NULL; + struct qemuDomainAttachDeviceMknodData data; + int ret = -1; + g_autofree char *target = NULL; + bool isLink; + bool isReg; + bool isDir; + + if (!ttl) { + virReportSystemError(ELOOP, + _("Too many levels of symbolic links: %s"), + file); + return ret; + } + + memset(&data, 0, sizeof(data)); + + data.driver = driver; + data.vm = vm; + data.file = file; + + if (g_lstat(file, &data.sb) < 0) { + virReportSystemError(errno, + _("Unable to access %s"), file); + return ret; + } + + isLink = S_ISLNK(data.sb.st_mode); + isReg = S_ISREG(data.sb.st_mode) || S_ISFIFO(data.sb.st_mode) || S_ISSOCK(data.sb.st_mode); + isDir = S_ISDIR(data.sb.st_mode); + + if ((isReg || isDir) && STRPREFIX(file, QEMU_DEVPREFIX)) { + cfg = virQEMUDriverGetConfig(driver); + if (!(target = qemuDomainGetPreservedMountPath(cfg, vm, file))) + goto cleanup; + + if (virFileBindMountDevice(file, target) < 0) + goto cleanup; + + data.target = target; + } else if (isLink) { + g_autoptr(GError) gerr = NULL; + + if (!(target = g_file_read_link(file, &gerr))) { + virReportError(VIR_ERR_SYSTEM_ERROR, + _("failed to resolve symlink %s: %s"), file, gerr->message); + return ret; + } + + if (!g_path_is_absolute(target)) { + g_autofree char *fileTmp = g_strdup(file); + char *c = NULL, *tmp = NULL; + + if ((c = strrchr(fileTmp, '/'))) + *(c + 1) = '\0'; + + tmp = g_strdup_printf("%s%s", fileTmp, target); + VIR_FREE(target); + target = g_steal_pointer(&tmp); + } + + data.target = target; + } + + /* Symlinks don't have ACLs. */ + if (!isLink && + virFileGetACLs(file, &data.acl) < 0 && + errno != ENOTSUP) { + virReportSystemError(errno, + _("Unable to get ACLs on %s"), file); + goto cleanup; + } + +# ifdef WITH_SELINUX + if (lgetfilecon_raw(file, &data.tcon) < 0 && + (errno != ENOTSUP && errno != ENODATA)) { + virReportSystemError(errno, + _("Unable to get SELinux label from %s"), file); + goto cleanup; + } +# endif + + if (STRPREFIX(file, QEMU_DEVPREFIX)) { + size_t i; + + for (i = 0; i < ndevMountsPath; i++) { + if (STREQ(devMountsPath[i], "/dev")) + continue; + if (STRPREFIX(file, devMountsPath[i])) + break; + } + + if (i == ndevMountsPath) { + if (qemuSecurityPreFork(driver->securityManager) < 0) + goto cleanup; + + if (virProcessRunInMountNamespace(vm->pid, + qemuDomainAttachDeviceMknodHelper, + &data) < 0) { + qemuSecurityPostFork(driver->securityManager); + goto cleanup; + } + qemuSecurityPostFork(driver->securityManager); + } else { + VIR_DEBUG("Skipping dev %s because of %s mount point", + file, devMountsPath[i]); + } + } + + if (isLink && + qemuDomainAttachDeviceMknodRecursive(driver, vm, target, + devMountsPath, ndevMountsPath, + ttl -1) < 0) + goto cleanup; + + ret = 0; + cleanup: +# ifdef WITH_SELINUX + freecon(data.tcon); +# endif + virFileFreeACLs(&data.acl); + if (isReg && target) + umount(target); + return ret; +} + + +#else /* !defined(__linux__) */ + + +static int +qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver G_GNUC_UNUSED, + virDomainObjPtr vm G_GNUC_UNUSED, + const char *file G_GNUC_UNUSED, + char * const *devMountsPath G_GNUC_UNUSED, + size_t ndevMountsPath G_GNUC_UNUSED, + unsigned int ttl G_GNUC_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Namespaces are not supported on this platform.")); + return -1; +} + + +#endif /* !defined(__linux__) */ + + +static int +qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *file, + char * const *devMountsPath, + size_t ndevMountsPath) +{ + long symloop_max = sysconf(_SC_SYMLOOP_MAX); + + return qemuDomainAttachDeviceMknodRecursive(driver, vm, file, + devMountsPath, ndevMountsPath, + symloop_max); +} + + +static int +qemuDomainDetachDeviceUnlinkHelper(pid_t pid G_GNUC_UNUSED, + void *opaque) +{ + const char *path = opaque; + + VIR_DEBUG("Unlinking %s", path); + if (unlink(path) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove device %s"), path); + return -1; + } + + return 0; +} + + +static int +qemuDomainDetachDeviceUnlink(virQEMUDriverPtr driver G_GNUC_UNUSED, + virDomainObjPtr vm, + const char *file, + char * const *devMountsPath, + size_t ndevMountsPath) +{ + size_t i; + + if (STRPREFIX(file, QEMU_DEVPREFIX)) { + for (i = 0; i < ndevMountsPath; i++) { + if (STREQ(devMountsPath[i], "/dev")) + continue; + if (STRPREFIX(file, devMountsPath[i])) + break; + } + + if (i == ndevMountsPath) { + if (virProcessRunInMountNamespace(vm->pid, + qemuDomainDetachDeviceUnlinkHelper, + (void *)file) < 0) + return -1; + } + } + + return 0; +} + + +static int +qemuDomainNamespaceMknodPaths(virDomainObjPtr vm, + const char **paths, + size_t npaths) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + g_autoptr(virQEMUDriverConfig) cfg = NULL; + char **devMountsPath = NULL; + size_t ndevMountsPath = 0; + int ret = -1; + size_t i; + + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) || + !npaths) + return 0; + + cfg = virQEMUDriverGetConfig(driver); + if (qemuDomainGetPreservedMounts(cfg, vm, + &devMountsPath, NULL, + &ndevMountsPath) < 0) + goto cleanup; + + for (i = 0; i < npaths; i++) { + if (qemuDomainAttachDeviceMknod(driver, + vm, + paths[i], + devMountsPath, ndevMountsPath) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virStringListFreeCount(devMountsPath, ndevMountsPath); + return ret; +} + + +static int +qemuDomainNamespaceMknodPath(virDomainObjPtr vm, + const char *path) +{ + const char *paths[] = { path }; + + return qemuDomainNamespaceMknodPaths(vm, paths, 1); +} + + +static int +qemuDomainNamespaceUnlinkPaths(virDomainObjPtr vm, + const char **paths, + size_t npaths) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + g_autoptr(virQEMUDriverConfig) cfg = NULL; + char **devMountsPath = NULL; + size_t ndevMountsPath = 0; + size_t i; + int ret = -1; + + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) || + !npaths) + return 0; + + cfg = virQEMUDriverGetConfig(driver); + + if (qemuDomainGetPreservedMounts(cfg, vm, + &devMountsPath, NULL, + &ndevMountsPath) < 0) + goto cleanup; + + for (i = 0; i < npaths; i++) { + if (qemuDomainDetachDeviceUnlink(driver, vm, paths[i], + devMountsPath, ndevMountsPath) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + virStringListFreeCount(devMountsPath, ndevMountsPath); + return ret; +} + + +static int +qemuDomainNamespaceUnlinkPath(virDomainObjPtr vm, + const char *path) +{ + const char *paths[] = { path }; + + return qemuDomainNamespaceUnlinkPaths(vm, paths, 1); +} + + +int +qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, + virStorageSourcePtr src) +{ + virStorageSourcePtr next; + VIR_AUTOSTRINGLIST paths = NULL; + size_t npaths = 0; + bool hasNVMe = false; + + for (next = src; virStorageSourceIsBacking(next); next = next->backingStore) { + g_autofree char *tmpPath = NULL; + + if (next->type == VIR_STORAGE_TYPE_NVME) { + hasNVMe = true; + + if (!(tmpPath = virPCIDeviceAddressGetIOMMUGroupDev(&next->nvme->pciAddr))) + return -1; + } else { + VIR_AUTOSTRINGLIST targetPaths = NULL; + + if (virStorageSourceIsEmpty(next) || + !virStorageSourceIsLocalStorage(next)) { + /* Not creating device. Just continue. */ + continue; + } + + tmpPath = g_strdup(next->path); + + if (virDevMapperGetTargets(next->path, &targetPaths) < 0 && + errno != ENOSYS) { + virReportSystemError(errno, + _("Unable to get devmapper targets for %s"), + next->path); + return -1; + } + + if (virStringListMerge(&paths, &targetPaths) < 0) + return -1; + } + + if (virStringListAdd(&paths, tmpPath) < 0) + return -1; + } + + /* qemu-pr-helper might require access to /dev/mapper/control. */ + if (src->pr && + virStringListAdd(&paths, QEMU_DEVICE_MAPPER_CONTROL_PATH) < 0) + return -1; + + if (hasNVMe && + virStringListAdd(&paths, QEMU_DEV_VFIO) < 0) + return -1; + + npaths = virStringListLength((const char **) paths); + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths, npaths) < 0) + return -1; + + return 0; +} + + +int +qemuDomainNamespaceTeardownDisk(virDomainObjPtr vm G_GNUC_UNUSED, + virStorageSourcePtr src G_GNUC_UNUSED) +{ + /* While in hotplug case we create the whole backing chain, + * here we must limit ourselves. The disk we want to remove + * might be a part of backing chain of another disk. + * If you are reading these lines and have some spare time + * you can come up with and algorithm that checks for that. + * I don't, therefore: */ + return 0; +} + + +/** + * qemuDomainNamespaceSetupHostdev: + * @vm: domain object + * @hostdev: hostdev to create in @vm's namespace + * + * For given @hostdev, create its devfs representation (if it has one) in + * domain namespace. Note, @hostdev must not be in @vm's definition. + * + * Returns: 0 on success, + * -1 otherwise. + */ +int +qemuDomainNamespaceSetupHostdev(virDomainObjPtr vm, + virDomainHostdevDefPtr hostdev) +{ + g_autofree char *path = NULL; + + if (qemuDomainGetHostdevPath(hostdev, &path, NULL) < 0) + return -1; + + if (path && qemuDomainNamespaceMknodPath(vm, path) < 0) + return -1; + + if (qemuHostdevNeedsVFIO(hostdev) && + !qemuDomainNeedsVFIO(vm->def) && + qemuDomainNamespaceMknodPath(vm, QEMU_DEV_VFIO) < 0) + return -1; + + return 0; +} + + +/** + * qemuDomainNamespaceTeardownHostdev: + * @vm: domain object + * @hostdev: hostdev to remove in @vm's namespace + * + * For given @hostdev, remove its devfs representation (if it has one) in + * domain namespace. Note, @hostdev must not be in @vm's definition. + * + * Returns: 0 on success, + * -1 otherwise. + */ +int +qemuDomainNamespaceTeardownHostdev(virDomainObjPtr vm, + virDomainHostdevDefPtr hostdev) +{ + g_autofree char *path = NULL; + + if (qemuDomainGetHostdevPath(hostdev, &path, NULL) < 0) + return -1; + + if (path && qemuDomainNamespaceUnlinkPath(vm, path) < 0) + return -1; + + if (qemuHostdevNeedsVFIO(hostdev) && + !qemuDomainNeedsVFIO(vm->def) && + qemuDomainNamespaceUnlinkPath(vm, QEMU_DEV_VFIO) < 0) + return -1; + + return 0; +} + + +int +qemuDomainNamespaceSetupMemory(virDomainObjPtr vm, + virDomainMemoryDefPtr mem) +{ + if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) + return 0; + + if (qemuDomainNamespaceMknodPath(vm, mem->nvdimmPath) < 0) + return -1; + + return 0; +} + + +int +qemuDomainNamespaceTeardownMemory(virDomainObjPtr vm, + virDomainMemoryDefPtr mem) +{ + if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) + return 0; + + if (qemuDomainNamespaceUnlinkPath(vm, mem->nvdimmPath) < 0) + return -1; + + return 0; +} + + +int +qemuDomainNamespaceSetupChardev(virDomainObjPtr vm, + virDomainChrDefPtr chr) +{ + const char *path; + + if (!(path = virDomainChrSourceDefGetPath(chr->source))) + return 0; + + /* Socket created by qemu. It doesn't exist upfront. */ + if (chr->source->type == VIR_DOMAIN_CHR_TYPE_UNIX && + chr->source->data.nix.listen) + return 0; + + if (qemuDomainNamespaceMknodPath(vm, path) < 0) + return -1; + + return 0; +} + + +int +qemuDomainNamespaceTeardownChardev(virDomainObjPtr vm, + virDomainChrDefPtr chr) +{ + const char *path = NULL; + + if (chr->source->type != VIR_DOMAIN_CHR_TYPE_DEV) + return 0; + + path = chr->source->data.file.path; + + if (qemuDomainNamespaceUnlinkPath(vm, path) < 0) + return -1; + + return 0; +} + + +int +qemuDomainNamespaceSetupRNG(virDomainObjPtr vm, + virDomainRNGDefPtr rng) +{ + const char *path = NULL; + + switch ((virDomainRNGBackend) rng->backend) { + case VIR_DOMAIN_RNG_BACKEND_RANDOM: + path = rng->source.file; + break; + + case VIR_DOMAIN_RNG_BACKEND_EGD: + case VIR_DOMAIN_RNG_BACKEND_BUILTIN: + case VIR_DOMAIN_RNG_BACKEND_LAST: + break; + } + + if (path && qemuDomainNamespaceMknodPath(vm, path) < 0) + return -1; + + return 0; +} + + +int +qemuDomainNamespaceTeardownRNG(virDomainObjPtr vm, + virDomainRNGDefPtr rng) +{ + const char *path = NULL; + + switch ((virDomainRNGBackend) rng->backend) { + case VIR_DOMAIN_RNG_BACKEND_RANDOM: + path = rng->source.file; + break; + + case VIR_DOMAIN_RNG_BACKEND_EGD: + case VIR_DOMAIN_RNG_BACKEND_BUILTIN: + case VIR_DOMAIN_RNG_BACKEND_LAST: + break; + } + + if (path && qemuDomainNamespaceUnlinkPath(vm, path) < 0) + return -1; + + return 0; +} + + +int +qemuDomainNamespaceSetupInput(virDomainObjPtr vm, + virDomainInputDefPtr input) +{ + const char *path = NULL; + + if (!(path = virDomainInputDefGetPath(input))) + return 0; + + if (path && qemuDomainNamespaceMknodPath(vm, path) < 0) + return -1; + return 0; +} + + +int +qemuDomainNamespaceTeardownInput(virDomainObjPtr vm, + virDomainInputDefPtr input) +{ + const char *path = NULL; + + if (!(path = virDomainInputDefGetPath(input))) + return 0; + + if (path && qemuDomainNamespaceUnlinkPath(vm, path) < 0) + return -1; + + return 0; +} diff --git a/src/qemu/qemu_domain_namespace.h b/src/qemu/qemu_domain_namespace.h new file mode 100644 index 0000000000..df58462414 --- /dev/null +++ b/src/qemu/qemu_domain_namespace.h @@ -0,0 +1,86 @@ +/* + * qemu_domain_namespace.h: QEMU domain namespace helpers + * + * Copyright (C) 2006-2019 Red Hat, Inc. + * Copyright (C) 2006 Daniel P. Berrange + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#pragma once + +#include "virenum.h" +#include "qemu_conf.h" +#include "virconf.h" + +typedef enum { + QEMU_DOMAIN_NS_MOUNT = 0, + QEMU_DOMAIN_NS_LAST +} qemuDomainNamespace; +VIR_ENUM_DECL(qemuDomainNamespace); + +int qemuDomainEnableNamespace(virDomainObjPtr vm, + qemuDomainNamespace ns); + +bool qemuDomainNamespaceEnabled(virDomainObjPtr vm, + qemuDomainNamespace ns); + +int qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, + virSecurityManagerPtr mgr, + virDomainObjPtr vm); + +int qemuDomainCreateNamespace(virQEMUDriverPtr driver, + virDomainObjPtr vm); + +void qemuDomainDestroyNamespace(virQEMUDriverPtr driver, + virDomainObjPtr vm); + +bool qemuDomainNamespaceAvailable(qemuDomainNamespace ns); + +int qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, + virStorageSourcePtr src); + +int qemuDomainNamespaceTeardownDisk(virDomainObjPtr vm, + virStorageSourcePtr src); + +int qemuDomainNamespaceSetupHostdev(virDomainObjPtr vm, + virDomainHostdevDefPtr hostdev); + +int qemuDomainNamespaceTeardownHostdev(virDomainObjPtr vm, + virDomainHostdevDefPtr hostdev); + +int qemuDomainNamespaceSetupMemory(virDomainObjPtr vm, + virDomainMemoryDefPtr memory); + +int qemuDomainNamespaceTeardownMemory(virDomainObjPtr vm, + virDomainMemoryDefPtr memory); + +int qemuDomainNamespaceSetupChardev(virDomainObjPtr vm, + virDomainChrDefPtr chr); + +int qemuDomainNamespaceTeardownChardev(virDomainObjPtr vm, + virDomainChrDefPtr chr); + +int qemuDomainNamespaceSetupRNG(virDomainObjPtr vm, + virDomainRNGDefPtr rng); + +int qemuDomainNamespaceTeardownRNG(virDomainObjPtr vm, + virDomainRNGDefPtr rng); + +int qemuDomainNamespaceSetupInput(virDomainObjPtr vm, + virDomainInputDefPtr input); + +int qemuDomainNamespaceTeardownInput(virDomainObjPtr vm, + virDomainInputDefPtr input); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 53980d4d78..62fa38cd55 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -50,6 +50,7 @@ #include "qemu_security.h" #include "qemu_checkpoint.h" #include "qemu_backup.h" +#include "qemu_domain_namespace.h" #include "virerror.h" #include "virlog.h" diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 26912334d2..3c72d07f32 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -27,6 +27,7 @@ #include "qemu_capabilities.h" #include "qemu_domain.h" #include "qemu_domain_address.h" +#include "qemu_domain_namespace.h" #include "qemu_command.h" #include "qemu_hostdev.h" #include "qemu_interface.h" diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 1006f41614..e368f59b8c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -45,6 +45,7 @@ #include "qemu_block.h" #include "qemu_domain.h" #include "qemu_domain_address.h" +#include "qemu_domain_namespace.h" #include "qemu_cgroup.h" #include "qemu_capabilities.h" #include "qemu_monitor.h" diff --git a/src/qemu/qemu_security.c b/src/qemu/qemu_security.c index b9e2470b58..78fd9892a9 100644 --- a/src/qemu/qemu_security.c +++ b/src/qemu/qemu_security.c @@ -21,6 +21,7 @@ #include <config.h> #include "qemu_domain.h" +#include "qemu_domain_namespace.h" #include "qemu_security.h" #include "virlog.h" -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
The qemu_domain.c file is big as is and we should split it into separate semantic blocks. Start with code that handles domain namespaces.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- po/POTFILES.in | 1 + src/qemu/Makefile.inc.am | 2 + src/qemu/qemu_conf.c | 1 + src/qemu/qemu_domain.c | 1848 +---------------------------- src/qemu/qemu_domain.h | 57 - src/qemu/qemu_domain_namespace.c | 1885 ++++++++++++++++++++++++++++++ src/qemu/qemu_domain_namespace.h | 86 ++
Is the "domain" part necessary here? qemu_namespace.c would be enough
src/qemu/qemu_driver.c | 1 + src/qemu/qemu_hotplug.c | 1 + src/qemu/qemu_process.c | 1 + src/qemu/qemu_security.c | 1 + 11 files changed, 1980 insertions(+), 1904 deletions(-) create mode 100644 src/qemu/qemu_domain_namespace.c create mode 100644 src/qemu/qemu_domain_namespace.h
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 2058290870..92dc69ce39 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -37,6 +37,7 @@ #include "qemu_blockjob.h" #include "qemu_checkpoint.h" #include "qemu_validate.h" +#include "qemu_domain_namespace.h" #include "viralloc.h" #include "virlog.h" #include "virerror.h" @@ -65,17 +66,8 @@ #include "virutil.h" #include "virdevmapper.h"
virdevmapper.h was only used in the namespace code
-#ifdef __linux__ -# include <sys/sysmacros.h> -#endif #include <sys/time.h> #include <fcntl.h> -#if defined(HAVE_SYS_MOUNT_H) -# include <sys/mount.h> -#endif -#ifdef WITH_SELINUX -# include <selinux/selinux.h> -#endif
With the file renamed: Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

The name of this function is not very helpful, because it doesn't create anything, it just flips a bit in a bitmask when domain is starting up. Move the function internals into qemu_process.c and forget the function ever existed. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 14 -------------- src/qemu/qemu_domain_namespace.h | 3 --- src/qemu/qemu_process.c | 16 +++++++++++++++- 3 files changed, 15 insertions(+), 18 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 1e54cb2153..ec417edb60 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1035,20 +1035,6 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, } -int -qemuDomainCreateNamespace(virQEMUDriverPtr driver, - virDomainObjPtr vm) -{ - g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); - - if (virBitmapIsBitSet(cfg->namespaces, QEMU_DOMAIN_NS_MOUNT) && - qemuDomainEnableNamespace(vm, QEMU_DOMAIN_NS_MOUNT) < 0) - return -1; - - return 0; -} - - bool qemuDomainNamespaceEnabled(virDomainObjPtr vm, qemuDomainNamespace ns) diff --git a/src/qemu/qemu_domain_namespace.h b/src/qemu/qemu_domain_namespace.h index df58462414..0182ce50a2 100644 --- a/src/qemu/qemu_domain_namespace.h +++ b/src/qemu/qemu_domain_namespace.h @@ -41,9 +41,6 @@ int qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, virSecurityManagerPtr mgr, virDomainObjPtr vm); -int qemuDomainCreateNamespace(virQEMUDriverPtr driver, - virDomainObjPtr vm); - void qemuDomainDestroyNamespace(virQEMUDriverPtr driver, virDomainObjPtr vm); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e368f59b8c..c076dcac3a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -6640,6 +6640,20 @@ qemuProcessSetupDiskThrottlingBlockdev(virQEMUDriverPtr driver, } +static int +qemuProcessEnableDomainNamespaces(virQEMUDriverPtr driver, + virDomainObjPtr vm) +{ + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + + if (virBitmapIsBitSet(cfg->namespaces, QEMU_DOMAIN_NS_MOUNT) && + qemuDomainEnableNamespace(vm, QEMU_DOMAIN_NS_MOUNT) < 0) + return -1; + + return 0; +} + + /** * qemuProcessLaunch: * @@ -6759,7 +6773,7 @@ qemuProcessLaunch(virConnectPtr conn, VIR_DEBUG("Building mount namespace"); - if (qemuDomainCreateNamespace(driver, vm) < 0) + if (qemuProcessEnableDomainNamespaces(driver, vm) < 0) goto cleanup; VIR_DEBUG("Setting up raw IO"); -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
The name of this function is not very helpful, because it doesn't create anything, it just flips a bit in a bitmask when domain is starting up. Move the function internals into qemu_process.c and forget the function ever existed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 14 -------------- src/qemu/qemu_domain_namespace.h | 3 --- src/qemu/qemu_process.c | 16 +++++++++++++++- 3 files changed, 15 insertions(+), 18 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

There is a lot of functions called from qemuDomainBuildNamespace() that accept @cfg (virQEMUDriverConfigPtr) as an argument and don't use it. Historically, it was done so that all qemuDomainSetupAll*() functions look the same. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 91 ++++++++++++-------------------- 1 file changed, 34 insertions(+), 57 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index ec417edb60..6d8faa79fb 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -486,8 +486,7 @@ qemuDomainSetupDev(virQEMUDriverConfigPtr cfg, static int -qemuDomainSetupDisk(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainDiskDefPtr disk, +qemuDomainSetupDisk(virDomainDiskDefPtr disk, const struct qemuDomainCreateDeviceData *data) { virStorageSourcePtr next; @@ -545,16 +544,14 @@ qemuDomainSetupDisk(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, static int -qemuDomainSetupAllDisks(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, +qemuDomainSetupAllDisks(virDomainObjPtr vm, const struct qemuDomainCreateDeviceData *data) { size_t i; VIR_DEBUG("Setting up disks"); for (i = 0; i < vm->def->ndisks; i++) { - if (qemuDomainSetupDisk(cfg, - vm->def->disks[i], + if (qemuDomainSetupDisk(vm->def->disks[i], data) < 0) return -1; } @@ -565,8 +562,7 @@ qemuDomainSetupAllDisks(virQEMUDriverConfigPtr cfg, static int -qemuDomainSetupHostdev(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainHostdevDefPtr dev, +qemuDomainSetupHostdev(virDomainHostdevDefPtr dev, const struct qemuDomainCreateDeviceData *data) { g_autofree char *path = NULL; @@ -586,16 +582,14 @@ qemuDomainSetupHostdev(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, static int -qemuDomainSetupAllHostdevs(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, +qemuDomainSetupAllHostdevs(virDomainObjPtr vm, const struct qemuDomainCreateDeviceData *data) { size_t i; VIR_DEBUG("Setting up hostdevs"); for (i = 0; i < vm->def->nhostdevs; i++) { - if (qemuDomainSetupHostdev(cfg, - vm->def->hostdevs[i], + if (qemuDomainSetupHostdev(vm->def->hostdevs[i], data) < 0) return -1; } @@ -605,8 +599,7 @@ qemuDomainSetupAllHostdevs(virQEMUDriverConfigPtr cfg, static int -qemuDomainSetupMemory(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainMemoryDefPtr mem, +qemuDomainSetupMemory(virDomainMemoryDefPtr mem, const struct qemuDomainCreateDeviceData *data) { if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) @@ -617,16 +610,14 @@ qemuDomainSetupMemory(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, static int -qemuDomainSetupAllMemories(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, +qemuDomainSetupAllMemories(virDomainObjPtr vm, const struct qemuDomainCreateDeviceData *data) { size_t i; VIR_DEBUG("Setting up memories"); for (i = 0; i < vm->def->nmems; i++) { - if (qemuDomainSetupMemory(cfg, - vm->def->mems[i], + if (qemuDomainSetupMemory(vm->def->mems[i], data) < 0) return -1; } @@ -656,8 +647,7 @@ qemuDomainSetupChardev(virDomainDefPtr def G_GNUC_UNUSED, static int -qemuDomainSetupAllChardevs(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainObjPtr vm, +qemuDomainSetupAllChardevs(virDomainObjPtr vm, const struct qemuDomainCreateDeviceData *data) { VIR_DEBUG("Setting up chardevs"); @@ -674,8 +664,7 @@ qemuDomainSetupAllChardevs(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, static int -qemuDomainSetupTPM(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainTPMDefPtr dev, +qemuDomainSetupTPM(virDomainTPMDefPtr dev, const struct qemuDomainCreateDeviceData *data) { switch (dev->type) { @@ -696,8 +685,7 @@ qemuDomainSetupTPM(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, static int -qemuDomainSetupAllTPMs(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainObjPtr vm, +qemuDomainSetupAllTPMs(virDomainObjPtr vm, const struct qemuDomainCreateDeviceData *data) { size_t i; @@ -705,7 +693,7 @@ qemuDomainSetupAllTPMs(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, VIR_DEBUG("Setting up TPMs"); for (i = 0; i < vm->def->ntpms; i++) { - if (qemuDomainSetupTPM(cfg, vm->def->tpms[i], data) < 0) + if (qemuDomainSetupTPM(vm->def->tpms[i], data) < 0) return -1; } @@ -715,8 +703,7 @@ qemuDomainSetupAllTPMs(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, static int -qemuDomainSetupGraphics(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainGraphicsDefPtr gfx, +qemuDomainSetupGraphics(virDomainGraphicsDefPtr gfx, const struct qemuDomainCreateDeviceData *data) { const char *rendernode = virDomainGraphicsGetRenderNode(gfx); @@ -729,16 +716,14 @@ qemuDomainSetupGraphics(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, static int -qemuDomainSetupAllGraphics(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, +qemuDomainSetupAllGraphics(virDomainObjPtr vm, const struct qemuDomainCreateDeviceData *data) { size_t i; VIR_DEBUG("Setting up graphics"); for (i = 0; i < vm->def->ngraphics; i++) { - if (qemuDomainSetupGraphics(cfg, - vm->def->graphics[i], + if (qemuDomainSetupGraphics(vm->def->graphics[i], data) < 0) return -1; } @@ -749,8 +734,7 @@ qemuDomainSetupAllGraphics(virQEMUDriverConfigPtr cfg, static int -qemuDomainSetupInput(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainInputDefPtr input, +qemuDomainSetupInput(virDomainInputDefPtr input, const struct qemuDomainCreateDeviceData *data) { const char *path = virDomainInputDefGetPath(input); @@ -763,16 +747,14 @@ qemuDomainSetupInput(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, static int -qemuDomainSetupAllInputs(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, +qemuDomainSetupAllInputs(virDomainObjPtr vm, const struct qemuDomainCreateDeviceData *data) { size_t i; VIR_DEBUG("Setting up inputs"); for (i = 0; i < vm->def->ninputs; i++) { - if (qemuDomainSetupInput(cfg, - vm->def->inputs[i], + if (qemuDomainSetupInput(vm->def->inputs[i], data) < 0) return -1; } @@ -782,8 +764,7 @@ qemuDomainSetupAllInputs(virQEMUDriverConfigPtr cfg, static int -qemuDomainSetupRNG(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainRNGDefPtr rng, +qemuDomainSetupRNG(virDomainRNGDefPtr rng, const struct qemuDomainCreateDeviceData *data) { switch ((virDomainRNGBackend) rng->backend) { @@ -804,16 +785,14 @@ qemuDomainSetupRNG(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, static int -qemuDomainSetupAllRNGs(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, +qemuDomainSetupAllRNGs(virDomainObjPtr vm, const struct qemuDomainCreateDeviceData *data) { size_t i; VIR_DEBUG("Setting up RNGs"); for (i = 0; i < vm->def->nrngs; i++) { - if (qemuDomainSetupRNG(cfg, - vm->def->rngs[i], + if (qemuDomainSetupRNG(vm->def->rngs[i], data) < 0) return -1; } @@ -824,8 +803,7 @@ qemuDomainSetupAllRNGs(virQEMUDriverConfigPtr cfg, static int -qemuDomainSetupLoader(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainObjPtr vm, +qemuDomainSetupLoader(virDomainObjPtr vm, const struct qemuDomainCreateDeviceData *data) { virDomainLoaderDefPtr loader = vm->def->os.loader; @@ -860,8 +838,7 @@ qemuDomainSetupLoader(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, static int -qemuDomainSetupLaunchSecurity(virQEMUDriverConfigPtr cfg G_GNUC_UNUSED, - virDomainObjPtr vm, +qemuDomainSetupLaunchSecurity(virDomainObjPtr vm, const struct qemuDomainCreateDeviceData *data) { virDomainSEVDefPtr sev = vm->def->sev; @@ -923,34 +900,34 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupDev(cfg, mgr, vm, &data) < 0) goto cleanup; - if (qemuDomainSetupAllDisks(cfg, vm, &data) < 0) + if (qemuDomainSetupAllDisks(vm, &data) < 0) goto cleanup; - if (qemuDomainSetupAllHostdevs(cfg, vm, &data) < 0) + if (qemuDomainSetupAllHostdevs(vm, &data) < 0) goto cleanup; - if (qemuDomainSetupAllMemories(cfg, vm, &data) < 0) + if (qemuDomainSetupAllMemories(vm, &data) < 0) goto cleanup; - if (qemuDomainSetupAllChardevs(cfg, vm, &data) < 0) + if (qemuDomainSetupAllChardevs(vm, &data) < 0) goto cleanup; - if (qemuDomainSetupAllTPMs(cfg, vm, &data) < 0) + if (qemuDomainSetupAllTPMs(vm, &data) < 0) goto cleanup; - if (qemuDomainSetupAllGraphics(cfg, vm, &data) < 0) + if (qemuDomainSetupAllGraphics(vm, &data) < 0) goto cleanup; - if (qemuDomainSetupAllInputs(cfg, vm, &data) < 0) + if (qemuDomainSetupAllInputs(vm, &data) < 0) goto cleanup; - if (qemuDomainSetupAllRNGs(cfg, vm, &data) < 0) + if (qemuDomainSetupAllRNGs(vm, &data) < 0) goto cleanup; - if (qemuDomainSetupLoader(cfg, vm, &data) < 0) + if (qemuDomainSetupLoader(vm, &data) < 0) goto cleanup; - if (qemuDomainSetupLaunchSecurity(cfg, vm, &data) < 0) + if (qemuDomainSetupLaunchSecurity(vm, &data) < 0) goto cleanup; /* Save some mount points because we want to share them with the host */ -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
There is a lot of functions called from qemuDomainBuildNamespace() that accept @cfg (virQEMUDriverConfigPtr) as an argument and don't use it. Historically, it was done so that all qemuDomainSetupAll*() functions look the same.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 91 ++++++++++++-------------------- 1 file changed, 34 insertions(+), 57 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Functions that create a device node after domain startup (used from hotplug) will get a list of paths they want to create and eventually call qemuDomainNamespaceMknodPaths() which then checks whether domain mount namespace is enabled in the first place. Alternatively, on device hotunplug, we might want to delete a path inside domain namespace in which case qemuDomainNamespaceUnlinkPaths() checks whether the namespace is enabled. While this is not dangerous, it certainly burns a couple of CPU cycles needlessly. Check whether mount namespace is enabled upfront. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 39 ++++++++++++++++++++++++++++---- 1 file changed, 35 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 6d8faa79fb..41451bec9f 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1481,8 +1481,7 @@ qemuDomainNamespaceMknodPaths(virDomainObjPtr vm, int ret = -1; size_t i; - if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) || - !npaths) + if (!npaths) return 0; cfg = virQEMUDriverGetConfig(driver); @@ -1529,8 +1528,7 @@ qemuDomainNamespaceUnlinkPaths(virDomainObjPtr vm, size_t i; int ret = -1; - if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT) || - !npaths) + if (!npaths) return 0; cfg = virQEMUDriverGetConfig(driver); @@ -1572,6 +1570,9 @@ qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, size_t npaths = 0; bool hasNVMe = false; + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + return 0; + for (next = src; virStorageSourceIsBacking(next); next = next->backingStore) { g_autofree char *tmpPath = NULL; @@ -1655,6 +1656,9 @@ qemuDomainNamespaceSetupHostdev(virDomainObjPtr vm, { g_autofree char *path = NULL; + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + return 0; + if (qemuDomainGetHostdevPath(hostdev, &path, NULL) < 0) return -1; @@ -1687,6 +1691,9 @@ qemuDomainNamespaceTeardownHostdev(virDomainObjPtr vm, { g_autofree char *path = NULL; + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + return 0; + if (qemuDomainGetHostdevPath(hostdev, &path, NULL) < 0) return -1; @@ -1706,6 +1713,9 @@ int qemuDomainNamespaceSetupMemory(virDomainObjPtr vm, virDomainMemoryDefPtr mem) { + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + return 0; + if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) return 0; @@ -1720,6 +1730,9 @@ int qemuDomainNamespaceTeardownMemory(virDomainObjPtr vm, virDomainMemoryDefPtr mem) { + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + return 0; + if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) return 0; @@ -1736,6 +1749,9 @@ qemuDomainNamespaceSetupChardev(virDomainObjPtr vm, { const char *path; + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + return 0; + if (!(path = virDomainChrSourceDefGetPath(chr->source))) return 0; @@ -1757,6 +1773,9 @@ qemuDomainNamespaceTeardownChardev(virDomainObjPtr vm, { const char *path = NULL; + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + return 0; + if (chr->source->type != VIR_DOMAIN_CHR_TYPE_DEV) return 0; @@ -1775,6 +1794,9 @@ qemuDomainNamespaceSetupRNG(virDomainObjPtr vm, { const char *path = NULL; + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + return 0; + switch ((virDomainRNGBackend) rng->backend) { case VIR_DOMAIN_RNG_BACKEND_RANDOM: path = rng->source.file; @@ -1799,6 +1821,9 @@ qemuDomainNamespaceTeardownRNG(virDomainObjPtr vm, { const char *path = NULL; + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + return 0; + switch ((virDomainRNGBackend) rng->backend) { case VIR_DOMAIN_RNG_BACKEND_RANDOM: path = rng->source.file; @@ -1823,6 +1848,9 @@ qemuDomainNamespaceSetupInput(virDomainObjPtr vm, { const char *path = NULL; + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + return 0; + if (!(path = virDomainInputDefGetPath(input))) return 0; @@ -1838,6 +1866,9 @@ qemuDomainNamespaceTeardownInput(virDomainObjPtr vm, { const char *path = NULL; + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) + return 0; + if (!(path = virDomainInputDefGetPath(input))) return 0; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
Functions that create a device node after domain startup (used from hotplug) will get a list of paths they want to create and eventually call qemuDomainNamespaceMknodPaths() which then checks whether domain mount namespace is enabled in the first place. Alternatively, on device hotunplug, we might want to delete a path inside domain namespace in which case qemuDomainNamespaceUnlinkPaths() checks whether the namespace is enabled. While this is not dangerous, it certainly burns a couple of CPU cycles needlessly.
Check whether mount namespace is enabled upfront.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 39 ++++++++++++++++++++++++++++---- 1 file changed, 35 insertions(+), 4 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

While qemuDomainNamespaceMknodPaths() doesn't actually creates files in the namespace in one go (it forks for each path), it a few commits time it will. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 41451bec9f..6bd1fb30cf 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1655,6 +1655,8 @@ qemuDomainNamespaceSetupHostdev(virDomainObjPtr vm, virDomainHostdevDefPtr hostdev) { g_autofree char *path = NULL; + VIR_AUTOSTRINGLIST paths = NULL; + size_t npaths = 0; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; @@ -1662,12 +1664,16 @@ qemuDomainNamespaceSetupHostdev(virDomainObjPtr vm, if (qemuDomainGetHostdevPath(hostdev, &path, NULL) < 0) return -1; - if (path && qemuDomainNamespaceMknodPath(vm, path) < 0) + if (path && virStringListAdd(&paths, path) < 0) return -1; if (qemuHostdevNeedsVFIO(hostdev) && !qemuDomainNeedsVFIO(vm->def) && - qemuDomainNamespaceMknodPath(vm, QEMU_DEV_VFIO) < 0) + virStringListAdd(&paths, QEMU_DEV_VFIO) < 0) + return -1; + + npaths = virStringListLength((const char **) paths); + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths, npaths) < 0) return -1; return 0; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
While qemuDomainNamespaceMknodPaths() doesn't actually creates
s/creates/create/
files in the namespace in one go (it forks for each path), it a few commits time it will.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

It's not really a problem since this is a helper process that dies as soon as the helper function returns, but the cleanup code will be replaced with a function soon and this change prepares the code for that. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 6bd1fb30cf..e385cda64a 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1102,7 +1102,7 @@ struct qemuDomainAttachDeviceMknodData { virQEMUDriverPtr driver; virDomainObjPtr vm; const char *file; - const char *target; + char *target; GStatBuf sb; void *acl; #ifdef WITH_SELINUX @@ -1248,6 +1248,7 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid G_GNUC_UNUSED, freecon(data->tcon); # endif virFileFreeACLs(&data->acl); + VIR_FREE(data->target); return ret; } -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
It's not really a problem since this is a helper process that dies as soon as the helper function returns, but the cleanup code will be replaced with a function soon and this change prepares the code for that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

This structure is going to be used from not only device attach code, but also when building the namespace. Moreover, the code lives in a separate file so the chances of clashing with another name are minimal. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index e385cda64a..40c4fb36cb 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1098,7 +1098,7 @@ qemuDomainNamespaceAvailable(qemuDomainNamespace ns G_GNUC_UNUSED) } -struct qemuDomainAttachDeviceMknodData { +struct qemuDomainMknodData { virQEMUDriverPtr driver; virDomainObjPtr vm; const char *file; @@ -1117,7 +1117,7 @@ static int qemuDomainAttachDeviceMknodHelper(pid_t pid G_GNUC_UNUSED, void *opaque) { - struct qemuDomainAttachDeviceMknodData *data = opaque; + struct qemuDomainMknodData *data = opaque; int ret = -1; bool delDevice = false; bool isLink = S_ISLNK(data->sb.st_mode); @@ -1262,7 +1262,7 @@ qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, unsigned int ttl) { g_autoptr(virQEMUDriverConfig) cfg = NULL; - struct qemuDomainAttachDeviceMknodData data; + struct qemuDomainMknodData data; int ret = -1; g_autofree char *target = NULL; bool isLink; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
This structure is going to be used from not only device attach code, but also when building the namespace. Moreover, the code lives in a separate file so the chances of clashing with another name are minimal.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index e385cda64a..40c4fb36cb 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1098,7 +1098,7 @@ qemuDomainNamespaceAvailable(qemuDomainNamespace ns G_GNUC_UNUSED) }
-struct qemuDomainAttachDeviceMknodData { +struct qemuDomainMknodData {
qemuNamespaceMkondData would be more fitting.
virQEMUDriverPtr driver; virDomainObjPtr vm; const char *file;
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

When attaching a device into a domain, the corresponding /dev node might need to be created in the domain's namespace. For some types of files we call mknod(), for symlinks we call symlink(), but for others - which exist in the host namespace - we need to so called 'bind mount' them (which is a way of passing a file/directory between mount namespaces). There is this condition in qemuDomainAttachDeviceMknodRecursive() which decides whether a bind mount will be used, move it into a separate function so that it can be reused later. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 40c4fb36cb..0c40118938 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1253,6 +1253,17 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid G_GNUC_UNUSED, } +static bool +qemuDomainMknodItemIsBindMounted(mode_t st_mode) +{ + /* A block device S_ISBLK() or a chardev S_ISCHR() is intentionally not + * handled. We want to mknod() it instead of passing in through bind + * mounting. */ + return S_ISREG(st_mode) || S_ISFIFO(st_mode) || + S_ISSOCK(st_mode) || S_ISDIR(st_mode); +} + + static int qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, virDomainObjPtr vm, @@ -1267,7 +1278,6 @@ qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, g_autofree char *target = NULL; bool isLink; bool isReg; - bool isDir; if (!ttl) { virReportSystemError(ELOOP, @@ -1289,10 +1299,9 @@ qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, } isLink = S_ISLNK(data.sb.st_mode); - isReg = S_ISREG(data.sb.st_mode) || S_ISFIFO(data.sb.st_mode) || S_ISSOCK(data.sb.st_mode); - isDir = S_ISDIR(data.sb.st_mode); + isReg = qemuDomainMknodItemIsBindMounted(data.sb.st_mode); - if ((isReg || isDir) && STRPREFIX(file, QEMU_DEVPREFIX)) { + if (isReg && STRPREFIX(file, QEMU_DEVPREFIX)) { cfg = virQEMUDriverGetConfig(driver); if (!(target = qemuDomainGetPreservedMountPath(cfg, vm, file))) goto cleanup; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
When attaching a device into a domain, the corresponding /dev node might need to be created in the domain's namespace. For some types of files we call mknod(), for symlinks we call symlink(), but for others - which exist in the host namespace - we need to so called 'bind mount' them (which is a way of passing a file/directory between mount namespaces). There is this condition in qemuDomainAttachDeviceMknodRecursive() which decides whether a bind mount will be used, move it into a separate function so that it can be reused later.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 40c4fb36cb..0c40118938 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1253,6 +1253,17 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid G_GNUC_UNUSED, }
+static bool +qemuDomainMknodItemIsBindMounted(mode_t st_mode)
To me, IsBindMounted reads like it's done already, but we yet have to do that. Consider: qemuNamespaceMknodItemNeedsBindMount
+{ + /* A block device S_ISBLK() or a chardev S_ISCHR() is intentionally not + * handled. We want to mknod() it instead of passing in through bind + * mounting. */ + return S_ISREG(st_mode) || S_ISFIFO(st_mode) || + S_ISSOCK(st_mode) || S_ISDIR(st_mode); +} + + static int qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, virDomainObjPtr vm, @@ -1267,7 +1278,6 @@ qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, g_autofree char *target = NULL; bool isLink; bool isReg; - bool isDir;
if (!ttl) { virReportSystemError(ELOOP, @@ -1289,10 +1299,9 @@ qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, }
isLink = S_ISLNK(data.sb.st_mode); - isReg = S_ISREG(data.sb.st_mode) || S_ISFIFO(data.sb.st_mode) || S_ISSOCK(data.sb.st_mode); - isDir = S_ISDIR(data.sb.st_mode); + isReg = qemuDomainMknodItemIsBindMounted(data.sb.st_mode);
note that the variable is still misleadingly called isReg.
- if ((isReg || isDir) && STRPREFIX(file, QEMU_DEVPREFIX)) { + if (isReg && STRPREFIX(file, QEMU_DEVPREFIX)) { cfg = virQEMUDriverGetConfig(driver); if (!(target = qemuDomainGetPreservedMountPath(cfg, vm, file)))
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

So far, when attaching a device needs two or more /dev nodes created into a domain, we fork off and run the helper for every node separately. For majority of devices this is okay, because they need no or one node created anyway. But the idea is to use this attach code to build the namespace when starting a domain, in which case there will be way more nodes than one. To achieve this, the recursive approach for handling symlinks has to be turned into an iterative one. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 298 +++++++++++++++++++------------ 1 file changed, 185 insertions(+), 113 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 0c40118938..31acf2bde6 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1098,26 +1098,58 @@ qemuDomainNamespaceAvailable(qemuDomainNamespace ns G_GNUC_UNUSED) } -struct qemuDomainMknodData { - virQEMUDriverPtr driver; - virDomainObjPtr vm; +typedef struct _qemuDomainMknodItem qemuDomainMknodItem; +typedef qemuDomainMknodItem *qemuDomainMknodItemPtr; +struct _qemuDomainMknodItem { const char *file; char *target; + bool bindmounted; GStatBuf sb; void *acl; -#ifdef WITH_SELINUX char *tcon; +}; + +typedef struct _qemuDomainMknodData qemuDomainMknodData; +typedef qemuDomainMknodData *qemuDomainMknodDataPtr; +struct _qemuDomainMknodData { + virQEMUDriverPtr driver; + virDomainObjPtr vm; + qemuDomainMknodItemPtr items; + size_t nitems; +}; + + +static void +qemuDomainMknodItemClear(qemuDomainMknodItemPtr item) +{ + VIR_FREE(item->target); + virFileFreeACLs(&item->acl); +#ifdef WITH_SELINUX + freecon(item->tcon); #endif -}; +} + + +static void +qemuDomainMknodDataClear(qemuDomainMknodDataPtr data) +{ + size_t i; + + for (i = 0; i < data->nitems; i++) { + qemuDomainMknodItemPtr item = &data->items[i]; + + qemuDomainMknodItemClear(item); + } + + VIR_FREE(data->items); +} /* Our way of creating devices is highly linux specific */ #if defined(__linux__) static int -qemuDomainAttachDeviceMknodHelper(pid_t pid G_GNUC_UNUSED, - void *opaque) +qemuDomainMknodOne(qemuDomainMknodItemPtr data) { - struct qemuDomainMknodData *data = opaque; int ret = -1; bool delDevice = false; bool isLink = S_ISLNK(data->sb.st_mode); @@ -1125,8 +1157,6 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid G_GNUC_UNUSED, bool isReg = S_ISREG(data->sb.st_mode) || S_ISFIFO(data->sb.st_mode) || S_ISSOCK(data->sb.st_mode); bool isDir = S_ISDIR(data->sb.st_mode); - qemuSecurityPostFork(data->driver->securityManager); - if (virFileMakeParentPath(data->file) < 0) { virReportSystemError(errno, _("Unable to create %s"), data->file); @@ -1244,11 +1274,6 @@ qemuDomainAttachDeviceMknodHelper(pid_t pid G_GNUC_UNUSED, else unlink(data->file); } -# ifdef WITH_SELINUX - freecon(data->tcon); -# endif - virFileFreeACLs(&data->acl); - VIR_FREE(data->target); return ret; } @@ -1265,63 +1290,66 @@ qemuDomainMknodItemIsBindMounted(mode_t st_mode) static int -qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, - virDomainObjPtr vm, - const char *file, - char * const *devMountsPath, - size_t ndevMountsPath, - unsigned int ttl) +qemuDomainMknodHelper(pid_t pid G_GNUC_UNUSED, + void *opaque) { - g_autoptr(virQEMUDriverConfig) cfg = NULL; - struct qemuDomainMknodData data; + qemuDomainMknodDataPtr data = opaque; + size_t i; int ret = -1; + + qemuSecurityPostFork(data->driver->securityManager); + + for (i = 0; i < data->nitems; i++) { + if (qemuDomainMknodOne(&data->items[i]) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + qemuDomainMknodDataClear(data); + return ret; +} + + +static int +qemuDomainMknodItemInit(qemuDomainMknodItemPtr item, + virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + const char *file) +{ g_autofree char *target = NULL; bool isLink; bool isReg; - if (!ttl) { - virReportSystemError(ELOOP, - _("Too many levels of symbolic links: %s"), - file); - return ret; - } + item->file = file; - memset(&data, 0, sizeof(data)); - - data.driver = driver; - data.vm = vm; - data.file = file; - - if (g_lstat(file, &data.sb) < 0) { + if (g_lstat(file, &item->sb) < 0) { virReportSystemError(errno, _("Unable to access %s"), file); - return ret; + return -1; } - isLink = S_ISLNK(data.sb.st_mode); - isReg = qemuDomainMknodItemIsBindMounted(data.sb.st_mode); + isLink = S_ISLNK(item->sb.st_mode); + isReg = qemuDomainMknodItemIsBindMounted(item->sb.st_mode); if (isReg && STRPREFIX(file, QEMU_DEVPREFIX)) { - cfg = virQEMUDriverGetConfig(driver); if (!(target = qemuDomainGetPreservedMountPath(cfg, vm, file))) - goto cleanup; + return -1; - if (virFileBindMountDevice(file, target) < 0) - goto cleanup; - - data.target = target; + item->target = g_steal_pointer(&target); } else if (isLink) { g_autoptr(GError) gerr = NULL; if (!(target = g_file_read_link(file, &gerr))) { virReportError(VIR_ERR_SYSTEM_ERROR, _("failed to resolve symlink %s: %s"), file, gerr->message); - return ret; + return -1; } if (!g_path_is_absolute(target)) { g_autofree char *fileTmp = g_strdup(file); - char *c = NULL, *tmp = NULL; + char *c = NULL; + char *tmp = NULL; if ((c = strrchr(fileTmp, '/'))) *(c + 1) = '\0'; @@ -1331,92 +1359,79 @@ qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver, target = g_steal_pointer(&tmp); } - data.target = target; + item->target = g_steal_pointer(&target); } /* Symlinks don't have ACLs. */ if (!isLink && - virFileGetACLs(file, &data.acl) < 0 && + virFileGetACLs(file, &item->acl) < 0 && errno != ENOTSUP) { virReportSystemError(errno, _("Unable to get ACLs on %s"), file); - goto cleanup; + return -1; } # ifdef WITH_SELINUX - if (lgetfilecon_raw(file, &data.tcon) < 0 && + if (lgetfilecon_raw(file, &item->tcon) < 0 && (errno != ENOTSUP && errno != ENODATA)) { virReportSystemError(errno, _("Unable to get SELinux label from %s"), file); - goto cleanup; + return -1; } # endif - if (STRPREFIX(file, QEMU_DEVPREFIX)) { - size_t i; - - for (i = 0; i < ndevMountsPath; i++) { - if (STREQ(devMountsPath[i], "/dev")) - continue; - if (STRPREFIX(file, devMountsPath[i])) - break; - } - - if (i == ndevMountsPath) { - if (qemuSecurityPreFork(driver->securityManager) < 0) - goto cleanup; - - if (virProcessRunInMountNamespace(vm->pid, - qemuDomainAttachDeviceMknodHelper, - &data) < 0) { - qemuSecurityPostFork(driver->securityManager); - goto cleanup; - } - qemuSecurityPostFork(driver->securityManager); - } else { - VIR_DEBUG("Skipping dev %s because of %s mount point", - file, devMountsPath[i]); - } - } - - if (isLink && - qemuDomainAttachDeviceMknodRecursive(driver, vm, target, - devMountsPath, ndevMountsPath, - ttl -1) < 0) - goto cleanup; - - ret = 0; - cleanup: -# ifdef WITH_SELINUX - freecon(data.tcon); -# endif - virFileFreeACLs(&data.acl); - if (isReg && target) - umount(target); - return ret; + return 0; } -#else /* !defined(__linux__) */ - - static int -qemuDomainAttachDeviceMknodRecursive(virQEMUDriverPtr driver G_GNUC_UNUSED, - virDomainObjPtr vm G_GNUC_UNUSED, - const char *file G_GNUC_UNUSED, - char * const *devMountsPath G_GNUC_UNUSED, - size_t ndevMountsPath G_GNUC_UNUSED, - unsigned int ttl G_GNUC_UNUSED) +qemuDomainAttachDeviceMknodOne(qemuDomainMknodDataPtr data, + virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + const char *file, + char * const *devMountsPath, + size_t ndevMountsPath) { - virReportSystemError(ENOSYS, "%s", - _("Namespaces are not supported on this platform.")); - return -1; + long ttl = sysconf(_SC_SYMLOOP_MAX); + const char *next = file; + size_t i; + + while (1) { + qemuDomainMknodItem item = { 0 }; + + if (qemuDomainMknodItemInit(&item, cfg, vm, next) < 0) + return -1; + + if (STRPREFIX(next, QEMU_DEVPREFIX)) { + for (i = 0; i < ndevMountsPath; i++) { + if (STREQ(devMountsPath[i], "/dev")) + continue; + if (STRPREFIX(next, devMountsPath[i])) + break; + } + + if (i == ndevMountsPath && + VIR_APPEND_ELEMENT_COPY(data->items, data->nitems, item) < 0) + return -1; + } + + if (!S_ISLNK(item.sb.st_mode)) + break; + + if (ttl-- == 0) { + virReportSystemError(ELOOP, + _("Too many levels of symbolic links: %s"), + next); + return -1; + } + + next = item.target; + } + + return 0; } -#endif /* !defined(__linux__) */ - - static int qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, virDomainObjPtr vm, @@ -1424,14 +1439,71 @@ qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, char * const *devMountsPath, size_t ndevMountsPath) { - long symloop_max = sysconf(_SC_SYMLOOP_MAX); + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + qemuDomainMknodData data = { 0 }; + size_t i; + int ret = -1; - return qemuDomainAttachDeviceMknodRecursive(driver, vm, file, - devMountsPath, ndevMountsPath, - symloop_max); + data.driver = driver; + data.vm = vm; + + if (qemuDomainAttachDeviceMknodOne(&data, cfg, vm, file, + devMountsPath, ndevMountsPath) < 0) + return -1; + + for (i = 0; i < data.nitems; i++) { + qemuDomainMknodItemPtr item = &data.items[i]; + if (item->target && + qemuDomainMknodItemIsBindMounted(item->sb.st_mode)) { + if (virFileBindMountDevice(item->file, item->target) < 0) + goto cleanup; + item->bindmounted = true; + } + } + + if (qemuSecurityPreFork(driver->securityManager) < 0) + goto cleanup; + + if (virProcessRunInMountNamespace(vm->pid, + qemuDomainMknodHelper, + &data) < 0) { + qemuSecurityPostFork(driver->securityManager); + goto cleanup; + } + qemuSecurityPostFork(driver->securityManager); + + ret = 0; + cleanup: + for (i = 0; i < data.nitems; i++) { + if (data.items[i].bindmounted && + umount(data.items[i].target) < 0) { + VIR_WARN("Unable to unmount %s", data.items[i].target); + } + } + qemuDomainMknodDataClear(&data); + return ret; +} + + +#else /* !defined(__linux__) */ + + +static int +qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver G_GNUC_UNUSED, + virDomainObjPtr vm G_GNUC_UNUSED, + const char *file G_GNUC_UNUSED, + char * const *devMountsPath G_GNUC_UNUSED, + size_t ndevMountsPath G_GNUC_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Namespaces are not supported on this platform.")); + return -1; } +#endif /* !defined(__linux__) */ + + static int qemuDomainDetachDeviceUnlinkHelper(pid_t pid G_GNUC_UNUSED, void *opaque) -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
So far, when attaching a device needs two or more /dev nodes created into a domain, we fork off and run the helper for every node separately. For majority of devices this is okay, because they need no or one node created anyway. But the idea is to use this attach code to build the namespace when starting a domain, in which case there will be way more nodes than one.
To achieve this, the recursive approach for handling symlinks has to be turned into an iterative one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 298 +++++++++++++++++++------------ 1 file changed, 185 insertions(+), 113 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

While the previous commit prepared the helper function run in a forked off helper (with corresponding struct), this commit modifies the caller, which now create all files requested in a single process and does not fork off for every single path. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 89 ++++++++++++-------------------- 1 file changed, 32 insertions(+), 57 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 31acf2bde6..b9f8c32770 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1385,12 +1385,12 @@ qemuDomainMknodItemInit(qemuDomainMknodItemPtr item, static int -qemuDomainAttachDeviceMknodOne(qemuDomainMknodDataPtr data, - virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, - const char *file, - char * const *devMountsPath, - size_t ndevMountsPath) +qemuDomainNamespacePrepareOne(qemuDomainMknodDataPtr data, + virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + const char *file, + char * const *devMountsPath, + size_t ndevMountsPath) { long ttl = sysconf(_SC_SYMLOOP_MAX); const char *next = file; @@ -1433,23 +1433,36 @@ qemuDomainAttachDeviceMknodOne(qemuDomainMknodDataPtr data, static int -qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, - virDomainObjPtr vm, - const char *file, - char * const *devMountsPath, - size_t ndevMountsPath) +qemuDomainNamespaceMknodPaths(virDomainObjPtr vm, + const char **paths, + size_t npaths) { - g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverPtr driver = priv->driver; + g_autoptr(virQEMUDriverConfig) cfg = NULL; + char **devMountsPath = NULL; + size_t ndevMountsPath = 0; qemuDomainMknodData data = { 0 }; size_t i; int ret = -1; + if (npaths == 0) + return 0; + + cfg = virQEMUDriverGetConfig(driver); + if (qemuDomainGetPreservedMounts(cfg, vm, + &devMountsPath, NULL, + &ndevMountsPath) < 0) + return -1; + data.driver = driver; data.vm = vm; - if (qemuDomainAttachDeviceMknodOne(&data, cfg, vm, file, - devMountsPath, ndevMountsPath) < 0) - return -1; + for (i = 0; i < npaths; i++) { + if (qemuDomainNamespacePrepareOne(&data, cfg, vm, paths[i], + devMountsPath, ndevMountsPath) < 0) + goto cleanup; + } for (i = 0; i < data.nitems; i++) { qemuDomainMknodItemPtr item = &data.items[i]; @@ -1481,6 +1494,7 @@ qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, } } qemuDomainMknodDataClear(&data); + virStringListFreeCount(devMountsPath, ndevMountsPath); return ret; } @@ -1489,11 +1503,9 @@ qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver, static int -qemuDomainAttachDeviceMknod(virQEMUDriverPtr driver G_GNUC_UNUSED, - virDomainObjPtr vm G_GNUC_UNUSED, - const char *file G_GNUC_UNUSED, - char * const *devMountsPath G_GNUC_UNUSED, - size_t ndevMountsPath G_GNUC_UNUSED) +qemuDomainNamespaceMknodPaths(virDomainObjPtr vm G_GNUC_UNUSED, + const char **paths G_GNUC_UNUSED, + size_t npaths G_GNUC_UNUSED) { virReportSystemError(ENOSYS, "%s", _("Namespaces are not supported on this platform.")); @@ -1550,43 +1562,6 @@ qemuDomainDetachDeviceUnlink(virQEMUDriverPtr driver G_GNUC_UNUSED, } -static int -qemuDomainNamespaceMknodPaths(virDomainObjPtr vm, - const char **paths, - size_t npaths) -{ - qemuDomainObjPrivatePtr priv = vm->privateData; - virQEMUDriverPtr driver = priv->driver; - g_autoptr(virQEMUDriverConfig) cfg = NULL; - char **devMountsPath = NULL; - size_t ndevMountsPath = 0; - int ret = -1; - size_t i; - - if (!npaths) - return 0; - - cfg = virQEMUDriverGetConfig(driver); - if (qemuDomainGetPreservedMounts(cfg, vm, - &devMountsPath, NULL, - &ndevMountsPath) < 0) - goto cleanup; - - for (i = 0; i < npaths; i++) { - if (qemuDomainAttachDeviceMknod(driver, - vm, - paths[i], - devMountsPath, ndevMountsPath) < 0) - goto cleanup; - } - - ret = 0; - cleanup: - virStringListFreeCount(devMountsPath, ndevMountsPath); - return ret; -} - - static int qemuDomainNamespaceMknodPath(virDomainObjPtr vm, const char *path) -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
While the previous commit prepared the helper function run in a forked off helper (with corresponding struct), this commit modifies the caller, which now create all files requested in a single process and does not fork off for every single path.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 89 ++++++++++++-------------------- 1 file changed, 32 insertions(+), 57 deletions(-)
diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 31acf2bde6..b9f8c32770 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1385,12 +1385,12 @@ qemuDomainMknodItemInit(qemuDomainMknodItemPtr item,
static int -qemuDomainAttachDeviceMknodOne(qemuDomainMknodDataPtr data, - virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm, - const char *file, - char * const *devMountsPath, - size_t ndevMountsPath) +qemuDomainNamespacePrepareOne(qemuDomainMknodDataPtr data,
Consider: PrepareItem or PrepareOneItem
+ virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm, + const char *file, + char * const *devMountsPath, + size_t ndevMountsPath) { long ttl = sysconf(_SC_SYMLOOP_MAX); const char *next = file;
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Every caller does the same - counts the number of items in a string list they have, only to pass the number to qemuDomainNamespaceMknodPaths(). This is needless - the function can accept the string list and count the items itself. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index b9f8c32770..1803943fbc 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1434,18 +1434,19 @@ qemuDomainNamespacePrepareOne(qemuDomainMknodDataPtr data, static int qemuDomainNamespaceMknodPaths(virDomainObjPtr vm, - const char **paths, - size_t npaths) + const char **paths) { qemuDomainObjPrivatePtr priv = vm->privateData; virQEMUDriverPtr driver = priv->driver; g_autoptr(virQEMUDriverConfig) cfg = NULL; char **devMountsPath = NULL; size_t ndevMountsPath = 0; + size_t npaths = 0; qemuDomainMknodData data = { 0 }; size_t i; int ret = -1; + npaths = virStringListLength(paths); if (npaths == 0) return 0; @@ -1566,9 +1567,9 @@ static int qemuDomainNamespaceMknodPath(virDomainObjPtr vm, const char *path) { - const char *paths[] = { path }; + const char *paths[] = { path, NULL }; - return qemuDomainNamespaceMknodPaths(vm, paths, 1); + return qemuDomainNamespaceMknodPaths(vm, paths); } @@ -1624,7 +1625,6 @@ qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, { virStorageSourcePtr next; VIR_AUTOSTRINGLIST paths = NULL; - size_t npaths = 0; bool hasNVMe = false; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) @@ -1674,8 +1674,7 @@ qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, virStringListAdd(&paths, QEMU_DEV_VFIO) < 0) return -1; - npaths = virStringListLength((const char **) paths); - if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths, npaths) < 0) + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; return 0; @@ -1713,7 +1712,6 @@ qemuDomainNamespaceSetupHostdev(virDomainObjPtr vm, { g_autofree char *path = NULL; VIR_AUTOSTRINGLIST paths = NULL; - size_t npaths = 0; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; @@ -1729,8 +1727,7 @@ qemuDomainNamespaceSetupHostdev(virDomainObjPtr vm, virStringListAdd(&paths, QEMU_DEV_VFIO) < 0) return -1; - npaths = virStringListLength((const char **) paths); - if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths, npaths) < 0) + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; return 0; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
Every caller does the same - counts the number of items in a string list they have, only to pass the number to qemuDomainNamespaceMknodPaths(). This is needless - the function can accept the string list and count the items itself.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

The aim to make it look as close to qemuDomainNamespaceSetupDisk() as possible. The latter will call the former and this change makes that diff easier to read. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 1803943fbc..18ec86816c 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -486,13 +486,13 @@ qemuDomainSetupDev(virQEMUDriverConfigPtr cfg, static int -qemuDomainSetupDisk(virDomainDiskDefPtr disk, +qemuDomainSetupDisk(virStorageSourcePtr src, const struct qemuDomainCreateDeviceData *data) { virStorageSourcePtr next; bool hasNVMe = false; - for (next = disk->src; virStorageSourceIsBacking(next); next = next->backingStore) { + for (next = src; virStorageSourceIsBacking(next); next = next->backingStore) { VIR_AUTOSTRINGLIST targetPaths = NULL; size_t i; @@ -531,7 +531,7 @@ qemuDomainSetupDisk(virDomainDiskDefPtr disk, } /* qemu-pr-helper might require access to /dev/mapper/control. */ - if (disk->src->pr && + if (src->pr && qemuDomainCreateDevice(QEMU_DEVICE_MAPPER_CONTROL_PATH, data, true) < 0) return -1; @@ -551,7 +551,7 @@ qemuDomainSetupAllDisks(virDomainObjPtr vm, VIR_DEBUG("Setting up disks"); for (i = 0; i < vm->def->ndisks; i++) { - if (qemuDomainSetupDisk(vm->def->disks[i], + if (qemuDomainSetupDisk(vm->def->disks[i]->src, data) < 0) return -1; } -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
The aim to make it look as close to qemuDomainNamespaceSetupDisk() as possible. The latter will call the former and this change makes that diff easier to read.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Okay, here is the deal. Currently, the way we build namespace is very fragile. It is done from pre-exec hook when starting a domain, after we mass closed all FDs and before we drop privileges and exec() QEMU. This fact poses some limitations onto the namespace build code, e.g. it has to make sure not to keep any FD opened (not even through a library call), because it would be leaked to QEMU. Also, it has to call only async signal safe functions. These requirements are hard to meet - in fact as of my commit v6.2.0-rc1~235 we are leaking a FD into QEMU by calling libdevmapper functions. To solve this issue and avoid similar problems in the future, we should change our paradigm. We already have functions which can populate domain's namespace with nodes from the daemon context. If we use them to populate the namespace and keep only the bare minimum in the pre-exec hook, we've mitigated the risk. Therefore, the old qemuDomainBuildNamespace() is renamed to qemuDomainUnshareNamespace() and new qemuDomainBuildNamespace() function is introduced. So far, the new function is basically a NOP and domain's namespace is still populated from the pre-exec hook - next patches will fix it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 23 ++++++++++++++++++++--- src/qemu/qemu_domain_namespace.h | 8 +++++--- src/qemu/qemu_process.c | 6 +++++- 3 files changed, 30 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 18ec86816c..38abed56c8 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -856,10 +856,27 @@ qemuDomainSetupLaunchSecurity(virDomainObjPtr vm, } +static int +qemuDomainNamespaceMknodPaths(virDomainObjPtr vm, + const char **paths); + + int -qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, - virSecurityManagerPtr mgr, - virDomainObjPtr vm) +qemuDomainBuildNamespace(virDomainObjPtr vm) +{ + VIR_AUTOSTRINGLIST paths = NULL; + + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) + return -1; + + return 0; +} + + +int +qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, + virSecurityManagerPtr mgr, + virDomainObjPtr vm) { struct qemuDomainCreateDeviceData data; const char *devPath = NULL; diff --git a/src/qemu/qemu_domain_namespace.h b/src/qemu/qemu_domain_namespace.h index 0182ce50a2..70eebf4dc4 100644 --- a/src/qemu/qemu_domain_namespace.h +++ b/src/qemu/qemu_domain_namespace.h @@ -37,9 +37,11 @@ int qemuDomainEnableNamespace(virDomainObjPtr vm, bool qemuDomainNamespaceEnabled(virDomainObjPtr vm, qemuDomainNamespace ns); -int qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, - virSecurityManagerPtr mgr, - virDomainObjPtr vm); +int qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, + virSecurityManagerPtr mgr, + virDomainObjPtr vm); + +int qemuDomainBuildNamespace(virDomainObjPtr vm); void qemuDomainDestroyNamespace(virQEMUDriverPtr driver, virDomainObjPtr vm); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index c076dcac3a..bee0fd031b 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3164,7 +3164,7 @@ static int qemuProcessHook(void *data) if (qemuSecurityClearSocketLabel(h->driver->securityManager, h->vm->def) < 0) goto cleanup; - if (qemuDomainBuildNamespace(h->cfg, h->driver->securityManager, h->vm) < 0) + if (qemuDomainUnshareNamespace(h->cfg, h->driver->securityManager, h->vm) < 0) goto cleanup; if (virDomainNumatuneGetMode(h->vm->def->numa, -1, &mode) == 0) { @@ -6831,6 +6831,10 @@ qemuProcessLaunch(virConnectPtr conn, goto cleanup; } + VIR_DEBUG("Building domain mount namespace (if required)"); + if (qemuDomainBuildNamespace(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting up domain cgroup (if required)"); if (qemuSetupCgroup(vm, nnicindexes, nicindexes) < 0) goto cleanup; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
Okay, here is the deal. Currently, the way we build namespace is very fragile. It is done from pre-exec hook when starting a domain, after we mass closed all FDs and before we drop privileges and exec() QEMU. This fact poses some limitations onto the namespace build code, e.g. it has to make sure not to keep any FD opened (not even through a library call), because it would be leaked to QEMU. Also, it has to call only async signal safe functions. These requirements are hard to meet - in fact as of my commit v6.2.0-rc1~235 we are leaking a FD into QEMU by calling libdevmapper functions.
To solve this issue and avoid similar problems in the future, we should change our paradigm. We already have functions which can populate domain's namespace with nodes from the daemon context. If we use them to populate the namespace and keep only the bare minimum in the pre-exec hook, we've mitigated the risk.
Therefore, the old qemuDomainBuildNamespace() is renamed to qemuDomainUnshareNamespace() and new qemuDomainBuildNamespace() function is introduced. So far, the new function is basically a NOP and domain's namespace is still populated from the pre-exec hook - next patches will fix it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 23 ++++++++++++++++++++--- src/qemu/qemu_domain_namespace.h | 8 +++++--- src/qemu/qemu_process.c | 6 +++++- 3 files changed, 30 insertions(+), 7 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

As mentioned in previous commit, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with basic /dev nodes (e.g. /dev/null, /dev/kvm, etc.) into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 23 +++++++++++------------ src/qemu/qemu_domain_namespace.h | 3 ++- src/qemu/qemu_process.c | 2 +- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 38abed56c8..17c804dfca 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -435,8 +435,7 @@ qemuDomainCreateDevice(const char *device, static int qemuDomainPopulateDevices(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm G_GNUC_UNUSED, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { const char *const *devices = (const char *const *) cfg->cgroupDeviceACL; size_t i; @@ -445,7 +444,7 @@ qemuDomainPopulateDevices(virQEMUDriverConfigPtr cfg, devices = defaultDeviceACL; for (i = 0; devices[i]; i++) { - if (qemuDomainCreateDevice(devices[i], data, true) < 0) + if (virStringListAdd(paths, devices[i]) < 0) return -1; } @@ -454,10 +453,9 @@ qemuDomainPopulateDevices(virQEMUDriverConfigPtr cfg, static int -qemuDomainSetupDev(virQEMUDriverConfigPtr cfg, - virSecurityManagerPtr mgr, +qemuDomainSetupDev(virSecurityManagerPtr mgr, virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + const char *path) { g_autofree char *mount_options = NULL; g_autofree char *opts = NULL; @@ -475,10 +473,7 @@ qemuDomainSetupDev(virQEMUDriverConfigPtr cfg, */ opts = g_strdup_printf("mode=755,size=65536%s", mount_options); - if (virFileSetupDev(data->path, opts) < 0) - return -1; - - if (qemuDomainPopulateDevices(cfg, vm, data) < 0) + if (virFileSetupDev(path, opts) < 0) return -1; return 0; @@ -862,10 +857,14 @@ qemuDomainNamespaceMknodPaths(virDomainObjPtr vm, int -qemuDomainBuildNamespace(virDomainObjPtr vm) +qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm) { VIR_AUTOSTRINGLIST paths = NULL; + if (qemuDomainPopulateDevices(cfg, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; @@ -914,7 +913,7 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (virProcessSetupPrivateMountNS() < 0) goto cleanup; - if (qemuDomainSetupDev(cfg, mgr, vm, &data) < 0) + if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup; if (qemuDomainSetupAllDisks(vm, &data) < 0) diff --git a/src/qemu/qemu_domain_namespace.h b/src/qemu/qemu_domain_namespace.h index 70eebf4dc4..644f2adef3 100644 --- a/src/qemu/qemu_domain_namespace.h +++ b/src/qemu/qemu_domain_namespace.h @@ -41,7 +41,8 @@ int qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, virSecurityManagerPtr mgr, virDomainObjPtr vm); -int qemuDomainBuildNamespace(virDomainObjPtr vm); +int qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm); void qemuDomainDestroyNamespace(virQEMUDriverPtr driver, virDomainObjPtr vm); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index bee0fd031b..e2f32dc25a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -6832,7 +6832,7 @@ qemuProcessLaunch(virConnectPtr conn, } VIR_DEBUG("Building domain mount namespace (if required)"); - if (qemuDomainBuildNamespace(vm) < 0) + if (qemuDomainBuildNamespace(cfg, vm) < 0) goto cleanup; VIR_DEBUG("Setting up domain cgroup (if required)"); -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
As mentioned in previous commit, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with basic /dev nodes (e.g. /dev/null, /dev/kvm, etc.) into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 23 +++++++++++------------ src/qemu/qemu_domain_namespace.h | 3 ++- src/qemu/qemu_process.c | 2 +- 3 files changed, 14 insertions(+), 14 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

On Wed, Jul 22, 2020 at 11:40:10AM +0200, Michal Privoznik wrote:
As mentioned in previous commit, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with basic /dev nodes (e.g. /dev/null, /dev/kvm, etc.) into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 23 +++++++++++------------ src/qemu/qemu_domain_namespace.h | 3 ++- src/qemu/qemu_process.c | 2 +- 3 files changed, 14 insertions(+), 14 deletions(-)
I don't understand why, but this commit has broken QEMU startup on hosts without KVM. It now always dies with error : qemuNamespaceMknodItemInit:1341 : Unable to access /dev/kvm: No such file or directory This was git bisect identified, but since theres no mention of kvm in this patch, I'm going to assume the actual bug is hiding dormant in a previous patch until this patch activates the bug.
diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 38abed56c8..17c804dfca 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -435,8 +435,7 @@ qemuDomainCreateDevice(const char *device,
static int qemuDomainPopulateDevices(virQEMUDriverConfigPtr cfg, - virDomainObjPtr vm G_GNUC_UNUSED, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { const char *const *devices = (const char *const *) cfg->cgroupDeviceACL; size_t i; @@ -445,7 +444,7 @@ qemuDomainPopulateDevices(virQEMUDriverConfigPtr cfg, devices = defaultDeviceACL;
for (i = 0; devices[i]; i++) { - if (qemuDomainCreateDevice(devices[i], data, true) < 0) + if (virStringListAdd(paths, devices[i]) < 0) return -1; }
@@ -454,10 +453,9 @@ qemuDomainPopulateDevices(virQEMUDriverConfigPtr cfg,
static int -qemuDomainSetupDev(virQEMUDriverConfigPtr cfg, - virSecurityManagerPtr mgr, +qemuDomainSetupDev(virSecurityManagerPtr mgr, virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + const char *path) { g_autofree char *mount_options = NULL; g_autofree char *opts = NULL; @@ -475,10 +473,7 @@ qemuDomainSetupDev(virQEMUDriverConfigPtr cfg, */ opts = g_strdup_printf("mode=755,size=65536%s", mount_options);
- if (virFileSetupDev(data->path, opts) < 0) - return -1; - - if (qemuDomainPopulateDevices(cfg, vm, data) < 0) + if (virFileSetupDev(path, opts) < 0) return -1;
return 0; @@ -862,10 +857,14 @@ qemuDomainNamespaceMknodPaths(virDomainObjPtr vm,
int -qemuDomainBuildNamespace(virDomainObjPtr vm) +qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm) { VIR_AUTOSTRINGLIST paths = NULL;
+ if (qemuDomainPopulateDevices(cfg, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1;
@@ -914,7 +913,7 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (virProcessSetupPrivateMountNS() < 0) goto cleanup;
- if (qemuDomainSetupDev(cfg, mgr, vm, &data) < 0) + if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup;
if (qemuDomainSetupAllDisks(vm, &data) < 0) diff --git a/src/qemu/qemu_domain_namespace.h b/src/qemu/qemu_domain_namespace.h index 70eebf4dc4..644f2adef3 100644 --- a/src/qemu/qemu_domain_namespace.h +++ b/src/qemu/qemu_domain_namespace.h @@ -41,7 +41,8 @@ int qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, virSecurityManagerPtr mgr, virDomainObjPtr vm);
-int qemuDomainBuildNamespace(virDomainObjPtr vm); +int qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, + virDomainObjPtr vm);
void qemuDomainDestroyNamespace(virQEMUDriverPtr driver, virDomainObjPtr vm); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index bee0fd031b..e2f32dc25a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -6832,7 +6832,7 @@ qemuProcessLaunch(virConnectPtr conn, }
VIR_DEBUG("Building domain mount namespace (if required)"); - if (qemuDomainBuildNamespace(vm) < 0) + if (qemuDomainBuildNamespace(cfg, vm) < 0) goto cleanup;
VIR_DEBUG("Setting up domain cgroup (if required)"); -- 2.26.2
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 9/3/20 2:09 PM, Daniel P. Berrangé wrote:
On Wed, Jul 22, 2020 at 11:40:10AM +0200, Michal Privoznik wrote:
As mentioned in previous commit, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with basic /dev nodes (e.g. /dev/null, /dev/kvm, etc.) into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 23 +++++++++++------------ src/qemu/qemu_domain_namespace.h | 3 ++- src/qemu/qemu_process.c | 2 +- 3 files changed, 14 insertions(+), 14 deletions(-)
I don't understand why, but this commit has broken QEMU startup on hosts without KVM. It now always dies with
error : qemuNamespaceMknodItemInit:1341 : Unable to access /dev/kvm: No such file or directory
This was git bisect identified, but since theres no mention of kvm in this patch, I'm going to assume the actual bug is hiding dormant in a previous patch until this patch activates the bug.
Let me try to reproduce and write a fix. I assume unloading KVM module is enough, isn't it? Michal

On Thu, Sep 03, 2020 at 04:40:52PM +0200, Michal Prívozník wrote:
On 9/3/20 2:09 PM, Daniel P. Berrangé wrote:
On Wed, Jul 22, 2020 at 11:40:10AM +0200, Michal Privoznik wrote:
As mentioned in previous commit, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with basic /dev nodes (e.g. /dev/null, /dev/kvm, etc.) into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 23 +++++++++++------------ src/qemu/qemu_domain_namespace.h | 3 ++- src/qemu/qemu_process.c | 2 +- 3 files changed, 14 insertions(+), 14 deletions(-)
I don't understand why, but this commit has broken QEMU startup on hosts without KVM. It now always dies with
error : qemuNamespaceMknodItemInit:1341 : Unable to access /dev/kvm: No such file or directory
This was git bisect identified, but since theres no mention of kvm in this patch, I'm going to assume the actual bug is hiding dormant in a previous patch until this patch activates the bug.
Let me try to reproduce and write a fix. I assume unloading KVM module is enough, isn't it?
Yep, unloading, or even just rm /dev/kvm is enough Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 9/3/20 4:42 PM, Daniel P. Berrangé wrote:
On Thu, Sep 03, 2020 at 04:40:52PM +0200, Michal Prívozník wrote:
On 9/3/20 2:09 PM, Daniel P. Berrangé wrote:
On Wed, Jul 22, 2020 at 11:40:10AM +0200, Michal Privoznik wrote:
As mentioned in previous commit, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with basic /dev nodes (e.g. /dev/null, /dev/kvm, etc.) into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 23 +++++++++++------------ src/qemu/qemu_domain_namespace.h | 3 ++- src/qemu/qemu_process.c | 2 +- 3 files changed, 14 insertions(+), 14 deletions(-)
I don't understand why, but this commit has broken QEMU startup on hosts without KVM. It now always dies with
error : qemuNamespaceMknodItemInit:1341 : Unable to access /dev/kvm: No such file or directory
This was git bisect identified, but since theres no mention of kvm in this patch, I'm going to assume the actual bug is hiding dormant in a previous patch until this patch activates the bug.
Let me try to reproduce and write a fix. I assume unloading KVM module is enough, isn't it?
Yep, unloading, or even just rm /dev/kvm is enough
So I think I know what the problem is. When domain's /dev is being built, it's firstly populated with cfg->cgroupDeviceACL (which contains /dev/kvm by default). Previously, the code was ENOENT tolerant, now it is not. Let me post a patch for that. Michal

As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain disks into daemon's namespace. Fixes: a30078cb832646177defd256e77c632905f1e6d0 Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1858260 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 89 ++++++++------------------------ 1 file changed, 22 insertions(+), 67 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 17c804dfca..61f7846009 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -482,33 +482,29 @@ qemuDomainSetupDev(virSecurityManagerPtr mgr, static int qemuDomainSetupDisk(virStorageSourcePtr src, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { virStorageSourcePtr next; bool hasNVMe = false; for (next = src; virStorageSourceIsBacking(next); next = next->backingStore) { - VIR_AUTOSTRINGLIST targetPaths = NULL; - size_t i; + g_autofree char *tmpPath = NULL; if (next->type == VIR_STORAGE_TYPE_NVME) { - g_autofree char *nvmePath = NULL; - hasNVMe = true; - if (!(nvmePath = virPCIDeviceAddressGetIOMMUGroupDev(&next->nvme->pciAddr))) - return -1; - - if (qemuDomainCreateDevice(nvmePath, data, false) < 0) + if (!(tmpPath = virPCIDeviceAddressGetIOMMUGroupDev(&next->nvme->pciAddr))) return -1; } else { - if (!next->path || !virStorageSourceIsLocalStorage(next)) { + VIR_AUTOSTRINGLIST targetPaths = NULL; + + if (virStorageSourceIsEmpty(next) || + !virStorageSourceIsLocalStorage(next)) { /* Not creating device. Just continue. */ continue; } - if (qemuDomainCreateDevice(next->path, data, false) < 0) - return -1; + tmpPath = g_strdup(next->path); if (virDevMapperGetTargets(next->path, &targetPaths) < 0 && errno != ENOSYS) { @@ -518,20 +514,21 @@ qemuDomainSetupDisk(virStorageSourcePtr src, return -1; } - for (i = 0; targetPaths && targetPaths[i]; i++) { - if (qemuDomainCreateDevice(targetPaths[i], data, false) < 0) - return -1; - } + if (virStringListMerge(paths, &targetPaths) < 0) + return -1; } + + if (virStringListAdd(paths, tmpPath) < 0) + return -1; } /* qemu-pr-helper might require access to /dev/mapper/control. */ if (src->pr && - qemuDomainCreateDevice(QEMU_DEVICE_MAPPER_CONTROL_PATH, data, true) < 0) + virStringListAdd(paths, QEMU_DEVICE_MAPPER_CONTROL_PATH) < 0) return -1; if (hasNVMe && - qemuDomainCreateDevice(QEMU_DEV_VFIO, data, false) < 0) + virStringListAdd(paths, QEMU_DEV_VFIO) < 0) return -1; return 0; @@ -540,14 +537,15 @@ qemuDomainSetupDisk(virStorageSourcePtr src, static int qemuDomainSetupAllDisks(virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { size_t i; + VIR_DEBUG("Setting up disks"); for (i = 0; i < vm->def->ndisks; i++) { if (qemuDomainSetupDisk(vm->def->disks[i]->src, - data) < 0) + paths) < 0) return -1; } @@ -865,6 +863,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainPopulateDevices(cfg, &paths) < 0) return -1; + if (qemuDomainSetupAllDisks(vm, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; @@ -916,9 +917,6 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup; - if (qemuDomainSetupAllDisks(vm, &data) < 0) - goto cleanup; - if (qemuDomainSetupAllHostdevs(vm, &data) < 0) goto cleanup; @@ -1639,55 +1637,12 @@ int qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, virStorageSourcePtr src) { - virStorageSourcePtr next; VIR_AUTOSTRINGLIST paths = NULL; - bool hasNVMe = false; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; - for (next = src; virStorageSourceIsBacking(next); next = next->backingStore) { - g_autofree char *tmpPath = NULL; - - if (next->type == VIR_STORAGE_TYPE_NVME) { - hasNVMe = true; - - if (!(tmpPath = virPCIDeviceAddressGetIOMMUGroupDev(&next->nvme->pciAddr))) - return -1; - } else { - VIR_AUTOSTRINGLIST targetPaths = NULL; - - if (virStorageSourceIsEmpty(next) || - !virStorageSourceIsLocalStorage(next)) { - /* Not creating device. Just continue. */ - continue; - } - - tmpPath = g_strdup(next->path); - - if (virDevMapperGetTargets(next->path, &targetPaths) < 0 && - errno != ENOSYS) { - virReportSystemError(errno, - _("Unable to get devmapper targets for %s"), - next->path); - return -1; - } - - if (virStringListMerge(&paths, &targetPaths) < 0) - return -1; - } - - if (virStringListAdd(&paths, tmpPath) < 0) - return -1; - } - - /* qemu-pr-helper might require access to /dev/mapper/control. */ - if (src->pr && - virStringListAdd(&paths, QEMU_DEVICE_MAPPER_CONTROL_PATH) < 0) - return -1; - - if (hasNVMe && - virStringListAdd(&paths, QEMU_DEV_VFIO) < 0) + if (qemuDomainSetupDisk(src, &paths) < 0) return -1; if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain disks into daemon's namespace.
Fixes: a30078cb832646177defd256e77c632905f1e6d0 Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1858260
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 89 ++++++++------------------------ 1 file changed, 22 insertions(+), 67 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain hostdevs into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 43 ++++++++++++++++---------------- 1 file changed, 21 insertions(+), 22 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 61f7846009..2517832a8d 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -555,19 +555,22 @@ qemuDomainSetupAllDisks(virDomainObjPtr vm, static int -qemuDomainSetupHostdev(virDomainHostdevDefPtr dev, - const struct qemuDomainCreateDeviceData *data) +qemuDomainSetupHostdev(virDomainObjPtr vm, + virDomainHostdevDefPtr hostdev, + bool hotplug, + char ***paths) { g_autofree char *path = NULL; - if (qemuDomainGetHostdevPath(dev, &path, NULL) < 0) + if (qemuDomainGetHostdevPath(hostdev, &path, NULL) < 0) return -1; - if (path && qemuDomainCreateDevice(path, data, false) < 0) + if (path && virStringListAdd(paths, path) < 0) return -1; - if (qemuHostdevNeedsVFIO(dev) && - qemuDomainCreateDevice(QEMU_DEV_VFIO, data, false) < 0) + if (qemuHostdevNeedsVFIO(hostdev) && + (!hotplug || !qemuDomainNeedsVFIO(vm->def)) && + virStringListAdd(paths, QEMU_DEV_VFIO) < 0) return -1; return 0; @@ -576,14 +579,16 @@ qemuDomainSetupHostdev(virDomainHostdevDefPtr dev, static int qemuDomainSetupAllHostdevs(virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { size_t i; VIR_DEBUG("Setting up hostdevs"); for (i = 0; i < vm->def->nhostdevs; i++) { - if (qemuDomainSetupHostdev(vm->def->hostdevs[i], - data) < 0) + if (qemuDomainSetupHostdev(vm, + vm->def->hostdevs[i], + false, + paths) < 0) return -1; } VIR_DEBUG("Setup all hostdevs"); @@ -866,6 +871,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupAllDisks(vm, &paths) < 0) return -1; + if (qemuDomainSetupAllHostdevs(vm, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; @@ -917,9 +925,6 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup; - if (qemuDomainSetupAllHostdevs(vm, &data) < 0) - goto cleanup; - if (qemuDomainSetupAllMemories(vm, &data) < 0) goto cleanup; @@ -1681,21 +1686,15 @@ int qemuDomainNamespaceSetupHostdev(virDomainObjPtr vm, virDomainHostdevDefPtr hostdev) { - g_autofree char *path = NULL; VIR_AUTOSTRINGLIST paths = NULL; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; - if (qemuDomainGetHostdevPath(hostdev, &path, NULL) < 0) - return -1; - - if (path && virStringListAdd(&paths, path) < 0) - return -1; - - if (qemuHostdevNeedsVFIO(hostdev) && - !qemuDomainNeedsVFIO(vm->def) && - virStringListAdd(&paths, QEMU_DEV_VFIO) < 0) + if (qemuDomainSetupHostdev(vm, + hostdev, + true, + &paths) < 0) return -1; if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain hostdevs into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 43 ++++++++++++++++---------------- 1 file changed, 21 insertions(+), 22 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain memory (nvdimms) into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 2517832a8d..bafb08fac8 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -598,25 +598,25 @@ qemuDomainSetupAllHostdevs(virDomainObjPtr vm, static int qemuDomainSetupMemory(virDomainMemoryDefPtr mem, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) return 0; - return qemuDomainCreateDevice(mem->nvdimmPath, data, false); + return virStringListAdd(paths, mem->nvdimmPath); } static int qemuDomainSetupAllMemories(virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { size_t i; VIR_DEBUG("Setting up memories"); for (i = 0; i < vm->def->nmems; i++) { if (qemuDomainSetupMemory(vm->def->mems[i], - data) < 0) + paths) < 0) return -1; } VIR_DEBUG("Setup all memories"); @@ -874,6 +874,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupAllHostdevs(vm, &paths) < 0) return -1; + if (qemuDomainSetupAllMemories(vm, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; @@ -925,9 +928,6 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup; - if (qemuDomainSetupAllMemories(vm, &data) < 0) - goto cleanup; - if (qemuDomainSetupAllChardevs(vm, &data) < 0) goto cleanup; @@ -1743,13 +1743,15 @@ int qemuDomainNamespaceSetupMemory(virDomainObjPtr vm, virDomainMemoryDefPtr mem) { + VIR_AUTOSTRINGLIST paths = NULL; + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; - if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) - return 0; + if (qemuDomainSetupMemory(mem, &paths) < 0) + return -1; - if (qemuDomainNamespaceMknodPath(vm, mem->nvdimmPath) < 0) + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; return 0; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain memory (nvdimms) into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain chardevs into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index bafb08fac8..36d22b42f2 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -629,7 +629,7 @@ qemuDomainSetupChardev(virDomainDefPtr def G_GNUC_UNUSED, virDomainChrDefPtr dev, void *opaque) { - const struct qemuDomainCreateDeviceData *data = opaque; + char ***paths = opaque; const char *path = NULL; if (!(path = virDomainChrSourceDefGetPath(dev->source))) @@ -640,20 +640,20 @@ qemuDomainSetupChardev(virDomainDefPtr def G_GNUC_UNUSED, dev->source->data.nix.listen) return 0; - return qemuDomainCreateDevice(path, data, true); + return virStringListAdd(paths, path); } static int qemuDomainSetupAllChardevs(virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { VIR_DEBUG("Setting up chardevs"); if (virDomainChrDefForeach(vm->def, true, qemuDomainSetupChardev, - (void *)data) < 0) + paths) < 0) return -1; VIR_DEBUG("Setup all chardevs"); @@ -877,6 +877,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupAllMemories(vm, &paths) < 0) return -1; + if (qemuDomainSetupAllChardevs(vm, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; @@ -928,9 +931,6 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup; - if (qemuDomainSetupAllChardevs(vm, &data) < 0) - goto cleanup; - if (qemuDomainSetupAllTPMs(vm, &data) < 0) goto cleanup; @@ -1779,20 +1779,15 @@ int qemuDomainNamespaceSetupChardev(virDomainObjPtr vm, virDomainChrDefPtr chr) { - const char *path; + VIR_AUTOSTRINGLIST paths = NULL; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; - if (!(path = virDomainChrSourceDefGetPath(chr->source))) - return 0; + if (qemuDomainSetupChardev(vm->def, chr, &paths) < 0) + return -1; - /* Socket created by qemu. It doesn't exist upfront. */ - if (chr->source->type == VIR_DOMAIN_CHR_TYPE_UNIX && - chr->source->data.nix.listen) - return 0; - - if (qemuDomainNamespaceMknodPath(vm, path) < 0) + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; return 0; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain chardevs into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index bafb08fac8..36d22b42f2 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -629,7 +629,7 @@ qemuDomainSetupChardev(virDomainDefPtr def G_GNUC_UNUSED, virDomainChrDefPtr dev, void *opaque) { - const struct qemuDomainCreateDeviceData *data = opaque; + char ***paths = opaque; const char *path = NULL;
if (!(path = virDomainChrSourceDefGetPath(dev->source))) @@ -640,20 +640,20 @@ qemuDomainSetupChardev(virDomainDefPtr def G_GNUC_UNUSED, dev->source->data.nix.listen) return 0;
- return qemuDomainCreateDevice(path, data, true); + return virStringListAdd(paths, path); }
static int qemuDomainSetupAllChardevs(virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { VIR_DEBUG("Setting up chardevs");
if (virDomainChrDefForeach(vm->def, true, qemuDomainSetupChardev, - (void *)data) < 0) + paths) < 0) return -1;
VIR_DEBUG("Setup all chardevs"); @@ -877,6 +877,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupAllMemories(vm, &paths) < 0) return -1;
+ if (qemuDomainSetupAllChardevs(vm, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1;
@@ -928,9 +931,6 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup;
- if (qemuDomainSetupAllChardevs(vm, &data) < 0) - goto cleanup; - if (qemuDomainSetupAllTPMs(vm, &data) < 0) goto cleanup;
@@ -1779,20 +1779,15 @@ int qemuDomainNamespaceSetupChardev(virDomainObjPtr vm, virDomainChrDefPtr chr) { - const char *path; + VIR_AUTOSTRINGLIST paths = NULL;
if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0;
- if (!(path = virDomainChrSourceDefGetPath(chr->source))) - return 0; + if (qemuDomainSetupChardev(vm->def, chr, &paths) < 0) + return -1;
- /* Socket created by qemu. It doesn't exist upfront. */ - if (chr->source->type == VIR_DOMAIN_CHR_TYPE_UNIX && - chr->source->data.nix.listen) - return 0; -
Hmm, this is not necessarily true. qemuBuildChrChardevStr opens listen type sockets if QEMU supports FD passing for them.
- if (qemuDomainNamespaceMknodPath(vm, path) < 0) + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1;
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain TPM into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 36d22b42f2..138dc63489 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -663,12 +663,11 @@ qemuDomainSetupAllChardevs(virDomainObjPtr vm, static int qemuDomainSetupTPM(virDomainTPMDefPtr dev, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { switch (dev->type) { case VIR_DOMAIN_TPM_TYPE_PASSTHROUGH: - if (qemuDomainCreateDevice(dev->data.passthrough.source.data.file.path, - data, false) < 0) + if (virStringListAdd(paths, dev->data.passthrough.source.data.file.path) < 0) return -1; break; @@ -684,14 +683,14 @@ qemuDomainSetupTPM(virDomainTPMDefPtr dev, static int qemuDomainSetupAllTPMs(virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { size_t i; VIR_DEBUG("Setting up TPMs"); for (i = 0; i < vm->def->ntpms; i++) { - if (qemuDomainSetupTPM(vm->def->tpms[i], data) < 0) + if (qemuDomainSetupTPM(vm->def->tpms[i], paths) < 0) return -1; } @@ -880,6 +879,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupAllChardevs(vm, &paths) < 0) return -1; + if (qemuDomainSetupAllTPMs(vm, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; @@ -931,9 +933,6 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup; - if (qemuDomainSetupAllTPMs(vm, &data) < 0) - goto cleanup; - if (qemuDomainSetupAllGraphics(vm, &data) < 0) goto cleanup; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain TPM into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain graphics (render node) into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 138dc63489..8a77c067c8 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -701,27 +701,27 @@ qemuDomainSetupAllTPMs(virDomainObjPtr vm, static int qemuDomainSetupGraphics(virDomainGraphicsDefPtr gfx, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { const char *rendernode = virDomainGraphicsGetRenderNode(gfx); if (!rendernode) return 0; - return qemuDomainCreateDevice(rendernode, data, false); + return virStringListAdd(paths, rendernode); } static int qemuDomainSetupAllGraphics(virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { size_t i; VIR_DEBUG("Setting up graphics"); for (i = 0; i < vm->def->ngraphics; i++) { if (qemuDomainSetupGraphics(vm->def->graphics[i], - data) < 0) + paths) < 0) return -1; } @@ -882,6 +882,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupAllTPMs(vm, &paths) < 0) return -1; + if (qemuDomainSetupAllGraphics(vm, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; @@ -933,9 +936,6 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup; - if (qemuDomainSetupAllGraphics(vm, &data) < 0) - goto cleanup; - if (qemuDomainSetupAllInputs(vm, &data) < 0) goto cleanup; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain graphics (render node) into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain inputs into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 8a77c067c8..f709fbb616 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -732,11 +732,11 @@ qemuDomainSetupAllGraphics(virDomainObjPtr vm, static int qemuDomainSetupInput(virDomainInputDefPtr input, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { const char *path = virDomainInputDefGetPath(input); - if (path && qemuDomainCreateDevice(path, data, false) < 0) + if (path && virStringListAdd(paths, path) < 0) return -1; return 0; @@ -745,14 +745,14 @@ qemuDomainSetupInput(virDomainInputDefPtr input, static int qemuDomainSetupAllInputs(virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { size_t i; VIR_DEBUG("Setting up inputs"); for (i = 0; i < vm->def->ninputs; i++) { if (qemuDomainSetupInput(vm->def->inputs[i], - data) < 0) + paths) < 0) return -1; } VIR_DEBUG("Setup all inputs"); @@ -885,6 +885,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupAllGraphics(vm, &paths) < 0) return -1; + if (qemuDomainSetupAllInputs(vm, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; @@ -936,9 +939,6 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup; - if (qemuDomainSetupAllInputs(vm, &data) < 0) - goto cleanup; - if (qemuDomainSetupAllRNGs(vm, &data) < 0) goto cleanup; @@ -1872,15 +1872,15 @@ int qemuDomainNamespaceSetupInput(virDomainObjPtr vm, virDomainInputDefPtr input) { - const char *path = NULL; + VIR_AUTOSTRINGLIST paths = NULL; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; - if (!(path = virDomainInputDefGetPath(input))) - return 0; + if (qemuDomainSetupInput(input, &paths) < 0) + return -1; - if (path && qemuDomainNamespaceMknodPath(vm, path) < 0) + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; return 0; } -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain inputs into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain RNGs into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 40 +++++++++----------------------- 1 file changed, 11 insertions(+), 29 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index f709fbb616..2ab10cb9f0 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -762,11 +762,11 @@ qemuDomainSetupAllInputs(virDomainObjPtr vm, static int qemuDomainSetupRNG(virDomainRNGDefPtr rng, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { switch ((virDomainRNGBackend) rng->backend) { case VIR_DOMAIN_RNG_BACKEND_RANDOM: - if (qemuDomainCreateDevice(rng->source.file, data, false) < 0) + if (virStringListAdd(paths, rng->source.file) < 0) return -1; break; @@ -783,14 +783,14 @@ qemuDomainSetupRNG(virDomainRNGDefPtr rng, static int qemuDomainSetupAllRNGs(virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { size_t i; VIR_DEBUG("Setting up RNGs"); for (i = 0; i < vm->def->nrngs; i++) { if (qemuDomainSetupRNG(vm->def->rngs[i], - data) < 0) + paths) < 0) return -1; } @@ -888,6 +888,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupAllInputs(vm, &paths) < 0) return -1; + if (qemuDomainSetupAllRNGs(vm, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; @@ -939,9 +942,6 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup; - if (qemuDomainSetupAllRNGs(vm, &data) < 0) - goto cleanup; - if (qemuDomainSetupLoader(vm, &data) < 0) goto cleanup; @@ -1581,16 +1581,6 @@ qemuDomainDetachDeviceUnlink(virQEMUDriverPtr driver G_GNUC_UNUSED, } -static int -qemuDomainNamespaceMknodPath(virDomainObjPtr vm, - const char *path) -{ - const char *paths[] = { path, NULL }; - - return qemuDomainNamespaceMknodPaths(vm, paths); -} - - static int qemuDomainNamespaceUnlinkPaths(virDomainObjPtr vm, const char **paths, @@ -1818,23 +1808,15 @@ int qemuDomainNamespaceSetupRNG(virDomainObjPtr vm, virDomainRNGDefPtr rng) { - const char *path = NULL; + VIR_AUTOSTRINGLIST paths = NULL; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; - switch ((virDomainRNGBackend) rng->backend) { - case VIR_DOMAIN_RNG_BACKEND_RANDOM: - path = rng->source.file; - break; + if (qemuDomainSetupRNG(rng, &paths) < 0) + return -1; - case VIR_DOMAIN_RNG_BACKEND_EGD: - case VIR_DOMAIN_RNG_BACKEND_BUILTIN: - case VIR_DOMAIN_RNG_BACKEND_LAST: - break; - } - - if (path && qemuDomainNamespaceMknodPath(vm, path) < 0) + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; return 0; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain RNGs into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 40 +++++++++----------------------- 1 file changed, 11 insertions(+), 29 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain loader into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 2ab10cb9f0..66c6cedadf 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -801,7 +801,7 @@ qemuDomainSetupAllRNGs(virDomainObjPtr vm, static int qemuDomainSetupLoader(virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { virDomainLoaderDefPtr loader = vm->def->os.loader; @@ -810,16 +810,16 @@ qemuDomainSetupLoader(virDomainObjPtr vm, if (loader) { switch ((virDomainLoader) loader->type) { case VIR_DOMAIN_LOADER_TYPE_ROM: - if (qemuDomainCreateDevice(loader->path, data, false) < 0) + if (virStringListAdd(paths, loader->path) < 0) return -1; break; case VIR_DOMAIN_LOADER_TYPE_PFLASH: - if (qemuDomainCreateDevice(loader->path, data, false) < 0) + if (virStringListAdd(paths, loader->path) < 0) return -1; if (loader->nvram && - qemuDomainCreateDevice(loader->nvram, data, false) < 0) + virStringListAdd(paths, loader->nvram) < 0) return -1; break; @@ -891,6 +891,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupAllRNGs(vm, &paths) < 0) return -1; + if (qemuDomainSetupLoader(vm, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; @@ -942,9 +945,6 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup; - if (qemuDomainSetupLoader(vm, &data) < 0) - goto cleanup; - if (qemuDomainSetupLaunchSecurity(vm, &data) < 0) goto cleanup; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain loader into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain SEV into daemon's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 66c6cedadf..e569b1dbe1 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -421,7 +421,7 @@ qemuDomainCreateDeviceRecursive(const char *device, } -static int +static int G_GNUC_UNUSED qemuDomainCreateDevice(const char *device, const struct qemuDomainCreateDeviceData *data, bool allow_noent) @@ -836,7 +836,7 @@ qemuDomainSetupLoader(virDomainObjPtr vm, static int qemuDomainSetupLaunchSecurity(virDomainObjPtr vm, - const struct qemuDomainCreateDeviceData *data) + char ***paths) { virDomainSEVDefPtr sev = vm->def->sev; @@ -845,7 +845,7 @@ qemuDomainSetupLaunchSecurity(virDomainObjPtr vm, VIR_DEBUG("Setting up launch security"); - if (qemuDomainCreateDevice(QEMU_DEV_SEV, data, false) < 0) + if (virStringListAdd(paths, QEMU_DEV_SEV) < 0) return -1; VIR_DEBUG("Set up launch security"); @@ -894,6 +894,9 @@ qemuDomainBuildNamespace(virQEMUDriverConfigPtr cfg, if (qemuDomainSetupLoader(vm, &paths) < 0) return -1; + if (qemuDomainSetupLaunchSecurity(vm, &paths) < 0) + return -1; + if (qemuDomainNamespaceMknodPaths(vm, (const char **) paths) < 0) return -1; @@ -906,7 +909,6 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, virSecurityManagerPtr mgr, virDomainObjPtr vm) { - struct qemuDomainCreateDeviceData data; const char *devPath = NULL; char **devMountsPath = NULL, **devMountsSavePath = NULL; size_t ndevMountsPath = 0, i; @@ -935,19 +937,12 @@ qemuDomainUnshareNamespace(virQEMUDriverConfigPtr cfg, goto cleanup; } - data.path = devPath; - data.devMountsPath = devMountsPath; - data.ndevMountsPath = ndevMountsPath; - if (virProcessSetupPrivateMountNS() < 0) goto cleanup; if (qemuDomainSetupDev(mgr, vm, devPath) < 0) goto cleanup; - if (qemuDomainSetupLaunchSecurity(vm, &data) < 0) - goto cleanup; - /* Save some mount points because we want to share them with the host */ for (i = 0; i < ndevMountsPath; i++) { struct stat sb; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
As mentioned in one of previous commits, populating domain's namespace from pre-exec() hook is dangerous. This commit moves population of the namespace with domain SEV into daemon's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

After previous cleanup, creating /dev nodes from pre-exec hook is no longer needed and thus can be removed. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 248 ------------------------------- 1 file changed, 248 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index e569b1dbe1..41d79559f9 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -185,254 +185,6 @@ qemuDomainGetPreservedMounts(virQEMUDriverConfigPtr cfg, } -struct qemuDomainCreateDeviceData { - const char *path; /* Path to temp new /dev location */ - char * const *devMountsPath; - size_t ndevMountsPath; -}; - - -static int -qemuDomainCreateDeviceRecursive(const char *device, - const struct qemuDomainCreateDeviceData *data, - bool allow_noent, - unsigned int ttl) -{ - g_autofree char *devicePath = NULL; - g_autofree char *target = NULL; - GStatBuf sb; - int ret = -1; - bool isLink = false; - bool isDev = false; - bool isReg = false; - bool isDir = false; - bool create = false; -#ifdef WITH_SELINUX - char *tcon = NULL; -#endif - - if (!ttl) { - virReportSystemError(ELOOP, - _("Too many levels of symbolic links: %s"), - device); - return ret; - } - - if (g_lstat(device, &sb) < 0) { - if (errno == ENOENT && allow_noent) { - /* Ignore non-existent device. */ - return 0; - } - virReportSystemError(errno, _("Unable to stat %s"), device); - return ret; - } - - isLink = S_ISLNK(sb.st_mode); - isDev = S_ISCHR(sb.st_mode) || S_ISBLK(sb.st_mode); - isReg = S_ISREG(sb.st_mode) || S_ISFIFO(sb.st_mode) || S_ISSOCK(sb.st_mode); - isDir = S_ISDIR(sb.st_mode); - - /* Here, @device might be whatever path in the system. We - * should create the path in the namespace iff it's "/dev" - * prefixed. However, if it is a symlink, we need to traverse - * it too (it might point to something in "/dev"). Just - * consider: - * - * /var/sym1 -> /var/sym2 -> /dev/sda (because users can) - * - * This means, "/var/sym1" is not created (it's shared with - * the parent namespace), nor "/var/sym2", but "/dev/sda". - * - * TODO Remove all `.' and `..' from the @device path. - * Otherwise we might get fooled with `/dev/../var/my_image'. - * For now, lets hope callers play nice. - */ - if (STRPREFIX(device, QEMU_DEVPREFIX)) { - size_t i; - - for (i = 0; i < data->ndevMountsPath; i++) { - if (STREQ(data->devMountsPath[i], "/dev")) - continue; - if (STRPREFIX(device, data->devMountsPath[i])) - break; - } - - if (i == data->ndevMountsPath) { - /* Okay, @device is in /dev but not in any mount point under /dev. - * Create it. */ - devicePath = g_strdup_printf("%s/%s", data->path, - device + strlen(QEMU_DEVPREFIX)); - - if (virFileMakeParentPath(devicePath) < 0) { - virReportSystemError(errno, - _("Unable to create %s"), - devicePath); - goto cleanup; - } - VIR_DEBUG("Creating dev %s", device); - create = true; - } else { - VIR_DEBUG("Skipping dev %s because of %s mount point", - device, data->devMountsPath[i]); - } - } - - if (isLink) { - g_autoptr(GError) gerr = NULL; - - /* We are dealing with a symlink. Create a dangling symlink and descend - * down one level which hopefully creates the symlink's target. */ - if (!(target = g_file_read_link(device, &gerr))) { - virReportError(VIR_ERR_SYSTEM_ERROR, - _("failed to resolve symlink %s: %s"), device, gerr->message); - goto cleanup; - } - - if (create && - symlink(target, devicePath) < 0) { - if (errno == EEXIST) { - ret = 0; - } else { - virReportSystemError(errno, - _("unable to create symlink %s"), - devicePath); - } - goto cleanup; - } - - /* Tricky part. If the target starts with a slash then we need to take - * it as it is. Otherwise we need to replace the last component in the - * original path with the link target: - * /dev/rtc -> rtc0 (want /dev/rtc0) - * /dev/disk/by-id/ata-SanDisk_SDSSDXPS480G_161101402485 -> ../../sda - * (want /dev/disk/by-id/../../sda) - * /dev/stdout -> /proc/self/fd/1 (no change needed) - */ - if (!g_path_is_absolute(target)) { - g_autofree char *devTmp = g_strdup(device); - char *c = NULL, *tmp = NULL; - - if ((c = strrchr(devTmp, '/'))) - *(c + 1) = '\0'; - - tmp = g_strdup_printf("%s%s", devTmp, target); - VIR_FREE(target); - target = g_steal_pointer(&tmp); - } - - if (qemuDomainCreateDeviceRecursive(target, data, - allow_noent, ttl - 1) < 0) - goto cleanup; - } else if (isDev) { - if (create) { - unlink(devicePath); - if (mknod(devicePath, sb.st_mode, sb.st_rdev) < 0) { - virReportSystemError(errno, - _("Failed to make device %s"), - devicePath); - goto cleanup; - } - } - } else if (isReg) { - if (create && - virFileTouch(devicePath, sb.st_mode) < 0) - goto cleanup; - /* Just create the file here so that code below sets - * proper owner and mode. Bind mount only after that. */ - } else if (isDir) { - if (create && - virFileMakePathWithMode(devicePath, sb.st_mode) < 0) { - virReportSystemError(errno, - _("Unable to make dir %s"), - devicePath); - goto cleanup; - } - } else { - virReportError(VIR_ERR_OPERATION_UNSUPPORTED, - _("unsupported device type %s 0%o"), - device, sb.st_mode); - goto cleanup; - } - - if (!create) { - ret = 0; - goto cleanup; - } - - if (lchown(devicePath, sb.st_uid, sb.st_gid) < 0) { - virReportSystemError(errno, - _("Failed to chown device %s"), - devicePath); - goto cleanup; - } - - /* Symlinks don't have mode */ - if (!isLink && - chmod(devicePath, sb.st_mode) < 0) { - virReportSystemError(errno, - _("Failed to set permissions for device %s"), - devicePath); - goto cleanup; - } - - /* Symlinks don't have ACLs. */ - if (!isLink && - virFileCopyACLs(device, devicePath) < 0 && - errno != ENOTSUP) { - virReportSystemError(errno, - _("Failed to copy ACLs on device %s"), - devicePath); - goto cleanup; - } - -#ifdef WITH_SELINUX - if (lgetfilecon_raw(device, &tcon) < 0 && - (errno != ENOTSUP && errno != ENODATA)) { - virReportSystemError(errno, - _("Unable to get SELinux label from %s"), - device); - goto cleanup; - } - - if (tcon && - lsetfilecon_raw(devicePath, (const char *)tcon) < 0) { - VIR_WARNINGS_NO_WLOGICALOP_EQUAL_EXPR - if (errno != EOPNOTSUPP && errno != ENOTSUP) { - VIR_WARNINGS_RESET - virReportSystemError(errno, - _("Unable to set SELinux label on %s"), - devicePath); - goto cleanup; - } - } -#endif - - /* Finish mount process started earlier. */ - if ((isReg || isDir) && - virFileBindMountDevice(device, devicePath) < 0) - goto cleanup; - - ret = 0; - cleanup: -#ifdef WITH_SELINUX - freecon(tcon); -#endif - return ret; -} - - -static int G_GNUC_UNUSED -qemuDomainCreateDevice(const char *device, - const struct qemuDomainCreateDeviceData *data, - bool allow_noent) -{ - long symloop_max = sysconf(_SC_SYMLOOP_MAX); - - return qemuDomainCreateDeviceRecursive(device, data, - allow_noent, symloop_max); -} - - static int qemuDomainPopulateDevices(virQEMUDriverConfigPtr cfg, char ***paths) -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
After previous cleanup, creating /dev nodes from pre-exec hook is no longer needed and thus can be removed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 248 ------------------------------- 1 file changed, 248 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Simirarly to qemuDomainAttachDeviceMknodHelper() which was modified just a couple of commits ago, modify the unlink helper which is called on device detach so that it can unlink multiple files in one go instead of forking off for every single one of them. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 65 +++++++++++++++----------------- 1 file changed, 30 insertions(+), 35 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 41d79559f9..4e0b50d885 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1286,44 +1286,21 @@ static int qemuDomainDetachDeviceUnlinkHelper(pid_t pid G_GNUC_UNUSED, void *opaque) { - const char *path = opaque; - - VIR_DEBUG("Unlinking %s", path); - if (unlink(path) < 0 && errno != ENOENT) { - virReportSystemError(errno, - _("Unable to remove device %s"), path); - return -1; - } - - return 0; -} - - -static int -qemuDomainDetachDeviceUnlink(virQEMUDriverPtr driver G_GNUC_UNUSED, - virDomainObjPtr vm, - const char *file, - char * const *devMountsPath, - size_t ndevMountsPath) -{ + char **paths = opaque; size_t i; - if (STRPREFIX(file, QEMU_DEVPREFIX)) { - for (i = 0; i < ndevMountsPath; i++) { - if (STREQ(devMountsPath[i], "/dev")) - continue; - if (STRPREFIX(file, devMountsPath[i])) - break; - } + for (i = 0; paths[i]; i++) { + const char *path = paths[i]; - if (i == ndevMountsPath) { - if (virProcessRunInMountNamespace(vm->pid, - qemuDomainDetachDeviceUnlinkHelper, - (void *)file) < 0) - return -1; + VIR_DEBUG("Unlinking %s", path); + if (unlink(path) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove device %s"), path); + return -1; } } + virStringListFree(paths); return 0; } @@ -1336,6 +1313,7 @@ qemuDomainNamespaceUnlinkPaths(virDomainObjPtr vm, qemuDomainObjPrivatePtr priv = vm->privateData; virQEMUDriverPtr driver = priv->driver; g_autoptr(virQEMUDriverConfig) cfg = NULL; + VIR_AUTOSTRINGLIST unlinkPaths = NULL; char **devMountsPath = NULL; size_t ndevMountsPath = 0; size_t i; @@ -1352,11 +1330,28 @@ qemuDomainNamespaceUnlinkPaths(virDomainObjPtr vm, goto cleanup; for (i = 0; i < npaths; i++) { - if (qemuDomainDetachDeviceUnlink(driver, vm, paths[i], - devMountsPath, ndevMountsPath) < 0) - goto cleanup; + const char *file = paths[i]; + + if (STRPREFIX(file, QEMU_DEVPREFIX)) { + for (i = 0; i < ndevMountsPath; i++) { + if (STREQ(devMountsPath[i], "/dev")) + continue; + if (STRPREFIX(file, devMountsPath[i])) + break; + } + + if (i == ndevMountsPath && + virStringListAdd(&unlinkPaths, file) < 0) + return -1; + } } + if (unlinkPaths && + virProcessRunInMountNamespace(vm->pid, + qemuDomainDetachDeviceUnlinkHelper, + unlinkPaths) < 0) + return -1; + ret = 0; cleanup: virStringListFreeCount(devMountsPath, ndevMountsPath); -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
Simirarly to qemuDomainAttachDeviceMknodHelper() which was modified just a couple of commits ago, modify the unlink helper which is called on device detach so that it can unlink multiple files in one go instead of forking off for every single one of them.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 65 +++++++++++++++----------------- 1 file changed, 30 insertions(+), 35 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

So far, the only caller qemuDomainNamespaceUnlinkPath() will always pass a single path to unlink, but similarly to qemuDomainNamespaceMknodPaths() - there are a few callers that would like to pass two or more files to unlink at once (held in a string list). Make the @paths argument a string list then. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 4e0b50d885..135842e212 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1307,8 +1307,7 @@ qemuDomainDetachDeviceUnlinkHelper(pid_t pid G_GNUC_UNUSED, static int qemuDomainNamespaceUnlinkPaths(virDomainObjPtr vm, - const char **paths, - size_t npaths) + const char **paths) { qemuDomainObjPrivatePtr priv = vm->privateData; virQEMUDriverPtr driver = priv->driver; @@ -1316,9 +1315,11 @@ qemuDomainNamespaceUnlinkPaths(virDomainObjPtr vm, VIR_AUTOSTRINGLIST unlinkPaths = NULL; char **devMountsPath = NULL; size_t ndevMountsPath = 0; + size_t npaths; size_t i; int ret = -1; + npaths = virStringListLength(paths); if (!npaths) return 0; @@ -1363,9 +1364,9 @@ static int qemuDomainNamespaceUnlinkPath(virDomainObjPtr vm, const char *path) { - const char *paths[] = { path }; + const char *paths[] = { path, NULL }; - return qemuDomainNamespaceUnlinkPaths(vm, paths, 1); + return qemuDomainNamespaceUnlinkPaths(vm, paths); } -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
So far, the only caller qemuDomainNamespaceUnlinkPath() will always pass a single path to unlink, but similarly to qemuDomainNamespaceMknodPaths() - there are a few callers that would like to pass two or more files to unlink at once (held in a string list). Make the @paths argument a string list then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

In my attempt to deduplicate the code, we can use qemuDomainSetupHostdev() to obtain the list of paths to unlink and then pass it to qemuDomainNamespaceUnlinkPaths() to unlink them in a single fork. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 135842e212..8251554e73 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1451,20 +1451,18 @@ int qemuDomainNamespaceTeardownHostdev(virDomainObjPtr vm, virDomainHostdevDefPtr hostdev) { - g_autofree char *path = NULL; + VIR_AUTOSTRINGLIST paths = NULL; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; - if (qemuDomainGetHostdevPath(hostdev, &path, NULL) < 0) + if (qemuDomainSetupHostdev(vm, + hostdev, + true, + &paths) < 0) return -1; - if (path && qemuDomainNamespaceUnlinkPath(vm, path) < 0) - return -1; - - if (qemuHostdevNeedsVFIO(hostdev) && - !qemuDomainNeedsVFIO(vm->def) && - qemuDomainNamespaceUnlinkPath(vm, QEMU_DEV_VFIO) < 0) + if (qemuDomainNamespaceUnlinkPaths(vm, (const char **) paths) < 0) return -1; return 0; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
In my attempt to deduplicate the code, we can use qemuDomainSetupHostdev() to obtain the list of paths to unlink and then pass it to qemuDomainNamespaceUnlinkPaths() to unlink them in a single fork.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 135842e212..8251554e73 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1451,20 +1451,18 @@ int qemuDomainNamespaceTeardownHostdev(virDomainObjPtr vm, virDomainHostdevDefPtr hostdev) { - g_autofree char *path = NULL; + VIR_AUTOSTRINGLIST paths = NULL;
if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0;
- if (qemuDomainGetHostdevPath(hostdev, &path, NULL) < 0) + if (qemuDomainSetupHostdev(vm,
Yeah, SetupHostdev is definitely a misleading name. GetHostdevPaths? PrepareHostdevPaths?
+ hostdev, + true, + &paths) < 0) return -1;
- if (path && qemuDomainNamespaceUnlinkPath(vm, path) < 0) - return -1; - - if (qemuHostdevNeedsVFIO(hostdev) && - !qemuDomainNeedsVFIO(vm->def) && - qemuDomainNamespaceUnlinkPath(vm, QEMU_DEV_VFIO) < 0) + if (qemuDomainNamespaceUnlinkPaths(vm, (const char **) paths) < 0) return -1;
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

We can use qemuDomainSetupMemory() to obtain the path that we need to unlink() from within domain's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 8251554e73..aaf45859d3 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1492,13 +1492,15 @@ int qemuDomainNamespaceTeardownMemory(virDomainObjPtr vm, virDomainMemoryDefPtr mem) { + VIR_AUTOSTRINGLIST paths = NULL; + if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; - if (mem->model != VIR_DOMAIN_MEMORY_MODEL_NVDIMM) - return 0; + if (qemuDomainSetupMemory(mem, &paths) < 0) + return -1; - if (qemuDomainNamespaceUnlinkPath(vm, mem->nvdimmPath) < 0) + if (qemuDomainNamespaceUnlinkPaths(vm, (const char **) paths) < 0) return -1; return 0; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
We can use qemuDomainSetupMemory() to obtain the path that we need to unlink() from within domain's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

We can use qemuDomainSetupChardev() to obtain the path that we need to unlink() from within domain's namespace. Note, while previously we unlinked only VIR_DOMAIN_CHR_TYPE_DEV chardevs, with this change we unlink some other types too - exactly those types we created when plugging the device in. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index aaf45859d3..7a329c0c4a 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1530,17 +1530,15 @@ int qemuDomainNamespaceTeardownChardev(virDomainObjPtr vm, virDomainChrDefPtr chr) { - const char *path = NULL; + VIR_AUTOSTRINGLIST paths = NULL; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; - if (chr->source->type != VIR_DOMAIN_CHR_TYPE_DEV) - return 0; + if (qemuDomainSetupChardev(vm->def, chr, &paths) < 0) + return -1; - path = chr->source->data.file.path; - - if (qemuDomainNamespaceUnlinkPath(vm, path) < 0) + if (qemuDomainNamespaceUnlinkPaths(vm, (const char **) paths) < 0) return -1; return 0; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
We can use qemuDomainSetupChardev() to obtain the path that we need to unlink() from within domain's namespace. Note, while previously we unlinked only VIR_DOMAIN_CHR_TYPE_DEV chardevs, with this change we unlink some other types too - exactly those types we created when plugging the device in.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

We can use qemuDomainSetupRNG() to obtain the path that we need to unlink() from within domain's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 7a329c0c4a..89d73b26ef 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1568,23 +1568,15 @@ int qemuDomainNamespaceTeardownRNG(virDomainObjPtr vm, virDomainRNGDefPtr rng) { - const char *path = NULL; + VIR_AUTOSTRINGLIST paths = NULL; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; - switch ((virDomainRNGBackend) rng->backend) { - case VIR_DOMAIN_RNG_BACKEND_RANDOM: - path = rng->source.file; - break; + if (qemuDomainSetupRNG(rng, &paths) < 0) + return -1; - case VIR_DOMAIN_RNG_BACKEND_EGD: - case VIR_DOMAIN_RNG_BACKEND_BUILTIN: - case VIR_DOMAIN_RNG_BACKEND_LAST: - break; - } - - if (path && qemuDomainNamespaceUnlinkPath(vm, path) < 0) + if (qemuDomainNamespaceUnlinkPaths(vm, (const char **) paths) < 0) return -1; return 0; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
We can use qemuDomainSetupRNG() to obtain the path that we need to unlink() from within domain's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 16 ++++------------ 1 file changed, 4 insertions(+), 12 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

We can use qemuDomainSetupInput() to obtain the path that we need to unlink() from within domain's namespace. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/src/qemu/qemu_domain_namespace.c b/src/qemu/qemu_domain_namespace.c index 89d73b26ef..51d3497670 100644 --- a/src/qemu/qemu_domain_namespace.c +++ b/src/qemu/qemu_domain_namespace.c @@ -1360,16 +1360,6 @@ qemuDomainNamespaceUnlinkPaths(virDomainObjPtr vm, } -static int -qemuDomainNamespaceUnlinkPath(virDomainObjPtr vm, - const char *path) -{ - const char *paths[] = { path, NULL }; - - return qemuDomainNamespaceUnlinkPaths(vm, paths); -} - - int qemuDomainNamespaceSetupDisk(virDomainObjPtr vm, virStorageSourcePtr src) @@ -1605,15 +1595,15 @@ int qemuDomainNamespaceTeardownInput(virDomainObjPtr vm, virDomainInputDefPtr input) { - const char *path = NULL; + VIR_AUTOSTRINGLIST paths = NULL; if (!qemuDomainNamespaceEnabled(vm, QEMU_DOMAIN_NS_MOUNT)) return 0; - if (!(path = virDomainInputDefGetPath(input))) - return 0; + if (qemuDomainSetupInput(input, &paths) < 0) + return -1; - if (path && qemuDomainNamespaceUnlinkPath(vm, path) < 0) + if (qemuDomainNamespaceUnlinkPaths(vm, (const char **) paths) < 0) return -1; return 0; -- 2.26.2

On a Wednesday in 2020, Michal Privoznik wrote:
We can use qemuDomainSetupInput() to obtain the path that we need to unlink() from within domain's namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain_namespace.c | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano
participants (4)
-
Daniel P. Berrangé
-
Ján Tomko
-
Michal Privoznik
-
Michal Prívozník