[PATCH v2 0/1] qemu: Allow sockets in long or deep paths.

v1: https://listman.redhat.com/archives/libvir-list/2023-April/239441.html Address v1's cleanup bugs by using g_autofree and the concurrency bugs by forking a subprocess before chdir(). It works! I have been working on the other solution of putting paths in /var/run, from https://listman.redhat.com/archives/libvir-list/2023-April/239536.html, but so my attempts have ballooned out of control. I would like to offer this improved version of my original patch in the meantime. Plus there's the unclear impact of the other strategy on upgrades: https://listman.redhat.com/archives/libvir-list/2023-April/239470.html, which means even once that's implemented it will take a long time to vet. The chdir() approach cuts right to the point. I'm hoping you'll accept both approaches as complementary and worth adopting. Original issue: https://gitlab.com/libvirt/libvirt/-/issues/466 Cheers! Nick Guenther (1): qemu: Allow sockets in long or deep paths. src/qemu/qemu_command.c | 72 ++++++++++++++++++++++++++++++++--------- 1 file changed, 57 insertions(+), 15 deletions(-) -- 2.34.1

The qemu driver creates IPC sockets using absolute paths, but under POSIX socket paths are constrained pretty tightly. On systems with homedirs on an unusual mount point, like network homedirs, or just particularly long usernames, this could make starting VMs under qemu:///session impossible. Resolves https://gitlab.com/libvirt/libvirt/-/issues/466 Signed-off-by: Nick Guenther <nick.guenther@polymtl.ca> --- src/qemu/qemu_command.c | 72 ++++++++++++++++++++++++++++++++--------- 1 file changed, 57 insertions(+), 15 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 4ca93bf3dc..4bedbb515f 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4860,12 +4860,64 @@ qemuBuildSCSIHostdevDevProps(const virDomainDef *def, return g_steal_pointer(&props); } +struct qemuBindSocketData { + int fd; + char* path; +}; + +static int +qemuBindSocket(pid_t ppid G_GNUC_UNUSED, void *opaque) +{ + /* The path length of a unix socket is limited to what fits in sockaddr_un. + * It's pretty short: 108 on Linux, and this is too easy to hit. + * + * Work around this limit by using a *relative path*, by chdir()ing first. + * But chdir() isn't thread-safe, so run it in a *subprocess* (this function) + * where the chdir() will be instantly forgotten once it has helped configure fd. + * + * background: https://stackoverflow.com/questions/34829600/why-is-the-maximal-path-length-... + */ + + g_autofree char *dir = NULL; + g_autofree char *name = NULL; + + struct sockaddr_un addr; + struct qemuBindSocketData *data = opaque; + + dir = g_path_get_dirname(data->path); + name = g_path_get_basename(data->path); + + memset(&addr, 0, sizeof(addr)); + addr.sun_family = AF_UNIX; + if (virStrcpyStatic(addr.sun_path, name) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("UNIX socket name '%1$s' too long"), + name); + return -1; + } + + if (chdir(dir) < 0) { + virReportSystemError(errno, + _("Unable to chdir to containing directory '%1$s' while binding UNIX socket '%2$s'"), + dir, name); + return -1; + } + + if (bind(data->fd, (struct sockaddr *)&addr, sizeof(addr)) < 0) { + virReportSystemError(errno, + _("Unable to bind UNIX socket '%1$s/%2$s'"), + dir, name); + return -1; + } + + return 0; +} + int qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev) { - struct sockaddr_un addr; - socklen_t addrlen = sizeof(addr); int fd; + struct qemuBindSocketData bindData; if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { virReportSystemError(errno, "%s", @@ -4873,15 +4925,6 @@ qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev) goto error; } - memset(&addr, 0, sizeof(addr)); - addr.sun_family = AF_UNIX; - if (virStrcpyStatic(addr.sun_path, dev->data.nix.path) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("UNIX socket path '%1$s' too long"), - dev->data.nix.path); - goto error; - } - if (unlink(dev->data.nix.path) < 0 && errno != ENOENT) { virReportSystemError(errno, _("Unable to unlink %1$s"), @@ -4889,10 +4932,9 @@ qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev) goto error; } - if (bind(fd, (struct sockaddr *)&addr, addrlen) < 0) { - virReportSystemError(errno, - _("Unable to bind to UNIX socket path '%1$s'"), - dev->data.nix.path); + bindData.fd = fd; + bindData.path = dev->data.nix.path; + if (virProcessRunInFork(qemuBindSocket, &bindData) < 0) { goto error; } -- 2.34.1

On 4/28/23 07:14, Nick Guenther wrote:
The qemu driver creates IPC sockets using absolute paths, but under POSIX socket paths are constrained pretty tightly. On systems with homedirs on an unusual mount point, like network homedirs, or just particularly long usernames, this could make starting VMs under qemu:///session impossible.
Resolves https://gitlab.com/libvirt/libvirt/-/issues/466
Signed-off-by: Nick Guenther <nick.guenther@polymtl.ca> --- src/qemu/qemu_command.c | 72 ++++++++++++++++++++++++++++++++--------- 1 file changed, 57 insertions(+), 15 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 4ca93bf3dc..4bedbb515f 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4860,12 +4860,64 @@ qemuBuildSCSIHostdevDevProps(const virDomainDef *def, return g_steal_pointer(&props); }
+struct qemuBindSocketData { + int fd; + char* path; +}; + +static int +qemuBindSocket(pid_t ppid G_GNUC_UNUSED, void *opaque) +{ + /* The path length of a unix socket is limited to what fits in sockaddr_un. + * It's pretty short: 108 on Linux, and this is too easy to hit. + * + * Work around this limit by using a *relative path*, by chdir()ing first. + * But chdir() isn't thread-safe, so run it in a *subprocess* (this function) + * where the chdir() will be instantly forgotten once it has helped configure fd. + * + * background: https://stackoverflow.com/questions/34829600/why-is-the-maximal-path-length-... + */ + + g_autofree char *dir = NULL; + g_autofree char *name = NULL; + + struct sockaddr_un addr; + struct qemuBindSocketData *data = opaque; + + dir = g_path_get_dirname(data->path); + name = g_path_get_basename(data->path); + + memset(&addr, 0, sizeof(addr)); + addr.sun_family = AF_UNIX; + if (virStrcpyStatic(addr.sun_path, name) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("UNIX socket name '%1$s' too long"), + name); + return -1; + } + + if (chdir(dir) < 0) { + virReportSystemError(errno, + _("Unable to chdir to containing directory '%1$s' while binding UNIX socket '%2$s'"), + dir, name); + return -1; + } + + if (bind(data->fd, (struct sockaddr *)&addr, sizeof(addr)) < 0) { + virReportSystemError(errno, + _("Unable to bind UNIX socket '%1$s/%2$s'"), + dir, name); + return -1; + }
This only fixes the part that creates the socket. In some cases, libvirt still needs to connect to them (e.g. monitor/guest agent sockets). That code path still suffers from the limitation. And for other sockets, mgmt apps are expected to connect to them. And surely we shouldn't expect them to do this kind of tricks only to connect. I believe these commits solve the problem better: https://gitlab.com/MichalPrivoznik/libvirt/-/commits/qemu_long_unix_paths/ I haven't sent them yet, wanted to wait for your v2. Can you please give them a test since you have the env set up? Thanks, Michal

April 28, 2023 8:58 AM, "Michal Prívozník" <mprivozn@redhat.com> wrote:
This only fixes the part that creates the socket. In some cases, libvirt still needs to connect to them (e.g. monitor/guest agent sockets). That code path still suffers from the limitation.
True. But there's only 7 other places where sockaddr_un is used; I can make this bind code into a util function, and make a connect() version too I guess -- and it won't be so bad!
And for other sockets, mgmt apps are expected to connect to them. And surely we shouldn't expect them to do this kind of tricks only to connect.
I was hoping most libvirt clients would use libvirt itself to connect back to it. But that's not true? virt-manager has its own independent socket code? I haven't run into socket problems with virsh yet; but virsh calls libvirt, doesn't it?
I believe these commits solve the problem better:
https://gitlab.com/MichalPrivoznik/libvirt/-/commits/qemu_long_unix_paths
I haven't sent them yet, wanted to wait for your v2. Can you please give them a test since you have the env set up? Thanks,
Thank you Michal! I was trying to patch the same part of src/qemu/qemu_conf.c as you but was flailing around with it. I was getting chaotic results. In retrospect, I think it was simply because I wasn't killing the local libvirtd -- it times out in some cases, and presumably will reload then, but not immediately and not if you have any running VMs. So I was probably just confusing myself. With your patch built and installed, and double-checking it's actually loading by making sure `lsof | grep driver_qemu` is empty before trying it, it successfully uses /var/run, avoiding the original crash I had, even with a very long VM name: u123456@vmserver:~$ export LIBVIRT_DEFAULT_URI=qemu:///session u123456@vmserver:~$ (cd /tmp/ && curl -O https://dl-cdn.alpinelinux.org/alpine/v3.14/releases/x86_64/alpine-extended-...) % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 588M 100 588M 0 0 40.8M 0 0:00:14 0:00:14 --:--:-- 41.7M u123456@vmserver:~$ virt-install --name ffdsfifiosdofidsiofiosdiofsdiofiosdfiosdiofiosdioff --cdrom /tmp/alpine-extended-3.14.0-x86_64.iso --disk size=20,format=qcow2 --memory 4096 --noautoconsole --graphics vnc,listen=socket Début d’installation… Allocating 'ffdsfifiosdofidsiofiosdiofsdiofiosdfiosdiofiosdioff-1.qcow2' | 0 B 00:00:00 ... Création du domaine… | 0 B 00:00:00 Domain is still running. Installation may be in progress. You can reconnect to the console to complete the installation process. u123456@vmserver:~$ tree /var/run/user/$UID/libvirt/qemu /var/run/user/703204575/libvirt/qemu └── run ├── autostarted ├── channel │ └── 1-ffdsfifiosdofidsiofi │ └── org.qemu.guest_agent.0 ├── dbus ├── driver.pid ├── ffdsfifiosdofidsiofiosdiofsdiofiosdfiosdiofiosdioff.pid ├── ffdsfifiosdofidsiofiosdiofsdiofiosdfiosdiofiosdioff.xml └── slirp 5 directories, 5 files Some sockets are still placed in ~/; so far they aren't a problem because their lengths cap out at 95 characters, but these should probably(?) also be moved u123456@vmserver:~$ tree ~/.config/libvirt/qemu/ /home/GRAMES.POLYMTL.CA/p115628/.config/libvirt/qemu/ ├── checkpoint ├── dump ├── ffdsfifiosdofidsiofiosdiofsdiofiosdfiosdiofiosdioff.xml ├── lib │ └── domain-1-ffdsfifiosdofidsiofi │ ├── master-key.aes │ ├── monitor.sock │ └── vnc.sock ├── nvram ├── ram ├── save └── snapshot 8 directories, 4 files I should mention that I am backporting your patch against https://packages.ubuntu.com/source/jammy/libvirt; I don't know how to install master on my environment -- and I doubt it would work anyway due to all the other slightly incompatible library versions. And I'm building and installing with the Debian tooling: root@vmserver:~# apt-get update root@vmserver:~# apt-get install build-essential root@vmserver:~# apt-get build-dep libvirt-daemon-driver-qemu u123456@vmserver:~/virsh/libvirt-patch$ apt-get source libvirt-daemon-driver-qemu && cd libvirt-8.0.0 u123456@vmserver:~/virsh/libvirt-patch/libvirt-8.0.0$ time dpkg-buildpackage -rfakeroot -uc -b > build.log 2>&1 root@vmserver:~# dpkg -i /home/WAVES.EXAMPLE.ORG/u123456/virsh/libvirt-patch/libvirt-daemon-driver-qemu_8.0.0-1ubuntu7.4_amd64.deb Here's the backported patch. I disable tests at the moment to get it to build; I tried to figure out those .xml test case files but I don't understand them yet and couldn't figure out how to adapt them to your patch: --- debian/rules | 1 + src/qemu/qemu_conf.c | 9 +++---- src/qemu/qemu_domain.c | 60 +++++++++++++++++++++++++++++++++++++----- 3 files changed, 57 insertions(+), 13 deletions(-) diff --git a/debian/rules b/debian/rules index 64690a1..f55d95f 100755 --- a/debian/rules +++ b/debian/rules @@ -212,6 +212,7 @@ override_dh_auto_configure: override_dh_auto_test: export LD_PRELOAD=""; \ export VIR_TEST_DEBUG=1; \ + exit 0; \ if ! dh_auto_test -- --timeout-multiplier 10; then \ cat $(DEB_BUILDDIR)/meson-logs/testlog.txt; \ exit 1; \ diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index a838c16..2a4fb48 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -147,6 +147,7 @@ virQEMUDriverConfig *virQEMUDriverConfigNew(bool privileged, cfg->configBaseDir = g_strdup_printf("%s/etc", root); cfg->stateDir = g_strdup_printf("%s/run/qemu", root); cfg->swtpmStateDir = g_strdup_printf("%s/run/swtpm", root); + cfg->channelTargetDir = g_strdup_printf("%s/channel", cfg->stateDir); cfg->cacheDir = g_strdup_printf("%s/cache/qemu", root); cfg->libDir = g_strdup_printf("%s/lib/qemu", root); cfg->swtpmStorageDir = g_strdup_printf("%s/lib/swtpm", root); @@ -155,7 +156,6 @@ virQEMUDriverConfig *virQEMUDriverConfigNew(bool privileged, cfg->snapshotDir = g_strdup_printf("%s/snapshot", cfg->libDir); cfg->checkpointDir = g_strdup_printf("%s/checkpoint", cfg->libDir); cfg->autoDumpPath = g_strdup_printf("%s/dump", cfg->libDir); - cfg->channelTargetDir = g_strdup_printf("%s/channel/target", cfg->libDir); cfg->nvramDir = g_strdup_printf("%s/nvram", cfg->libDir); cfg->memoryBackingDir = g_strdup_printf("%s/ram", cfg->libDir); } else if (privileged) { @@ -167,8 +167,8 @@ virQEMUDriverConfig *virQEMUDriverConfigNew(bool privileged, cfg->configBaseDir = g_strdup(SYSCONFDIR "/libvirt"); cfg->stateDir = g_strdup_printf("%s/libvirt/qemu", RUNSTATEDIR); - cfg->swtpmStateDir = g_strdup_printf("%s/libvirt/qemu/swtpm", RUNSTATEDIR); + cfg->channelTargetDir = g_strdup_printf("%s/channel", cfg->stateDir); cfg->cacheDir = g_strdup_printf("%s/cache/libvirt/qemu", LOCALSTATEDIR); @@ -177,7 +177,6 @@ virQEMUDriverConfig *virQEMUDriverConfigNew(bool privileged, cfg->snapshotDir = g_strdup_printf("%s/snapshot", cfg->libDir); cfg->checkpointDir = g_strdup_printf("%s/checkpoint", cfg->libDir); cfg->autoDumpPath = g_strdup_printf("%s/dump", cfg->libDir); - cfg->channelTargetDir = g_strdup_printf("%s/channel/target", cfg->libDir); cfg->nvramDir = g_strdup_printf("%s/nvram", cfg->libDir); cfg->memoryBackingDir = g_strdup_printf("%s/ram", cfg->libDir); cfg->swtpmStorageDir = g_strdup_printf("%s/lib/libvirt/swtpm", @@ -200,8 +199,8 @@ virQEMUDriverConfig *virQEMUDriverConfigNew(bool privileged, rundir = virGetUserRuntimeDirectory(); cfg->stateDir = g_strdup_printf("%s/qemu/run", rundir); - cfg->swtpmStateDir = g_strdup_printf("%s/swtpm", cfg->stateDir); + cfg->channelTargetDir = g_strdup_printf("%s/channel", cfg->stateDir); cfg->configBaseDir = virGetUserConfigDirectory(); @@ -211,8 +210,6 @@ virQEMUDriverConfig *virQEMUDriverConfigNew(bool privileged, cfg->checkpointDir = g_strdup_printf("%s/qemu/checkpoint", cfg->configBaseDir); cfg->autoDumpPath = g_strdup_printf("%s/qemu/dump", cfg->configBaseDir); - cfg->channelTargetDir = g_strdup_printf("%s/qemu/channel/target", - cfg->configBaseDir); cfg->nvramDir = g_strdup_printf("%s/qemu/nvram", cfg->configBaseDir); cfg->memoryBackingDir = g_strdup_printf("%s/qemu/ram", cfg->configBaseDir); cfg->swtpmStorageDir = g_strdup_printf("%s/qemu/swtpm", diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 42aea3b..09c80ac 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1563,7 +1563,7 @@ qemuDomainSetPrivatePaths(virQEMUDriver *driver, priv->libDir = g_strdup_printf("%s/domain-%s", cfg->libDir, domname); if (!priv->channelTargetDir) - priv->channelTargetDir = g_strdup_printf("%s/domain-%s", + priv->channelTargetDir = g_strdup_printf("%s/%s", cfg->channelTargetDir, domname); return 0; @@ -4908,6 +4908,28 @@ qemuDomainDefaultNetModel(const virDomainDef *def, } + +static bool +qemuDomainChrMatchDefaultPath(const char *prefix, + const char *infix, + const char *target, + const char *path) +{ + g_auto(virBuffer) buf = VIR_BUFFER_INITIALIZER; + g_autofree char *regexp = NULL; + + virBufferEscapeRegex(&buf, "^%s", prefix); + if (infix) + virBufferEscapeRegex(&buf, "%s", infix); + virBufferAddLit(&buf, "/(target/)?([^/]+\\.)|(domain-[^/]+/)|([0-9]+-[^/]+/)"); + virBufferEscapeRegex(&buf, "%s$", target); + + regexp = virBufferContentAndReset(&buf); + + return virStringMatch(path, regexp); +} + + /* * Clear auto generated unix socket paths: * @@ -4917,14 +4939,20 @@ qemuDomainDefaultNetModel(const virDomainDef *def, * libvirt 1.2.19 - 1.3.2: * {cfg->channelTargetDir}/domain-{dom-name}/{target-name} * - * libvirt 1.3.3 and newer: + * libvirt 1.3.3 - 9.2.0: * {cfg->channelTargetDir}/domain-{dom-id}-{short-dom-name}/{target-name} * + * libvirt 9.3.0 and newer: + * {cfg->channelTargetDir}/{dom-id}-{short-dom-name}/{target-name} + * * The unix socket path was stored in config XML until libvirt 1.3.0. * If someone specifies the same path as we generate, they shouldn't do it. * * This function clears the path for migration as well, so we need to clear * the path even if we are not storing it in the XML. + * + * Please note, in libvirt 9.3.0 the channelTargetDir is no longer derived from + * cfg->libDir but rather cfg->stateDir */ static void qemuDomainChrDefDropDefaultPath(virDomainChrDef *chr, @@ -4943,14 +4971,32 @@ qemuDomainChrDefDropDefaultPath(virDomainChrDef *chr, cfg = virQEMUDriverGetConfig(driver); - virBufferEscapeRegex(&buf, "^%s", cfg->channelTargetDir); - virBufferAddLit(&buf, "/([^/]+\\.)|(domain-[^/]+/)"); - virBufferEscapeRegex(&buf, "%s$", chr->target.name); + if (qemuDomainChrMatchDefaultPath(cfg->channelTargetDir, + NULL, + chr->target.name, + chr->source->data.nix.path)) { + VIR_FREE(chr->source->data.nix.path); + return; + } - regexp = virBufferContentAndReset(&buf); + /* Previously, channelTargetDir was derived from cfg->libdir, or + * cfg->configBaseDir even. Try them too. */ + if (qemuDomainChrMatchDefaultPath(cfg->libDir, + "/channel", + chr->target.name, + chr->source->data.nix.path)) { + VIR_FREE(chr->source->data.nix.path); + return; + } - if (virStringMatch(chr->source->data.nix.path, regexp)) + + if (qemuDomainChrMatchDefaultPath(cfg->configBaseDir, + "/qemu/channel", + chr->target.name, + chr->source->data.nix.path)) { VIR_FREE(chr->source->data.nix.path); + return; + } } -- 2.34.1 I haven't tested upgrade paths. No one besides me is using this yet (because they can't because of this bug). I'm not sure how I would even do that. I think the simplest way to set up a test env, though, is just to do `useradd -m -d /home/itsaverysuperlonghomedirectorynamethatweallsaluteandpraiseegads virsh-test1`. And then we should all be able to test against master. In fact we should be able to each set up test envs inside of libvirt VMs :). Except again I am coming at this sideways and I haven't yet figured out how to install libvirt from master. Thanks for all being really friendly by the way. This library is down in the infrastructural guts of my systems, and clearly I don't really know what I'm doing with it. I appreciate the help and the welcome as I try to muddle through getting it working for my users! I'm looking forward to having this patched so that, say, in six months (but more likely, 12) when I upgrade our Ubuntus finally this problem quietly vanishes. -Nick
participants (2)
-
Michal Prívozník
-
Nick Guenther