[libvirt] [PATCH v3 0/4] qemu: use FD passing for chardev UNIX sockets

This series makes use of the chardev fd passing arriving in QEMU 2.12 to get rid of the startup race wrt opening the QEMU monitor. It is actually enabled in all chardev UNIX sockets for sake of having the same codepath everywhere, but is only important for the monitor socket. Changed in v3: - Refactor UNIX socket opening code to allow it to be mocked in the unit tests to avoid creating real UNIX sockets Daniel P. Berrangé (4): qemu: probe for -chardev 'fd' parameter for FD passing qemu: support passing pre-opened UNIX socket listen FD qemu: don't retry connect() if doing FD passing qemu: remove pointless connect retry logic in agent src/qemu/qemu_agent.c | 84 ++----------------- src/qemu/qemu_capabilities.c | 4 +- src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 64 +++++++++++++- src/qemu/qemu_command.h | 4 + src/qemu/qemu_monitor.c | 54 +++++++----- src/qemu/qemu_monitor.h | 1 + src/qemu/qemu_process.c | 27 ++++-- .../caps_2.12.0.aarch64.xml | 1 + .../caps_2.12.0.ppc64.xml | 1 + .../caps_2.12.0.s390x.xml | 1 + .../caps_2.12.0.x86_64.xml | 1 + tests/qemumonitortestutils.c | 1 + .../disk-drive-write-cache.x86_64-latest.args | 3 +- ...irtio-scsi-reservations.x86_64-latest.args | 3 +- tests/qemuxml2argvmock.c | 16 ++++ 16 files changed, 153 insertions(+), 113 deletions(-) -- 2.17.0

QEMU >= 2.12 will support passing of pre-opened file descriptors for socket based character devices. Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml | 1 + 6 files changed, 7 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index a5cb24fec6..9a2b976e46 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -486,6 +486,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, /* 300 */ "sdl-gl", + "chardev-fd-pass", ); @@ -2495,6 +2496,7 @@ static struct virQEMUCapsCommandLineProps virQEMUCapsCommandLine[] = { { "vnc", "vnc", QEMU_CAPS_VNC_MULTI_SERVERS }, { "chardev", "reconnect", QEMU_CAPS_CHARDEV_RECONNECT }, { "sandbox", "elevateprivileges", QEMU_CAPS_SECCOMP_BLACKLIST }, + { "chardev", "fd", QEMU_CAPS_CHARDEV_FD_PASS }, }; static int diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index d23c34c24d..f8f8c3e0cb 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -470,6 +470,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ /* 300 */ QEMU_CAPS_SDL_GL, /* -sdl gl */ + QEMU_CAPS_CHARDEV_FD_PASS, /* Passing pre-opened FDs for chardevs */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml index cabe4f2f07..3c6d9ef7ed 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml @@ -162,6 +162,7 @@ <flag name='qom-list-properties'/> <flag name='memory-backend-file.discard-data'/> <flag name='sdl-gl'/> + <flag name='chardev-fd-pass'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>343099</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml index bffe3b3b97..b7d6b0c5f4 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml @@ -159,6 +159,7 @@ <flag name='qom-list-properties'/> <flag name='memory-backend-file.discard-data'/> <flag name='sdl-gl'/> + <flag name='chardev-fd-pass'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>419968</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml b/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml index 138be92fad..bdf08f8ffd 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml @@ -127,6 +127,7 @@ <flag name='virtual-css-bridge.cssid-unrestricted'/> <flag name='vfio-ccw'/> <flag name='sdl-gl'/> + <flag name='chardev-fd-pass'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml index 4247afeb31..c9846ba45a 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml @@ -200,6 +200,7 @@ <flag name='qom-list-properties'/> <flag name='memory-backend-file.discard-data'/> <flag name='sdl-gl'/> + <flag name='chardev-fd-pass'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>390813</microcodeVersion> -- 2.17.0

There is a race condition when spawning QEMU where libvirt has spawned QEMU but the monitor socket is not yet open. Libvirt has to repeatedly try to connect() to QEMU's monitor until eventually it succeeds, or times out. We use kill() to check if QEMU is still alive so we avoid waiting a long time if QEMU exited, but having a timeout at all is still unpleasant. With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP sockets. If libvirt has called bind() and listen() on this FD, then we have a guarantee that libvirt can immediately call connect() and succeed without any race. Although we only really care about this for the monitor socket and agent socket, this patch does FD passing for all UNIX socket based character devices since there appears to be no downside to it. We don't do FD passing for TCP sockets, however, because it is only possible to pass a single FD, while some hostnames may require listening on multiple FDs to cover IPv4 and IPv6 concurrently. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_command.c | 64 ++++++++++++++++++- src/qemu/qemu_command.h | 4 ++ .../disk-drive-write-cache.x86_64-latest.args | 3 +- ...irtio-scsi-reservations.x86_64-latest.args | 3 +- tests/qemuxml2argvmock.c | 16 +++++ 5 files changed, 84 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index c4237339bf..6834480e1f 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4913,6 +4913,56 @@ qemuBuildChrChardevReconnectStr(virBufferPtr buf, } +int +qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev) +{ + struct sockaddr_un addr; + socklen_t addrlen = sizeof(addr); + int fd; + + if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { + virReportSystemError(errno, "%s", + _("Unable to create UNIX socket")); + goto error; + } + + memset(&addr, 0, sizeof(addr)); + addr.sun_family = AF_UNIX; + if (virStrcpyStatic(addr.sun_path, dev->data.nix.path) == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("UNIX socket path '%s' too long"), + dev->data.nix.path); + goto error; + } + + if (unlink(dev->data.nix.path) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to unlink %s"), + dev->data.nix.path); + goto error; + } + + if (bind(fd, (struct sockaddr *)&addr, addrlen) < 0) { + virReportSystemError(errno, + _("Unable to bind to UNIX socket path '%s'"), + dev->data.nix.path); + goto error; + } + + if (listen(fd, 1) < 0) { + virReportSystemError(errno, + _("Unable to listen to UNIX socket path '%s'"), + dev->data.nix.path); + goto error; + } + + return fd; + + error: + VIR_FORCE_CLOSE(fd); + return -1; +} + /* This function outputs a -chardev command line option which describes only the * host side of the character device */ static char * @@ -5042,8 +5092,18 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager, break; case VIR_DOMAIN_CHR_TYPE_UNIX: - virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); - virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) { + int fd = qemuOpenChrChardevUNIXSocket(dev); + if (fd < 0) + goto cleanup; + + virBufferAsprintf(&buf, "socket,id=%s,fd=%d", charAlias, fd); + + virCommandPassFD(cmd, fd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); + } else { + virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); + virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); + } if (dev->data.nix.listen) virBufferAdd(&buf, nowait ? ",server,nowait" : ",server", -1); diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 28bc33558b..f722b1be72 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -70,6 +70,10 @@ int qemuBuildTLSx509BackendProps(const char *tlspath, virQEMUCapsPtr qemuCaps, virJSONValuePtr *propsret); +/* Open a UNIX socket for chardev FD passing */ +int +qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev); + /* Generate '-device' string for chardev device */ int qemuBuildChrDeviceStr(char **deviceStr, diff --git a/tests/qemuxml2argvdata/disk-drive-write-cache.x86_64-latest.args b/tests/qemuxml2argvdata/disk-drive-write-cache.x86_64-latest.args index a63c5b7477..9e5b611351 100644 --- a/tests/qemuxml2argvdata/disk-drive-write-cache.x86_64-latest.args +++ b/tests/qemuxml2argvdata/disk-drive-write-cache.x86_64-latest.args @@ -17,8 +17,7 @@ file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \ -display none \ -no-user-config \ -nodefaults \ --chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,\ -server,nowait \ +-chardev socket,id=charmonitor,fd=1729,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=utc \ -no-shutdown \ diff --git a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.x86_64-latest.args b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.x86_64-latest.args index 768bc22f9f..ad88d3319a 100644 --- a/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.x86_64-latest.args +++ b/tests/qemuxml2argvdata/disk-virtio-scsi-reservations.x86_64-latest.args @@ -21,8 +21,7 @@ path=/path/to/qemu-pr-helper.sock \ -display none \ -no-user-config \ -nodefaults \ --chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,\ -server,nowait \ +-chardev socket,id=charmonitor,fd=1729,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=utc \ -no-shutdown \ diff --git a/tests/qemuxml2argvmock.c b/tests/qemuxml2argvmock.c index 6d78063f00..b1edbdd0e6 100644 --- a/tests/qemuxml2argvmock.c +++ b/tests/qemuxml2argvmock.c @@ -37,6 +37,7 @@ #include "virtpm.h" #include "virutil.h" #include "qemu/qemu_interface.h" +#include "qemu/qemu_command.h" #include <time.h> #include <unistd.h> @@ -227,3 +228,18 @@ qemuInterfaceOpenVhostNet(virDomainDefPtr def ATTRIBUTE_UNUSED, vhostfd[i] = STDERR_FILENO + 42 + i; return 0; } + + +int +qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev ATTRIBUTE_UNUSED) + +{ + /* We need to return an FD number for a UNIX listener socket, + * which will be given to QEMU via a CLI arg. We need a fixed + * number to get stable tests. This is obviously not a real + * FD number, so when virCommand closes the FD in the parent + * it will get EINVAL, but that's (hopefully) not going to + * be a problem.... + */ + return 1729; +} -- 2.17.0

On 05/17/2018 09:40 AM, Daniel P. Berrangé wrote:
There is a race condition when spawning QEMU where libvirt has spawned QEMU but the monitor socket is not yet open. Libvirt has to repeatedly try to connect() to QEMU's monitor until eventually it succeeds, or times out. We use kill() to check if QEMU is still alive so we avoid waiting a long time if QEMU exited, but having a timeout at all is still unpleasant.
With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP sockets. If libvirt has called bind() and listen() on this FD, then we have a guarantee that libvirt can immediately call connect() and succeed without any race.
Although we only really care about this for the monitor socket and agent socket, this patch does FD passing for all UNIX socket based character devices since there appears to be no downside to it.
We don't do FD passing for TCP sockets, however, because it is only possible to pass a single FD, while some hostnames may require listening on multiple FDs to cover IPv4 and IPv6 concurrently.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_command.c | 64 ++++++++++++++++++- src/qemu/qemu_command.h | 4 ++ .../disk-drive-write-cache.x86_64-latest.args | 3 +- ...irtio-scsi-reservations.x86_64-latest.args | 3 +- tests/qemuxml2argvmock.c | 16 +++++ 5 files changed, 84 insertions(+), 6 deletions(-)
Using a mocked socket number seems to be a reasonable mechanism to achieve the goal. There's certainly other tests that used mocked paths or results to get a standard result/answer. Reviewed-by: John Ferlan <jferlan@redhat.com> John

On 05/17/2018 08:40 AM, Daniel P. Berrangé wrote:
There is a race condition when spawning QEMU where libvirt has spawned QEMU but the monitor socket is not yet open. Libvirt has to repeatedly try to connect() to QEMU's monitor until eventually it succeeds, or times out. We use kill() to check if QEMU is still alive so we avoid waiting a long time if QEMU exited, but having a timeout at all is still unpleasant.
With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP sockets. If libvirt has called bind() and listen() on this FD, then we have a guarantee that libvirt can immediately call connect() and succeed without any race.
Although we only really care about this for the monitor socket and agent socket, this patch does FD passing for all UNIX socket based character devices since there appears to be no downside to it.
We don't do FD passing for TCP sockets, however, because it is only possible to pass a single FD, while some hostnames may require listening on multiple FDs to cover IPv4 and IPv6 concurrently.
+++ b/tests/qemuxml2argvmock.c
+int +qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev ATTRIBUTE_UNUSED) + +{ + /* We need to return an FD number for a UNIX listener socket, + * which will be given to QEMU via a CLI arg. We need a fixed + * number to get stable tests. This is obviously not a real + * FD number, so when virCommand closes the FD in the parent + * it will get EINVAL, but that's (hopefully) not going to + * be a problem.... + */ + return 1729;
Is it worth asserting that 1729 is not an open fd (perhaps by fcntl(1729, F_GETFD) == -1) because of something strange in the test environment? -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

On Thu, May 24, 2018 at 04:50:36PM -0500, Eric Blake wrote:
On 05/17/2018 08:40 AM, Daniel P. Berrangé wrote:
There is a race condition when spawning QEMU where libvirt has spawned QEMU but the monitor socket is not yet open. Libvirt has to repeatedly try to connect() to QEMU's monitor until eventually it succeeds, or times out. We use kill() to check if QEMU is still alive so we avoid waiting a long time if QEMU exited, but having a timeout at all is still unpleasant.
With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP sockets. If libvirt has called bind() and listen() on this FD, then we have a guarantee that libvirt can immediately call connect() and succeed without any race.
Although we only really care about this for the monitor socket and agent socket, this patch does FD passing for all UNIX socket based character devices since there appears to be no downside to it.
We don't do FD passing for TCP sockets, however, because it is only possible to pass a single FD, while some hostnames may require listening on multiple FDs to cover IPv4 and IPv6 concurrently.
+++ b/tests/qemuxml2argvmock.c
+int +qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev ATTRIBUTE_UNUSED) + +{ + /* We need to return an FD number for a UNIX listener socket, + * which will be given to QEMU via a CLI arg. We need a fixed + * number to get stable tests. This is obviously not a real + * FD number, so when virCommand closes the FD in the parent + * it will get EINVAL, but that's (hopefully) not going to + * be a problem.... + */ + return 1729;
Is it worth asserting that 1729 is not an open fd (perhaps by fcntl(1729, F_GETFD) == -1) because of something strange in the test environment?
Sure I'll add diff --git a/tests/qemuxml2argvmock.c b/tests/qemuxml2argvmock.c index 36d25dfc3f..56a6d4892b 100644 --- a/tests/qemuxml2argvmock.c +++ b/tests/qemuxml2argvmock.c @@ -228,5 +228,7 @@ qemuOpenChrChardevUNIXSocket(const virDomainChrSourceDef *dev ATTRIBUTE_UNUSED) * it will get EINVAL, but that's (hopefully) not going to * be a problem.... */ + if (fcntl(1729, F_GETFD) != -1) + abort(); return 1729; } Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Since libvirt called bind() and listen() on the UNIX socket, it is guaranteed that connect() will immediately succeed, if QEMU is running normally. It will only fail if QEMU has closed the monitor socket by mistake or if QEMU has exited, letting the kernel close it. With this in mind we can remove the retry loop and timeout when connecting to the QEMU monitor if we are doing FD passing. Libvirt can go straight to sending the QMP greeting and will simply block waiting for a reply until QEMU is ready. Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_capabilities.c | 2 +- src/qemu/qemu_monitor.c | 54 +++++++++++++++++++++--------------- src/qemu/qemu_monitor.h | 1 + src/qemu/qemu_process.c | 27 +++++++++++++----- tests/qemumonitortestutils.c | 1 + 5 files changed, 55 insertions(+), 30 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 9a2b976e46..2d6776ab4a 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -4175,7 +4175,7 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd, cmd->vm->pid = cmd->pid; - if (!(cmd->mon = qemuMonitorOpen(cmd->vm, &cmd->config, true, + if (!(cmd->mon = qemuMonitorOpen(cmd->vm, &cmd->config, true, true, 0, &callbacks, NULL))) goto ignore; diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 3d7ca3ccfc..ef1f7321b2 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -341,6 +341,7 @@ qemuMonitorDispose(void *obj) static int qemuMonitorOpenUnix(const char *monitor, pid_t cpid, + bool retry, unsigned long long timeout) { struct sockaddr_un addr; @@ -362,31 +363,39 @@ qemuMonitorOpenUnix(const char *monitor, goto error; } - if (virTimeBackOffStart(&timebackoff, 1, timeout * 1000) < 0) - goto error; - while (virTimeBackOffWait(&timebackoff)) { - ret = connect(monfd, (struct sockaddr *)&addr, sizeof(addr)); - - if (ret == 0) - break; + if (retry) { + if (virTimeBackOffStart(&timebackoff, 1, timeout * 1000) < 0) + goto error; + while (virTimeBackOffWait(&timebackoff)) { + ret = connect(monfd, (struct sockaddr *)&addr, sizeof(addr)); - if ((errno == ENOENT || errno == ECONNREFUSED) && - (!cpid || virProcessKill(cpid, 0) == 0)) { - /* ENOENT : Socket may not have shown up yet - * ECONNREFUSED : Leftover socket hasn't been removed yet */ - continue; - } + if (ret == 0) + break; - virReportSystemError(errno, "%s", - _("failed to connect to monitor socket")); - goto error; + if ((errno == ENOENT || errno == ECONNREFUSED) && + (!cpid || virProcessKill(cpid, 0) == 0)) { + /* ENOENT : Socket may not have shown up yet + * ECONNREFUSED : Leftover socket hasn't been removed yet */ + continue; + } - } + virReportSystemError(errno, "%s", + _("failed to connect to monitor socket")); + goto error; + } - if (ret != 0) { - virReportSystemError(errno, "%s", - _("monitor socket did not show up")); - goto error; + if (ret != 0) { + virReportSystemError(errno, "%s", + _("monitor socket did not show up")); + goto error; + } + } else { + ret = connect(monfd, (struct sockaddr *) &addr, sizeof(addr)); + if (ret < 0) { + virReportSystemError(errno, "%s", + _("failed to connect to monitor socket")); + goto error; + } } return monfd; @@ -906,6 +915,7 @@ qemuMonitorPtr qemuMonitorOpen(virDomainObjPtr vm, virDomainChrSourceDefPtr config, bool json, + bool retry, unsigned long long timeout, qemuMonitorCallbacksPtr cb, void *opaque) @@ -920,7 +930,7 @@ qemuMonitorOpen(virDomainObjPtr vm, case VIR_DOMAIN_CHR_TYPE_UNIX: hasSendFD = true; if ((fd = qemuMonitorOpenUnix(config->data.nix.path, - vm->pid, timeout)) < 0) + vm->pid, retry, timeout)) < 0) return NULL; break; diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 33dc521e83..5821847f5a 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -320,6 +320,7 @@ char *qemuMonitorUnescapeArg(const char *in); qemuMonitorPtr qemuMonitorOpen(virDomainObjPtr vm, virDomainChrSourceDefPtr config, bool json, + bool retry, unsigned long long timeout, qemuMonitorCallbacksPtr cb, void *opaque) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 5b73a61962..84b66521fa 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1772,7 +1772,7 @@ qemuProcessInitMonitor(virQEMUDriverPtr driver, static int qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob, - qemuDomainLogContextPtr logCtxt) + bool retry, qemuDomainLogContextPtr logCtxt) { qemuDomainObjPrivatePtr priv = vm->privateData; qemuMonitorPtr mon = NULL; @@ -1803,6 +1803,7 @@ qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob, mon = qemuMonitorOpen(vm, monConfig, priv->monJSON, + retry, timeout, &monitorCallbacks, driver); @@ -2180,17 +2181,23 @@ qemuProcessWaitForMonitor(virQEMUDriverPtr driver, { int ret = -1; virHashTablePtr info = NULL; - qemuDomainObjPrivatePtr priv; + qemuDomainObjPrivatePtr priv = vm->privateData; + bool retry = true; + + if (priv->qemuCaps && + virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) + retry = false; - VIR_DEBUG("Connect monitor to %p '%s'", vm, vm->def->name); - if (qemuConnectMonitor(driver, vm, asyncJob, logCtxt) < 0) + VIR_DEBUG("Connect monitor to vm=%p name='%s' retry=%d", + vm, vm->def->name, retry); + + if (qemuConnectMonitor(driver, vm, asyncJob, retry, logCtxt) < 0) goto cleanup; /* Try to get the pty path mappings again via the monitor. This is much more * reliable if it's available. * Note that the monitor itself can be on a pty, so we still need to try the * log output method. */ - priv = vm->privateData; if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) goto cleanup; ret = qemuMonitorGetChardevInfo(priv->mon, &info); @@ -7468,6 +7475,7 @@ qemuProcessReconnect(void *opaque) unsigned int stopFlags = 0; bool jobStarted = false; virCapsPtr caps = NULL; + bool retry = true; VIR_FREE(data); @@ -7498,10 +7506,15 @@ qemuProcessReconnect(void *opaque) * allowReboot in status XML and we need to initialize it. */ qemuProcessPrepareAllowReboot(obj); - VIR_DEBUG("Reconnect monitor to %p '%s'", obj, obj->def->name); + if (priv->qemuCaps && + virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) + retry = false; + + VIR_DEBUG("Reconnect monitor to def=%p name='%s' retry=%d", + obj, obj->def->name, retry); /* XXX check PID liveliness & EXE path */ - if (qemuConnectMonitor(driver, obj, QEMU_ASYNC_JOB_NONE, NULL) < 0) + if (qemuConnectMonitor(driver, obj, QEMU_ASYNC_JOB_NONE, retry, NULL) < 0) goto error; if (qemuHostdevUpdateActiveDomainDevices(driver, obj->def) < 0) diff --git a/tests/qemumonitortestutils.c b/tests/qemumonitortestutils.c index 62f68ee699..789eb72196 100644 --- a/tests/qemumonitortestutils.c +++ b/tests/qemumonitortestutils.c @@ -1252,6 +1252,7 @@ qemuMonitorTestNew(bool json, if (!(test->mon = qemuMonitorOpen(test->vm, &src, json, + true, 0, &qemuMonitorTestCallbacks, driver))) -- 2.17.0

When the agent code was first introduced back in commit c160ce3316852a797d7b06b4ee101233866e69a9 Author: Daniel P. Berrange <berrange@redhat.com> Date: Wed Oct 5 18:31:54 2011 +0100 QEMU guest agent support there was code that would loop and retry the connection when opening the agent socket. At this time, the only thing done in between the opening of the monitor socket & opening of the agent socket was a call to set the monitor capabilities. This was a no-op on non-QMP versions, so in theory there could be a race which let us connect to the monitor while the agent socket was still not created by QEMU. In the modern world, however, we long ago mandated the use of QMP for managing QEMU, so we're guaranteed to have a set capabilities QMP call. Once we've seen a reply to this, we're guaranteed that QEMU has fully initialized all backends and is in its event loop. We can thus be sure the QEMU agent socket is present and don't need to retry connections to it, even without having the chardev FD passing feature. Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_agent.c | 84 ++++--------------------------------------- 1 file changed, 7 insertions(+), 77 deletions(-) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index b838f75207..e508abcc24 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -106,7 +106,6 @@ struct _qemuAgent { int fd; int watch; - bool connectPending; bool running; virDomainObjPtr vm; @@ -180,15 +179,12 @@ static void qemuAgentDispose(void *obj) } static int -qemuAgentOpenUnix(const char *monitor, pid_t cpid, bool *inProgress) +qemuAgentOpenUnix(const char *monitor) { struct sockaddr_un addr; int monfd; - virTimeBackOffVar timeout; int ret = -1; - *inProgress = false; - if ((monfd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { virReportSystemError(errno, "%s", _("failed to create socket")); @@ -217,39 +213,11 @@ qemuAgentOpenUnix(const char *monitor, pid_t cpid, bool *inProgress) goto error; } - if (virTimeBackOffStart(&timeout, 1, 3*1000 /* ms */) < 0) - goto error; - while (virTimeBackOffWait(&timeout)) { - ret = connect(monfd, (struct sockaddr *)&addr, sizeof(addr)); - - if (ret == 0) - break; - - if ((errno == ENOENT || errno == ECONNREFUSED) && - virProcessKill(cpid, 0) == 0) { - /* ENOENT : Socket may not have shown up yet - * ECONNREFUSED : Leftover socket hasn't been removed yet */ - continue; - } - - if ((errno == EINPROGRESS) || - (errno == EAGAIN)) { - VIR_DEBUG("Connection attempt continuing in background"); - *inProgress = true; - ret = 0; - break; - } - + ret = connect(monfd, (struct sockaddr *)&addr, sizeof(addr)); + if (ret < 0) { virReportSystemError(errno, "%s", _("failed to connect to monitor socket")); goto error; - - } - - if (ret != 0) { - virReportSystemError(errno, "%s", - _("monitor socket did not show up")); - goto error; } return monfd; @@ -470,35 +438,6 @@ qemuAgentIOProcess(qemuAgentPtr mon) } -static int -qemuAgentIOConnect(qemuAgentPtr mon) -{ - int optval; - socklen_t optlen; - - VIR_DEBUG("Checking on background connection status"); - - mon->connectPending = false; - - optlen = sizeof(optval); - - if (getsockopt(mon->fd, SOL_SOCKET, SO_ERROR, - &optval, &optlen) < 0) { - virReportSystemError(errno, "%s", - _("Cannot check socket connection status")); - return -1; - } - - if (optval != 0) { - virReportSystemError(optval, "%s", - _("Cannot connect to agent socket")); - return -1; - } - - VIR_DEBUG("Agent is now connected"); - return 0; -} - /* * Called when the monitor is able to write data * Call this function while holding the monitor lock. @@ -630,13 +569,8 @@ qemuAgentIO(int watch, int fd, int events, void *opaque) error = true; } else { if (events & VIR_EVENT_HANDLE_WRITABLE) { - if (mon->connectPending) { - if (qemuAgentIOConnect(mon) < 0) - error = true; - } else { - if (qemuAgentIOWrite(mon) < 0) - error = true; - } + if (qemuAgentIOWrite(mon) < 0) + error = true; events &= ~VIR_EVENT_HANDLE_WRITABLE; } @@ -768,8 +702,7 @@ qemuAgentOpen(virDomainObjPtr vm, switch (config->type) { case VIR_DOMAIN_CHR_TYPE_UNIX: - mon->fd = qemuAgentOpenUnix(config->data.nix.path, vm->pid, - &mon->connectPending); + mon->fd = qemuAgentOpenUnix(config->data.nix.path); break; case VIR_DOMAIN_CHR_TYPE_PTY: @@ -790,10 +723,7 @@ qemuAgentOpen(virDomainObjPtr vm, if ((mon->watch = virEventAddHandle(mon->fd, VIR_EVENT_HANDLE_HANGUP | VIR_EVENT_HANDLE_ERROR | - VIR_EVENT_HANDLE_READABLE | - (mon->connectPending ? - VIR_EVENT_HANDLE_WRITABLE : - 0), + VIR_EVENT_HANDLE_READABLE, qemuAgentIO, mon, virObjectFreeCallback)) < 0) { -- 2.17.0
participants (3)
-
Daniel P. Berrangé
-
Eric Blake
-
John Ferlan