[libvirt] [PATCH v2 0/4] qemu: use FD passing for chardev UNIX sockets

This series makes use of the chardev fd passing arriving in QEMU 2.12 to get rid of the startup race wrt opening the QEMU monitor. It is actually enabled in all chardev UNIX sockets for sake of having the same codepath everywhere, but is only important for the monitor socket. Daniel P. Berrangé (4): qemu: probe for -chardev 'fd' parameter for FD passing qemu: support passing pre-opened UNIX socket listen FD qemu: don't retry connect() if doing FD passing qemu: remove pointless connect retry logic in agent src/qemu/qemu_agent.c | 84 ++-------------------- src/qemu/qemu_capabilities.c | 4 +- src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 54 +++++++++++++- src/qemu/qemu_monitor.c | 54 ++++++++------ src/qemu/qemu_monitor.h | 1 + src/qemu/qemu_process.c | 27 +++++-- tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml | 1 + tests/qemumonitortestutils.c | 1 + 12 files changed, 121 insertions(+), 109 deletions(-) -- 2.14.3

QEMU >= 2.12 will support passing of pre-opened file descriptors for socket based character devices. Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml | 1 + 6 files changed, 7 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 4fb586f1d9..59bae67945 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -468,6 +468,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "virtio-tablet-ccw", "qcow2-luks", "pcie-pci-bridge", + "chardev-fd-pass", ); @@ -2425,6 +2426,7 @@ static struct virQEMUCapsCommandLineProps virQEMUCapsCommandLine[] = { { "machine", "loadparm", QEMU_CAPS_LOADPARM }, { "vnc", "vnc", QEMU_CAPS_VNC_MULTI_SERVERS }, { "chardev", "reconnect", QEMU_CAPS_CHARDEV_RECONNECT }, + { "chardev", "fd", QEMU_CAPS_CHARDEV_FD_PASS }, }; static int diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 2f15da0729..53de5dff9b 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -452,6 +452,7 @@ typedef enum { QEMU_CAPS_DEVICE_VIRTIO_TABLET_CCW, /* -device virtio-tablet-ccw */ QEMU_CAPS_QCOW2_LUKS, /* qcow2 format support LUKS encryption */ QEMU_CAPS_DEVICE_PCIE_PCI_BRIDGE, /* -device pcie-pci-bridge */ + QEMU_CAPS_CHARDEV_FD_PASS, /* Passing pre-opened FDs for chardevs */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml index 1eadfb6cc0..c733de7700 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.aarch64.xml @@ -180,6 +180,7 @@ <flag name='dump-completed'/> <flag name='qcow2-luks'/> <flag name='pcie-pci-bridge'/> + <flag name='chardev-fd-pass'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>342346</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml index 34f5567146..c64ea44d4f 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.ppc64.xml @@ -177,6 +177,7 @@ <flag name='machine.pseries.max-cpu-compat'/> <flag name='dump-completed'/> <flag name='qcow2-luks'/> + <flag name='chardev-fd-pass'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>419215</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml b/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml index bd675a946f..c5c48eb07c 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.s390x.xml @@ -142,6 +142,7 @@ <flag name='virtio-mouse-ccw'/> <flag name='virtio-tablet-ccw'/> <flag name='qcow2-luks'/> + <flag name='chardev-fd-pass'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml index 821ddca4f5..33867499ae 100644 --- a/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_2.12.0.x86_64.xml @@ -218,6 +218,7 @@ <flag name='dump-completed'/> <flag name='qcow2-luks'/> <flag name='pcie-pci-bridge'/> + <flag name='chardev-fd-pass'/> <version>2011090</version> <kvmVersion>0</kvmVersion> <microcodeVersion>390060</microcodeVersion> -- 2.14.3

There is a race condition when spawning QEMU where libvirt has spawned QEMU but the monitor socket is not yet open. Libvirt has to repeatedly try to connect() to QEMU's monitor until eventually it succeeds, or times out. We use kill() to check if QEMU is still alive so we avoid waiting a long time if QEMU exited, but having a timeout at all is still unpleasant. With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP sockets. If libvirt has called bind() and listen() on this FD, then we have a guarantee that libvirt can immediately call connect() and succeed without any race. Although we only really care about this for the monitor socket and agent socket, this patch does FD passing for all UNIX socket based character devices since there appears to be no downside to it. We don't do FD passing for TCP sockets, however, because it is only possible to pass a single FD, while some hostnames may require listening on multiple FDs to cover IPv4 and IPv6 concurrently. Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_command.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 52 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 0c109c63e7..9fc48eb829 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5034,8 +5034,58 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager, break; case VIR_DOMAIN_CHR_TYPE_UNIX: - virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); - virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) { + struct sockaddr_un addr; + socklen_t addrlen = sizeof(addr); + int fd; + + if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { + virReportSystemError(errno, "%s", + _("Unable to create UNIX socket")); + goto cleanup; + } + + memset(&addr, 0, sizeof(addr)); + addr.sun_family = AF_UNIX; + if (virStrcpyStatic(addr.sun_path, dev->data.nix.path) == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("UNIX socket path '%s' too long"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (unlink(dev->data.nix.path) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to unlink %s"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (bind(fd, (struct sockaddr *)&addr, addrlen) < 0) { + virReportSystemError(errno, + _("Unable to bind to UNIX socket path '%s'"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (listen(fd, 1) < 0) { + virReportSystemError(errno, + _("Unable to listen to UNIX socket path '%s'"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + virBufferAsprintf(&buf, "socket,id=%s,fd=%d", charAlias, fd); + + virCommandPassFD(cmd, fd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); + } else { + virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); + virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); + } if (dev->data.nix.listen) virBufferAdd(&buf, nowait ? ",server,nowait" : ",server", -1); -- 2.14.3

On 04/18/2018 01:30 PM, Daniel P. Berrangé wrote:
There is a race condition when spawning QEMU where libvirt has spawned QEMU but the monitor socket is not yet open. Libvirt has to repeatedly try to connect() to QEMU's monitor until eventually it succeeds, or times out. We use kill() to check if QEMU is still alive so we avoid waiting a long time if QEMU exited, but having a timeout at all is still unpleasant.
With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP sockets. If libvirt has called bind() and listen() on this FD, then we have a guarantee that libvirt can immediately call connect() and succeed without any race.
Although we only really care about this for the monitor socket and agent socket, this patch does FD passing for all UNIX socket based character devices since there appears to be no downside to it.
We don't do FD passing for TCP sockets, however, because it is only possible to pass a single FD, while some hostnames may require listening on multiple FDs to cover IPv4 and IPv6 concurrently.
Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_command.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 52 insertions(+), 2 deletions(-)
This one is now affected by Peter's recent xml2argv adjustment for: tests/qemuxml2argvdata/disk-drive-write-cache.x86_64-latest.args which now fails w/ : 253) QEMU XML-2-ARGV disk-drive-write-cache.x86_64-latest ... libvirt: QEMU Driver error : Unable to bind to UNIX socket path '/tmp/lib/domain--1-QEMUGuest1/monitor.sock': No such file or directory FAILED because the qemuProcessCreatePretendCmd doesn't allow this code to distinguish that it shouldn't attempt a bind/listen <sigh> I "think" this may require moving the socket setup code into qemuProcessLaunch or somehow passing a flag into BuildCommandLine to tell it not to do anything "real". John
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 0c109c63e7..9fc48eb829 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5034,8 +5034,58 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager, break;
case VIR_DOMAIN_CHR_TYPE_UNIX: - virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); - virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) { + struct sockaddr_un addr; + socklen_t addrlen = sizeof(addr); + int fd; + + if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { + virReportSystemError(errno, "%s", + _("Unable to create UNIX socket")); + goto cleanup; + } + + memset(&addr, 0, sizeof(addr)); + addr.sun_family = AF_UNIX; + if (virStrcpyStatic(addr.sun_path, dev->data.nix.path) == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("UNIX socket path '%s' too long"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (unlink(dev->data.nix.path) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to unlink %s"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (bind(fd, (struct sockaddr *)&addr, addrlen) < 0) { + virReportSystemError(errno, + _("Unable to bind to UNIX socket path '%s'"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (listen(fd, 1) < 0) { + virReportSystemError(errno, + _("Unable to listen to UNIX socket path '%s'"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + virBufferAsprintf(&buf, "socket,id=%s,fd=%d", charAlias, fd); + + virCommandPassFD(cmd, fd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); + } else { + virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); + virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); + } if (dev->data.nix.listen) virBufferAdd(&buf, nowait ? ",server,nowait" : ",server", -1);

On Fri, Apr 20, 2018 at 10:18:18 -0400, John Ferlan wrote:
On 04/18/2018 01:30 PM, Daniel P. Berrangé wrote:
There is a race condition when spawning QEMU where libvirt has spawned QEMU but the monitor socket is not yet open. Libvirt has to repeatedly try to connect() to QEMU's monitor until eventually it succeeds, or times out. We use kill() to check if QEMU is still alive so we avoid waiting a long time if QEMU exited, but having a timeout at all is still unpleasant.
With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP sockets. If libvirt has called bind() and listen() on this FD, then we have a guarantee that libvirt can immediately call connect() and succeed without any race.
Although we only really care about this for the monitor socket and agent socket, this patch does FD passing for all UNIX socket based character devices since there appears to be no downside to it.
We don't do FD passing for TCP sockets, however, because it is only possible to pass a single FD, while some hostnames may require listening on multiple FDs to cover IPv4 and IPv6 concurrently.
Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_command.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 52 insertions(+), 2 deletions(-)
This one is now affected by Peter's recent xml2argv adjustment for:
tests/qemuxml2argvdata/disk-drive-write-cache.x86_64-latest.args
which now fails w/ :
253) QEMU XML-2-ARGV disk-drive-write-cache.x86_64-latest ... libvirt: QEMU Driver error : Unable to bind to UNIX socket path '/tmp/lib/domain--1-QEMUGuest1/monitor.sock': No such file or directory FAILED
I'm kind of glad I've added this test :)
because the qemuProcessCreatePretendCmd doesn't allow this code to distinguish that it shouldn't attempt a bind/listen <sigh>
I "think" this may require moving the socket setup code into qemuProcessLaunch or somehow passing a flag into BuildCommandLine to tell it not to do anything "real".
Yeah. The command line generator should be inert to the host, so this needs to be moved to the preparation steps.

Since libvirt called bind() and listen() on the UNIX socket, it is guaranteed that connect() will immediately succeed, if QEMU is running normally. It will only fail if QEMU has closed the monitor socket by mistake or if QEMU has exited, letting the kernel close it. With this in mind we can remove the retry loop and timeout when connecting to the QEMU monitor if we are doing FD passing. Libvirt can go straight to sending the QMP greeting and will simply block waiting for a reply until QEMU is ready. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_capabilities.c | 2 +- src/qemu/qemu_monitor.c | 54 ++++++++++++++++++++++++++------------------ src/qemu/qemu_monitor.h | 1 + src/qemu/qemu_process.c | 27 ++++++++++++++++------ tests/qemumonitortestutils.c | 1 + 5 files changed, 55 insertions(+), 30 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 59bae67945..f27d0544ff 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -4128,7 +4128,7 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd, cmd->vm->pid = cmd->pid; - if (!(cmd->mon = qemuMonitorOpen(cmd->vm, &cmd->config, true, + if (!(cmd->mon = qemuMonitorOpen(cmd->vm, &cmd->config, true, true, 0, &callbacks, NULL))) goto ignore; diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 22f05222db..130490bde8 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -349,6 +349,7 @@ qemuMonitorDispose(void *obj) static int qemuMonitorOpenUnix(const char *monitor, pid_t cpid, + bool retry, unsigned long long timeout) { struct sockaddr_un addr; @@ -370,31 +371,39 @@ qemuMonitorOpenUnix(const char *monitor, goto error; } - if (virTimeBackOffStart(&timebackoff, 1, timeout * 1000) < 0) - goto error; - while (virTimeBackOffWait(&timebackoff)) { - ret = connect(monfd, (struct sockaddr *) &addr, sizeof(addr)); - - if (ret == 0) - break; + if (retry) { + if (virTimeBackOffStart(&timebackoff, 1, timeout * 1000) < 0) + goto error; + while (virTimeBackOffWait(&timebackoff)) { + ret = connect(monfd, (struct sockaddr *) &addr, sizeof(addr)); - if ((errno == ENOENT || errno == ECONNREFUSED) && - (!cpid || virProcessKill(cpid, 0) == 0)) { - /* ENOENT : Socket may not have shown up yet - * ECONNREFUSED : Leftover socket hasn't been removed yet */ - continue; - } + if (ret == 0) + break; - virReportSystemError(errno, "%s", - _("failed to connect to monitor socket")); - goto error; + if ((errno == ENOENT || errno == ECONNREFUSED) && + (!cpid || virProcessKill(cpid, 0) == 0)) { + /* ENOENT : Socket may not have shown up yet + * ECONNREFUSED : Leftover socket hasn't been removed yet */ + continue; + } - } + virReportSystemError(errno, "%s", + _("failed to connect to monitor socket")); + goto error; + } - if (ret != 0) { - virReportSystemError(errno, "%s", - _("monitor socket did not show up")); - goto error; + if (ret != 0) { + virReportSystemError(errno, "%s", + _("monitor socket did not show up")); + goto error; + } + } else { + ret = connect(monfd, (struct sockaddr *) &addr, sizeof(addr)); + if (ret < 0) { + virReportSystemError(errno, "%s", + _("failed to connect to monitor socket")); + goto error; + } } return monfd; @@ -914,6 +923,7 @@ qemuMonitorPtr qemuMonitorOpen(virDomainObjPtr vm, virDomainChrSourceDefPtr config, bool json, + bool retry, unsigned long long timeout, qemuMonitorCallbacksPtr cb, void *opaque) @@ -928,7 +938,7 @@ qemuMonitorOpen(virDomainObjPtr vm, case VIR_DOMAIN_CHR_TYPE_UNIX: hasSendFD = true; if ((fd = qemuMonitorOpenUnix(config->data.nix.path, - vm->pid, timeout)) < 0) + vm->pid, retry, timeout)) < 0) return NULL; break; diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 9556a51341..c539c41b1d 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -320,6 +320,7 @@ char *qemuMonitorUnescapeArg(const char *in); qemuMonitorPtr qemuMonitorOpen(virDomainObjPtr vm, virDomainChrSourceDefPtr config, bool json, + bool retry, unsigned long long timeout, qemuMonitorCallbacksPtr cb, void *opaque) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d0adc8aa3b..3c8718676d 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1771,7 +1771,7 @@ qemuProcessInitMonitor(virQEMUDriverPtr driver, static int qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob, - qemuDomainLogContextPtr logCtxt) + bool retry, qemuDomainLogContextPtr logCtxt) { qemuDomainObjPrivatePtr priv = vm->privateData; qemuMonitorPtr mon = NULL; @@ -1802,6 +1802,7 @@ qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob, mon = qemuMonitorOpen(vm, monConfig, priv->monJSON, + retry, timeout, &monitorCallbacks, driver); @@ -2179,17 +2180,23 @@ qemuProcessWaitForMonitor(virQEMUDriverPtr driver, { int ret = -1; virHashTablePtr info = NULL; - qemuDomainObjPrivatePtr priv; + qemuDomainObjPrivatePtr priv = vm->privateData; + bool retry = true; + + if (priv->qemuCaps && + virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) + retry = false; - VIR_DEBUG("Connect monitor to %p '%s'", vm, vm->def->name); - if (qemuConnectMonitor(driver, vm, asyncJob, logCtxt) < 0) + VIR_DEBUG("Connect monitor to vm=%p name='%s' retry=%d", + vm, vm->def->name, retry); + + if (qemuConnectMonitor(driver, vm, asyncJob, retry, logCtxt) < 0) goto cleanup; /* Try to get the pty path mappings again via the monitor. This is much more * reliable if it's available. * Note that the monitor itself can be on a pty, so we still need to try the * log output method. */ - priv = vm->privateData; if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) goto cleanup; ret = qemuMonitorGetChardevInfo(priv->mon, &info); @@ -7263,6 +7270,7 @@ qemuProcessReconnect(void *opaque) unsigned int stopFlags = 0; bool jobStarted = false; virCapsPtr caps = NULL; + bool retry = true; VIR_FREE(data); @@ -7293,10 +7301,15 @@ qemuProcessReconnect(void *opaque) * allowReboot in status XML and we need to initialize it. */ qemuProcessPrepareAllowReboot(obj); - VIR_DEBUG("Reconnect monitor to %p '%s'", obj, obj->def->name); + if (priv->qemuCaps && + virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) + retry = false; + + VIR_DEBUG("Reconnect monitor to def=%p name='%s' retry=%d", + obj, obj->def->name, retry); /* XXX check PID liveliness & EXE path */ - if (qemuConnectMonitor(driver, obj, QEMU_ASYNC_JOB_NONE, NULL) < 0) + if (qemuConnectMonitor(driver, obj, QEMU_ASYNC_JOB_NONE, retry, NULL) < 0) goto error; if (qemuHostdevUpdateActiveDomainDevices(driver, obj->def) < 0) diff --git a/tests/qemumonitortestutils.c b/tests/qemumonitortestutils.c index 66bccac9a4..9808e972c8 100644 --- a/tests/qemumonitortestutils.c +++ b/tests/qemumonitortestutils.c @@ -1252,6 +1252,7 @@ qemuMonitorTestNew(bool json, if (!(test->mon = qemuMonitorOpen(test->vm, &src, json, + true, 0, &qemuMonitorTestCallbacks, driver))) -- 2.14.3

On 04/18/2018 01:30 PM, Daniel P. Berrangé wrote:
Since libvirt called bind() and listen() on the UNIX socket, it is guaranteed that connect() will immediately succeed, if QEMU is running normally. It will only fail if QEMU has closed the monitor socket by mistake or if QEMU has exited, letting the kernel close it.
With this in mind we can remove the retry loop and timeout when connecting to the QEMU monitor if we are doing FD passing. Libvirt can go straight to sending the QMP greeting and will simply block waiting for a reply until QEMU is ready.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_capabilities.c | 2 +- src/qemu/qemu_monitor.c | 54 ++++++++++++++++++++++++++------------------ src/qemu/qemu_monitor.h | 1 + src/qemu/qemu_process.c | 27 ++++++++++++++++------ tests/qemumonitortestutils.c | 1 + 5 files changed, 55 insertions(+), 30 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John

When the agent code was first introduced back in commit c160ce3316852a797d7b06b4ee101233866e69a9 Author: Daniel P. Berrange <berrange@redhat.com> Date: Wed Oct 5 18:31:54 2011 +0100 QEMU guest agent support there was code that would loop and retry the connection when opening the agent socket. At this time, the only thing done in between the opening of the monitor socket & opening of the agent socket was a call to set the monitor capabilities. This was a no-op on non-QMP versions, so in theory there could be a race which let us connect to the monitor while the agent socket was still not created by QEMU. In the modern world, however, we long ago mandated the use of QMP for managing QEMU, so we're guaranteed to have a set capabilities QMP call. Once we've seen a reply to this, we're guaranteed that QEMU has fully initialized all backends and is in its event loop. We can thus be sure the QEMU agent socket is present and don't need to retry connections to it, even without having the chardev FD passing feature. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_agent.c | 84 +++++---------------------------------------------- 1 file changed, 7 insertions(+), 77 deletions(-) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index 85af53d194..a68c1e3897 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -106,7 +106,6 @@ struct _qemuAgent { int fd; int watch; - bool connectPending; bool running; virDomainObjPtr vm; @@ -183,15 +182,12 @@ static void qemuAgentDispose(void *obj) } static int -qemuAgentOpenUnix(const char *monitor, pid_t cpid, bool *inProgress) +qemuAgentOpenUnix(const char *monitor) { struct sockaddr_un addr; int monfd; - virTimeBackOffVar timeout; int ret = -1; - *inProgress = false; - if ((monfd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { virReportSystemError(errno, "%s", _("failed to create socket")); @@ -220,39 +216,11 @@ qemuAgentOpenUnix(const char *monitor, pid_t cpid, bool *inProgress) goto error; } - if (virTimeBackOffStart(&timeout, 1, 3*1000 /* ms */) < 0) - goto error; - while (virTimeBackOffWait(&timeout)) { - ret = connect(monfd, (struct sockaddr *) &addr, sizeof(addr)); - - if (ret == 0) - break; - - if ((errno == ENOENT || errno == ECONNREFUSED) && - virProcessKill(cpid, 0) == 0) { - /* ENOENT : Socket may not have shown up yet - * ECONNREFUSED : Leftover socket hasn't been removed yet */ - continue; - } - - if ((errno == EINPROGRESS) || - (errno == EAGAIN)) { - VIR_DEBUG("Connection attempt continuing in background"); - *inProgress = true; - ret = 0; - break; - } - + ret = connect(monfd, (struct sockaddr *) &addr, sizeof(addr)); + if (ret < 0) { virReportSystemError(errno, "%s", _("failed to connect to monitor socket")); goto error; - - } - - if (ret != 0) { - virReportSystemError(errno, "%s", - _("monitor socket did not show up")); - goto error; } return monfd; @@ -473,35 +441,6 @@ qemuAgentIOProcess(qemuAgentPtr mon) } -static int -qemuAgentIOConnect(qemuAgentPtr mon) -{ - int optval; - socklen_t optlen; - - VIR_DEBUG("Checking on background connection status"); - - mon->connectPending = false; - - optlen = sizeof(optval); - - if (getsockopt(mon->fd, SOL_SOCKET, SO_ERROR, - &optval, &optlen) < 0) { - virReportSystemError(errno, "%s", - _("Cannot check socket connection status")); - return -1; - } - - if (optval != 0) { - virReportSystemError(optval, "%s", - _("Cannot connect to agent socket")); - return -1; - } - - VIR_DEBUG("Agent is now connected"); - return 0; -} - /* * Called when the monitor is able to write data * Call this function while holding the monitor lock. @@ -633,13 +572,8 @@ qemuAgentIO(int watch, int fd, int events, void *opaque) error = true; } else { if (events & VIR_EVENT_HANDLE_WRITABLE) { - if (mon->connectPending) { - if (qemuAgentIOConnect(mon) < 0) - error = true; - } else { - if (qemuAgentIOWrite(mon) < 0) - error = true; - } + if (qemuAgentIOWrite(mon) < 0) + error = true; events &= ~VIR_EVENT_HANDLE_WRITABLE; } @@ -771,8 +705,7 @@ qemuAgentOpen(virDomainObjPtr vm, switch (config->type) { case VIR_DOMAIN_CHR_TYPE_UNIX: - mon->fd = qemuAgentOpenUnix(config->data.nix.path, vm->pid, - &mon->connectPending); + mon->fd = qemuAgentOpenUnix(config->data.nix.path); break; case VIR_DOMAIN_CHR_TYPE_PTY: @@ -793,10 +726,7 @@ qemuAgentOpen(virDomainObjPtr vm, if ((mon->watch = virEventAddHandle(mon->fd, VIR_EVENT_HANDLE_HANGUP | VIR_EVENT_HANDLE_ERROR | - VIR_EVENT_HANDLE_READABLE | - (mon->connectPending ? - VIR_EVENT_HANDLE_WRITABLE : - 0), + VIR_EVENT_HANDLE_READABLE, qemuAgentIO, mon, virObjectFreeCallback)) < 0) { -- 2.14.3

On 04/18/2018 01:30 PM, Daniel P. Berrangé wrote:
When the agent code was first introduced back in
commit c160ce3316852a797d7b06b4ee101233866e69a9 Author: Daniel P. Berrange <berrange@redhat.com> Date: Wed Oct 5 18:31:54 2011 +0100
QEMU guest agent support
there was code that would loop and retry the connection when opening the agent socket. At this time, the only thing done in between the opening of the monitor socket & opening of the agent socket was a call to set the monitor capabilities. This was a no-op on non-QMP versions, so in theory there could be a race which let us connect to the monitor while the agent socket was still not created by QEMU.
In the modern world, however, we long ago mandated the use of QMP for managing QEMU, so we're guaranteed to have a set capabilities QMP call. Once we've seen a reply to this, we're guaranteed that QEMU has fully initialized all backends and is in its event loop.
We can thus be sure the QEMU agent socket is present and don't need to retry connections to it, even without having the chardev FD passing feature.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_agent.c | 84 +++++---------------------------------------------- 1 file changed, 7 insertions(+), 77 deletions(-)
Reviewed-by: John Ferlan <jferlan@redhat.com> John
participants (3)
-
Daniel P. Berrangé
-
John Ferlan
-
Peter Krempa