[libvirt] [PATCH 0/3] qemu: use FD passing for monitor (and other unix sockets)

This series makes use of the chardev fd passing arriving in QEMU 2.12 to get rid of the startup race wrt opening the QEMU monitor. The same thing benefits the agent too but I've not tackled that yet. Daniel P. Berrangé (3): qemu: probe for -chardev 'fd' parameter for FD passing qemu: support passing pre-opened UNIX socket listen FD qemu: don't retry connect() if doing FD passing src/qemu/qemu_capabilities.c | 4 ++- src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 58 ++++++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_monitor.c | 54 ++++++++++++++++++++++++----------------- src/qemu/qemu_monitor.h | 1 + src/qemu/qemu_process.c | 27 +++++++++++++++------ tests/qemumonitortestutils.c | 1 + 7 files changed, 114 insertions(+), 32 deletions(-) -- 2.14.3

QEMU >= 2.12 will support passing of pre-opened file descriptors for socket based character devices. XXXX should we include 2.12 pre-release git snapshot of capaibilities ? Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + 2 files changed, 3 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 3eb5ed6d1a..c20dc32126 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -459,6 +459,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "pl011", "machine.pseries.max-cpu-compat", "dump-completed", + "chardev-fd-pass", ); @@ -3196,6 +3197,7 @@ static struct virQEMUCapsCommandLineProps virQEMUCapsCommandLine[] = { { "machine", "loadparm", QEMU_CAPS_LOADPARM }, { "vnc", "vnc", QEMU_CAPS_VNC_MULTI_SERVERS }, { "chardev", "reconnect", QEMU_CAPS_CHARDEV_RECONNECT }, + { "chardev", "fd", QEMU_CAPS_CHARDEV_FD_PASS }, }; static int diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index c2ec2be193..aedf3c2d5e 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -444,6 +444,7 @@ typedef enum { QEMU_CAPS_DEVICE_PL011, /* -device pl011 (not user-instantiable) */ QEMU_CAPS_MACHINE_PSERIES_MAX_CPU_COMPAT, /* -machine pseries,max-cpu-compat= */ QEMU_CAPS_DUMP_COMPLETED, /* DUMP_COMPLETED event */ + QEMU_CAPS_CHARDEV_FD_PASS, /* Passing pre-opened FDs for chardevs */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; -- 2.14.3

On 03/14/2018 01:33 PM, Daniel P. Berrangé wrote:
QEMU >= 2.12 will support passing of pre-opened file descriptors for socket based character devices.
XXXX should we include 2.12 pre-release git snapshot of capaibilities ?
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + 2 files changed, 3 insertions(+)
Going to need some merging and the above commit message will need some tweaking and of course the qemucapabilitiestest run to update the 2.12 caps files to add the flag... With those adjustments, Reviewed-by: John Ferlan <jferlan@redhat.com> John
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 3eb5ed6d1a..c20dc32126 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -459,6 +459,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "pl011", "machine.pseries.max-cpu-compat", "dump-completed", + "chardev-fd-pass", );
@@ -3196,6 +3197,7 @@ static struct virQEMUCapsCommandLineProps virQEMUCapsCommandLine[] = { { "machine", "loadparm", QEMU_CAPS_LOADPARM }, { "vnc", "vnc", QEMU_CAPS_VNC_MULTI_SERVERS }, { "chardev", "reconnect", QEMU_CAPS_CHARDEV_RECONNECT }, + { "chardev", "fd", QEMU_CAPS_CHARDEV_FD_PASS }, };
static int diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index c2ec2be193..aedf3c2d5e 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -444,6 +444,7 @@ typedef enum { QEMU_CAPS_DEVICE_PL011, /* -device pl011 (not user-instantiable) */ QEMU_CAPS_MACHINE_PSERIES_MAX_CPU_COMPAT, /* -machine pseries,max-cpu-compat= */ QEMU_CAPS_DUMP_COMPLETED, /* DUMP_COMPLETED event */ + QEMU_CAPS_CHARDEV_FD_PASS, /* Passing pre-opened FDs for chardevs */
QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags;

There is a race condition when spawning QEMU where libvirt has spawned QEMU but the monitor socket is not yet open. Libvirt has to repeatedly try to connect() to QEMU's monitor until eventually it succeeds, or times out. We use kill() to check if QEMU is still alive so we avoid waiting a long time if QEMU exited, but having a timeout at all is still unpleasant. With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP sockets. If libvirt has called bind() and listen() on this FD, then we have a guarantee that libvirt can immediately call connect() and succeed without any race. Although we only really care about this for the monitor socket and agent socket, this patch does FD passing for all UNIX socket based character devices since there appears to be no downside to it. We don't do FD passing for TCP sockets, however, because it is only possible to pass a single FD, while some hostnames may require listening on multiple FDs to cover IPv4 and IPv6 concurrently. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_command.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 56 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index fa0aa5d5c3..291aad13cf 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5010,8 +5010,62 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager, break; case VIR_DOMAIN_CHR_TYPE_UNIX: - virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); - virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); +#ifndef WIN32 + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) { + struct sockaddr_un addr; + socklen_t addrlen = sizeof(addr); + int fd; + + if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { + virReportSystemError(errno, "%s", + _("Unable to create UNIX socket")); + goto cleanup; + } + + memset(&addr, 0, sizeof(addr)); + addr.sun_family = AF_UNIX; + if (virStrcpyStatic(addr.sun_path, dev->data.nix.path) == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Monitor path %s too big for destination"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (unlink(dev->data.nix.path) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to unlink %s"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (bind(fd, (struct sockaddr *)&addr, addrlen) < 0) { + virReportSystemError(errno, + _("Unable to bind to monitor %s"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (listen(fd, 1) < 0) { + virReportSystemError(errno, + _("Unable to listen to monitor %s"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + virBufferAsprintf(&buf, "socket,id=%s,fd=%d", charAlias, fd); + + virCommandPassFD(cmd, fd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); + } else { +#endif /* WIN32 */ + virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); + virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); +#ifndef WIN32 + } +#endif /* WIN32 */ if (dev->data.nix.listen) virBufferAdd(&buf, nowait ? ",server,nowait" : ",server", -1); -- 2.14.3

On 03/14/2018 01:33 PM, Daniel P. Berrangé wrote:
There is a race condition when spawning QEMU where libvirt has spawned QEMU but the monitor socket is not yet open. Libvirt has to repeatedly try to connect() to QEMU's monitor until eventually it succeeds, or times out. We use kill() to check if QEMU is still alive so we avoid waiting a long time if QEMU exited, but having a timeout at all is still unpleasant.
With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP sockets. If libvirt has called bind() and listen() on this FD, then we have a guarantee that libvirt can immediately call connect() and succeed without any race.
Although we only really care about this for the monitor socket and agent socket, this patch does FD passing for all UNIX socket based character devices since there appears to be no downside to it.
We don't do FD passing for TCP sockets, however, because it is only possible to pass a single FD, while some hostnames may require listening on multiple FDs to cover IPv4 and IPv6 concurrently.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_command.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 56 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index fa0aa5d5c3..291aad13cf 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5010,8 +5010,62 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager, break;
case VIR_DOMAIN_CHR_TYPE_UNIX: - virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); - virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); +#ifndef WIN32 + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) { + struct sockaddr_un addr; + socklen_t addrlen = sizeof(addr); + int fd; + + if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { + virReportSystemError(errno, "%s", + _("Unable to create UNIX socket")); + goto cleanup; + } + + memset(&addr, 0, sizeof(addr)); + addr.sun_family = AF_UNIX; + if (virStrcpyStatic(addr.sun_path, dev->data.nix.path) == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Monitor path %s too big for destination"), + dev->data.nix.path);
The commit message implies more general than monitor path...
+ VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (unlink(dev->data.nix.path) < 0 && errno != ENOENT) { + virReportSystemError(errno, + _("Unable to unlink %s"), + dev->data.nix.path); + VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (bind(fd, (struct sockaddr *)&addr, addrlen) < 0) { + virReportSystemError(errno, + _("Unable to bind to monitor %s"), + dev->data.nix.path);
Same here...
+ VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + if (listen(fd, 1) < 0) { + virReportSystemError(errno, + _("Unable to listen to monitor %s"), + dev->data.nix.path);
Again... W/ minor adjustments, Reviewed-by: John Ferlan <jferlan@redhat.com> John
+ VIR_FORCE_CLOSE(fd); + goto cleanup; + } + + virBufferAsprintf(&buf, "socket,id=%s,fd=%d", charAlias, fd); + + virCommandPassFD(cmd, fd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); + } else { +#endif /* WIN32 */ + virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); + virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); +#ifndef WIN32 + } +#endif /* WIN32 */ if (dev->data.nix.listen) virBufferAdd(&buf, nowait ? ",server,nowait" : ",server", -1);

On Wed, Mar 14, 2018 at 05:33:05PM +0000, Daniel P. Berrangé wrote:
There is a race condition when spawning QEMU where libvirt has spawned QEMU but the monitor socket is not yet open. Libvirt has to repeatedly try to connect() to QEMU's monitor until eventually it succeeds, or times out. We use kill() to check if QEMU is still alive so we avoid waiting a long time if QEMU exited, but having a timeout at all is still unpleasant.
With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP sockets. If libvirt has called bind() and listen() on this FD, then we have a guarantee that libvirt can immediately call connect() and succeed without any race.
Although we only really care about this for the monitor socket and agent socket, this patch does FD passing for all UNIX socket based character devices since there appears to be no downside to it.
We don't do FD passing for TCP sockets, however, because it is only possible to pass a single FD, while some hostnames may require listening on multiple FDs to cover IPv4 and IPv6 concurrently.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_command.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 56 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index fa0aa5d5c3..291aad13cf 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5010,8 +5010,62 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager, break;
case VIR_DOMAIN_CHR_TYPE_UNIX: - virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); - virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); +#ifndef WIN32
Why the ifdef? Do we even build QEMU driver on WIN32? Jan

On Mon, Mar 26, 2018 at 03:19:36PM +0200, Ján Tomko wrote:
On Wed, Mar 14, 2018 at 05:33:05PM +0000, Daniel P. Berrangé wrote:
There is a race condition when spawning QEMU where libvirt has spawned QEMU but the monitor socket is not yet open. Libvirt has to repeatedly try to connect() to QEMU's monitor until eventually it succeeds, or times out. We use kill() to check if QEMU is still alive so we avoid waiting a long time if QEMU exited, but having a timeout at all is still unpleasant.
With QEMU 2.12 we can pass in a pre-opened FD for UNIX domain or TCP sockets. If libvirt has called bind() and listen() on this FD, then we have a guarantee that libvirt can immediately call connect() and succeed without any race.
Although we only really care about this for the monitor socket and agent socket, this patch does FD passing for all UNIX socket based character devices since there appears to be no downside to it.
We don't do FD passing for TCP sockets, however, because it is only possible to pass a single FD, while some hostnames may require listening on multiple FDs to cover IPv4 and IPv6 concurrently.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_command.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 56 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index fa0aa5d5c3..291aad13cf 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5010,8 +5010,62 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager, break;
case VIR_DOMAIN_CHR_TYPE_UNIX: - virBufferAsprintf(&buf, "socket,id=%s,path=", charAlias); - virQEMUBuildBufferEscapeComma(&buf, dev->data.nix.path); +#ifndef WIN32
Why the ifdef?
Do we even build QEMU driver on WIN32?
Good point - this was just habit for any UNIX sockets code. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Since libvirt called bind() and listen() on the UNIX socket, it is guaranteed that connect() will immediately succeed, if QEMU is running normally. It will only fail if QEMU has closed the monitor socket by mistake or if QEMU has exited, letting the kernel close it. With this in mind we can remove the retry loop and timeout when connecting to the QEMU monitor if we are doing FD passing. Libvirt can go straight to sending the QMP greeting and will simply block waiting for a reply until QEMU is ready. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_capabilities.c | 2 +- src/qemu/qemu_monitor.c | 54 ++++++++++++++++++++++++++------------------ src/qemu/qemu_monitor.h | 1 + src/qemu/qemu_process.c | 27 ++++++++++++++++------ tests/qemumonitortestutils.c | 1 + 5 files changed, 55 insertions(+), 30 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index c20dc32126..dd392c9c22 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -5123,7 +5123,7 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd, cmd->vm->pid = cmd->pid; - if (!(cmd->mon = qemuMonitorOpen(cmd->vm, &cmd->config, true, + if (!(cmd->mon = qemuMonitorOpen(cmd->vm, &cmd->config, true, true, 0, &callbacks, NULL))) goto ignore; diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 1d67a97789..bf1a305656 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -349,6 +349,7 @@ qemuMonitorDispose(void *obj) static int qemuMonitorOpenUnix(const char *monitor, pid_t cpid, + bool retry, unsigned long long timeout) { struct sockaddr_un addr; @@ -370,31 +371,39 @@ qemuMonitorOpenUnix(const char *monitor, goto error; } - if (virTimeBackOffStart(&timebackoff, 1, timeout * 1000) < 0) - goto error; - while (virTimeBackOffWait(&timebackoff)) { - ret = connect(monfd, (struct sockaddr *) &addr, sizeof(addr)); - - if (ret == 0) - break; + if (retry) { + if (virTimeBackOffStart(&timebackoff, 1, timeout * 1000) < 0) + goto error; + while (virTimeBackOffWait(&timebackoff)) { + ret = connect(monfd, (struct sockaddr *) &addr, sizeof(addr)); - if ((errno == ENOENT || errno == ECONNREFUSED) && - (!cpid || virProcessKill(cpid, 0) == 0)) { - /* ENOENT : Socket may not have shown up yet - * ECONNREFUSED : Leftover socket hasn't been removed yet */ - continue; - } + if (ret == 0) + break; - virReportSystemError(errno, "%s", - _("failed to connect to monitor socket")); - goto error; + if ((errno == ENOENT || errno == ECONNREFUSED) && + (!cpid || virProcessKill(cpid, 0) == 0)) { + /* ENOENT : Socket may not have shown up yet + * ECONNREFUSED : Leftover socket hasn't been removed yet */ + continue; + } - } + virReportSystemError(errno, "%s", + _("failed to connect to monitor socket")); + goto error; + } - if (ret != 0) { - virReportSystemError(errno, "%s", - _("monitor socket did not show up")); - goto error; + if (ret != 0) { + virReportSystemError(errno, "%s", + _("monitor socket did not show up")); + goto error; + } + } else { + ret = connect(monfd, (struct sockaddr *) &addr, sizeof(addr)); + if (ret < 0) { + virReportSystemError(errno, "%s", + _("failed to connect to monitor socket")); + goto error; + } } return monfd; @@ -914,6 +923,7 @@ qemuMonitorPtr qemuMonitorOpen(virDomainObjPtr vm, virDomainChrSourceDefPtr config, bool json, + bool retry, unsigned long long timeout, qemuMonitorCallbacksPtr cb, void *opaque) @@ -928,7 +938,7 @@ qemuMonitorOpen(virDomainObjPtr vm, case VIR_DOMAIN_CHR_TYPE_UNIX: hasSendFD = true; if ((fd = qemuMonitorOpenUnix(config->data.nix.path, - vm->pid, timeout)) < 0) + vm->pid, retry, timeout)) < 0) return NULL; break; diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index adfa87aba9..c1483c52c4 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -320,6 +320,7 @@ char *qemuMonitorUnescapeArg(const char *in); qemuMonitorPtr qemuMonitorOpen(virDomainObjPtr vm, virDomainChrSourceDefPtr config, bool json, + bool retry, unsigned long long timeout, qemuMonitorCallbacksPtr cb, void *opaque) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 57c06c7c15..9950461810 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1771,7 +1771,7 @@ qemuProcessInitMonitor(virQEMUDriverPtr driver, static int qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob, - qemuDomainLogContextPtr logCtxt) + bool retry, qemuDomainLogContextPtr logCtxt) { qemuDomainObjPrivatePtr priv = vm->privateData; qemuMonitorPtr mon = NULL; @@ -1799,6 +1799,7 @@ qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob, mon = qemuMonitorOpen(vm, priv->monConfig, priv->monJSON, + retry, timeout, &monitorCallbacks, driver); @@ -2174,17 +2175,23 @@ qemuProcessWaitForMonitor(virQEMUDriverPtr driver, { int ret = -1; virHashTablePtr info = NULL; - qemuDomainObjPrivatePtr priv; + qemuDomainObjPrivatePtr priv = vm->privateData; + bool retry = true; + + if (priv->qemuCaps && + virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) + retry = false; - VIR_DEBUG("Connect monitor to %p '%s'", vm, vm->def->name); - if (qemuConnectMonitor(driver, vm, asyncJob, logCtxt) < 0) + VIR_DEBUG("Connect monitor to vm=%p name='%s' retry=%d", + vm, vm->def->name, retry); + + if (qemuConnectMonitor(driver, vm, asyncJob, retry, logCtxt) < 0) goto cleanup; /* Try to get the pty path mappings again via the monitor. This is much more * reliable if it's available. * Note that the monitor itself can be on a pty, so we still need to try the * log output method. */ - priv = vm->privateData; if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) goto cleanup; ret = qemuMonitorGetChardevInfo(priv->mon, &info); @@ -7267,6 +7274,7 @@ qemuProcessReconnect(void *opaque) unsigned int stopFlags = 0; bool jobStarted = false; virCapsPtr caps = NULL; + bool retry = true; VIR_FREE(data); @@ -7297,10 +7305,15 @@ qemuProcessReconnect(void *opaque) * allowReboot in status XML and we need to initialize it. */ qemuProcessPrepareAllowReboot(obj); - VIR_DEBUG("Reconnect monitor to %p '%s'", obj, obj->def->name); + if (priv->qemuCaps && + virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) + retry = false; + + VIR_DEBUG("Reconnect monitor to def=%p name='%s' retry=%d", + obj, obj->def->name, retry); /* XXX check PID liveliness & EXE path */ - if (qemuConnectMonitor(driver, obj, QEMU_ASYNC_JOB_NONE, NULL) < 0) + if (qemuConnectMonitor(driver, obj, QEMU_ASYNC_JOB_NONE, retry, NULL) < 0) goto error; if (qemuHostdevUpdateActiveDomainDevices(driver, obj->def) < 0) diff --git a/tests/qemumonitortestutils.c b/tests/qemumonitortestutils.c index 5e30fb067c..bd15e16598 100644 --- a/tests/qemumonitortestutils.c +++ b/tests/qemumonitortestutils.c @@ -1174,6 +1174,7 @@ qemuMonitorTestNew(bool json, if (!(test->mon = qemuMonitorOpen(test->vm, &src, json, + true, 0, &qemuMonitorTestCallbacks, driver))) -- 2.14.3

On 03/14/2018 01:33 PM, Daniel P. Berrangé wrote:
Since libvirt called bind() and listen() on the UNIX socket, it is guaranteed that connect() will immediately succeed, if QEMU is running normally. It will only fail if QEMU has closed the monitor socket by mistake or if QEMU has exited, letting the kernel close it.
With this in mind we can remove the retry loop and timeout when connecting to the QEMU monitor if we are doing FD passing. Libvirt can go straight to sending the QMP greeting and will simply block waiting for a reply until QEMU is ready.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_capabilities.c | 2 +- src/qemu/qemu_monitor.c | 54 ++++++++++++++++++++++++++------------------ src/qemu/qemu_monitor.h | 1 + src/qemu/qemu_process.c | 27 ++++++++++++++++------ tests/qemumonitortestutils.c | 1 + 5 files changed, 55 insertions(+), 30 deletions(-)
So just doing the monitor for now... Eventually the agent too? How about having a news.xml patch - is this a newsworthy change? [...]
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 57c06c7c15..9950461810 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1771,7 +1771,7 @@ qemuProcessInitMonitor(virQEMUDriverPtr driver,
static int qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob, - qemuDomainLogContextPtr logCtxt) + bool retry, qemuDomainLogContextPtr logCtxt) { qemuDomainObjPrivatePtr priv = vm->privateData; qemuMonitorPtr mon = NULL; @@ -1799,6 +1799,7 @@ qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob, mon = qemuMonitorOpen(vm, priv->monConfig, priv->monJSON, + retry, timeout, &monitorCallbacks, driver); @@ -2174,17 +2175,23 @@ qemuProcessWaitForMonitor(virQEMUDriverPtr driver, { int ret = -1; virHashTablePtr info = NULL; - qemuDomainObjPrivatePtr priv; + qemuDomainObjPrivatePtr priv = vm->privateData; + bool retry = true;
This is also called from qemuProcessAttach where it wouldn't seem there'd be necessarily be a chardev on the command line with the pre-opened fd, but then again since it's being used for an already running qemu instance, that would "I hope" work properly... With that assumption in place, Reviewed-by: John Ferlan <jferlan@redhat.com> John
+ + if (priv->qemuCaps && + virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) + retry = false;
- VIR_DEBUG("Connect monitor to %p '%s'", vm, vm->def->name); - if (qemuConnectMonitor(driver, vm, asyncJob, logCtxt) < 0) + VIR_DEBUG("Connect monitor to vm=%p name='%s' retry=%d", + vm, vm->def->name, retry); + + if (qemuConnectMonitor(driver, vm, asyncJob, retry, logCtxt) < 0) goto cleanup;
/* Try to get the pty path mappings again via the monitor. This is much more * reliable if it's available. * Note that the monitor itself can be on a pty, so we still need to try the * log output method. */ - priv = vm->privateData; if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) goto cleanup; ret = qemuMonitorGetChardevInfo(priv->mon, &info);
[...]

On Mon, Mar 26, 2018 at 09:10:04AM -0400, John Ferlan wrote:
On 03/14/2018 01:33 PM, Daniel P. Berrangé wrote:
Since libvirt called bind() and listen() on the UNIX socket, it is guaranteed that connect() will immediately succeed, if QEMU is running normally. It will only fail if QEMU has closed the monitor socket by mistake or if QEMU has exited, letting the kernel close it.
With this in mind we can remove the retry loop and timeout when connecting to the QEMU monitor if we are doing FD passing. Libvirt can go straight to sending the QMP greeting and will simply block waiting for a reply until QEMU is ready.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- src/qemu/qemu_capabilities.c | 2 +- src/qemu/qemu_monitor.c | 54 ++++++++++++++++++++++++++------------------ src/qemu/qemu_monitor.h | 1 + src/qemu/qemu_process.c | 27 ++++++++++++++++------ tests/qemumonitortestutils.c | 1 + 5 files changed, 55 insertions(+), 30 deletions(-)
So just doing the monitor for now... Eventually the agent too?
Once we've attached to the monitor & got the QMP handshake done, we know everything else has been setup correctly. So from a functional POV there's no need to change the agent. That said I will add another a patch, since there's no reason todo the retry loop for the agent - even before FD passing, we should never have been retrying AFAICT, since we should be guaranteed that it exists at the time we connect.
How about having a news.xml patch - is this a newsworthy change?
Its mostly invisible to the user I would hope
[...]
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 57c06c7c15..9950461810 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1771,7 +1771,7 @@ qemuProcessInitMonitor(virQEMUDriverPtr driver,
static int qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob, - qemuDomainLogContextPtr logCtxt) + bool retry, qemuDomainLogContextPtr logCtxt) { qemuDomainObjPrivatePtr priv = vm->privateData; qemuMonitorPtr mon = NULL; @@ -1799,6 +1799,7 @@ qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob, mon = qemuMonitorOpen(vm, priv->monConfig, priv->monJSON, + retry, timeout, &monitorCallbacks, driver); @@ -2174,17 +2175,23 @@ qemuProcessWaitForMonitor(virQEMUDriverPtr driver, { int ret = -1; virHashTablePtr info = NULL; - qemuDomainObjPrivatePtr priv; + qemuDomainObjPrivatePtr priv = vm->privateData; + bool retry = true;
This is also called from qemuProcessAttach where it wouldn't seem there'd be necessarily be a chardev on the command line with the pre-opened fd, but then again since it's being used for an already running qemu instance, that would "I hope" work properly...
With that assumption in place,
Yes, exactly - when attaching to an existing QEMU, we only want to try once and then abort, because either the socket exists, or the QEMU we're attaching to has just quit. Retrying was never right in that scenario.
Reviewed-by: John Ferlan <jferlan@redhat.com>
John
+ + if (priv->qemuCaps && + virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_CHARDEV_FD_PASS)) + retry = false;
- VIR_DEBUG("Connect monitor to %p '%s'", vm, vm->def->name); - if (qemuConnectMonitor(driver, vm, asyncJob, logCtxt) < 0) + VIR_DEBUG("Connect monitor to vm=%p name='%s' retry=%d", + vm, vm->def->name, retry); + + if (qemuConnectMonitor(driver, vm, asyncJob, retry, logCtxt) < 0) goto cleanup;
/* Try to get the pty path mappings again via the monitor. This is much more * reliable if it's available. * Note that the monitor itself can be on a pty, so we still need to try the * log output method. */ - priv = vm->privateData; if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) goto cleanup; ret = qemuMonitorGetChardevInfo(priv->mon, &info);
[...]
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (3)
-
Daniel P. Berrangé
-
John Ferlan
-
Ján Tomko