[libvirt] [PATCH v3 0/5] Don't blindly connect to qemu-ga

diff to v2: - completely reworked, new approach Michal Privoznik (5): qemu: Set virtio channel state sooner qemu: Rename qemuProcessReconnectRefreshChannelVirtioState qemu: change qemuFindAgentConfig return type qemu: Introduce QEMU_CAPS_VSERPORT_CHANGE qemu: Connect to guest agent iff needed src/qemu/qemu_agent.c | 2 +- src/qemu/qemu_agent.h | 2 +- src/qemu/qemu_capabilities.c | 3 +++ src/qemu/qemu_capabilities.h | 3 +++ src/qemu/qemu_domain.c | 11 ++++------ src/qemu/qemu_domain.h | 2 +- src/qemu/qemu_driver.c | 33 ++++++++++++++++++++++------ src/qemu/qemu_migration.c | 20 +++++++++++++++++ src/qemu/qemu_process.c | 21 +++++++++++++----- src/qemu/qemu_process.h | 3 +++ tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 1 + tests/qemucapabilitiesdata/caps_2.4.0-1.caps | 1 + tests/qemucapabilitiesdata/caps_2.5.0-1.caps | 1 + tests/qemucapabilitiesdata/caps_2.6.0-1.caps | 1 + 14 files changed, 81 insertions(+), 23 deletions(-) -- 2.4.10

In qemu driver we listen to virtio channel events like an agent connected to or disconnected from the guest part of socket. However, with a little exception - when we find out that the socket in question is the guest agent one, we connect or disconnect guest agent which is done prior setting new state in internal structure. Due to a bug in our code it may happen that we got the event but failed to set it in internal structure representing the channel. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e2134ee..8ccf68b 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4457,6 +4457,12 @@ processSerialChangedEvent(virQEMUDriverPtr driver, dev.data.chr->targetType != VIR_DOMAIN_CHR_CHANNEL_TARGET_TYPE_VIRTIO) goto endjob; + dev.data.chr->state = newstate; + + if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0) + VIR_WARN("unable to save status of domain %s after updating state of " + "channel %s", vm->def->name, devAlias); + if (STREQ_NULLABLE(dev.data.chr->target.name, "org.qemu.guest_agent.0")) { if (newstate == VIR_DOMAIN_CHR_DEVICE_STATE_CONNECTED) { if (!priv->agent) { @@ -4479,12 +4485,6 @@ processSerialChangedEvent(virQEMUDriverPtr driver, qemuDomainEventQueue(driver, event); } - dev.data.chr->state = newstate; - - if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0) - VIR_WARN("unable to save status of domain %s after updating state of " - "channel %s", vm->def->name, devAlias); - endjob: qemuDomainObjEndJob(driver, vm); -- 2.4.10

This function is going to called from other areas not just reconnect. New name is qemuProcessRefreshChannelState. At the same time expose the function as it's going to be needed very soon. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_process.c | 8 ++++---- src/qemu/qemu_process.h | 3 +++ 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 730eb2c..5133804 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1857,9 +1857,9 @@ qemuProcessRefreshChannelVirtioState(virQEMUDriverPtr driver, } -static int -qemuProcessReconnectRefreshChannelVirtioState(virQEMUDriverPtr driver, - virDomainObjPtr vm) +int +qemuProcessRefreshChannelState(virQEMUDriverPtr driver, + virDomainObjPtr vm) { qemuDomainObjPrivatePtr priv = vm->privateData; virHashTablePtr info = NULL; @@ -3621,7 +3621,7 @@ qemuProcessReconnect(void *opaque) if (qemuDomainCheckEjectableMedia(driver, obj, QEMU_ASYNC_JOB_NONE) < 0) goto error; - if (qemuProcessReconnectRefreshChannelVirtioState(driver, obj) < 0) + if (qemuProcessRefreshChannelState(driver, obj) < 0) goto error; if (qemuProcessRefreshBalloonState(driver, obj, QEMU_ASYNC_JOB_NONE) < 0) diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h index c674111..2e348f5 100644 --- a/src/qemu/qemu_process.h +++ b/src/qemu/qemu_process.h @@ -155,4 +155,7 @@ virDomainDiskDefPtr qemuProcessFindDomainDiskByAlias(virDomainObjPtr vm, int qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm); +int qemuProcessRefreshChannelState(virQEMUDriverPtr driver, + virDomainObjPtr vm); + #endif /* __QEMU_PROCESS_H__ */ -- 2.4.10

On Mon, Jan 11, 2016 at 11:31:09AM +0100, Michal Privoznik wrote:
This function is going to called from other areas not just
s/going to called/going to be called/
reconnect. New name is qemuProcessRefreshChannelState. At the same time expose the function as it's going to be needed very soon.

While this is no functional change, whole channel definition is going to be needed very soon. Moreover, while touching this obey const correctness rule in qemuAgentOpen() - so far it was passed regular pointer to channel config even though the function is expected to not change pointee at all. Pass const pointer instead. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_agent.c | 2 +- src/qemu/qemu_agent.h | 2 +- src/qemu/qemu_domain.c | 11 ++++------- src/qemu/qemu_domain.h | 2 +- src/qemu/qemu_process.c | 4 ++-- 5 files changed, 9 insertions(+), 12 deletions(-) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index 5735ed8..f979f82 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -710,7 +710,7 @@ qemuAgentIO(int watch, int fd, int events, void *opaque) qemuAgentPtr qemuAgentOpen(virDomainObjPtr vm, - virDomainChrSourceDefPtr config, + const virDomainChrSourceDef *config, qemuAgentCallbacksPtr cb) { qemuAgentPtr mon; diff --git a/src/qemu/qemu_agent.h b/src/qemu/qemu_agent.h index 7cbf8eb..c092504 100644 --- a/src/qemu/qemu_agent.h +++ b/src/qemu/qemu_agent.h @@ -44,7 +44,7 @@ struct _qemuAgentCallbacks { qemuAgentPtr qemuAgentOpen(virDomainObjPtr vm, - virDomainChrSourceDefPtr config, + const virDomainChrSourceDef *config, qemuAgentCallbacksPtr cb); void qemuAgentClose(qemuAgentPtr mon); diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 55d5e71..e8c291f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -3704,10 +3704,9 @@ qemuDomainSupportsBlockJobs(virDomainObjPtr vm, * Returns the pointer to the channel definition that is used to access the * guest agent if the agent is configured or NULL otherwise. */ -virDomainChrSourceDefPtr +virDomainChrDefPtr qemuFindAgentConfig(virDomainDefPtr def) { - virDomainChrSourceDefPtr config = NULL; size_t i; for (i = 0; i < def->nchannels; i++) { @@ -3716,13 +3715,11 @@ qemuFindAgentConfig(virDomainDefPtr def) if (channel->targetType != VIR_DOMAIN_CHR_CHANNEL_TARGET_TYPE_VIRTIO) continue; - if (STREQ_NULLABLE(channel->target.name, "org.qemu.guest_agent.0")) { - config = &channel->source; - break; - } + if (STREQ_NULLABLE(channel->target.name, "org.qemu.guest_agent.0")) + return channel; } - return config; + return NULL; } diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 82495dc..7fc4fff 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -487,7 +487,7 @@ int qemuDomainAlignMemorySizes(virDomainDefPtr def); void qemuDomainMemoryDeviceAlignSize(virDomainDefPtr def, virDomainMemoryDefPtr mem); -virDomainChrSourceDefPtr qemuFindAgentConfig(virDomainDefPtr def); +virDomainChrDefPtr qemuFindAgentConfig(virDomainDefPtr def); bool qemuDomainMachineIsQ35(const virDomainDef *def); bool qemuDomainMachineIsI440FX(const virDomainDef *def); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 5133804..43092d4 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -203,7 +203,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) qemuDomainObjPrivatePtr priv = vm->privateData; int ret = -1; qemuAgentPtr agent = NULL; - virDomainChrSourceDefPtr config = qemuFindAgentConfig(vm->def); + virDomainChrDefPtr config = qemuFindAgentConfig(vm->def); if (!config) return 0; @@ -223,7 +223,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) virObjectUnlock(vm); agent = qemuAgentOpen(vm, - config, + &config->source, &agentCallbacks); virObjectLock(vm); -- 2.4.10

This capability tells if qemu is capable of vserport_change events. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_capabilities.c | 3 +++ src/qemu/qemu_capabilities.h | 3 +++ tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 1 + tests/qemucapabilitiesdata/caps_2.4.0-1.caps | 1 + tests/qemucapabilitiesdata/caps_2.5.0-1.caps | 1 + tests/qemucapabilitiesdata/caps_2.6.0-1.caps | 1 + 6 files changed, 10 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 92f42dc..4583a8a 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -311,6 +311,8 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "chardev-file-append", "ich9-disable-s3", "ich9-disable-s4", + + "vserport-change-event", /* 210 */ ); @@ -1482,6 +1484,7 @@ struct virQEMUCapsStringFlags virQEMUCapsEvents[] = { { "SPICE_MIGRATE_COMPLETED", QEMU_CAPS_SEAMLESS_MIGRATION }, { "DEVICE_DELETED", QEMU_CAPS_DEVICE_DEL_EVENT }, { "MIGRATION", QEMU_CAPS_MIGRATION_EVENT }, + { "VSERPORT_CHANGE", QEMU_CAPS_VSERPORT_CHANGE }, }; struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 336031d..d0b941e 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -339,6 +339,9 @@ typedef enum { QEMU_CAPS_ICH9_DISABLE_S3, /* -M q35 S3 BIOS Advertisement on/off */ QEMU_CAPS_ICH9_DISABLE_S4, /* -M q35 S4 BIOS Advertisement on/off */ + /* 210 */ + QEMU_CAPS_VSERPORT_CHANGE, /* VSERPORT_CHANGE event */ + QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_2.1.1-1.caps b/tests/qemucapabilitiesdata/caps_2.1.1-1.caps index 1098dcf..332b85a 100644 --- a/tests/qemucapabilitiesdata/caps_2.1.1-1.caps +++ b/tests/qemucapabilitiesdata/caps_2.1.1-1.caps @@ -159,4 +159,5 @@ <flag name='rtl8139'/> <flag name='e1000'/> <flag name='virtio-net'/> + <flag name='vserport-change-event'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_2.4.0-1.caps b/tests/qemucapabilitiesdata/caps_2.4.0-1.caps index ffc09c6..97ac901 100644 --- a/tests/qemucapabilitiesdata/caps_2.4.0-1.caps +++ b/tests/qemucapabilitiesdata/caps_2.4.0-1.caps @@ -169,4 +169,5 @@ <flag name='virtio-input-host'/> <flag name='ich9-disable-s3'/> <flag name='ich9-disable-s4'/> + <flag name='vserport-change-event'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_2.5.0-1.caps b/tests/qemucapabilitiesdata/caps_2.5.0-1.caps index 2a4f5c0..1d49d9d 100644 --- a/tests/qemucapabilitiesdata/caps_2.5.0-1.caps +++ b/tests/qemucapabilitiesdata/caps_2.5.0-1.caps @@ -170,4 +170,5 @@ <flag name='virtio-input-host'/> <flag name='ich9-disable-s3'/> <flag name='ich9-disable-s4'/> + <flag name='vserport-change-event'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_2.6.0-1.caps b/tests/qemucapabilitiesdata/caps_2.6.0-1.caps index 944208e..10c07cd 100644 --- a/tests/qemucapabilitiesdata/caps_2.6.0-1.caps +++ b/tests/qemucapabilitiesdata/caps_2.6.0-1.caps @@ -172,4 +172,5 @@ <flag name='chardev-file-append'/> <flag name='ich9-disable-s3'/> <flag name='ich9-disable-s4'/> + <flag name='vserport-change-event'/> </qemuCaps> -- 2.4.10

https://bugzilla.redhat.com/show_bug.cgi?id=1293351 Since we already have virtio channel events, we know when guest agent within guest has (dis-)connected. Instead of us blindly connecting to a socket that no one is listening to, we can just follow what qemu-ga does. This has a nice benefit that we don't need to 'guest-ping' the agent just to timeout and find out nobody is listening. The way that this commit is implemented: - don't connect in qemuProcessStart directly, defer that to event callback (which already follows the agent). - after migration is settled, before we resume vCPUs, ask qemu whether somebody is listening on the socket and if so, connect to it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 21 ++++++++++++++++++++- src/qemu/qemu_migration.c | 20 ++++++++++++++++++++ src/qemu/qemu_process.c | 9 +++++++++ 3 files changed, 49 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8ccf68b..43ef18a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6650,12 +6650,13 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, bool start_paused, qemuDomainAsyncJob asyncJob) { - int ret = -1; + int rc, ret = -1; virObjectEventPtr event; int intermediatefd = -1; virCommandPtr cmd = NULL; char *errbuf = NULL; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + qemuDomainObjPrivatePtr priv = vm->privateData; if ((header->version == 2) && (header->compressed != QEMU_SAVE_FORMAT_RAW)) { @@ -6710,6 +6711,24 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, goto cleanup; } + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VSERPORT_CHANGE)) { + /* If QEMU is capable of vserport_change events, we ought to + * refresh channel states here and possibly connect to the guest + * agent too. */ + if (qemuProcessRefreshChannelState(driver, vm) < 0) + goto cleanup; + + if ((rc = qemuConnectAgent(driver, vm)) < 0) { + if (rc == -2) + goto cleanup; + + VIR_WARN("Cannot connect to QEMU guest agent for %s", + vm->def->name); + virResetLastError(); + priv->agentError = true; + } + } + event = virDomainEventLifecycleNewFromObj(vm, VIR_DOMAIN_EVENT_STARTED, VIR_DOMAIN_EVENT_STARTED_RESTORED); diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 51e7125..33ed738 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -5795,6 +5795,7 @@ qemuMigrationFinish(virQEMUDriverPtr driver, unsigned short port; unsigned long long timeReceived = 0; virObjectEventPtr event; + int rc; VIR_DEBUG("driver=%p, dconn=%p, vm=%p, cookiein=%s, cookieinlen=%d, " "cookieout=%p, cookieoutlen=%p, flags=%lx, retcode=%d", @@ -5863,6 +5864,25 @@ qemuMigrationFinish(virQEMUDriverPtr driver, if (qemuMigrationStopNBDServer(driver, vm, mig) < 0) goto endjob; + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VSERPORT_CHANGE)) { + /* If QEMU is capable of vserport_change events, we ought to + * refresh channel states here and possibly connect to the guest + * agent too. */ + if (qemuProcessRefreshChannelState(driver, vm) < 0) + goto endjob; + + if ((rc = qemuConnectAgent(driver, vm)) < 0) { + if (rc == -2) + goto endjob; + + VIR_WARN("Cannot connect to QEMU guest agent for %s", + vm->def->name); + virResetLastError(); + priv->agentError = true; + } + } + + if (flags & VIR_MIGRATE_PERSIST_DEST) { if (qemuMigrationPersist(driver, vm, mig, !v3proto) < 0) { /* Hmpf. Migration was successful, but making it persistent diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 43092d4..03b8d16 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -208,6 +208,15 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) if (!config) return 0; + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VSERPORT_CHANGE) && + config->state != VIR_DOMAIN_CHR_DEVICE_STATE_CONNECTED) { + /* So, qemu is capable of sending us event whenever guest agent + * (dis-)connects. And we reflect its state in config->state. So when + * we are called without proper state set, just defer connecting. */ + VIR_DEBUG("Deferring connecting to guest agent"); + return 0; + } + if (virSecurityManagerSetDaemonSocketLabel(driver->securityManager, vm->def) < 0) { VIR_ERROR(_("Failed to set security context for agent for %s"), -- 2.4.10

On Mon, Jan 11, 2016 at 11:31:12AM +0100, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1293351
Since we already have virtio channel events, we know when guest agent within guest has (dis-)connected. Instead of us blindly connecting to a socket that no one is listening to, we can just follow what qemu-ga does. This has a nice benefit that we don't need to 'guest-ping' the agent just to timeout and find out nobody is listening.
The way that this commit is implemented: - don't connect in qemuProcessStart directly, defer that to event callback (which already follows the agent). - after migration is settled, before we resume vCPUs, ask qemu whether somebody is listening on the socket and if so, connect to it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 21 ++++++++++++++++++++- src/qemu/qemu_migration.c | 20 ++++++++++++++++++++ src/qemu/qemu_process.c | 9 +++++++++ 3 files changed, 49 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8ccf68b..43ef18a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6650,12 +6650,13 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, bool start_paused, qemuDomainAsyncJob asyncJob) { - int ret = -1; + int rc, ret = -1; virObjectEventPtr event; int intermediatefd = -1; virCommandPtr cmd = NULL; char *errbuf = NULL; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + qemuDomainObjPrivatePtr priv = vm->privateData;
if ((header->version == 2) && (header->compressed != QEMU_SAVE_FORMAT_RAW)) { @@ -6710,6 +6711,24 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, goto cleanup; }
+ if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VSERPORT_CHANGE)) { + /* If QEMU is capable of vserport_change events, we ought to + * refresh channel states here and possibly connect to the guest + * agent too. */ + if (qemuProcessRefreshChannelState(driver, vm) < 0) + goto cleanup; + + if ((rc = qemuConnectAgent(driver, vm)) < 0) { + if (rc == -2) + goto cleanup; + + VIR_WARN("Cannot connect to QEMU guest agent for %s", + vm->def->name); + virResetLastError(); + priv->agentError = true; + } + } +
This pattern is repeating. If we moved the channel state refresh into qemuConnectAgent() (it could even be called only once per domain lifetime, all other times we would depend on the state being correct), mainly if there is a check for the state anyway, that would help in more cases than just this one. And more generally, we could always skip connecting to the agent if qemu supports VSERPORT_CHANGE, so that condition could be moved inside qemuConnectAgent() as well (or creating a wrapper around it). That way your problem would be also solved for domains we connected to after libvirt's restart and all other cases that none of us thought about. If there is a reason against that, which I missed, then at least putting this into it's own function, and considering calling it from other parts of the code where qemuConnectAgent() is being called currently, would be nice. ACK to 1-4, by the way. Martin
participants (3)
-
Martin Kletzander
-
Michal Privoznik
-
Pavel Hrdina