[libvirt] [PATCH v4] qemu: Connect to guest agent iff needed

https://bugzilla.redhat.com/show_bug.cgi?id=1293351 Since we already have virtio channel events, we know when guest agent within guest has (dis-)connected. Instead of us blindly connecting to a socket that no one is listening to, we can just follow what qemu-ga does. This has a nice benefit that we don't need to 'guest-ping' the agent just to timeout and find out nobody is listening. The way that this commit is implemented: - don't connect in qemuProcessStart directly, defer that to event callback (which already follows the agent). - after migration is settled, before we resume vCPUs, ask qemu whether somebody is listening on the socket and if so, connect to it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- diff to v3: -Move cap detection into qemuConnectAgent src/qemu/qemu_driver.c | 13 ++++++++++++- src/qemu/qemu_migration.c | 12 ++++++++++++ src/qemu/qemu_process.c | 24 ++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8ccf68b..3751ba6 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6650,12 +6650,13 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, bool start_paused, qemuDomainAsyncJob asyncJob) { - int ret = -1; + int rc, ret = -1; virObjectEventPtr event; int intermediatefd = -1; virCommandPtr cmd = NULL; char *errbuf = NULL; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + qemuDomainObjPrivatePtr priv = vm->privateData; if ((header->version == 2) && (header->compressed != QEMU_SAVE_FORMAT_RAW)) { @@ -6710,6 +6711,16 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, goto cleanup; } + if ((rc = qemuConnectAgent(driver, vm)) < 0) { + if (rc == -2) + goto cleanup; + + VIR_WARN("Cannot connect to QEMU guest agent for %s", + vm->def->name); + virResetLastError(); + priv->agentError = true; + } + event = virDomainEventLifecycleNewFromObj(vm, VIR_DOMAIN_EVENT_STARTED, VIR_DOMAIN_EVENT_STARTED_RESTORED); diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 51e7125..9678adf 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -5795,6 +5795,7 @@ qemuMigrationFinish(virQEMUDriverPtr driver, unsigned short port; unsigned long long timeReceived = 0; virObjectEventPtr event; + int rc; VIR_DEBUG("driver=%p, dconn=%p, vm=%p, cookiein=%s, cookieinlen=%d, " "cookieout=%p, cookieoutlen=%p, flags=%lx, retcode=%d", @@ -5863,6 +5864,17 @@ qemuMigrationFinish(virQEMUDriverPtr driver, if (qemuMigrationStopNBDServer(driver, vm, mig) < 0) goto endjob; + if ((rc = qemuConnectAgent(driver, vm)) < 0) { + if (rc == -2) + goto endjob; + + VIR_WARN("Cannot connect to QEMU guest agent for %s", + vm->def->name); + virResetLastError(); + priv->agentError = true; + } + + if (flags & VIR_MIGRATE_PERSIST_DEST) { if (qemuMigrationPersist(driver, vm, mig, !v3proto) < 0) { /* Hmpf. Migration was successful, but making it persistent diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 05cbda2..e1a25dd 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -77,6 +77,10 @@ VIR_LOG_INIT("qemu.qemu_process"); +static int +qemuProcessReconnectRefreshChannelVirtioState(virQEMUDriverPtr driver, + virDomainObjPtr vm); + /** * qemuProcessRemoveDomainStatus * @@ -208,6 +212,26 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) if (!config) return 0; + if (priv->agent) + return 0; + + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VSERPORT_CHANGE) && + config->state != VIR_DOMAIN_CHR_DEVICE_STATE_CONNECTED) { + /* So qemu is capable of sending us event whenever guest agent + * (dis-)connects. And we reflect its state in config->state. Well, + * nearly. It may happen that what we think about the guest agent state + * is not accurate, e.g. right after migration. Ask qemu to be sure. */ + + if (qemuProcessReconnectRefreshChannelVirtioState(driver, vm) < 0) + goto cleanup; + + /* Now we are sure about the channel state. */ + if (config->state != VIR_DOMAIN_CHR_DEVICE_STATE_CONNECTED) { + VIR_DEBUG("Deferring connecting to guest agent"); + return 0; + } + } + if (virSecurityManagerSetDaemonSocketLabel(driver->securityManager, vm->def) < 0) { VIR_ERROR(_("Failed to set security context for agent for %s"), -- 2.4.10

On 18.01.2016 11:54, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1293351
Since we already have virtio channel events, we know when guest agent within guest has (dis-)connected. Instead of us blindly connecting to a socket that no one is listening to, we can just follow what qemu-ga does. This has a nice benefit that we don't need to 'guest-ping' the agent just to timeout and find out nobody is listening.
The way that this commit is implemented: - don't connect in qemuProcessStart directly, defer that to event callback (which already follows the agent). - after migration is settled, before we resume vCPUs, ask qemu whether somebody is listening on the socket and if so, connect to it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
diff to v3: -Move cap detection into qemuConnectAgent
src/qemu/qemu_driver.c | 13 ++++++++++++- src/qemu/qemu_migration.c | 12 ++++++++++++ src/qemu/qemu_process.c | 24 ++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 1 deletion(-)
ping Michal

On 01/18/2016 05:54 AM, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1293351
Since we already have virtio channel events, we know when guest agent within guest has (dis-)connected. Instead of us blindly connecting to a socket that no one is listening to, we can just follow what qemu-ga does. This has a nice benefit that we don't need to 'guest-ping' the agent just to timeout and find out nobody is listening.
The way that this commit is implemented: - don't connect in qemuProcessStart directly, defer that to event callback (which already follows the agent).
The qemuConnectAgent is called in qemuProcessLaunch... Curious, are you referring to processSerialChangedEvent handling the connect on event callback?
- after migration is settled, before we resume vCPUs, ask qemu whether somebody is listening on the socket and if so, connect to it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
diff to v3: -Move cap detection into qemuConnectAgent
src/qemu/qemu_driver.c | 13 ++++++++++++- src/qemu/qemu_migration.c | 12 ++++++++++++ src/qemu/qemu_process.c | 24 ++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 1 deletion(-)
Concept seems reasonable - have a question though... With the change to qemuConnectAgent that means the qemuProcessReconnect would call qemuProcessReconnectRefreshChannelVirtioState again later in processing. So should it matter? Is there anything in the interceding code between he qemuConnectAgent call and where that Refresh call is now that may do "something" that perhaps the Channel code may be assuming would be done? I'm perhaps thinking most of security mgr settings. Compared to when AgentConnect is called during qemuProcessLaunch processing - it's done much earlier for Reconnect John
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8ccf68b..3751ba6 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6650,12 +6650,13 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, bool start_paused, qemuDomainAsyncJob asyncJob) { - int ret = -1; + int rc, ret = -1; virObjectEventPtr event; int intermediatefd = -1; virCommandPtr cmd = NULL; char *errbuf = NULL; virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + qemuDomainObjPrivatePtr priv = vm->privateData;
if ((header->version == 2) && (header->compressed != QEMU_SAVE_FORMAT_RAW)) { @@ -6710,6 +6711,16 @@ qemuDomainSaveImageStartVM(virConnectPtr conn, goto cleanup; }
+ if ((rc = qemuConnectAgent(driver, vm)) < 0) { + if (rc == -2) + goto cleanup; + + VIR_WARN("Cannot connect to QEMU guest agent for %s", + vm->def->name); + virResetLastError(); + priv->agentError = true; + } + event = virDomainEventLifecycleNewFromObj(vm, VIR_DOMAIN_EVENT_STARTED, VIR_DOMAIN_EVENT_STARTED_RESTORED); diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 51e7125..9678adf 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -5795,6 +5795,7 @@ qemuMigrationFinish(virQEMUDriverPtr driver, unsigned short port; unsigned long long timeReceived = 0; virObjectEventPtr event; + int rc;
VIR_DEBUG("driver=%p, dconn=%p, vm=%p, cookiein=%s, cookieinlen=%d, " "cookieout=%p, cookieoutlen=%p, flags=%lx, retcode=%d", @@ -5863,6 +5864,17 @@ qemuMigrationFinish(virQEMUDriverPtr driver, if (qemuMigrationStopNBDServer(driver, vm, mig) < 0) goto endjob;
+ if ((rc = qemuConnectAgent(driver, vm)) < 0) { + if (rc == -2) + goto endjob; + + VIR_WARN("Cannot connect to QEMU guest agent for %s", + vm->def->name); + virResetLastError(); + priv->agentError = true; + } + + if (flags & VIR_MIGRATE_PERSIST_DEST) { if (qemuMigrationPersist(driver, vm, mig, !v3proto) < 0) { /* Hmpf. Migration was successful, but making it persistent diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 05cbda2..e1a25dd 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -77,6 +77,10 @@
VIR_LOG_INIT("qemu.qemu_process");
+static int +qemuProcessReconnectRefreshChannelVirtioState(virQEMUDriverPtr driver, + virDomainObjPtr vm); + /** * qemuProcessRemoveDomainStatus * @@ -208,6 +212,26 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm) if (!config) return 0;
+ if (priv->agent) + return 0; + + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VSERPORT_CHANGE) && + config->state != VIR_DOMAIN_CHR_DEVICE_STATE_CONNECTED) { + /* So qemu is capable of sending us event whenever guest agent + * (dis-)connects. And we reflect its state in config->state. Well, + * nearly. It may happen that what we think about the guest agent state + * is not accurate, e.g. right after migration. Ask qemu to be sure. */ + + if (qemuProcessReconnectRefreshChannelVirtioState(driver, vm) < 0) + goto cleanup; + + /* Now we are sure about the channel state. */ + if (config->state != VIR_DOMAIN_CHR_DEVICE_STATE_CONNECTED) { + VIR_DEBUG("Deferring connecting to guest agent"); + return 0; + } + } + if (virSecurityManagerSetDaemonSocketLabel(driver->securityManager, vm->def) < 0) { VIR_ERROR(_("Failed to set security context for agent for %s"),

On 02.02.2016 18:47, John Ferlan wrote:
On 01/18/2016 05:54 AM, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1293351
Since we already have virtio channel events, we know when guest agent within guest has (dis-)connected. Instead of us blindly connecting to a socket that no one is listening to, we can just follow what qemu-ga does. This has a nice benefit that we don't need to 'guest-ping' the agent just to timeout and find out nobody is listening.
The way that this commit is implemented: - don't connect in qemuProcessStart directly, defer that to event callback (which already follows the agent).
The qemuConnectAgent is called in qemuProcessLaunch... Curious, are you referring to processSerialChangedEvent handling the connect on event callback?
Oh right. I will update the commit message. I am still not used to qemuProcessStart split. Yes, it should have been qemuProcessLaunch and processSerialChangedEvent.
- after migration is settled, before we resume vCPUs, ask qemu whether somebody is listening on the socket and if so, connect to it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
diff to v3: -Move cap detection into qemuConnectAgent
src/qemu/qemu_driver.c | 13 ++++++++++++- src/qemu/qemu_migration.c | 12 ++++++++++++ src/qemu/qemu_process.c | 24 ++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 1 deletion(-)
Concept seems reasonable - have a question though...
With the change to qemuConnectAgent that means the qemuProcessReconnect would call qemuProcessReconnectRefreshChannelVirtioState again later in processing. So should it matter?
Good catch. I guess my patch can be updated so that Refresh() is called just once. Generally it wouldn't matter but what's the point in calling it twice, right?
Is there anything in the interceding code between he qemuConnectAgent call and where that Refresh call is now that may do "something" that perhaps the Channel code may be assuming would be done? I'm perhaps thinking most of security mgr settings. Compared to when AgentConnect is called during qemuProcessLaunch processing - it's done much earlier for Reconnect
Yes. it's called very early. But even though it is, I don't think that any code should have problem. Firstly, host visible part (= agent socket) is always there regardless of qemu-ga running inside or not. So seclabel mgrs have something to work with. Secondly, the rest of or code that actually relies on qemuAgent connection must be adopted to disappearing agent because we already follow (dis-)connecting agent. Thanks! Michal
participants (2)
-
John Ferlan
-
Michal Privoznik