[libvirt] [PATCH 0/5] Make migration more robust when client dies

Libvirt daemon was not completely ready to lose a connection to the client (may even be source libvirtd) which is controling the migration. We might end up with dangling migration jobs on both source and destination daemons. More details about the scenarios this might happen can be found in patches 2/5 and 5/5 which fix the issue using infrastructure set up by the other patches in this series. Jiri Denemark (5): qemu: Add support for domain cleanup callbacks qemu: Avoid dangling migration-in job on shutoff domains qemu: Add connection close callbacks qemu: Make autodestroy utilize connection close callbacks qemu: Avoid dangling migration-out job when client dies src/qemu/qemu_conf.c | 172 +++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_conf.h | 30 +++++++- src/qemu/qemu_domain.c | 75 +++++++++++++++++++- src/qemu/qemu_domain.h | 17 +++++ src/qemu/qemu_driver.c | 10 ++- src/qemu/qemu_migration.c | 88 +++++++++++++++++++++++ src/qemu/qemu_migration.h | 4 + src/qemu/qemu_process.c | 109 +++++++---------------------- 8 files changed, 413 insertions(+), 92 deletions(-) -- 1.7.8.5

Add support for registering cleanup callbacks to be run when a domain transitions to shutoff state. --- src/qemu/qemu_domain.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 15 +++++++++ src/qemu/qemu_process.c | 2 + 3 files changed, 90 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 625c595..a9469cf 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -232,6 +232,7 @@ static void qemuDomainObjPrivateFree(void *data) VIR_ERROR(_("Unexpected QEMU agent still active during domain deletion")); qemuAgentClose(priv->agent); } + VIR_FREE(priv->cleanupCallbacks); VIR_FREE(priv); } @@ -1769,3 +1770,75 @@ qemuDomainCheckDiskPresence(struct qemud_driver *driver, cleanup: return ret; } + +/* + * The vm must be locked when any of the following cleanup functions is + * called. + */ +int +qemuDomainCleanupAdd(virDomainObjPtr vm, + qemuDomainCleanupCallback cb) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int i; + + VIR_DEBUG("vm=%s, cb=%p", vm->def->name, cb); + + for (i = 0; i < priv->ncleanupCallbacks; i++) { + if (priv->cleanupCallbacks[i] == cb) + return 0; + } + + if (VIR_RESIZE_N(priv->cleanupCallbacks, + priv->ncleanupCallbacks_max, + priv->ncleanupCallbacks, 1) < 0) { + virReportOOMError(); + return -1; + } + + priv->cleanupCallbacks[priv->ncleanupCallbacks++] = cb; + return 0; +} + +void +qemuDomainCleanupRemove(virDomainObjPtr vm, + qemuDomainCleanupCallback cb) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int i; + + VIR_DEBUG("vm=%s, cb=%p", vm->def->name, cb); + + for (i = 0; i < priv->ncleanupCallbacks; i++) { + if (priv->cleanupCallbacks[i] == cb) { + memmove(priv->cleanupCallbacks + i, + priv->cleanupCallbacks + i + 1, + priv->ncleanupCallbacks - i - 1); + priv->ncleanupCallbacks--; + } + } + + VIR_SHRINK_N(priv->cleanupCallbacks, + priv->ncleanupCallbacks_max, + priv->ncleanupCallbacks_max - priv->ncleanupCallbacks); +} + +void +qemuDomainCleanupRun(struct qemud_driver *driver, + virDomainObjPtr vm) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int i; + + VIR_DEBUG("driver=%p, vm=%s", driver, vm->def->name); + + /* run cleanup callbacks in reverse order */ + for (i = priv->ncleanupCallbacks - 1; i >= 0; i--) { + if (priv->cleanupCallbacks[i]) + priv->cleanupCallbacks[i](driver, vm); + } + + VIR_FREE(priv->cleanupCallbacks); + priv->ncleanupCallbacks = 0; + priv->ncleanupCallbacks_max = 0; +} diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index f8e943f..af83c0e 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -94,6 +94,9 @@ struct qemuDomainJobObj { typedef struct _qemuDomainPCIAddressSet qemuDomainPCIAddressSet; typedef qemuDomainPCIAddressSet *qemuDomainPCIAddressSetPtr; +typedef void (*qemuDomainCleanupCallback)(struct qemud_driver *driver, + virDomainObjPtr vm); + typedef struct _qemuDomainObjPrivate qemuDomainObjPrivate; typedef qemuDomainObjPrivate *qemuDomainObjPrivatePtr; struct _qemuDomainObjPrivate { @@ -130,6 +133,10 @@ struct _qemuDomainObjPrivate { char *origname; virConsolesPtr cons; + + qemuDomainCleanupCallback *cleanupCallbacks; + size_t ncleanupCallbacks; + size_t ncleanupCallbacks_max; }; struct qemuDomainWatchdogEvent @@ -307,4 +314,12 @@ bool qemuDomainJobAllowed(qemuDomainObjPrivatePtr priv, int qemuDomainCheckDiskPresence(struct qemud_driver *driver, virDomainObjPtr vm, bool start_with_state); + +int qemuDomainCleanupAdd(virDomainObjPtr vm, + qemuDomainCleanupCallback cb); +void qemuDomainCleanupRemove(virDomainObjPtr vm, + qemuDomainCleanupCallback cb); +void qemuDomainCleanupRun(struct qemud_driver *driver, + virDomainObjPtr vm); + #endif /* __QEMU_DOMAIN_H__ */ diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 0af3751..1945864 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3803,6 +3803,8 @@ void qemuProcessStop(struct qemud_driver *driver, /* shut it off for sure */ ignore_value(qemuProcessKill(driver, vm, VIR_QEMU_PROCESS_KILL_FORCE)); + qemuDomainCleanupRun(driver, vm); + /* Stop autodestroy in case guest is restarted */ qemuProcessAutoDestroyRemove(driver, vm); -- 1.7.8.5

On 03/19/2012 10:18 AM, Jiri Denemark wrote:
Add support for registering cleanup callbacks to be run when a domain transitions to shutoff state. --- src/qemu/qemu_domain.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 15 +++++++++ src/qemu/qemu_process.c | 2 + 3 files changed, 90 insertions(+), 0 deletions(-)
+int +qemuDomainCleanupAdd(virDomainObjPtr vm, + qemuDomainCleanupCallback cb) +{
No opaque 'void *' parameter to pass to the cleanup? I guess we can add it later, if we find ourselves needing it, and didn't spot anything else fishy in this patch, so: ACK. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Destination daemon should not rely on the client or source daemon (depending on the type of migration) to call Finish when migration fails, because the client may crash before it can do so. The domain prepared for incoming migration is set to be destroyed (and migration job cleaned up) when connection with the client closes but this is not enough. If the associated qemu process crashes after Prepare step and the domain is cleaned up before the connection gets closed, autodestroy is not called for the domain and migration jobs remains set. In case the domain is defined on destination host (i.e., it is not completely removed once destroyed) we keep the job set for ever. To fix this, we register a cleanup callback which is responsible to clean migration-in job when a domain dies anywhere between Prepare and Finish steps. Note that we can't blindly clean any job when spotting EOF on monitor since normally an API is running at that time. --- src/qemu/qemu_domain.c | 2 -- src/qemu/qemu_domain.h | 2 ++ src/qemu/qemu_migration.c | 22 ++++++++++++++++++++++ 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index a9469cf..41ffd6a 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -51,7 +51,6 @@ (VIR_DOMAIN_XML_SECURE | \ VIR_DOMAIN_XML_UPDATE_CPU) -VIR_ENUM_DECL(qemuDomainJob) VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST, "none", "query", @@ -64,7 +63,6 @@ VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST, "async nested", ); -VIR_ENUM_DECL(qemuDomainAsyncJob) VIR_ENUM_IMPL(qemuDomainAsyncJob, QEMU_ASYNC_JOB_LAST, "none", "migration out", diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index af83c0e..d79ff1d 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -64,6 +64,7 @@ enum qemuDomainJob { QEMU_JOB_LAST }; +VIR_ENUM_DECL(qemuDomainJob) /* Async job consists of a series of jobs that may change state. Independent * jobs that do not change state (and possibly others if explicitly allowed by @@ -78,6 +79,7 @@ enum qemuDomainAsyncJob { QEMU_ASYNC_JOB_LAST }; +VIR_ENUM_DECL(qemuDomainAsyncJob) struct qemuDomainJobObj { virCond cond; /* Use to coordinate jobs */ diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 81b2d5b..4eb3bf4 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1107,6 +1107,23 @@ cleanup: /* Prepare is the first step, and it runs on the destination host. */ +static void +qemuMigrationPrepareCleanup(struct qemud_driver *driver, + virDomainObjPtr vm) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + VIR_DEBUG("driver=%p, vm=%s, job=%s, asyncJob=%s", + driver, + vm->def->name, + qemuDomainJobTypeToString(priv->job.active), + qemuDomainAsyncJobTypeToString(priv->job.asyncJob)); + + if (!qemuMigrationJobIsActive(vm, QEMU_ASYNC_JOB_MIGRATION_IN)) + return; + qemuDomainObjDiscardAsyncJob(driver, vm); +} + static int qemuMigrationPrepareAny(struct qemud_driver *driver, virConnectPtr dconn, @@ -1264,6 +1281,9 @@ qemuMigrationPrepareAny(struct qemud_driver *driver, VIR_WARN("Unable to encode migration cookie"); } + if (qemuDomainCleanupAdd(vm, qemuMigrationPrepareCleanup) < 0) + goto endjob; + virDomainAuditStart(vm, "migrated", true); event = virDomainEventNewFromObj(vm, VIR_DOMAIN_EVENT_STARTED, @@ -2703,6 +2723,8 @@ qemuMigrationFinish(struct qemud_driver *driver, v3proto ? QEMU_MIGRATION_PHASE_FINISH3 : QEMU_MIGRATION_PHASE_FINISH2); + qemuDomainCleanupRemove(vm, qemuMigrationPrepareCleanup); + if (flags & VIR_MIGRATE_PERSIST_DEST) cookie_flags |= QEMU_MIGRATION_COOKIE_PERSISTENT; -- 1.7.8.5

On 03/19/2012 10:18 AM, Jiri Denemark wrote:
Destination daemon should not rely on the client or source daemon (depending on the type of migration) to call Finish when migration fails, because the client may crash before it can do so. The domain prepared for incoming migration is set to be destroyed (and migration job cleaned up) when connection with the client closes but this is not enough. If the associated qemu process crashes after Prepare step and the domain is cleaned up before the connection gets closed, autodestroy is not called for the domain and migration jobs remains set. In case the domain is defined on destination host (i.e., it is not completely removed once destroyed) we keep the job set for ever. To fix this, we register a cleanup callback which is responsible to clean migration-in job when a domain dies anywhere between Prepare and Finish steps. Note that we can't blindly clean any job when spotting EOF on monitor since normally an API is running at that time. --- src/qemu/qemu_domain.c | 2 -- src/qemu/qemu_domain.h | 2 ++ src/qemu/qemu_migration.c | 22 ++++++++++++++++++++++ 3 files changed, 24 insertions(+), 2 deletions(-)
I'm restating my understanding of the bug, to make sure I am sure why your patch helps: - src requests a migration - dest starts a qemu process using information from the src, but the destination happens to be running an older qemu that can't support the full migration - qemu dies, but the destination hasn't seen a 'Finish' from the source, so the job remains open and the domain remains - connection is broken, but the open job prevents reclaiming the autodestroy domain on the destination - new connection is made, but source can't migrate because destination is already locked up on the stale attempt and the fix is adding a new callback, which says if qemu dies while the callback is registered, we cancel the migration job; therefore, even without a 'Finish' from the source, the autodestroy can now kick in ACK. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Tue, Mar 20, 2012 at 15:56:39 -0600, Eric Blake wrote:
On 03/19/2012 10:18 AM, Jiri Denemark wrote:
Destination daemon should not rely on the client or source daemon (depending on the type of migration) to call Finish when migration fails, because the client may crash before it can do so. The domain prepared for incoming migration is set to be destroyed (and migration job cleaned up) when connection with the client closes but this is not enough. If the associated qemu process crashes after Prepare step and the domain is cleaned up before the connection gets closed, autodestroy is not called for the domain and migration jobs remains set. In case the domain is defined on destination host (i.e., it is not completely removed once destroyed) we keep the job set for ever. To fix this, we register a cleanup callback which is responsible to clean migration-in job when a domain dies anywhere between Prepare and Finish steps. Note that we can't blindly clean any job when spotting EOF on monitor since normally an API is running at that time. --- src/qemu/qemu_domain.c | 2 -- src/qemu/qemu_domain.h | 2 ++ src/qemu/qemu_migration.c | 22 ++++++++++++++++++++++ 3 files changed, 24 insertions(+), 2 deletions(-)
I'm restating my understanding of the bug, to make sure I am sure why your patch helps:
- src requests a migration Right :-)
- dest starts a qemu process using information from the src, but the destination happens to be running an older qemu that can't support the full migration Perhaps, but there might be several reasons for qemu to die during migration, even if it's exactly the same version as on the source.
- qemu dies, but the destination hasn't seen a 'Finish' from the source, so the job remains open and the domain remains The domain is persistent so it is still there but inactive since we got EOF on qemu monitor. And because we haven't seen Finish, the job remains open even on the inactive domain.
- connection is broken, but the open job prevents reclaiming the autodestroy domain on the destination The domain is inactive so there's nothing to autodestroy in the first place. The EOF handler which destroyed the domain also removed its autodestroy callback. If we were lucky and we saw broken connection earlier than monitor EOF, we'd be fine since autodestroy always removes any async job active on the domain.
- new connection is made, but source can't migrate because destination is already locked up on the stale attempt Right.
and the fix is adding a new callback, which says if qemu dies while the callback is registered, we cancel the migration job; therefore, even without a 'Finish' from the source, the autodestroy can now kick in ...therefore, even without a 'Finish' from the source, we don't end up with a stale job in case monitor EOF handler is faster and destroys the domain (and unregisters autodestroy) before autodestroy is called on broken connection
ACK. Thanks.
Should I clarify the commit message a bit before pushing? Jirka

On 03/21/2012 02:41 AM, Jiri Denemark wrote:
On Tue, Mar 20, 2012 at 15:56:39 -0600, Eric Blake wrote:
On 03/19/2012 10:18 AM, Jiri Denemark wrote:
Destination daemon should not rely on the client or source daemon (depending on the type of migration) to call Finish when migration fails, because the client may crash before it can do so. The domain prepared for incoming migration is set to be destroyed (and migration job cleaned up) when connection with the client closes but this is not enough. If the associated qemu process crashes after Prepare step and the domain is cleaned up before the connection gets closed, autodestroy is not called for the domain and migration jobs remains set. In case the domain is defined on destination host (i.e., it is not completely removed once destroyed) we keep the job set for ever. To fix this, we register a cleanup callback which is responsible to clean migration-in job when a domain dies anywhere between Prepare and Finish steps. Note that we can't blindly clean any job when spotting EOF on monitor since normally an API is running at that time. --- src/qemu/qemu_domain.c | 2 -- src/qemu/qemu_domain.h | 2 ++ src/qemu/qemu_migration.c | 22 ++++++++++++++++++++++ 3 files changed, 24 insertions(+), 2 deletions(-)
I'm restating my understanding of the bug, to make sure I am sure why your patch helps:
ACK. Thanks.
Should I clarify the commit message a bit before pushing?
Dunno that it is needed. If you think a better wording would help, then go for it, but I was able to understand the patch and form my own rephrasing using just your original wording, so I'm okay if you push as-is. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Add support for registering arbitrary callback to be called for a domain when a connection gets closed. --- src/qemu/qemu_conf.c | 172 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_conf.h | 27 ++++++++ src/qemu/qemu_driver.c | 5 ++ 3 files changed, 204 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index a709cbf..88a04bc 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -56,6 +56,11 @@ #define VIR_FROM_THIS VIR_FROM_QEMU +struct _qemuDriverCloseDef { + virConnectPtr conn; + qemuDriverCloseCallback cb; +}; + void qemuDriverLock(struct qemud_driver *driver) { virMutexLock(&driver->lock); @@ -490,3 +495,170 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, virConfFree (conf); return 0; } + +static void +qemuDriverCloseCallbackFree(void *payload, + const void *name ATTRIBUTE_UNUSED) +{ + VIR_FREE(payload); +} + +int +qemuDriverCloseCallbackInit(struct qemud_driver *driver) +{ + driver->closeCallbacks = virHashCreate(5, qemuDriverCloseCallbackFree); + if (!driver->closeCallbacks) + return -1; + + return 0; +} + +void +qemuDriverCloseCallbackShutdown(struct qemud_driver *driver) +{ + virHashFree(driver->closeCallbacks); +} + +int +qemuDriverCloseCallbackSet(struct qemud_driver *driver, + virDomainObjPtr vm, + virConnectPtr conn, + qemuDriverCloseCallback cb) +{ + char uuidstr[VIR_UUID_STRING_BUFLEN]; + qemuDriverCloseDefPtr closeDef; + + virUUIDFormat(vm->def->uuid, uuidstr); + VIR_DEBUG("vm=%s, uuid=%s, conn=%p, cb=%p", + vm->def->name, uuidstr, conn, cb); + + closeDef = virHashLookup(driver->closeCallbacks, uuidstr); + if (closeDef) { + if (closeDef->conn != conn) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, + _("Close callback for domain %s already registered" + " with another connection %p"), + vm->def->name, closeDef->conn); + return -1; + } + if (closeDef->cb && closeDef->cb != cb) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, + _("Another close callback is already defined for" + " domain %s"), vm->def->name); + return -1; + } + + closeDef->cb = cb; + } else { + if (VIR_ALLOC(closeDef) < 0) { + virReportOOMError(); + return -1; + } + + closeDef->conn = conn; + closeDef->cb = cb; + if (virHashAddEntry(driver->closeCallbacks, uuidstr, closeDef) < 0) { + VIR_FREE(closeDef); + return -1; + } + } + return 0; +} + +int +qemuDriverCloseCallbackUnset(struct qemud_driver *driver, + virDomainObjPtr vm, + qemuDriverCloseCallback cb) +{ + char uuidstr[VIR_UUID_STRING_BUFLEN]; + qemuDriverCloseDefPtr closeDef; + + virUUIDFormat(vm->def->uuid, uuidstr); + VIR_DEBUG("vm=%s, uuid=%s, cb=%p", + vm->def->name, uuidstr, cb); + + closeDef = virHashLookup(driver->closeCallbacks, uuidstr); + if (!closeDef) + return -1; + + if (closeDef->cb && closeDef->cb != cb) { + qemuReportError(VIR_ERR_INTERNAL_ERROR, + _("Trying to remove mismatching close callback for" + " domain %s"), vm->def->name); + return -1; + } + + return virHashRemoveEntry(driver->closeCallbacks, uuidstr); +} + +qemuDriverCloseCallback +qemuDriverCloseCallbackGet(struct qemud_driver *driver, + virDomainObjPtr vm, + virConnectPtr conn) +{ + char uuidstr[VIR_UUID_STRING_BUFLEN]; + qemuDriverCloseDefPtr closeDef; + qemuDriverCloseCallback cb = NULL; + + virUUIDFormat(vm->def->uuid, uuidstr); + VIR_DEBUG("vm=%s, uuid=%s, conn=%p", + vm->def->name, uuidstr, conn); + + closeDef = virHashLookup(driver->closeCallbacks, uuidstr); + if (closeDef && (!conn || closeDef->conn == conn)) + cb = closeDef->cb; + + VIR_DEBUG("cb=%p", cb); + return cb; +} + +struct qemuDriverCloseCallbackData { + struct qemud_driver *driver; + virConnectPtr conn; +}; + +static void +qemuDriverCloseCallbackRun(void *payload, + const void *name, + void *opaque) +{ + struct qemuDriverCloseCallbackData *data = opaque; + qemuDriverCloseDefPtr closeDef = payload; + const char *uuidstr = name; + unsigned char uuid[VIR_UUID_BUFLEN]; + virDomainObjPtr dom; + + VIR_DEBUG("conn=%p, thisconn=%p, uuid=%s, cb=%p", + closeDef->conn, data->conn, uuidstr, closeDef->cb); + + if (data->conn != closeDef->conn || !closeDef->cb) + return; + + if (virUUIDParse(uuidstr, uuid) < 0) { + VIR_WARN("Failed to parse %s", uuidstr); + return; + } + + if (!(dom = virDomainFindByUUID(&data->driver->domains, uuid))) { + VIR_DEBUG("No domain object with UUID %s", uuidstr); + return; + } + + dom = closeDef->cb(data->driver, dom, data->conn); + if (dom) + virDomainObjUnlock(dom); + + virHashRemoveEntry(data->driver->closeCallbacks, uuidstr); +} + +void +qemuDriverCloseCallbackRunAll(struct qemud_driver *driver, + virConnectPtr conn) +{ + struct qemuDriverCloseCallbackData data = { + driver, conn + }; + VIR_DEBUG("conn=%p", conn); + + virHashForEach(driver->closeCallbacks, qemuDriverCloseCallbackRun, &data); +} diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index 36f1c4c..a22ce0c 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -46,6 +46,8 @@ # define QEMUD_CPUMASK_LEN CPU_SETSIZE +typedef struct _qemuDriverCloseDef qemuDriverCloseDef; +typedef qemuDriverCloseDef *qemuDriverCloseDefPtr; /* Main driver state */ struct qemud_driver { @@ -144,6 +146,13 @@ struct qemud_driver { * when the virConnectPtr is closed*/ virHashTablePtr autodestroy; + /* Mapping of 'char *uuidstr' -> qemuDriverCloseDefPtr of domains + * which want a specific cleanup to be done when a connection is + * closed. Such cleanup may be to automatically destroy the + * domain or abort a particular job running on it. + */ + virHashTablePtr closeCallbacks; + int keepAliveInterval; unsigned int keepAliveCount; }; @@ -180,4 +189,22 @@ struct qemuDomainDiskInfo { int io_status; }; +typedef virDomainObjPtr (*qemuDriverCloseCallback)(struct qemud_driver *driver, + virDomainObjPtr vm, + virConnectPtr conn); +int qemuDriverCloseCallbackInit(struct qemud_driver *driver); +void qemuDriverCloseCallbackShutdown(struct qemud_driver *driver); +int qemuDriverCloseCallbackSet(struct qemud_driver *driver, + virDomainObjPtr vm, + virConnectPtr conn, + qemuDriverCloseCallback cb); +int qemuDriverCloseCallbackUnset(struct qemud_driver *driver, + virDomainObjPtr vm, + qemuDriverCloseCallback cb); +qemuDriverCloseCallback qemuDriverCloseCallbackGet(struct qemud_driver *driver, + virDomainObjPtr vm, + virConnectPtr conn); +void qemuDriverCloseCallbackRunAll(struct qemud_driver *driver, + virConnectPtr conn); + #endif /* __QEMUD_CONF_H */ diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2c467ab..ce82535 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -662,6 +662,9 @@ qemudStartup(int privileged) { if (qemuProcessAutoDestroyInit(qemu_driver) < 0) goto error; + if (qemuDriverCloseCallbackInit(qemu_driver) < 0) + goto error; + /* Get all the running persistent or transient configs first */ if (virDomainLoadAllConfigs(qemu_driver->caps, &qemu_driver->domains, @@ -801,6 +804,7 @@ qemudShutdown(void) { virSysinfoDefFree(qemu_driver->hostsysinfo); qemuProcessAutoDestroyShutdown(qemu_driver); + qemuDriverCloseCallbackShutdown(qemu_driver); VIR_FREE(qemu_driver->configDir); VIR_FREE(qemu_driver->autostartDir); @@ -922,6 +926,7 @@ static int qemudClose(virConnectPtr conn) { virDomainEventStateDeregisterConn(conn, driver->domainEventState); qemuProcessAutoDestroyRun(driver, conn); + qemuDriverCloseCallbackRunAll(driver, conn); qemuDriverUnlock(driver); conn->privateData = NULL; -- 1.7.8.5

On 03/19/2012 10:18 AM, Jiri Denemark wrote:
Add support for registering arbitrary callback to be called for a domain when a connection gets closed. --- src/qemu/qemu_conf.c | 172 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_conf.h | 27 ++++++++ src/qemu/qemu_driver.c | 5 ++ 3 files changed, 204 insertions(+), 0 deletions(-)
+ /* Mapping of 'char *uuidstr' -> qemuDriverCloseDefPtr of domains + * which want a specific cleanup to be done when a connection is + * closed. Such cleanup may be to automatically destroy the
Are the two spaces intentional? Other than that, ACK. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

--- src/qemu/qemu_conf.h | 5 -- src/qemu/qemu_driver.c | 5 -- src/qemu/qemu_process.c | 107 ++++++++++------------------------------------ 3 files changed, 24 insertions(+), 93 deletions(-) diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index a22ce0c..3306014 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -141,11 +141,6 @@ struct qemud_driver { virLockManagerPluginPtr lockManager; - /* Mapping of 'char *uuidstr' -> virConnectPtr - * of guests which will be automatically killed - * when the virConnectPtr is closed*/ - virHashTablePtr autodestroy; - /* Mapping of 'char *uuidstr' -> qemuDriverCloseDefPtr of domains * which want a specific cleanup to be done when a connection is * closed. Such cleanup may be to automatically destroy the diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ce82535..3a5ef09 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -659,9 +659,6 @@ qemudStartup(int privileged) { qemu_driver->hugepage_path = mempath; } - if (qemuProcessAutoDestroyInit(qemu_driver) < 0) - goto error; - if (qemuDriverCloseCallbackInit(qemu_driver) < 0) goto error; @@ -803,7 +800,6 @@ qemudShutdown(void) { virSysinfoDefFree(qemu_driver->hostsysinfo); - qemuProcessAutoDestroyShutdown(qemu_driver); qemuDriverCloseCallbackShutdown(qemu_driver); VIR_FREE(qemu_driver->configDir); @@ -925,7 +921,6 @@ static int qemudClose(virConnectPtr conn) { qemuDriverLock(driver); virDomainEventStateDeregisterConn(conn, driver->domainEventState); - qemuProcessAutoDestroyRun(driver, conn); qemuDriverCloseCallbackRunAll(driver, conn); qemuDriverUnlock(driver); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 1945864..8915a9a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4120,124 +4120,65 @@ cleanup: } -int qemuProcessAutoDestroyInit(struct qemud_driver *driver) +static virDomainObjPtr +qemuProcessAutoDestroy(struct qemud_driver *driver, + virDomainObjPtr dom, + virConnectPtr conn) { - if (!(driver->autodestroy = virHashCreate(5, NULL))) - return -1; - - return 0; -} - -struct qemuProcessAutoDestroyData { - struct qemud_driver *driver; - virConnectPtr conn; -}; - -static void qemuProcessAutoDestroyDom(void *payload, - const void *name, - void *opaque) -{ - struct qemuProcessAutoDestroyData *data = opaque; - virConnectPtr conn = payload; - const char *uuidstr = name; - unsigned char uuid[VIR_UUID_BUFLEN]; - virDomainObjPtr dom; - qemuDomainObjPrivatePtr priv; + qemuDomainObjPrivatePtr priv = dom->privateData; virDomainEventPtr event = NULL; - VIR_DEBUG("conn=%p uuidstr=%s thisconn=%p", conn, uuidstr, data->conn); + VIR_DEBUG("vm=%s, conn=%p", dom->def->name, conn); - if (data->conn != conn) - return; - - if (virUUIDParse(uuidstr, uuid) < 0) { - VIR_WARN("Failed to parse %s", uuidstr); - return; - } - - if (!(dom = virDomainFindByUUID(&data->driver->domains, - uuid))) { - VIR_DEBUG("No domain object to kill"); - return; - } - - priv = dom->privateData; if (priv->job.asyncJob) { VIR_DEBUG("vm=%s has long-term job active, cancelling", dom->def->name); - qemuDomainObjDiscardAsyncJob(data->driver, dom); + qemuDomainObjDiscardAsyncJob(driver, dom); } - if (qemuDomainObjBeginJobWithDriver(data->driver, dom, + if (qemuDomainObjBeginJobWithDriver(driver, dom, QEMU_JOB_DESTROY) < 0) goto cleanup; VIR_DEBUG("Killing domain"); - qemuProcessStop(data->driver, dom, 1, VIR_DOMAIN_SHUTOFF_DESTROYED); + qemuProcessStop(driver, dom, 1, VIR_DOMAIN_SHUTOFF_DESTROYED); virDomainAuditStop(dom, "destroyed"); event = virDomainEventNewFromObj(dom, VIR_DOMAIN_EVENT_STOPPED, VIR_DOMAIN_EVENT_STOPPED_DESTROYED); - if (qemuDomainObjEndJob(data->driver, dom) == 0) + + if (qemuDomainObjEndJob(driver, dom) == 0) dom = NULL; if (dom && !dom->persistent) - qemuDomainRemoveInactive(data->driver, dom); - -cleanup: - if (dom) - virDomainObjUnlock(dom); + qemuDomainRemoveInactive(driver, dom); if (event) - qemuDomainEventQueue(data->driver, event); - virHashRemoveEntry(data->driver->autodestroy, uuidstr); -} - -/* - * Precondition: driver is locked - */ -void qemuProcessAutoDestroyRun(struct qemud_driver *driver, virConnectPtr conn) -{ - struct qemuProcessAutoDestroyData data = { - driver, conn - }; - VIR_DEBUG("conn=%p", conn); - virHashForEach(driver->autodestroy, qemuProcessAutoDestroyDom, &data); -} + qemuDomainEventQueue(driver, event); -void qemuProcessAutoDestroyShutdown(struct qemud_driver *driver) -{ - virHashFree(driver->autodestroy); +cleanup: + return dom; } int qemuProcessAutoDestroyAdd(struct qemud_driver *driver, virDomainObjPtr vm, virConnectPtr conn) { - char uuidstr[VIR_UUID_STRING_BUFLEN]; - virUUIDFormat(vm->def->uuid, uuidstr); - VIR_DEBUG("vm=%s uuid=%s conn=%p", vm->def->name, uuidstr, conn); - if (virHashAddEntry(driver->autodestroy, uuidstr, conn) < 0) - return -1; - return 0; + VIR_DEBUG("vm=%s, conn=%p", vm->def->name, conn); + return qemuDriverCloseCallbackSet(driver, vm, conn, + qemuProcessAutoDestroy); } int qemuProcessAutoDestroyRemove(struct qemud_driver *driver, virDomainObjPtr vm) { - char uuidstr[VIR_UUID_STRING_BUFLEN]; - virUUIDFormat(vm->def->uuid, uuidstr); - VIR_DEBUG("vm=%s uuid=%s", vm->def->name, uuidstr); - if (virHashRemoveEntry(driver->autodestroy, uuidstr) < 0) - return -1; - return 0; + VIR_DEBUG("vm=%s", vm->def->name); + return qemuDriverCloseCallbackUnset(driver, vm, qemuProcessAutoDestroy); } bool qemuProcessAutoDestroyActive(struct qemud_driver *driver, virDomainObjPtr vm) { - char uuidstr[VIR_UUID_STRING_BUFLEN]; - virUUIDFormat(vm->def->uuid, uuidstr); - VIR_DEBUG("vm=%s uuid=%s", vm->def->name, uuidstr); - if (virHashLookup(driver->autodestroy, uuidstr) != NULL) - return true; - return false; + qemuDriverCloseCallback cb; + VIR_DEBUG("vm=%s", vm->def->name); + cb = qemuDriverCloseCallbackGet(driver, vm, NULL); + return cb == qemuProcessAutoDestroy; } -- 1.7.8.5

On 03/19/2012 10:18 AM, Jiri Denemark wrote:
--- src/qemu/qemu_conf.h | 5 -- src/qemu/qemu_driver.c | 5 -- src/qemu/qemu_process.c | 107 ++++++++++------------------------------------ 3 files changed, 24 insertions(+), 93 deletions(-)
Good! I was thinking about this very cleanup myself as I reviewed 3/5, and I agree with your decision to separate it into different patches. ACK. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

When a client which started non-p2p migration dies in a bad time, the source libvirtd never clears the migration job and almost nothing can be done with the domain without restarting the daemon. This patch makes use of connection close callbacks and ensures that migration job is properly discarded when the client disconnects. --- src/qemu/qemu_driver.c | 4 +++ src/qemu/qemu_migration.c | 66 +++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 4 +++ 3 files changed, 74 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 3a5ef09..c2622d3 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8838,6 +8838,9 @@ qemuDomainMigrateBegin3(virDomainPtr domain, * This prevents any other APIs being invoked while migration is taking * place. */ + if (qemuDriverCloseCallbackSet(driver, vm, domain->conn, + qemuMigrationCleanup) < 0) + goto endjob; if (qemuMigrationJobContinue(vm) == 0) { vm = NULL; qemuReportError(VIR_ERR_OPERATION_FAILED, @@ -9069,6 +9072,7 @@ qemuDomainMigrateConfirm3(virDomainPtr domain, phase = QEMU_MIGRATION_PHASE_CONFIRM3; qemuMigrationJobStartPhase(driver, vm, phase); + qemuDriverCloseCallbackUnset(driver, vm, qemuMigrationCleanup); ret = qemuMigrationConfirm(driver, domain->conn, vm, cookiein, cookieinlen, diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 4eb3bf4..e02ba86 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1037,6 +1037,67 @@ qemuDomainMigrateGraphicsRelocate(struct qemud_driver *driver, } +/* This is called for outgoing non-p2p migrations when a connection to the + * client which initiated the migration was closed and we are waiting for it + * to follow up with the next phase, that is, in between + * virDomainMigrateBegin3 and qemuDomainMigratePerform3 or + * qemuDomainMigratePerform3 and qemuDomainMigrateConfirm3. + */ +virDomainObjPtr +qemuMigrationCleanup(struct qemud_driver *driver, + virDomainObjPtr vm, + virConnectPtr conn) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + VIR_DEBUG("vm=%s, conn=%p, asyncJob=%s, phase=%s", + vm->def->name, conn, + qemuDomainAsyncJobTypeToString(priv->job.asyncJob), + qemuDomainAsyncJobPhaseToString(priv->job.asyncJob, + priv->job.phase)); + + if (!qemuMigrationJobIsActive(vm, QEMU_ASYNC_JOB_MIGRATION_OUT)) + goto cleanup; + + VIR_DEBUG("The connection which started outgoing migration of domain %s" + " was closed; cancelling the migration", + vm->def->name); + + switch ((enum qemuMigrationJobPhase) priv->job.phase) { + case QEMU_MIGRATION_PHASE_BEGIN3: + /* just forget we were about to migrate */ + qemuDomainObjDiscardAsyncJob(driver, vm); + break; + + case QEMU_MIGRATION_PHASE_PERFORM3_DONE: + VIR_WARN("Migration of domain %s finished but we don't know if the" + " domain was successfully started on destination or not", + vm->def->name); + /* clear the job and let higher levels decide what to do */ + qemuDomainObjDiscardAsyncJob(driver, vm); + break; + + case QEMU_MIGRATION_PHASE_PERFORM3: + /* cannot be seen without an active migration API; unreachable */ + case QEMU_MIGRATION_PHASE_CONFIRM3: + case QEMU_MIGRATION_PHASE_CONFIRM3_CANCELLED: + /* all done; unreachable */ + case QEMU_MIGRATION_PHASE_PREPARE: + case QEMU_MIGRATION_PHASE_FINISH2: + case QEMU_MIGRATION_PHASE_FINISH3: + /* incoming migration; unreachable */ + case QEMU_MIGRATION_PHASE_PERFORM2: + /* single phase outgoing migration; unreachable */ + case QEMU_MIGRATION_PHASE_NONE: + case QEMU_MIGRATION_PHASE_LAST: + /* unreachable */ + ; + } + +cleanup: + return vm; +} + /* The caller is supposed to lock the vm and start a migration job. */ char *qemuMigrationBegin(struct qemud_driver *driver, virDomainObjPtr vm, @@ -2547,6 +2608,7 @@ qemuMigrationPerformPhase(struct qemud_driver *driver, } qemuMigrationJobStartPhase(driver, vm, QEMU_MIGRATION_PHASE_PERFORM3); + qemuDriverCloseCallbackUnset(driver, vm, qemuMigrationCleanup); resume = virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING; ret = doNativeMigrate(driver, vm, uri, cookiein, cookieinlen, @@ -2577,6 +2639,10 @@ qemuMigrationPerformPhase(struct qemud_driver *driver, qemuMigrationJobSetPhase(driver, vm, QEMU_MIGRATION_PHASE_PERFORM3_DONE); + if (qemuDriverCloseCallbackSet(driver, vm, conn, + qemuMigrationCleanup) < 0) + goto endjob; + endjob: if (ret < 0) refs = qemuMigrationJobFinish(driver, vm); diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index 41e4eac..5fab0ca 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -77,6 +77,10 @@ int qemuMigrationJobFinish(struct qemud_driver *driver, virDomainObjPtr obj) int qemuMigrationSetOffline(struct qemud_driver *driver, virDomainObjPtr vm); +virDomainObjPtr qemuMigrationCleanup(struct qemud_driver *driver, + virDomainObjPtr vm, + virConnectPtr conn); + char *qemuMigrationBegin(struct qemud_driver *driver, virDomainObjPtr vm, const char *xmlin, -- 1.7.8.5

On 03/19/2012 10:19 AM, Jiri Denemark wrote:
When a client which started non-p2p migration dies in a bad time, the source libvirtd never clears the migration job and almost nothing can be done with the domain without restarting the daemon. This patch makes use of connection close callbacks and ensures that migration job is properly discarded when the client disconnects. --- src/qemu/qemu_driver.c | 4 +++ src/qemu/qemu_migration.c | 66 +++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 4 +++ 3 files changed, 74 insertions(+), 0 deletions(-)
Can't have been a fun bug to track down, but I'm glad you found it.
+/* This is called for outgoing non-p2p migrations when a connection to the + * client which initiated the migration was closed and we are waiting for it
s/and we are waiting/but we were waiting/
+ + VIR_DEBUG("The connection which started outgoing migration of domain %s" + " was closed; cancelling the migration",
I think that in user-visible messages, we have been favoring US spelling (canceling) over UK spelling (cancelling); but VIR_DEBUG is borderline on whether it is user-visible. ACK. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Tue, Mar 20, 2012 at 16:33:49 -0600, Eric Blake wrote:
On 03/19/2012 10:19 AM, Jiri Denemark wrote:
When a client which started non-p2p migration dies in a bad time, the source libvirtd never clears the migration job and almost nothing can be done with the domain without restarting the daemon. This patch makes use of connection close callbacks and ensures that migration job is properly discarded when the client disconnects. --- src/qemu/qemu_driver.c | 4 +++ src/qemu/qemu_migration.c | 66 +++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_migration.h | 4 +++ 3 files changed, 74 insertions(+), 0 deletions(-)
Can't have been a fun bug to track down, but I'm glad you found it.
Heh, I found it as a side effect of trying to reproduce the bug fixed by 2/5. Anyway, I usually like dealing with such beasts so it was actually fun for me :-)
+/* This is called for outgoing non-p2p migrations when a connection to the + * client which initiated the migration was closed and we are waiting for it
s/and we are waiting/but we were waiting/
+ + VIR_DEBUG("The connection which started outgoing migration of domain %s" + " was closed; cancelling the migration",
I think that in user-visible messages, we have been favoring US spelling (canceling) over UK spelling (cancelling); but VIR_DEBUG is borderline on whether it is user-visible.
Yeah. Double letters are always fun for me. Why there is only a single 'p' in grouped while double 'p' in stopped? Is that because we know there is no "groupe" word and thus we don't need to distinguish which word "grouped" was derived from? (While both "stop" and "stope" exist.) If so, what if a "groupe" appears and starts to mean something? :-) Jirka

On Mon, Mar 19, 2012 at 17:18:55 +0100, Jiri Denemark wrote:
Libvirt daemon was not completely ready to lose a connection to the client (may even be source libvirtd) which is controling the migration. We might end up with dangling migration jobs on both source and destination daemons. More details about the scenarios this might happen can be found in patches 2/5 and 5/5 which fix the issue using infrastructure set up by the other patches in this series.
Jiri Denemark (5): qemu: Add support for domain cleanup callbacks qemu: Avoid dangling migration-in job on shutoff domains qemu: Add connection close callbacks qemu: Make autodestroy utilize connection close callbacks qemu: Avoid dangling migration-out job when client dies
Pushed. Jirka
participants (2)
-
Eric Blake
-
Jiri Denemark