[libvirt] [PATCH 0/5] Fix async job type when doing external snapshots and improve error message handling

This series adds error messages when trying to do a snapshot of a machine that has hostdevices assigned. Peter Krempa (5): qemu: Restart CPUs with valid async job type when doing external snapshots qemu: Make qemuMigrationIsAllowed more reusable qemu: snapshot: Report better error message if migration isn't allowed qemu: reuse qemuMigrationIsAllowed when doing save and managedsave [OPTIONAL] qemu: allow save and managedsave on guests marked as autodestroyable src/qemu/qemu_driver.c | 16 ++++++---------- src/qemu/qemu_migration.c | 45 ++++++++++++++++++++++++++------------------- src/qemu/qemu_migration.h | 2 ++ 3 files changed, 34 insertions(+), 29 deletions(-) -- 1.8.0

When restarting CPUs after an external snapshot, the restarting function was called without the appropriate async job type. This caused that a new sync job wasn't created and allowed races in the monitor. --- src/qemu/qemu_driver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e099c5c..6f8f840 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11481,7 +11481,7 @@ endjob: if (resume && vm && virDomainObjIsActive(vm) && qemuProcessStartCPUs(driver, vm, conn, VIR_DOMAIN_RUNNING_UNPAUSED, - QEMU_ASYNC_JOB_NONE) < 0) { + QEMU_ASYNC_JOB_SNAPSHOT) < 0) { virDomainEventPtr event = NULL; event = virDomainEventNewFromObj(vm, VIR_DOMAIN_EVENT_SUSPENDED, -- 1.8.0

On 12/07/2012 12:29 PM, Peter Krempa wrote:
When restarting CPUs after an external snapshot, the restarting function was called without the appropriate async job type. This caused that a new sync job wasn't created and allowed races in the monitor. --- src/qemu/qemu_driver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e099c5c..6f8f840 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11481,7 +11481,7 @@ endjob: if (resume && vm && virDomainObjIsActive(vm) && qemuProcessStartCPUs(driver, vm, conn, VIR_DOMAIN_RUNNING_UNPAUSED, - QEMU_ASYNC_JOB_NONE) < 0) { + QEMU_ASYNC_JOB_SNAPSHOT) < 0) { virDomainEventPtr event = NULL; event = virDomainEventNewFromObj(vm, VIR_DOMAIN_EVENT_SUSPENDED,
Makes sense, ACK. Martin

On 12/11/12 11:04, Martin Kletzander wrote:
On 12/07/2012 12:29 PM, Peter Krempa wrote:
When restarting CPUs after an external snapshot, the restarting function was called without the appropriate async job type. This caused that a new sync job wasn't created and allowed races in the monitor. --- src/qemu/qemu_driver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Pushed. Thanks. Peter

This patch exports qemuMigrationIsAllowed and adds a new parameter to it to denote if it's a remote migration or a local migration. Local migrations are used in snapshots and saving of the machine state and have fewer restrictions. This patch also adjusts callers of the function and tweaks some error messages to be more universal. --- src/qemu/qemu_migration.c | 45 ++++++++++++++++++++++++++------------------- src/qemu/qemu_migration.h | 2 ++ 2 files changed, 28 insertions(+), 19 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 86060dc..7e5cf01 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1011,30 +1011,38 @@ error: * assume that checking on source is sufficient to prevent ever * talking to the destination in the first place, we are stuck with * the fact that older servers did not do checks on the source. */ -static bool +bool qemuMigrationIsAllowed(virQEMUDriverPtr driver, virDomainObjPtr vm, - virDomainDefPtr def) + virDomainDefPtr def, bool remote) { int nsnapshots; bool forbid; int i; if (vm) { - if (qemuProcessAutoDestroyActive(driver, vm)) { - virReportError(VIR_ERR_OPERATION_INVALID, - "%s", _("domain is marked for auto destroy")); - return false; - } - if ((nsnapshots = virDomainSnapshotObjListNum(vm->snapshots, NULL, - 0))) { - virReportError(VIR_ERR_OPERATION_INVALID, - _("cannot migrate domain with %d snapshots"), - nsnapshots); - return false; + /* perform these checks only when migrating to remote hosts */ + if (remote) { + if (qemuProcessAutoDestroyActive(driver, vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("domain is marked for auto destroy")); + return false; + } + + nsnapshots = virDomainSnapshotObjListNum(vm->snapshots, NULL, 0); + if (nsnapshots < 0) + return false; + + if (nsnapshots > 0) { + virReportError(VIR_ERR_OPERATION_INVALID, + _("cannot migrate domain with %d snapshots"), + nsnapshots); + return false; + } } + if (virDomainHasDiskMirror(vm)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", - _("cannot migrate domain with active block job")); + _("domain has an active block job")); return false; } @@ -1055,8 +1063,7 @@ qemuMigrationIsAllowed(virQEMUDriverPtr driver, virDomainObjPtr vm, } if (forbid) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", - _("Domain with assigned non-USB host devices " - "cannot be migrated")); + _("domain has assigned non-USB host devices")); return false; } @@ -1428,7 +1435,7 @@ char *qemuMigrationBegin(virQEMUDriverPtr driver, if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) qemuMigrationJobSetPhase(driver, vm, QEMU_MIGRATION_PHASE_BEGIN3); - if (!qemuMigrationIsAllowed(driver, vm, NULL)) + if (!qemuMigrationIsAllowed(driver, vm, NULL, true)) goto cleanup; if (!(flags & VIR_MIGRATE_UNSAFE) && !qemuMigrationIsSafe(vm->def)) @@ -1521,7 +1528,7 @@ qemuMigrationPrepareAny(virQEMUDriverPtr driver, VIR_DOMAIN_XML_INACTIVE))) goto cleanup; - if (!qemuMigrationIsAllowed(driver, NULL, def)) + if (!qemuMigrationIsAllowed(driver, NULL, def, true)) goto cleanup; /* Target domain name, maybe renamed. */ @@ -2927,7 +2934,7 @@ qemuMigrationPerformJob(virQEMUDriverPtr driver, goto endjob; } - if (!qemuMigrationIsAllowed(driver, vm, NULL)) + if (!qemuMigrationIsAllowed(driver, vm, NULL, true)) goto endjob; if (!(flags & VIR_MIGRATE_UNSAFE) && !qemuMigrationIsSafe(vm->def)) diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index 62e39a0..5751ea5 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -144,6 +144,8 @@ int qemuMigrationConfirm(virQEMUDriverPtr driver, unsigned int flags, int retcode); +bool qemuMigrationIsAllowed(virQEMUDriverPtr driver, virDomainObjPtr vm, + virDomainDefPtr def, bool remote); int qemuMigrationToFile(virQEMUDriverPtr driver, virDomainObjPtr vm, int fd, off_t offset, const char *path, -- 1.8.0

On 12/07/2012 04:30 AM, Peter Krempa wrote:
This patch exports qemuMigrationIsAllowed and adds a new parameter to it to denote if it's a remote migration or a local migration. Local migrations are used in snapshots and saving of the machine state and have fewer restrictions. This patch also adjusts callers of the function and tweaks some error messages to be more universal. --- src/qemu/qemu_migration.c | 45 ++++++++++++++++++++++++++------------------- src/qemu/qemu_migration.h | 2 ++ 2 files changed, 28 insertions(+), 19 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 86060dc..7e5cf01 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1011,30 +1011,38 @@ error: * assume that checking on source is sufficient to prevent ever * talking to the destination in the first place, we are stuck with * the fact that older servers did not do checks on the source. */ -static bool +bool qemuMigrationIsAllowed(virQEMUDriverPtr driver, virDomainObjPtr vm, - virDomainDefPtr def) + virDomainDefPtr def, bool remote) {
+ /* perform these checks only when migrating to remote hosts */ + if (remote) { + if (qemuProcessAutoDestroyActive(driver, vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("domain is marked for auto destroy")); + return false; + }
The check for autodestroy should be universal - we don't want to permit migration-to-file on an autodestroy guest.
+ + nsnapshots = virDomainSnapshotObjListNum(vm->snapshots, NULL, 0); + if (nsnapshots < 0) + return false; + + if (nsnapshots > 0) { + virReportError(VIR_ERR_OPERATION_INVALID, + _("cannot migrate domain with %d snapshots"), + nsnapshots); + return false; + }
Agree with making this remote-only (and even then, it would be nice to someday lift this restriction, if I can figure out a way to migrate an arbitrary amount of snapshot metadata during migration. The problem we have is that the migration cookie is fixed size, but if you take enough snapshots, you can exceed that size). ACK with the autodestroy hunk hoisted out of the 'if (remote)' block. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 12/11/12 17:17, Eric Blake wrote:
On 12/07/2012 04:30 AM, Peter Krempa wrote:
This patch exports qemuMigrationIsAllowed and adds a new parameter to it to denote if it's a remote migration or a local migration. Local migrations are used in snapshots and saving of the machine state and have fewer restrictions. This patch also adjusts callers of the function and tweaks some error messages to be more universal. --- src/qemu/qemu_migration.c | 45 ++++++++++++++++++++++++++------------------- src/qemu/qemu_migration.h | 2 ++ 2 files changed, 28 insertions(+), 19 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 86060dc..7e5cf01 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1011,30 +1011,38 @@ error:
[...]
Agree with making this remote-only (and even then, it would be nice to someday lift this restriction, if I can figure out a way to migrate an arbitrary amount of snapshot metadata during migration. The problem we have is that the migration cookie is fixed size, but if you take enough snapshots, you can exceed that size).
ACK with the autodestroy hunk hoisted out of the 'if (remote)' block.
I moved the hunk to the unconditional part and pushed this patch. Thanks. Peter

Qemu doesn't support migration on guests with host devices. This patch adds a check to ensure migration is safe before actualy doing so. --- src/qemu/qemu_driver.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 6f8f840..4423e69 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11428,6 +11428,10 @@ qemuDomainSnapshotCreateActiveExternal(virConnectPtr conn, /* do the memory snapshot if necessary */ if (memory) { + /* check if migration is possible */ + if (!qemuMigrationIsAllowed(driver, vm, vm->def, false)) + goto endjob; + /* allow the migration job to be cancelled or the domain to be paused */ qemuDomainObjSetAsyncJobMask(vm, DEFAULT_JOB_MASK | JOB_MASK(QEMU_JOB_SUSPEND) | -- 1.8.0

On 12/07/2012 04:30 AM, Peter Krempa wrote:
Qemu doesn't support migration on guests with host devices. This patch adds a check to ensure migration is safe before actualy doing so.
s/actualy/actually/
--- src/qemu/qemu_driver.c | 4 ++++ 1 file changed, 4 insertions(+)
ACK.
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 6f8f840..4423e69 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11428,6 +11428,10 @@ qemuDomainSnapshotCreateActiveExternal(virConnectPtr conn,
/* do the memory snapshot if necessary */ if (memory) { + /* check if migration is possible */ + if (!qemuMigrationIsAllowed(driver, vm, vm->def, false)) + goto endjob; + /* allow the migration job to be cancelled or the domain to be paused */ qemuDomainObjSetAsyncJobMask(vm, DEFAULT_JOB_MASK | JOB_MASK(QEMU_JOB_SUSPEND) |
-- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 12/11/12 17:17, Eric Blake wrote:
On 12/07/2012 04:30 AM, Peter Krempa wrote:
Qemu doesn't support migration on guests with host devices. This patch adds a check to ensure migration is safe before actualy doing so.
s/actualy/actually/
--- src/qemu/qemu_driver.c | 4 ++++ 1 file changed, 4 insertions(+)
ACK.
I fixed the commit message and pushed this patch. Thanks. Peter

Save and managedsave use both migration to file. This patch reuses qemuMigrationIsAllowed to check if the migration could happen before trying. --- src/qemu/qemu_driver.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 4423e69..7e19cdc 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3010,11 +3010,9 @@ qemuDomainSaveInternal(virQEMUDriverPtr driver, virDomainPtr dom, "%s", _("domain is marked for auto destroy")); goto cleanup; } - if (virDomainHasDiskMirror(vm)) { - virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", - _("domain has active block copy job")); + + if (!qemuMigrationIsAllowed(driver, vm, vm->def, false)) goto cleanup; - } if (qemuDomainObjBeginAsyncJobWithDriver(driver, vm, QEMU_ASYNC_JOB_SAVE) < 0) -- 1.8.0

On 12/07/2012 04:30 AM, Peter Krempa wrote:
Save and managedsave use both migration to file. This patch reuses
s/use both/both use/
qemuMigrationIsAllowed to check if the migration could happen before trying. --- src/qemu/qemu_driver.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
ACK.
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 4423e69..7e19cdc 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3010,11 +3010,9 @@ qemuDomainSaveInternal(virQEMUDriverPtr driver, virDomainPtr dom, "%s", _("domain is marked for auto destroy")); goto cleanup; } - if (virDomainHasDiskMirror(vm)) { - virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", - _("domain has active block copy job")); + + if (!qemuMigrationIsAllowed(driver, vm, vm->def, false)) goto cleanup; - }
if (qemuDomainObjBeginAsyncJobWithDriver(driver, vm, QEMU_ASYNC_JOB_SAVE) < 0)
-- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 12/11/12 17:18, Eric Blake wrote:
On 12/07/2012 04:30 AM, Peter Krempa wrote:
Save and managedsave use both migration to file. This patch reuses
s/use both/both use/
qemuMigrationIsAllowed to check if the migration could happen before trying. --- src/qemu/qemu_driver.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
ACK.
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 4423e69..7e19cdc 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3010,11 +3010,9 @@ qemuDomainSaveInternal(virQEMUDriverPtr driver, virDomainPtr dom, "%s", _("domain is marked for auto destroy")); goto cleanup;
Now as patch 2/5 has the autodestroy check unconditional, so the check is redundant here. I squashed in the changes from 5/5 getting rid of this one instance here without changing semantics.
} - if (virDomainHasDiskMirror(vm)) { - virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, "%s", - _("domain has active block copy job")); + + if (!qemuMigrationIsAllowed(driver, vm, vm->def, false)) goto cleanup; - }
if (qemuDomainObjBeginAsyncJobWithDriver(driver, vm, QEMU_ASYNC_JOB_SAVE) < 0)
and pushed the patch. Thanks for the review. Peter

I don't see any reason to forbid saving machines that should be autodestroyed. I will squash this to the previous one if this is okay. --- src/qemu/qemu_driver.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7e19cdc..e5b6295 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3005,12 +3005,6 @@ qemuDomainSaveInternal(virQEMUDriverPtr driver, virDomainPtr dom, virDomainEventPtr event = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; - if (qemuProcessAutoDestroyActive(driver, vm)) { - virReportError(VIR_ERR_OPERATION_INVALID, - "%s", _("domain is marked for auto destroy")); - goto cleanup; - } - if (!qemuMigrationIsAllowed(driver, vm, vm->def, false)) goto cleanup; -- 1.8.0

2012/12/7 Peter Krempa <pkrempa@redhat.com>
I don't see any reason to forbid saving machines that should be autodestroyed. I will squash this to the previous one if this is okay. --- src/qemu/qemu_driver.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7e19cdc..e5b6295 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3005,12 +3005,6 @@ qemuDomainSaveInternal(virQEMUDriverPtr driver, virDomainPtr dom, virDomainEventPtr event = NULL; qemuDomainObjPrivatePtr priv = vm->privateData;
- if (qemuProcessAutoDestroyActive(driver, vm)) { - virReportError(VIR_ERR_OPERATION_INVALID, - "%s", _("domain is marked for auto destroy")); - goto cleanup; - } - if (!qemuMigrationIsAllowed(driver, vm, vm->def, false)) goto cleanup;
-- 1.8.0
I think, if a vm is marked has autodestroyed,then this means that we do not need this vm again on next reboot.? if so,what's the reason for us to save this vm? if we want to save it,why we marked it has autodestroyed? ^ ^

On 12/07/12 14:19, Gao Yongwei wrote:
2012/12/7 Peter Krempa <pkrempa@redhat.com <mailto:pkrempa@redhat.com>>
[...]
I think, if a vm is marked has autodestroyed,then this means that we do not need this vm again on next reboot.? if so,what's the reason for us to save this vm? if we want to save it,why we marked it has autodestroyed? ^ ^
Well, autodestroy means that the guest is destroyed when the current libvirt connection (the connection that started the guest) is terminated. I don't see any reason why the user shouldn't be allowed to save the guest if he explicitly wishes to do so. Peter

On 12/07/2012 06:27 AM, Peter Krempa wrote:
On 12/07/12 14:19, Gao Yongwei wrote:
2012/12/7 Peter Krempa <pkrempa@redhat.com <mailto:pkrempa@redhat.com>>
[...]
I think, if a vm is marked has autodestroyed,then this means that we do not need this vm again on next reboot.? if so,what's the reason for us to save this vm? if we want to save it,why we marked it has autodestroyed? ^ ^
Well, autodestroy means that the guest is destroyed when the current libvirt connection (the connection that started the guest) is terminated. I don't see any reason why the user shouldn't be allowed to save the guest if he explicitly wishes to do so.
The decision to forbid saving autodestroy guests was an intentional design decision. If you don't want the guest to survive a libvirtd restart, then you should not need to save the guest for any other reason, either. NACK to this patch. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (4)
-
Eric Blake
-
Gao Yongwei
-
Martin Kletzander
-
Peter Krempa