On 11/01/2012 10:22 AM, Peter Krempa wrote:
When pausing the guest while migration is running (to speed up
convergence) the virDomainSuspend API checks if the migration job is
active before entering the job. This could cause a possible race if the
virDomainSuspend is called while the job is active but ends before the
Suspend API enters the job (this would require that the migration is
aborted). This would cause a incorrect event to be emitted.
Not just aborted, but also completed normally. At any rate, I agree
that we shouldn't be relying on the state of job.asyncJob until we have
obtained the sync job.
---
src/qemu/qemu_driver.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 7dedb96..98dc441 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -1658,14 +1658,6 @@ static int qemudDomainSuspend(virDomainPtr dom) {
priv = vm->privateData;
- if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) {
- reason = VIR_DOMAIN_PAUSED_MIGRATION;
- eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED;
- } else {
- reason = VIR_DOMAIN_PAUSED_USER;
- eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_PAUSED;
- }
-
if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_SUSPEND) < 0)
goto cleanup;
@@ -1675,6 +1667,14 @@ static int qemudDomainSuspend(virDomainPtr dom) {
goto endjob;
}
+ if (priv->job.asyncJob == QEMU_ASYNC_JOB_MIGRATION_OUT) {
+ reason = VIR_DOMAIN_PAUSED_MIGRATION;
+ eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED;
+ } else {
+ reason = VIR_DOMAIN_PAUSED_USER;
+ eventDetail = VIR_DOMAIN_EVENT_SUSPENDED_PAUSED;
+ }
+
I think we're missing something. The idea here is that if the user
starts an async job, then they can request pause but not resume until
that async job has completed. How do we make sure resume is blocked?
By setting reason to something other than VIR_DOMAIN_PAUSED_USER.
The existing code worked for outgoing live migration by setting up the
async job, then if the user requests pause, sets things to the same
state as if the user had requested paused migration in the first place.
So in a similar vein - if the user requests live snapshot, but then
changes their mind and pauses, we should set things to the same state as
if they had requested paused snapshot in the first place.
[time passes, as well as an IRC chat...]
Oh, you did exactly that in patch 10/20!
In that case, this code is fixing a pre-existing bug, in preparation for
the work in patch 10. ACK as-is.
--
Eric Blake eblake(a)redhat.com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org