[libvirt] [PATCH 0/7] Fix domain state handling when reverting snapshots in PMSUSPENDED state

Unfortunately qemu behaves strangely so it probably can't be fully verified :/ Peter Krempa (7): event: Add transition reason for PMSUSPENDED state from snapshot ops lib: Add reason for a domain reaching the PMSUSPENDED state qemu: snapshot: Convert if-else switch to switch statement qemu: snapshot: Reject revertion from clearly bad states qemu: snapshot: Add helper to generate lifecycle events qemu: snapshot: Refactor event creation when reverting snapshots qemu: snapshot: Correctly revert snapshots in PMSUSPENDED state examples/object-events/event-test.c | 3 + include/libvirt/libvirt.h.in | 6 + src/conf/domain_conf.c | 3 +- src/qemu/qemu_driver.c | 301 +++++++++++++++++++++++++----------- tools/virsh-domain-monitor.c | 3 +- tools/virsh-domain.c | 3 +- 6 files changed, 228 insertions(+), 91 deletions(-) -- 2.0.0

Add event reason to support transitions to PM suspended state from snapshot revert operations. --- examples/object-events/event-test.c | 3 +++ include/libvirt/libvirt.h.in | 3 +++ tools/virsh-domain.c | 3 ++- 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/examples/object-events/event-test.c b/examples/object-events/event-test.c index 2a5a83b..84c878e 100644 --- a/examples/object-events/event-test.c +++ b/examples/object-events/event-test.c @@ -210,6 +210,9 @@ static const char *eventDetailToString(int event, int detail) { case VIR_DOMAIN_EVENT_PMSUSPENDED_DISK: ret = "Disk"; break; + case VIR_DOMAIN_EVENT_PMSUSPENDED_FROM_SNAPSHOT: + ret = "Snapshot"; + break; } break; case VIR_DOMAIN_EVENT_CRASHED: diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index ad6785f..af339f2 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -3693,6 +3693,9 @@ typedef enum { typedef enum { VIR_DOMAIN_EVENT_PMSUSPENDED_MEMORY = 0, /* Guest was PM suspended to memory */ VIR_DOMAIN_EVENT_PMSUSPENDED_DISK = 1, /* Guest was PM suspended to disk */ + VIR_DOMAIN_EVENT_PMSUSPENDED_FROM_SNAPSHOT = 2, /* Guest changed state to + PM suspended due to + snapshot operation */ #ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_EVENT_PMSUSPENDED_LAST diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index ba47258..cade830 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -10686,7 +10686,8 @@ VIR_ENUM_DECL(vshDomainEventPMSuspended) VIR_ENUM_IMPL(vshDomainEventPMSuspended, VIR_DOMAIN_EVENT_PMSUSPENDED_LAST, N_("Memory"), - N_("Disk")) + N_("Disk"), + N_("Snapshot")) VIR_ENUM_DECL(vshDomainEventCrashed) VIR_ENUM_IMPL(vshDomainEventCrashed, -- 2.0.0

On 07/18/2014 10:11 AM, Peter Krempa wrote:
Add event reason to support transitions to PM suspended state from snapshot revert operations. --- examples/object-events/event-test.c | 3 +++ include/libvirt/libvirt.h.in | 3 +++ tools/virsh-domain.c | 3 ++- 3 files changed, 8 insertions(+), 1 deletion(-)
+++ b/include/libvirt/libvirt.h.in @@ -3693,6 +3693,9 @@ typedef enum { typedef enum { VIR_DOMAIN_EVENT_PMSUSPENDED_MEMORY = 0, /* Guest was PM suspended to memory */ VIR_DOMAIN_EVENT_PMSUSPENDED_DISK = 1, /* Guest was PM suspended to disk */ + VIR_DOMAIN_EVENT_PMSUSPENDED_FROM_SNAPSHOT = 2, /* Guest changed state to + PM suspended due to + snapshot operation */
Umm, is this even possible with qemu? My understanding is that qemu is unable to migrate S3 status, and therefore, the moment you migrate to another machine or to a file, the receiving end of the migration will act as if the guest had a PM wakeup event and is no longer in S3. If you are proposing immediately re-suspending the guest into S3 after resuming it, that's wrong - we should only support states that can be entered immediately, and not states that require an unknown amount of guest CPU cycles. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

When libvirt reverts a snapshot in the PMSUSPENDED state we need to record how it entered this state. Add the "from snapshot" reason. --- include/libvirt/libvirt.h.in | 3 +++ src/conf/domain_conf.c | 3 ++- tools/virsh-domain-monitor.c | 3 ++- 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index af339f2..5c13c28 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -223,6 +223,9 @@ typedef enum { typedef enum { VIR_DOMAIN_PMSUSPENDED_UNKNOWN = 0, + VIR_DOMAIN_PMSUSPENDED_FROM_SNAPSHOT = 1, /* restored from a snapshot which + was taken while the domain was + pmsuspended */ #ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_PMSUSPENDED_LAST diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 1e27165..78a365f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -682,7 +682,8 @@ VIR_ENUM_IMPL(virDomainCrashedReason, VIR_DOMAIN_CRASHED_LAST, "panicked") VIR_ENUM_IMPL(virDomainPMSuspendedReason, VIR_DOMAIN_PMSUSPENDED_LAST, - "unknown") + "unknown", + "from snapshot") VIR_ENUM_IMPL(virDomainSeclabel, VIR_DOMAIN_SECLABEL_LAST, "default", diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c index 8bd58ad..3a598e9 100644 --- a/tools/virsh-domain-monitor.c +++ b/tools/virsh-domain-monitor.c @@ -213,7 +213,8 @@ VIR_ENUM_IMPL(vshDomainCrashedReason, VIR_ENUM_DECL(vshDomainPMSuspendedReason) VIR_ENUM_IMPL(vshDomainPMSuspendedReason, VIR_DOMAIN_PMSUSPENDED_LAST, - N_("unknown")) + N_("unknown"), + N_("from snapshot")) static const char * vshDomainStateReasonToString(int state, int reason) -- 2.0.0

On 07/18/2014 10:11 AM, Peter Krempa wrote:
When libvirt reverts a snapshot in the PMSUSPENDED state we need to record how it entered this state. Add the "from snapshot" reason. --- include/libvirt/libvirt.h.in | 3 +++ src/conf/domain_conf.c | 3 ++- tools/virsh-domain-monitor.c | 3 ++- 3 files changed, 7 insertions(+), 2 deletions(-)
Again, I'm not sure this is possible with current qemu. Rather, what really needs to happen is whatever we do when migrating an S3 guest or doing a 'virsh restore file' on a file created by 'virsh save' on an S3 guest. Basically, waking the file is forced to have the guest back in pm wakeup state (or at best paused). Yes, if future qemu is ever patched to support S3 migration, then we can revisit this. But that won't be qemu 2.1, so I don't think we need this patch. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Convert the target snapshot state selector to a switch statement enumerating all possible values. This points out a few mistakes in the original selector. The logic of the code is preserved until later patches. --- src/qemu/qemu_driver.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 3096688..7ab0f81 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13922,6 +13922,8 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, virDomainDefPtr config = NULL; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; + bool was_running = false; + bool was_stopped = false; virCheckFlags(VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING | VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED | @@ -14022,12 +14024,10 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) goto cleanup; - if (snap->def->state == VIR_DOMAIN_RUNNING - || snap->def->state == VIR_DOMAIN_PAUSED) { + switch ((virDomainState) snap->def->state) { + case VIR_DOMAIN_RUNNING: + case VIR_DOMAIN_PAUSED: /* Transitions 2, 3, 5, 6, 8, 9 */ - bool was_running = false; - bool was_stopped = false; - /* When using the loadvm monitor command, qemu does not know * whether to pause or run the reverted domain, and just stays * in the same state as before the monitor command, whether @@ -14153,7 +14153,16 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, detail); } } - } else { + break; + + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_BLOCKED: + /* XXX: The following one is clearly wrong! */ + case VIR_DOMAIN_PMSUSPENDED: + case VIR_DOMAIN_LAST: /* Transitions 1, 4, 7 */ /* Newer qemu -loadvm refuses to revert to the state of a snapshot * created by qemu-img snapshot -c. If the domain is running, we @@ -14217,6 +14226,7 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, detail); } } + break; } ret = 0; -- 2.0.0

On 07/18/2014 10:11 AM, Peter Krempa wrote:
Convert the target snapshot state selector to a switch statement enumerating all possible values. This points out a few mistakes in the original selector.
The logic of the code is preserved until later patches. --- src/qemu/qemu_driver.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)
ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 07/19/14 00:31, Eric Blake wrote:
On 07/18/2014 10:11 AM, Peter Krempa wrote:
Convert the target snapshot state selector to a switch statement enumerating all possible values. This points out a few mistakes in the original selector.
The logic of the code is preserved until later patches. --- src/qemu/qemu_driver.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)
ACK.
Pushed; Thanks. Peter

Report errors on some states snapshots done by qemu should never reach --- src/qemu/qemu_driver.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7ab0f81..2d58b53 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14158,11 +14158,8 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, case VIR_DOMAIN_SHUTDOWN: case VIR_DOMAIN_SHUTOFF: case VIR_DOMAIN_CRASHED: - case VIR_DOMAIN_NOSTATE: - case VIR_DOMAIN_BLOCKED: /* XXX: The following one is clearly wrong! */ case VIR_DOMAIN_PMSUSPENDED: - case VIR_DOMAIN_LAST: /* Transitions 1, 4, 7 */ /* Newer qemu -loadvm refuses to revert to the state of a snapshot * created by qemu-img snapshot -c. If the domain is running, we @@ -14227,6 +14224,15 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, } } break; + + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_LAST: + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid target domain state '%s'. Refusing " + "snapshot revertion "), + virDomainStateTypeToString(snap->def->state)); + goto cleanup; } ret = 0; -- 2.0.0

On 07/18/2014 10:11 AM, Peter Krempa wrote:
Report errors on some states snapshots done by qemu should never reach --- src/qemu/qemu_driver.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
+ + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_LAST: + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid target domain state '%s'. Refusing " + "snapshot revertion "),
s/revertion /reversion/ (typo and trailing space)
+ virDomainStateTypeToString(snap->def->state)); + goto cleanup; }
ACK with that fixed. It might also be nice to patch qemuDomainSnapshotCreateXML with the _REDEFINE flag to likewise reject modifying an existing snapshot into one of these states. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 07/19/14 00:33, Eric Blake wrote:
On 07/18/2014 10:11 AM, Peter Krempa wrote:
Report errors on some states snapshots done by qemu should never reach --- src/qemu/qemu_driver.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
+ + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_LAST: + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid target domain state '%s'. Refusing " + "snapshot revertion "),
s/revertion /reversion/
(typo and trailing space)
+ virDomainStateTypeToString(snap->def->state)); + goto cleanup; }
ACK with that fixed. It might also be nice to patch qemuDomainSnapshotCreateXML with the _REDEFINE flag to likewise reject modifying an existing snapshot into one of these states.
Fixed && pushed; Thanks. Peter

Add helper that will generate domain lifecycle events regarding to snapshot revert operations and use it in a demonstration case in the code. The helper will be later used in internal snaphsot revert refactor more widely. --- src/qemu/qemu_driver.c | 142 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 136 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2d58b53..e794e02 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -13907,6 +13907,139 @@ qemuDomainSnapshotRevertInactive(virQEMUDriverPtr driver, return ret > 0 ? -1 : ret; } + +static int +qemuDomainSnapshotEmitLifecycleEvent(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainState oldState, + virDomainState newState) +{ + int reason = 0; + int type = -1; + virObjectEventPtr event = NULL; + + switch (newState) { + case VIR_DOMAIN_RUNNING: + switch (oldState) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + case VIR_DOMAIN_SHUTDOWN: + type = VIR_DOMAIN_EVENT_STARTED; + reason = VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_PMSUSPENDED: + case VIR_DOMAIN_PAUSED: + type = VIR_DOMAIN_EVENT_RESUMED; + reason = VIR_DOMAIN_EVENT_RESUMED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_LAST: + case VIR_DOMAIN_RUNNING: + case VIR_DOMAIN_BLOCKED: + /* no event */ + break; + } + break; + + case VIR_DOMAIN_PAUSED: + switch (oldState) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + /* the machine was started here, so we need an additional event */ + event = virDomainEventLifecycleNewFromObj(vm, + VIR_DOMAIN_EVENT_STARTED, + VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT); + if (!event) + goto cleanup; + qemuDomainEventQueue(driver, event); + /* fallthrough! */ + case VIR_DOMAIN_PMSUSPENDED: + case VIR_DOMAIN_RUNNING: + case VIR_DOMAIN_SHUTDOWN: + type = VIR_DOMAIN_EVENT_SUSPENDED; + reason = VIR_DOMAIN_EVENT_SUSPENDED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_PAUSED: + case VIR_DOMAIN_LAST: + case VIR_DOMAIN_BLOCKED: + /* no event */ + break; + } + break; + + case VIR_DOMAIN_SHUTOFF: + switch (oldState) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_PAUSED: + case VIR_DOMAIN_RUNNING: + case VIR_DOMAIN_PMSUSPENDED: + type = VIR_DOMAIN_EVENT_STOPPED; + reason = VIR_DOMAIN_EVENT_STOPPED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + case VIR_DOMAIN_LAST: + /* no event */ + break; + } + break; + + case VIR_DOMAIN_PMSUSPENDED: + switch (oldState) { + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_SHUTOFF: + case VIR_DOMAIN_CRASHED: + /* the machine was started here, so we need an additional event */ + event = virDomainEventLifecycleNewFromObj(vm, + VIR_DOMAIN_EVENT_STARTED, + VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT); + if (!event) + goto cleanup; + qemuDomainEventQueue(driver, event); + /* fallthrough! */ + case VIR_DOMAIN_RUNNING: + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_PAUSED: + type = VIR_DOMAIN_EVENT_PMSUSPENDED; + reason = VIR_DOMAIN_EVENT_PMSUSPENDED_FROM_SNAPSHOT; + break; + + case VIR_DOMAIN_LAST: + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_PMSUSPENDED: + /* no event */ + break; + } + break; + + case VIR_DOMAIN_NOSTATE: + case VIR_DOMAIN_BLOCKED: + case VIR_DOMAIN_SHUTDOWN: + case VIR_DOMAIN_CRASHED: + case VIR_DOMAIN_LAST: + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Suspicious domain state '%d' after snapshot"), + newState); + return -1; + } + + if (!(event = virDomainEventLifecycleNewFromObj(vm, type, reason))) + goto cleanup; + + qemuDomainEventQueue(driver, event); + return 0; + + cleanup: + return -1; +} + static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, unsigned int flags) { @@ -14051,12 +14184,9 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, qemuProcessStop(driver, vm, VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT, 0); virDomainAuditStop(vm, "from-snapshot"); - detail = VIR_DOMAIN_EVENT_STOPPED_FROM_SNAPSHOT; - event = virDomainEventLifecycleNewFromObj(vm, - VIR_DOMAIN_EVENT_STOPPED, - detail); - if (event) - qemuDomainEventQueue(driver, event); + qemuDomainSnapshotEmitLifecycleEvent(driver, vm, + vm->state.state, + VIR_DOMAIN_SHUTOFF); goto load; } -- 2.0.0

The internal snapshot revert code used a very unintuitive approach to create lifecycle events while doing the revert. Refactor the code to use the new automagic function to do the revertion. --- src/qemu/qemu_driver.c | 76 +++++++++----------------------------------------- 1 file changed, 13 insertions(+), 63 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e794e02..1f98f4a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14047,16 +14047,12 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, virDomainObjPtr vm = NULL; int ret = -1; virDomainSnapshotObjPtr snap = NULL; - virObjectEventPtr event = NULL; - virObjectEventPtr event2 = NULL; - int detail; qemuDomainObjPrivatePtr priv; int rc; virDomainDefPtr config = NULL; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; - bool was_running = false; - bool was_stopped = false; + int oldState = vm->state.state; virCheckFlags(VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING | VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED | @@ -14187,24 +14183,18 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, qemuDomainSnapshotEmitLifecycleEvent(driver, vm, vm->state.state, VIR_DOMAIN_SHUTOFF); + oldState = VIR_DOMAIN_SHUTOFF; goto load; } priv = vm->privateData; if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { /* Transitions 5, 6 */ - was_running = true; if (qemuProcessStopCPUs(driver, vm, VIR_DOMAIN_PAUSED_FROM_SNAPSHOT, QEMU_ASYNC_JOB_NONE) < 0) goto endjob; - /* Create an event now in case the restore fails, so - * that user will be alerted that they are now paused. - * If restore later succeeds, we might replace this. */ - detail = VIR_DOMAIN_EVENT_SUSPENDED_FROM_SNAPSHOT; - event = virDomainEventLifecycleNewFromObj(vm, - VIR_DOMAIN_EVENT_SUSPENDED, - detail); + if (!virDomainObjIsActive(vm)) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("guest unexpectedly quit")); @@ -14224,7 +14214,6 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, } else { /* Transitions 2, 3 */ load: - was_stopped = true; if (config) virDomainObjAssignDef(vm, config, false, NULL); @@ -14233,10 +14222,7 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, VIR_NETDEV_VPORT_PROFILE_OP_CREATE, VIR_QEMU_PROCESS_START_PAUSED); virDomainAuditStart(vm, "from-snapshot", rc >= 0); - detail = VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT; - event = virDomainEventLifecycleNewFromObj(vm, - VIR_DOMAIN_EVENT_STARTED, - detail); + if (rc < 0) goto endjob; } @@ -14248,13 +14234,6 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, /* Transitions 3, 6, 9 */ virDomainObjSetState(vm, VIR_DOMAIN_PAUSED, VIR_DOMAIN_PAUSED_FROM_SNAPSHOT); - if (was_stopped) { - /* Transition 3, use event as-is and add event2 */ - detail = VIR_DOMAIN_EVENT_SUSPENDED_FROM_SNAPSHOT; - event2 = virDomainEventLifecycleNewFromObj(vm, - VIR_DOMAIN_EVENT_SUSPENDED, - detail); - } /* else transition 6 and 9 use event as-is */ } else { /* Transitions 2, 5, 8 */ if (!virDomainObjIsActive(vm)) { @@ -14267,21 +14246,6 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, QEMU_ASYNC_JOB_NONE); if (rc < 0) goto endjob; - virObjectUnref(event); - event = NULL; - if (was_stopped) { - /* Transition 2 */ - detail = VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT; - event = virDomainEventLifecycleNewFromObj(vm, - VIR_DOMAIN_EVENT_STARTED, - detail); - } else if (was_running) { - /* Transition 8 */ - detail = VIR_DOMAIN_EVENT_RESUMED; - event = virDomainEventLifecycleNewFromObj(vm, - VIR_DOMAIN_EVENT_RESUMED, - detail); - } } break; @@ -14300,10 +14264,9 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, /* Transitions 4, 7 */ qemuProcessStop(driver, vm, VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT, 0); virDomainAuditStop(vm, "from-snapshot"); - detail = VIR_DOMAIN_EVENT_STOPPED_FROM_SNAPSHOT; - event = virDomainEventLifecycleNewFromObj(vm, - VIR_DOMAIN_EVENT_STOPPED, - detail); + qemuDomainSnapshotEmitLifecycleEvent(driver, vm, vm->state.state, + VIR_DOMAIN_SHUTOFF); + oldState = VIR_DOMAIN_SHUTOFF; } if (qemuDomainSnapshotRevertInactive(driver, vm, snap) < 0) { @@ -14326,8 +14289,6 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, start_flags |= paused ? VIR_QEMU_PROCESS_START_PAUSED : 0; - if (event) - qemuDomainEventQueue(driver, event); rc = qemuProcessStart(snapshot->domain->conn, driver, vm, NULL, -1, NULL, NULL, VIR_NETDEV_VPORT_PROFILE_OP_CREATE, @@ -14342,16 +14303,6 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, } goto endjob; } - detail = VIR_DOMAIN_EVENT_STARTED_FROM_SNAPSHOT; - event = virDomainEventLifecycleNewFromObj(vm, - VIR_DOMAIN_EVENT_STARTED, - detail); - if (paused) { - detail = VIR_DOMAIN_EVENT_SUSPENDED_FROM_SNAPSHOT; - event2 = virDomainEventLifecycleNewFromObj(vm, - VIR_DOMAIN_EVENT_SUSPENDED, - detail); - } } break; @@ -14381,13 +14332,12 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, } else if (snap) { snap->def->current = false; } - if (event) { - qemuDomainEventQueue(driver, event); - if (event2) - qemuDomainEventQueue(driver, event2); - } - if (vm) - virObjectUnlock(vm); + + /* emit all appropriate events */ + qemuDomainSnapshotEmitLifecycleEvent(driver, vm, oldState, + vm->state.state); + + virObjectUnlock(vm); virObjectUnref(caps); virObjectUnref(cfg); -- 2.0.0

PMSUSPENDED state implies the qemu process running, so we should treat it that way. This increases the possible state space of transitions that may happen during the snapshot revert so this patch documents them as well. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1079162 --- src/qemu/qemu_driver.c | 67 ++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 48 insertions(+), 19 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 1f98f4a..f93e0fd 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14052,22 +14052,29 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, virDomainDefPtr config = NULL; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; - int oldState = vm->state.state; + int oldState; virCheckFlags(VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING | VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED | VIR_DOMAIN_SNAPSHOT_REVERT_FORCE, -1); /* We have the following transitions, which create the following events: - * 1. inactive -> inactive: none - * 2. inactive -> running: EVENT_STARTED - * 3. inactive -> paused: EVENT_STARTED, EVENT_PAUSED - * 4. running -> inactive: EVENT_STOPPED - * 5. running -> running: none - * 6. running -> paused: EVENT_PAUSED - * 7. paused -> inactive: EVENT_STOPPED - * 8. paused -> running: EVENT_RESUMED - * 9. paused -> paused: none + * 1. inactive -> inactive: none + * 2. inactive -> running: EVENT_STARTED + * 3. inactive -> paused: EVENT_STARTED, EVENT_PAUSED + * 4. running -> inactive: EVENT_STOPPED + * 5. running -> running: none + * 6. running -> paused: EVENT_PAUSED + * 7. paused -> inactive: EVENT_STOPPED + * 8. paused -> running: EVENT_RESUMED + * 9. paused -> paused: none + * 10. pmsuspended -> pmsuspended: none + * 11. pmsuspended -> inactive: EVENT_STOPPED + * 12. pmsuspended -> running: EVENT_RESUMED + * 13. pmsuspended -> paused: EVENT_PAUSED + * 14. inactive -> pmsuspended: EVENT_STARTED, EVENT_PMSUSPENDED + * 15. paused -> pmsuspended: EVENT_PMSUSPENDED + * 16. running -> pmsuspended: EVENT_PMSUSPENDED * Also, several transitions occur even if we fail partway through, * and use of FORCE can cause multiple transitions. */ @@ -14075,6 +14082,8 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, if (!(vm = qemuDomObjFromSnapshot(snapshot))) return -1; + oldState = vm->state.state; + cfg = virQEMUDriverGetConfig(driver); if (virDomainRevertToSnapshotEnsureACL(snapshot->domain->conn, vm->def) < 0) @@ -14127,6 +14136,14 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, } } + if (snap->def->state == VIR_DOMAIN_PMSUSPENDED && + (flags & VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING || + flags & VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("snapshot of a VM in PMSUSPENDED state cannot be " + "reverted to a running or paused state")); + goto cleanup; + } if (vm->current_snapshot) { vm->current_snapshot->def->current = false; @@ -14156,14 +14173,17 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, switch ((virDomainState) snap->def->state) { case VIR_DOMAIN_RUNNING: case VIR_DOMAIN_PAUSED: - /* Transitions 2, 3, 5, 6, 8, 9 */ + case VIR_DOMAIN_PMSUSPENDED: + /* New state of the domain requires a running qemu process */ + /* Transitions 2, 3, 5, 6, 8, 9, 10, 12, 13, 14, 15, 16 */ + /* When using the loadvm monitor command, qemu does not know * whether to pause or run the reverted domain, and just stays * in the same state as before the monitor command, whether * that is paused or running. We always pause before loadvm, * to have finer control. */ if (virDomainObjIsActive(vm)) { - /* Transitions 5, 6, 8, 9 */ + /* Transitions 5, 6, 8, 9, 10, 12, 13, 15, 16 */ /* Check for ABI compatibility. We need to do this check against * the migratable XML or it will always fail otherwise */ if (config && !qemuDomainDefCheckABIStability(driver, vm->def, config)) { @@ -14189,7 +14209,7 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, priv = vm->privateData; if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { - /* Transitions 5, 6 */ + /* Transitions 5, 6, 16*/ if (qemuProcessStopCPUs(driver, vm, VIR_DOMAIN_PAUSED_FROM_SNAPSHOT, QEMU_ASYNC_JOB_NONE) < 0) @@ -14212,7 +14232,7 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, if (config) virDomainObjAssignDef(vm, config, false, NULL); } else { - /* Transitions 2, 3 */ + /* Transitions 2, 3, 14 */ load: if (config) virDomainObjAssignDef(vm, config, false, NULL); @@ -14234,6 +14254,16 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, /* Transitions 3, 6, 9 */ virDomainObjSetState(vm, VIR_DOMAIN_PAUSED, VIR_DOMAIN_PAUSED_FROM_SNAPSHOT); + } else if (snap->def->state == VIR_DOMAIN_PMSUSPENDED) { + /* actually start the CPUs, as they are suspended differently */ + rc = qemuProcessStartCPUs(driver, vm, snapshot->domain->conn, + VIR_DOMAIN_RUNNING_FROM_SNAPSHOT, + QEMU_ASYNC_JOB_NONE); + if (rc < 0) + goto endjob; + /* now fix up the state to suspended */ + virDomainObjSetState(vm, VIR_DOMAIN_PMSUSPENDED, + VIR_DOMAIN_PMSUSPENDED_FROM_SNAPSHOT); } else { /* Transitions 2, 5, 8 */ if (!virDomainObjIsActive(vm)) { @@ -14252,16 +14282,15 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, case VIR_DOMAIN_SHUTDOWN: case VIR_DOMAIN_SHUTOFF: case VIR_DOMAIN_CRASHED: - /* XXX: The following one is clearly wrong! */ - case VIR_DOMAIN_PMSUSPENDED: - /* Transitions 1, 4, 7 */ + /* qemu will not run after the snapshot revertion */ + /* Transitions 1, 4, 7, 11 */ + /* Newer qemu -loadvm refuses to revert to the state of a snapshot * created by qemu-img snapshot -c. If the domain is running, we * must take it offline; then do the revert using qemu-img. */ - if (virDomainObjIsActive(vm)) { - /* Transitions 4, 7 */ + /* Transitions 4, 7, 11 */ qemuProcessStop(driver, vm, VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT, 0); virDomainAuditStop(vm, "from-snapshot"); qemuDomainSnapshotEmitLifecycleEvent(driver, vm, vm->state.state, -- 2.0.0

On 07/18/2014 10:11 AM, Peter Krempa wrote:
PMSUSPENDED state implies the qemu process running, so we should treat it that way. This increases the possible state space of transitions that may happen during the snapshot revert so this patch documents them as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1079162 --- src/qemu/qemu_driver.c | 67 ++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 48 insertions(+), 19 deletions(-)
I think this needs some rework - current qemu treats it as reverting to a pm wakeup event, and the guest will start running again (unless the user requested to revert as paused).
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 1f98f4a..f93e0fd 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14052,22 +14052,29 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, virDomainDefPtr config = NULL; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; - int oldState = vm->state.state; + int oldState;
virCheckFlags(VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING | VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED | VIR_DOMAIN_SNAPSHOT_REVERT_FORCE, -1);
/* We have the following transitions, which create the following events: - * 1. inactive -> inactive: none - * 2. inactive -> running: EVENT_STARTED - * 3. inactive -> paused: EVENT_STARTED, EVENT_PAUSED - * 4. running -> inactive: EVENT_STOPPED - * 5. running -> running: none - * 6. running -> paused: EVENT_PAUSED - * 7. paused -> inactive: EVENT_STOPPED - * 8. paused -> running: EVENT_RESUMED - * 9. paused -> paused: none + * 1. inactive -> inactive: none + * 2. inactive -> running: EVENT_STARTED + * 3. inactive -> paused: EVENT_STARTED, EVENT_PAUSED + * 4. running -> inactive: EVENT_STOPPED + * 5. running -> running: none + * 6. running -> paused: EVENT_PAUSED + * 7. paused -> inactive: EVENT_STOPPED + * 8. paused -> running: EVENT_RESUMED + * 9. paused -> paused: none + * 10. pmsuspended -> pmsuspended: none
this state is not possible with current qemu
+ * 11. pmsuspended -> inactive: EVENT_STOPPED + * 12. pmsuspended -> running: EVENT_RESUMED + * 13. pmsuspended -> paused: EVENT_PAUSED + * 14. inactive -> pmsuspended: EVENT_STARTED, EVENT_PMSUSPENDED + * 15. paused -> pmsuspended: EVENT_PMSUSPENDED + * 16. running -> pmsuspended: EVENT_PMSUSPENDED
and these three states are not possible
* Also, several transitions occur even if we fail partway through, * and use of FORCE can cause multiple transitions. */ @@ -14075,6 +14082,8 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, if (!(vm = qemuDomObjFromSnapshot(snapshot))) return -1;
+ oldState = vm->state.state; + cfg = virQEMUDriverGetConfig(driver);
if (virDomainRevertToSnapshotEnsureACL(snapshot->domain->conn, vm->def) < 0) @@ -14127,6 +14136,14 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, } }
+ if (snap->def->state == VIR_DOMAIN_PMSUSPENDED && + (flags & VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING || + flags & VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("snapshot of a VM in PMSUSPENDED state cannot be " + "reverted to a running or paused state"));
actually, I think that we need to tackle it differently - we can't CREATE a snapshot in the pmsuspended state (just as we can't migrate in that state - the mere act of migration is a wakeup event). So, if you eliminate all snapshots in the pmsuspended state as invalid, the set of state transitions is smaller. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 07/19/14 01:12, Eric Blake wrote:
On 07/18/2014 10:11 AM, Peter Krempa wrote:
PMSUSPENDED state implies the qemu process running, so we should treat it that way. This increases the possible state space of transitions that may happen during the snapshot revert so this patch documents them as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1079162 --- src/qemu/qemu_driver.c | 67 ++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 48 insertions(+), 19 deletions(-)
I think this needs some rework - current qemu treats it as reverting to a pm wakeup event, and the guest will start running again (unless the user requested to revert as paused).
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 1f98f4a..f93e0fd 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14052,22 +14052,29 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, virDomainDefPtr config = NULL; virQEMUDriverConfigPtr cfg = NULL; virCapsPtr caps = NULL; - int oldState = vm->state.state; + int oldState;
virCheckFlags(VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING | VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED | VIR_DOMAIN_SNAPSHOT_REVERT_FORCE, -1);
/* We have the following transitions, which create the following events: - * 1. inactive -> inactive: none - * 2. inactive -> running: EVENT_STARTED - * 3. inactive -> paused: EVENT_STARTED, EVENT_PAUSED - * 4. running -> inactive: EVENT_STOPPED - * 5. running -> running: none - * 6. running -> paused: EVENT_PAUSED - * 7. paused -> inactive: EVENT_STOPPED - * 8. paused -> running: EVENT_RESUMED - * 9. paused -> paused: none + * 1. inactive -> inactive: none + * 2. inactive -> running: EVENT_STARTED + * 3. inactive -> paused: EVENT_STARTED, EVENT_PAUSED + * 4. running -> inactive: EVENT_STOPPED + * 5. running -> running: none + * 6. running -> paused: EVENT_PAUSED + * 7. paused -> inactive: EVENT_STOPPED + * 8. paused -> running: EVENT_RESUMED + * 9. paused -> paused: none + * 10. pmsuspended -> pmsuspended: none
this state is not possible with current qemu
+ * 11. pmsuspended -> inactive: EVENT_STOPPED + * 12. pmsuspended -> running: EVENT_RESUMED + * 13. pmsuspended -> paused: EVENT_PAUSED + * 14. inactive -> pmsuspended: EVENT_STARTED, EVENT_PMSUSPENDED + * 15. paused -> pmsuspended: EVENT_PMSUSPENDED + * 16. running -> pmsuspended: EVENT_PMSUSPENDED
and these three states are not possible
* Also, several transitions occur even if we fail partway through, * and use of FORCE can cause multiple transitions. */ @@ -14075,6 +14082,8 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, if (!(vm = qemuDomObjFromSnapshot(snapshot))) return -1;
+ oldState = vm->state.state; + cfg = virQEMUDriverGetConfig(driver);
if (virDomainRevertToSnapshotEnsureACL(snapshot->domain->conn, vm->def) < 0) @@ -14127,6 +14136,14 @@ static int qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, } }
+ if (snap->def->state == VIR_DOMAIN_PMSUSPENDED && + (flags & VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING || + flags & VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("snapshot of a VM in PMSUSPENDED state cannot be " + "reverted to a running or paused state"));
actually, I think that we need to tackle it differently - we can't CREATE a snapshot in the pmsuspended state (just as we can't migrate in that state - the mere act of migration is a wakeup event). So, if you eliminate all snapshots in the pmsuspended state as invalid, the set of state transitions is smaller.
I've posted a series that forbids snapshot of pmsuspended guests. My initiative to post this full rework was so that libvirt would do the right thing right after the problem in qemu is fixed. With forbidding of the snapshot we'll need to re-visit this stuff once that qemu is fixed. Peter
participants (2)
-
Eric Blake
-
Peter Krempa