
On Fri, Jan 22, 2016 at 04:17:42PM +0100, Jiri Denemark wrote:
On Fri, Jan 22, 2016 at 15:07:04 +0000, Daniel P. Berrange wrote:
On Thu, Jan 21, 2016 at 11:20:46AM +0100, Jiri Denemark wrote:
VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY and VIR_DOMAIN_PAUSED_POSTCOPY are used on the source host once migration enters post-copy mode (which means the domain gets paused on the source. After the destination host takes over the execution of the domain, its virtual CPUs are resumed and the domain enters VIR_DOMAIN_RUNNING_POSTCOPY state and VIR_DOMAIN_EVENT_RESUMED_POSTCOPY event is emitted.
In case migration fails during post-copy mode and none of the hosts have complete state of the domain, both domains will remain paused with VIR_DOMAIN_PAUSED_POSTCOPY_FAILED reason and an upper layer may decide what to do.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
@@ -2380,6 +2383,8 @@ typedef enum { VIR_DOMAIN_EVENT_SUSPENDED_RESTORED = 4, /* Restored from paused state file */ VIR_DOMAIN_EVENT_SUSPENDED_FROM_SNAPSHOT = 5, /* Restored from paused snapshot */ VIR_DOMAIN_EVENT_SUSPENDED_API_ERROR = 6, /* suspended after failure during libvirt API call */ + VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY = 7, /* suspended for post-copy migration */ + VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY_FAILED = 8, /* suspended after failed post-copy */
Presumably the POSTCOPY_FAILED event can only be emitted on the target, since the source will already be suspended when we see a failure, and it doesn't make sense to issue a suspended event when we're already suspended.
But would it cause any harm? I figured it might be better to emit the event and set the state to POSTCOPY_FAILED even on the source so that apps/users don't have to guess whether POSTCOPY means it's still running or if it already failed.
The lifecycle events are supposed to be implementing a state machine, and we're not changing state in this case. I think applications that are currently using libvirt would reasonably consider it an error if libvirt issues an event for a state it is already in, and I could see it causing them to mistakenly run some logic twice if they get two SUSPEND events for the same domain in a row. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|