On Wed, Nov 03, 2010 at 04:51:18PM +0200, Dan Kenigsberg wrote:
On Tue, Nov 02, 2010 at 01:08:53PM +0000, Daniel P. Berrange wrote:
> This patch attempts to introduce a version 3 that uses the
> improved 5 step sequence
>
> * Src: Begin
> - Generate XML to pass to dst
> - Generate optional cookie to pass to dst
>
> * Dst: Prepare
> - Get ready to accept incoming VM
> - Generate optional cookie to pass to src
>
> * Src: Perform
> - Start migration and wait for send completion
> - Generate optional cookie to pass to dst
>
> * Dst: Finish
> - Wait for recv completion and check status
> - Kill off VM if failed, resume if success
> - Generate optional cookie to pass to src
>
> * Src: Confirm
> - Kill off VM if success, resume if failed
What happens if no confirmation is recieved (destination silently died)?
Will there be a timeout (bad idea), or at least a means by which higher
level management can cancel migration?
Either way, I did not get the suggestion to contact divinity.
In protocol of this kind, you always have the problem of what todo if you
get an error in the very last step. You can't keep adding further steps,
so at some point you have to decide that you've done enough that the
consequences of failure are not critical. If the 'Confirm' step fails
this means we've either failed to kill off the paused VM, or failed
to unpause the VM CPUs. Neither of those scenarios is a showstopper.
It just means a VM is left around. The app can try and do a destroy to
take care of it again later. The lock manager plugins should ensure
that the VM can't do anything bad.
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://deltacloud.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|