Daniel,
did you have a chance to look at the change I proposed as part
of your new V3 migration API?
(
https://www.redhat.com/archives/libvir-list/2011-April/msg00978.html)
Is there any plan to address the limitation raised in this 3D about
the fact that the src host does not know when the dst host (qemu)
has received all the data in the perform step?
Also, what about the change I proposed, which would allow the admin
to tell libvirt what initializations (=features) must succeed on the
dst host for the migration to be considered successful?
(I'll be happy to help if needed)
/Chris
-----Original Message-----
From: Christian Benvenuti (benve)
Sent: Thursday, April 21, 2011 6:35 PM
To: 'Stefan Berger'; Daniel P. Berrange
Cc: eblake(a)redhat.com; laine(a)laine.org; chrisw(a)redhat.com; libvir-
list(a)redhat.com; David Wang (dwang2); Roopa Prabhu (roprabhu); Gerhard
Stenzel; Jens Osterkamp; Anthony Liguori
Subject: RE: [libvirt] [PATCH 3/6] Introduce yet another migration
version in API.
> -----Original Message-----
> From: Stefan Berger [mailto:stefanb@linux.vnet.ibm.com]
> Sent: Thursday, April 21, 2011 6:34 AM
> To: Daniel P. Berrange
> Cc: Christian Benvenuti (benve); eblake(a)redhat.com; laine(a)laine.org;
> chrisw(a)redhat.com; libvir-list(a)redhat.com; David Wang (dwang2); Roopa
> Prabhu (roprabhu); Gerhard Stenzel; Jens Osterkamp; Anthony Liguori
> Subject: Re: [libvirt] [PATCH 3/6] Introduce yet another migration
> version in API.
>
> On 04/21/2011 07:43 AM, Daniel P. Berrange wrote:
> > On Thu, Apr 21, 2011 at 07:37:30AM -0400, Stefan Berger wrote:
> >>
> >> and simply doesn't start the VM. After this function is called all
> >> sockets are closed and the communication with the source host is
> >> cut. I don't think it allows for fall-back at this point.
> > Sure it does. As long as the destination QEMU CPUs have not been
> > started, you can fallback by simply killing the dest QEMU and
> > restarting CPUs on the src QEMU.
> >
> FWIW, I did a test and disabled the starting of the CPUs on the
> destination side and
> did a sleep() instead. Before the sleep() was over the Qemu on the
> source side had already disappeared.
Did you test this with migration V2 or migration V3?
I think what you describe is V2 (V3 now is different):
- Migration V2
SRC HOST DST HOST
|
|- dump XML
|
| (PREPARE)
+------------------------------>Start empty VM
|
|(PERFORM)
|- migrate cmd to monitor
|- kill VM
|
| (FINISH)
+------------------------------->Start CPU
- Migration V3
SRC HOST DST HOST
|
|(BEGIN)
|- dumpxml
| (PREPARE)
+------------------------------>Start empty VM
|
|(PERFORM)
|- migrate cmd to monitor
| (src CPU is now paused)
|
| (FINISH)
+------------------------------->Start CPU
|
|(CONFIRM)
|- if FINISH succeeded: Kill src VM
|- if FINISH failed : Run src VM
/Christian