On Wed, Aug 11, 2010 at 05:19:27PM -0400, David Teigland wrote:
On Wed, Aug 11, 2010 at 04:53:20PM -0400, Chris Lalancette wrote:
> > 1. sm-S holds the lease, and is monitoring qemu
> > 2. migration begins from S to D
> > 3. libvirt-D runs sm-D: sync_manager -c qemu with the addition of a new
> > sync_manager option --receive-lease
> > 4. sm-D writes its hostid D to the lease area signaling sm-S that it wants
> > to be the lease owner when S is done with it
> > 5. sm-D begins monitoring the lease owner on disk (which is still S)
> > 6. sm-D forks qemu-D
> > 7. sm-S sees that D wants the lease
> > 8. qemu-S exits with success
> > 9. sm-S sees qemu-S exit with success
> > 10. sm-S writes D as the lease owner into the lease area and exits
> > (in the non-migration/transfer case, sm-S writes owner=LEASE_FREE)
> > 11. sm-D (still monitoring the lease owner) sees that it has become the
> > owner, and begins renewing the lease
> > 12. qemu-D runs fully
>
> Unfortunately, this is not how migration works in qemu/kvm. Using your
> nomenclature above, it's more like the following:
>
> A guest is running on S. A migration is then initiated, at which point D
> fires up a qemu process with a -incoming argument.
libvirt starts qemu -incoming on D, right? So with sync_manager, libvirt
would start: sync_manager --receive_lease -c qemu -incoming
Yes that is correct
> This is sort of
> a container process that will receive all of the migration data. Crucially
> for sync-manager, though, qemu completely starts up and "attaches" to all
of
> the resources (including disks) *while* qemu at S is still running. Then it
> enters a sort of paused state (where the guest cannot run), and receives
> all of the migration data.
That should all be fine.
> Once all of the migration data has been received, the guest on S is destroyed,
ok, sm-S sees qemu-S exit at that point.
> and the guest on D is unpaused.
The critical bit would be ensuring that sm-S has written owner=D into
the lease area before qemu-D is unpaused. Hooking into the sequence at
that point in time might be too difficult or ugly, I don't know.
The main hard bit here is that QEMU gives us no indication that migration
has completed. We 'detect' it by issuing a 'cont' command to unpause the
CPUs - this command blocks until migration is done. Clearly this won't
work for SM, but this isn't SM's problem. We need to fix this in QEMU
so that we get an async notification of migration completion, so we can
then tell SM to upgrade the lease, before we then issue 'cont' to start
CPUs.
> That's why Dan
> mentioned that we need two hosts to access the disk at once.
It would be easiest, of course, if the lease owner always represented where
qemu was running, but that obviously won't work with migration. So we have
to settle for the lease owner always representing where qemu is unpaused.
I think my other mail is in fact describing the same thing as you are,
I was just using different terminology :-)
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://deltacloud.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|