On 08/11/10 - 03:37:12PM, David Teigland wrote:
> There are complications around migration we need to consider
too.
> During migration, you actually need QEMU running on two hosts at
> once. IIRC the idea is that before starting the migration operation,
> we'd have to tell sync-manager to mark the lease as shared with a
> specific host. The destination QEMU would have to startup in shared
> mode, and upgrade this to an exclusive lock when migration completes,
> or quit when migration fails.
sync_manager leases can only be exclusive, so it's a matter of transfering
ownership of the exclusive lock from source host to destination host. We
have not yet added lease transfer capabilities to sync_manager, but it
might look something like this:
S = source host, sm-S = sync_manager on S, ...
D = destination host, sm-D = sync_manager on D, ...
1. sm-S holds the lease, and is monitoring qemu
2. migration begins from S to D
3. libvirt-D runs sm-D: sync_manager -c qemu with the addition of a new
sync_manager option --receive-lease
4. sm-D writes its hostid D to the lease area signaling sm-S that it wants
to be the lease owner when S is done with it
5. sm-D begins monitoring the lease owner on disk (which is still S)
6. sm-D forks qemu-D
7. sm-S sees that D wants the lease
8. qemu-S exits with success
9. sm-S sees qemu-S exit with success
10. sm-S writes D as the lease owner into the lease area and exits
(in the non-migration/transfer case, sm-S writes owner=LEASE_FREE)
11. sm-D (still monitoring the lease owner) sees that it has become the
owner, and begins renewing the lease
12. qemu-D runs fully
Unfortunately, this is not how migration works in qemu/kvm. Using your
nomenclature above, it's more like the following:
A guest is running on S. A migration is then initiated, at which point D
fires up a qemu process with a -incoming argument. This is sort of
a container process that will receive all of the migration data. Crucially
for sync-manager, though, qemu completely starts up and "attaches" to all of
the resources (including disks) *while* qemu at S is still running. Then it
enters a sort of paused state (where the guest cannot run), and receives
all of the migration data. Once all of the migration data has been received,
the guest on S is destroyed, and the guest on D is unpaused. That's why Dan
mentioned that we need two hosts to access the disk at once.
--
Chris Lalancette