Re: [libvirt] [PATCH 3/6] Introduce yet another migration version in API.

21 Apr 2011


      ...
-----Original Message-----
From: Daniel P. Berrange [mailto:berrange@redhat.com]
Sent: Thursday, April 21, 2011 4:44 AM
To: Stefan Berger
Cc: Christian Benvenuti (benve); eblake@redhat.com; laine@laine.org;
chrisw@redhat.com; libvir-list@redhat.com; David Wang (dwang2); Roopa
Prabhu (roprabhu); Gerhard Stenzel; Jens Osterkamp; Anthony Liguori
Subject: Re: [libvirt] [PATCH 3/6] Introduce yet another migration
version in API.
...
On 04/20/2011 11:38 PM, Christian Benvenuti (benve) wrote:
...
...
On 04/20/2011 05:28 PM, Christian Benvenuti (benve) wrote:
...
Daniel,
   I looked at the patch-set you sent out on the 2/9/11
[libvirt] [PATCH 0/6] Introduce a new migration protocol
                         to QEMU driver
   http://www.mail-archive.com/libvir-
...
...
...
...
What is the status of this new migration protocol?
Is there any pending issue blocking its integration?
I would like to propose an RFC enhancement to the migration
algorithm.
Here is a quick summary of the proposal/idea.
- finer control on migration result
- possibility of specifying what features cannot fail
     their initialization on the dst host during migration.
     Migration should not succeed if any of them fails.
     - optional: each one of those features should be able to
                 provide a deinit function to cleanup resources
                 on the dst host if migration fails.
This functionality would come useful for the (NIC) set port
profile feature VDP (802.1Qbg/1Qbh), but what I propose is
a generic config option / API that can be used by any feature.
And now the details.
----------------------------------------------
enhancement: finer control on migration result
----------------------------------------------
There are different reasons why a VM may need (or be forced) to
migrate.
You can classify the types of the migrations also based on
different semantics.
For simplicity I'll classify them into two categories, based on
how important it is for the VM to migrate as fast as possible:
(1) It IS important
In this case, whether the VM will not be able to (temporary)
    make use of certain resources (for example the network) on
...
...
...
...
dst host, is not that important, because the completion of
On Thu, Apr 21, 2011 at 07:37:30AM -0400, Stefan Berger wrote:
list@redhat.com/msg33223.html
the
the
...
...
...
...
migration is considered higher priority.
    A possible scenario could be a server that must migrate ASAP
    because of a disaster/emergency.
(2) It IS NOT important
I can think of a VM whose applications/servers need a network
    connection in order to work properly. Loosing such network
    connectivity as a consequence of a migration would not be
    acceptable (or highly undesirable).
Given the case (2) above, I have a comment about the Finish
step, with regards to the port profile (VDP) codepath.
The call to
qemuMigrationVPAssociatePortProfile
in
     qemuMigrationFinish
can fail, but its result (success or failure) does not influence
the result of the migration Finish step (it was already like this
in migration V2).
I *believe* the underlying problem is Qemu's switch-over. Once Qemu
decides that the migration was successful, Qemu on the source side
dies
and continues running on the destination side. I don't think there
are
more handshakes foreseen with higher layers that this could be
reversed
or the switch-over delayed, but correct me if I am wrong...
Actually I think this is not what happens in migration V3.
My understanding is this:
- the qemu cmdline built by Libvirt on the dst host during Prepare3
  includes the "-S" option (ie no autostart)
- the VM on the dst host does not start running until libvirt
  calls qemuProcessStartCPUs in the Finish3 step.
  This fn simply sends the "-cont" cmd to the monitor to
  start the VM/CPUs.
That's correct, but it's doing this already in v2. The non-autostart
(-S) corresponds to Qemu's autostart here (migration.c):
void process_incoming_migration(QEMUFile *f)
{
    if (qemu_loadvm_state(f) < 0) {
        fprintf(stderr, "load of migration failed\n");
        exit(0);
    }
    qemu_announce_self();
    DPRINTF("successfully loaded vm state\n");
incoming_expected = false;
if (autostart)
        vm_start();
}
and simply doesn't start the VM. After this function is called all
sockets are closed and the communication with the source host is
cut. I don't think it allows for fall-back at this point.
Sure it does. As long as the destination QEMU CPUs have not been
started, you can fallback by simply killing the dest QEMU and
restarting CPUs on the src QEMU.
...
Rather we may need a 'wait' option for migration and before the
qemu_put_byte(f, QEMU_VM_EOF);
in qemu_savevm_state_complete() sync with the monitor and either
wait for something like migrate_finish or migrate_cancel.
The real problem, is that while we can tell from 'info migrate'
on the src, when the src has finished sending all data, there is
no way to ask the dest QEMU when it has finished receiving all
data.
So libvirt assumes that 'src finished sending' == success, and
will attempt to start the dst QEMU CPUs. As raised many times
in the past, we need 'info migrate' to work on the destination
too, in order to query success/fail. And ideally need async
events emitted when migration completes, so we don't have to
poll on 'info migrate' every 50ms
What is the reason why this point ('info migrate' on dst host)
was raised many times in the past but it was never implemented?
Is there any technical reason?
Assuming the interval between the moment src host finishes sending
and the dst host finishes receiving is not too big (which is
a fair assumption I guess), libvirt on the dst host could block
on that condition (ie wait for 'info migrate' to say "rx all" in the
dst host) at the beginning of Finish3. Is it doable?

/Christian