On 04/20/2011 05:28 PM, Christian Benvenuti (benve) wrote:
> Daniel,
> I looked at the patch-set you sent out on the 2/9/11
>
> [libvirt] [PATCH 0/6] Introduce a new migration protocol
> to QEMU driver
>
http://www.mail-archive.com/libvir-list@redhat.com/msg33223.html
>
> What is the status of this new migration protocol?
> Is there any pending issue blocking its integration?
>
> I would like to propose an RFC enhancement to the migration
> algorithm.
>
> Here is a quick summary of the proposal/idea.
>
> - finer control on migration result
>
> - possibility of specifying what features cannot fail
> their initialization on the dst host during migration.
> Migration should not succeed if any of them fails.
> - optional: each one of those features should be able to
> provide a deinit function to cleanup resources
> on the dst host if migration fails.
>
> This functionality would come useful for the (NIC) set port
> profile feature VDP (802.1Qbg/1Qbh), but what I propose is
> a generic config option / API that can be used by any feature.
>
> And now the details.
>
> ----------------------------------------------
> enhancement: finer control on migration result
> ----------------------------------------------
>
> There are different reasons why a VM may need (or be forced) to
> migrate.
> You can classify the types of the migrations also based on
> different semantics.
> For simplicity I'll classify them into two categories, based on
> how important it is for the VM to migrate as fast as possible:
>
> (1) It IS important
>
> In this case, whether the VM will not be able to (temporary)
> make use of certain resources (for example the network) on the
> dst host, is not that important, because the completion of the
> migration is considered higher priority.
> A possible scenario could be a server that must migrate ASAP
> because of a disaster/emergency.
>
> (2) It IS NOT important
>
> I can think of a VM whose applications/servers need a network
> connection in order to work properly. Loosing such network
> connectivity as a consequence of a migration would not be
> acceptable (or highly undesirable).
>
> Given the case (2) above, I have a comment about the Finish
> step, with regards to the port profile (VDP) codepath.
>
> The call to
>
> qemuMigrationVPAssociatePortProfile
>
> in
> qemuMigrationFinish
>
> can fail, but its result (success or failure) does not influence
> the result of the migration Finish step (it was already like this
> in migration V2).
I *believe* the underlying problem is Qemu's switch-over. Once Qemu
decides that the migration was successful, Qemu on the source side
dies
and continues running on the destination side. I don't think
there are
more handshakes foreseen with higher layers that this could be
reversed
or the switch-over delayed, but correct me if I am wrong...
Actually I think this is not what happens in migration V3.
My understanding is this:
- the qemu cmdline built by Libvirt on the dst host during Prepare3
includes the "-S" option (ie no autostart)
- the VM on the dst host does not start running until libvirt
calls qemuProcessStartCPUs in the Finish3 step.
This fn simply sends the "-cont" cmd to the monitor to
start the VM/CPUs.
If I am right, libvirt does have full control on how/when to start
the CPU on the dst host, it is not QEMU to do it.
The only thing libvirt does not control is when to pause the VM
on the src host: QEMU does it during the stage-2 of the live-ram-copy
based on the max_downtime config.
However I do not think this represents a problem.
Can someone confirm my understanding of the algorithm?
Stefan, if this is correct, I guess the algorithm allows us to
abort the migration at any time based on the success of the port
profile configuration, and it would make the implementation of the
policies (1)/(2) relatively easy.
/Christian
So now
whatever we do, we'd have to associate the port profile before the
actual switch-over, if we wanted to do something better than what is
there now and have the opportunity to terminate the migration before
the
switch-over by Qemu happens in case of failure to associate profiles.
The problem is to know when the switch-over happens or when the
migration goes into the final phase where the source side doesn't run
anymore. The would allow us to not associate the ports right at the
beginning of the migration but maybe towards the time when for example
in live-migration the source is not running anymore *and* also we have
the result of the association before Qemu on the source dies for good.
I
think some additional coordination between libvirt and Qemu would be
necessary so that if higher layer ops fail before the resume on the
destination side happens that Qemu can still fall back to the source
side. I believe what could happen now is that a VM could be
transferred
too fast (by the Qemu process) while the association (in libvirt)
happens, Qemu on the source side dies, and then we only get the
negative
result of the association. Maybe the simplest solution would be if
Qemu
on the source side waited for a command before transferring the last
packet so we still have a chance to cancel and Qemu doesn't just 'run
away' underneath libvirt's feet ;-).
I assume that 2 associations with the same profile are possible with
802.1Qbg and Qbh. Both are also going through a Pre-associate state
now.
Are there any side-effects if associating twice on the same switch
like
no packets that can be sent on the source side or something like that
-
-
obviously this would be bad if this happened early during live-
migration
and we'd want to push the association close to the 'final migration
phase', which in turn may require more coordination with Qemu (don't
know whether the final phase can be determine now -- maybe via polling
Qemu's monitor).
Regards,
Stefan