Re: [libvirt] [RFC v1 0/6] Live Migration with ephemeral host NIC devices

13 May 2015


      * Laine Stump (laine@redhat.com) wrote:
...
On 05/13/2015 04:28 AM, Peter Krempa wrote:
...
On Wed, May 13, 2015 at 09:08:39 +0100, Dr. David Alan Gilbert wrote:
...
* Peter Krempa (pkrempa@redhat.com) wrote:
...
On Wed, May 13, 2015 at 11:36:26 +0800, Chen Fan wrote:
...
my main goal is to add support migration with host NIC
passthrough devices and keep the network connectivity.
this series patch base on Shradha's patches on
https://www.redhat.com/archives/libvir-list/2012-November/msg01324.html
which is add migration support for host passthrough devices.
1) unplug the ephemeral devices before migration
2) do native migration
3) when migration finished, hotplug the ephemeral devices
IMHO this algorithm is something that an upper layer management app
should do. The device unplug operation is complex and it might not
succeed which will make the current migration thread hang or fail in an
intermediate state that will not be recoverable.
However you wouldn't want each of the upper layer management apps implementing
their own hacks for this; so something somewhere needs to standardise
what the guest sees.
The guest still will see an PCI device unplug request and will have to
respond to it, then will be paused and after resume a new PCI device
will appear. This is standardised. The nonstandardised part (which can't
really be standardised) is how the bonding or other guest-dependant
stuff will be handled, but that is up to the guest OS to handle.
From libvirt's perspective this is only something that will trigger the
device unplug and plug the devices back. And there are a lot of issues
here:
1) the destination of the migration might not have the desired devices
This will trigger a lot of problems as we will not be able to guarantee
    that the devices reappear on the destination and if we'd wanted to check
    we'd need a new migration protocol AFAIK.
2) The guest OS might refuse to detach the PCI device (it might be stuck
before PCI code is loaded)
In that case the migration will be stuck forever and abort attempts
    will make the domain state basically undefined depending on the
    phase where it failed.
Since we can't guarantee that the unplug of the PCI host devices will be
atomic or that it will succeed we basically can't guarantee in any way
in which state the VM will end up later after (a possibly failed)
migration. To recover such state there are too many option that could be
desired by the user that would be hard to implement in a way that would
be flexible enough.
In the past I've been on the side of having libvirt automatically do the
device detach and reattach (but definitely on the side of the guest
agent and libvirt keeping their hands off of network configuration in
the guest), with the thinking that 1) libvirt is in a well situated spot
to do it, and 2) this would eliminate duplicate code in the upper level
management.
However, Peter's points above made me consider the failure cases more
closely, in particular this one:
* the destination claims to have the resources required (right type of
PCI device, enough RAM), so migration is started.
* device detached on source, guest memory migrated to destination,
* guest started - no problems. (At this point, since the guest has been
restarted, it's not really possible for libvirt to fail the migration in
a recoverable manner (unless you want to implement some sort of
"unmigration" so that the guest state on the source is updated with
whatever execution occurred on the destination, and I don't think
*anyone* wants to go there))
* libvirt finds the device still available and attempts to attach it but
(for some odd reason) fails.
Now libvirt can't tell the application that the migration has succeeded,
because it didn't (unless the device was marked as "optional"), but it
also can't fail the migration except to say "this is such a monumental
failure that your guest has simply died".
If, on the other hand, the detach and re-attach are implemented in a
higher layer (ovirt/openstack), they will at least have the guest in a
state they can deal with - it won't be pretty, but they could for
example migrate the guest to another host (maybe back to the source) and
re-attach there.
So this one message from Peter has nicely pointed out the error in my
thinking, and I now agree that auto-detach/reattach shouldn't be
implemented in libvirt - it would work nicely in an error free world,
but would crumble in the face of some errors. (I just wish I had
considered the particular failure mode above a year or two ago, so I
could have been more discouraging in my emails then :-)
It's a shame to limit the utility of this by dealing with an error case
that's not a fatal error.  Does libvirt not have a way of dealing with
non-fatal errors?

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK