
On 05/13/2015 10:02 PM, Chen Fan wrote:
On 05/13/2015 10:30 PM, Laine Stump wrote:
On 05/13/2015 05:57 AM, Daniel P. Berrange wrote:
add migration support for ephemeral host devices, introduce two 'detach' and 'restore' functions to unplug/plug host devices during migration.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com> --- src/qemu/qemu_migration.c | 171 ++++++++++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_migration.h | 9 +++ src/qemu/qemu_process.c | 11 +++ 3 files changed, 187 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 56112f9..d5a698f 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c +void +qemuMigrationRestoreEphemeralDevices(virQEMUDriverPtr driver, + virConnectPtr conn, + virDomainObjPtr vm, + bool live) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virDomainDeviceDefPtr dev; + int ret = -1; + size_t i; + + VIR_DEBUG("Rum domain restore ephemeral devices"); + + for (i = 0; i < priv->nEphemeralDevices; i++) { + dev = priv->ephemeralDevices[i]; + + switch ((virDomainDeviceType) dev->type) { + case VIR_DOMAIN_DEVICE_NET: + if (live) { + ret = qemuDomainAttachNetDevice(conn, driver, vm, + dev->data.net); + } else { + ret = virDomainNetInsert(vm->def, dev->data.net); + } + + if (!ret) + dev->data.net = NULL; + break; + case VIR_DOMAIN_DEVICE_HOSTDEV: + if (live) { + ret = qemuDomainAttachHostDevice(conn, driver, vm, + dev->data.hostdev); + } else { + ret =virDomainHostdevInsert(vm->def, dev->data.hostdev); + } This re-attach step is where we actually have far far far worse
On Wed, May 13, 2015 at 11:36:30AM +0800, Chen Fan wrote: problems than with detach. This is blindly assuming that the guest on the target host can use the same hostdev that it was using on the source host. (kind of pointless to comment on, since pkrempa has changed my opinion by forcing me to think about the "failure to reattach" condition, but could be useful info for others)
For a <hostdev>, yes, but not for <interface type='network'> (which would point to a libvirt network pool of VFs).
This is essentially useless in the real world. Agreed (for plain <hostdev>)
Even if the same vendor/model device is available on the target host, it is very unlikely to be available at the same bus/slot/function that it was on the source. It is quite likely neccessary to allocate a complete different NIC, or if using SRIOV allocate a different function. It is also not uncommon to have different vendor/models, so a completely different NIC may be required. In the case of a network device, a different brand/model of NIC at a different PCI address using a different guest driver shouldn't be a problem for the guest, as long as the MAC address is the same (for a Linux guest anyway; not sure what a Windows guest would do with a NIC that had the same MAC but used a different driver). This points out the folly of trying to do migration with attached hostdevs (managed at *any* level), for anything other than SRIOV VFs (which can have their MAC address set before attach, unlike non-SRIOV NICs).
. So should we focus on implementing the feature that support migration with SRIOV VFs at first?
Not "at first", but "only". Adding the requirement of dealing properly with MAC address change to the guest adds a lot of complexity to that code with not much real gain. And based on my newfound realization of the horrible situation that would be created by a failure to re-attach after migration was complete (see my response to Peter Krempa yesterday), I now agree with Dan that this shouldn't be implemented in libvirt, but in the higher level management, which will be able to more easily/realistically deal with such a failure. (and by the way, I think I should apologize for leading you down the road of the ephemeral patches in response to your earlier RFC. If only I'd fully considered the post-migration re-attach failure case, and the difficulty libvirt would have recovering from that prior to Peter pointing it out so eloquently yesterday :-/)