On Fri, 24 Jan 2020, 17:54 Laine Stump, <laine@redhat.com> wrote:
V1: https://www.redhat.com/archives/libvir-list/2020-January/msg00813.html

This all used different names in V1 - in that incarnation the
configuration was done using "failover" and "backupAlias" attributes
added to the <driver> subelement of <interface>. But the resulting
code was cumbersome and had little bits scattered all over the place
due to needing it in both hostdev and interface parsing/formatting.

In his review of V1, danpb suggesting just adding a new subelement for
this configuration to free ourselves from the constraints of <driver>
parsing/formatting. This ended up dramatically simplifying the code
(hence the lack of V1's refactoring patches in V2, and a decrease in
patch count from 12 to 6).

During further discussion in email and on IRC, we decided that naming
the element <failover> was too limiting, as it implied the behavior of
what is, to libvirt, just two network devices that should be
teamed/bonded together - it's completely up to the hypervisor and
guest what is done with this information. In light of that, we decided
to name the new subelement <teaming>, and to specify the two
interfaces as "persistent" (an interface that will always remain
plugged in) and "transient" (an interface that may be periodically
unplugged (during migration, in the case of QEMU). So the virtio
device will have

   <teaming type='persistent'/>

and the hostdev device will have

   <teaming type='transient' persistent='ua-myvirtio'/>

(note that the persistent interface must have <alias name='ua-myvirtio'/>)

Given this config, libvirt will add "failover=on" to the device
commandline arg for the virtio device, and
"failover_pair_id=ua-myvirtio" to the arg for the hostdev device (and
when a migration is requested, it will notice if there is a hostdev
that has <teaming type='transient'/> set, and will allow the migration
in this case, but still disallow migrations of domains with normal
hostdevs).

In response to these extra commandline options, QEMU will set some
extra capabilities in the virtio device PCI capabilities data, and
will also automatically unplug/re-plug the hostdev device before and
after migration.

In the guest, the virtio-net driver will notice the extra PCI
capabilities and use this as a clue that it should search for another
device with matching MAC address (NB: the guest driver requires the
two devices to have matching MAC addresses) to join into a bond with
the virtio-net device.

I like the <teaming/> abstraction.

As I wrote earlier, as a virt-manager user I'd like to specify that two interfaces are teamed; I would not care to copy the mac address of one onto the other. I prefer that libvirt hides this virtio awkwardness by passing the "persistent" mac address to both qemu nics. Would libvirt do this service to its multiple clients?


This bond is hard-wired to always prefer the
hostdev device whenever it is present, and use the virtio device as
backup when the hostdev is unplugged.

----

As mentioned in a followup to the V1 cover letter, there is a
regression in QEMU 4.2.0 that causes QEMU to segv when a hostdev is
unplugged. That bug is fixed with this upstream QEMU patch:

https://git.qemu.org/?p=qemu.git;a=commitdiff;h=0446f8121723b134ca1d1ed0b73e96d4a0a8689d;hp=48008198270e3ebcc9394401d676c54ed5ac139c

Be sure to use a qemu build with this patch applied, or you may not
even be able to start the guest! Also we've found that the
DEVICE_DELETED event is never sent to libvirt when one of these
hostdevs is manually unplugged, meaning that libvirt keeps the device marked as
"in-use", and it therefore cannot be re-plugged to the guest until
after a complete guest "power cycle". AFAIK there isn't yet a fix for
that bug, so don't expect manual unplug of the device to work.

Laine Stump (6):
  qemu: add capabilities flag for failover feature
  conf: parse/format <teaming> subelement of <interface>
  qemu: support interface <teaming> functionality
  qemu: allow migration with assigned PCI hostdev if <teaming> is set
  qemu: add wait-unplug to qemu migration status enum
  docs: document <interface> subelement <teaming>

 docs/formatdomain.html.in                     | 100 ++++++++++++++++++
 docs/news.xml                                 |  28 +++++
 docs/schemas/domaincommon.rng                 |  19 ++++
 src/conf/domain_conf.c                        |  45 ++++++++
 src/conf/domain_conf.h                        |  14 +++
 src/qemu/qemu_capabilities.c                  |   4 +
 src/qemu/qemu_capabilities.h                  |   3 +
 src/qemu/qemu_command.c                       |   9 ++
 src/qemu/qemu_domain.c                        |  36 ++++++-
 src/qemu/qemu_migration.c                     |  53 +++++++++-
 src/qemu/qemu_monitor.c                       |   1 +
 src/qemu/qemu_monitor.h                       |   1 +
 src/qemu/qemu_monitor_json.c                  |   1 +
 .../caps_4.2.0.aarch64.xml                    |   1 +
 .../caps_4.2.0.x86_64.xml                     |   1 +
 .../net-virtio-teaming-network.xml            |  37 +++++++
 .../qemuxml2argvdata/net-virtio-teaming.args  |  40 +++++++
 tests/qemuxml2argvdata/net-virtio-teaming.xml |  50 +++++++++
 tests/qemuxml2argvtest.c                      |   4 +
 .../net-virtio-teaming-network.xml            |  51 +++++++++
 .../qemuxml2xmloutdata/net-virtio-teaming.xml |  66 ++++++++++++
 tests/qemuxml2xmltest.c                       |   6 ++
 22 files changed, 563 insertions(+), 7 deletions(-)
 create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming-network.xml
 create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming.args
 create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming.xml
 create mode 100644 tests/qemuxml2xmloutdata/net-virtio-teaming-network.xml
 create mode 100644 tests/qemuxml2xmloutdata/net-virtio-teaming.xml

--
2.24.1