BACKGROUND
As the live migration of mdev is going to be supported in VFIO, a scheme of deciding if a
mdev could be migratable between the source machine and the destination machine is needed.
Mostly, this email is going to discuss a possible solution which needs fewer modifications
of libvirt/VFIO.
The configuration of a mdev is located in the domain XML, which guides libvirt how to find
the mdev and generating the command line for QEMU. It basically only includes the UUID of
a mdev. The domain XML of the source machine and destination machine are going to be
compared before the migration really happens. Each configuration item would be compared
and checked by libvirt. If one item of the source machine is different from the item of
destination machine, the migration fails. For mdev, there is no any check/match before the
migration happens yet.
The user could use the node device list of libvirt to list the host devices and see the
capabilities of those devices. The current node device code of libvirt has already been
able to extract the supported mdev types from a host PCI device, plus some basic
information, like max supported mdev instance of a host PCI device.
THE SOLUTION
To strictly check the mdev type and make sure the migration happens between the compatible
mediated devices, three new mandatory elements in the domain XML below the hostdev element
would be introduced:
vendorid: The vendor ID of the mdev, which comes from the host PCI device. A user could
obtain this information from the host PCI device which supports mdev in the node device
list.
productid: The product ID of the mdev, which also comes from the host PCI device. A user
could obtain this information from the same approach above.
mdevtype: The type of the mdev. As the creation of the mdev is managed by the user, the
user knows the type of the mdev and would be responsible for filling out this
information.
These three elements are only needed when the device API of a mdev is
"vfio-PCI". Take the example of mdev configuration from
https://libvirt.org/formatdomain.html to illustrate the modification:
<devices>
<hostdev mode='subsystem' type='mdev' model='vfio-pci'>
<source>
<address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'/>
<vendorid>0xdead</vendorid> <!-- The VID of the host PCI device which
supports this mdev -->
<productid>0xbeef</productid> <!-- The PID of the host PCI device
which supports this mdev -->
<mdevtype>type</mdevtype> <!-- The vendor-specific mdev type string
-->
</source>
</hostdev>
With the newly introduced elements above, the flow of the creation of a domain XML with
mdev will be like:
1. The user obtains the vendorid/productid from node device list
2. The user fills the vendorid/productid/mdevtype in the domain XML
3. When a migration happens, libvirt check these elements. If one item is different
between two domain XML, then migration fails.
POSSIBLE MODIFICATION OF LIBVIRT
1) Introduce three new elements in domain XML parsing and processing functions.
2) Extend the function virDomainDeviceInfoCheckABIStability() which is going to check the
host dev part of the domain XMLs between the source machine and the destination machine.
So it could fail the migration when it finds out the IDs and the mdev type are different
between domain XMLs.
PROS
Minor changes in libvirt could achieve the mdev type match in the migration. Modifying
VFIO and other mdev components is not necessary.
Thanks,
Zhi.