[libvirt] RFC: Creating mediated devices with libvirt

15 Jun 2017

      Hi all,

so there's been an off-list discussion about finally implementing creation of
mediated devices with libvirt and it's more than desired to get as many opinions
on that as possible, so please do share your ideas. This did come up already as
part of some older threads ([1] for example), so this will be a respin of the
discussions. Long story short, we decided to put device creation off and focus
on the introduction of the framework as such first and build upon that later,
i.e. now.

[1] https://www.redhat.com/archives/libvir-list/2017-February/msg00177.html

========================================
PART 1: NODEDEV-DRIVER
========================================

API-wise, device creation through the nodedev driver should be pretty
straightforward and without any issues, since virNodeDevCreateXML takes an XML
and does support flags. Looking at the current device XML:

<device>
  <name>mdev_0cce8709_0640_46ef_bd14_962c7f73cc6f</name>
  <path>/sys/devices/pci0000:00/.../0cce8709-0640-46ef-bd14-962c7f73cc6f</path>
  <parent>pci_0000_03_00_0</parent>
  <driver>
    <name>vfio_mdev</name>
  </driver>
  <capability type='mdev'>
    <type id='nvidia-11'/>
    <iommuGroup number='13'/>
    <uuid>UUID<uuid> <!-- optional enhancement, see below -->
  </capability>
</device>

We can ignore <path>,<driver>,<iommugroup> elements, since these are useless
during creation. We also cannot use <name> since we don't support arbitrary
names and we also can't rely on users providing a name in correct form which we
would need to further parse in order to get the UUID.
So since the only thing missing to successfully use create an mdev using XML is
the UUID (if user doesn't want it to be generated automatically), how about
having a <uuid> subelement under <capability> just like PCIs have <domain> and
friends, USBs have <bus> & <device>, interfaces have <address> to uniquely
identify the device even if the name itself is unique.
Removal of a device should work as well, although we might want to
consider creating a *Flags version of the API.

=============================================================
PART 2: DOMAIN XML & DEVICE AUTO-CREATION, NO POLICY INVOLVED!
=============================================================

There were some doubts about auto-creation mentioned in [1], although they
weren't specified further. So hopefully, we'll get further in the discussion
this time.
...
From my perspective there are two main reasons/benefits to that:
1) Convenience
For apps like virt-manager, user will want to add a host device transparently,
"hey libvirt, I want an mdev assigned to my VM, can you do that". Even for
higher management apps, like oVirt, even they might not care about the parent
device at all times and considering that they would need to enumerate the
parents, pick one, create the device XML and pass it to the nodedev driver, IMHO
it would actually be easier and faster to just do it directly through sysfs,
bypassing libvirt once again....

2) Future domain migration
Suppose now that the mdev backing physical devices support state dump and
reload. Chances are, that the corresponding mdev doesn't even exist or has a
different UUID on the destination, so libvirt would do its best to handle this
before the domain could be resumed.
Following what we already have:

<devices>
  <hostdev mode='subsystem' type='mdev' model='vfio-pci'>
  <source>
    <address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'>
  </source>
  </hostdev>
</devices>

Instead of trying to somehow extend the <address> element using more
attributes like 'domain', 'slot', 'function', etc. that would render the whole
element ambiguous, I was thinking about creating a <parent> element nested under
<source> that would be basically just a nested definition of another host device
re-using all the element we already know, i.e. <address> for PCI, and of course
others if there happens to be a need for devices other than PCI. So speaking
about XML, we'd end up with something like:

<devices>
  <hostdev mode='subsystem' type='mdev' model='vfio-pci'>
  <source>
    <parent>
      <!-- possibly another <source> element - do we really want that? -->
        <address domain='0x0000' bus='0x00' slot='0x00' function='0x00'>
        <type id='foo'/>
      <!-- end of potential <source> element -->
    </parent>
    <!-- this one takes precedence if it exists, ignoring the parent -->
    <address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'>
  </source>
  </hostdev>
</devices>

So, this was the first idea off the top of my head, so I'd appreciate any
suggestions, comments, especially from people who have got the 'legacy'
insight into libvirt and can predict potential pitfalls based on experience :).

Thanks,
Erik

Erik Skultety

Daniel P. Berrange

Alex Williamson

Laine Stump

Daniel P. Berrange

Alex Williamson

Daniel P. Berrange

Daniel P. Berrange

Martin Polednik

Erik Skultety

Martin Polednik

Daniel P. Berrange

Erik Skultety

Alex Williamson

Pavel Hrdina

John Ferlan

Laine Stump

Daniel P. Berrange

Daniel P. Berrange

Laine Stump

Erik Skultety

Daniel P. Berrange

Alex Williamson

John Ferlan

Daniel P. Berrange

tags

participants (7)