
On 10/7/19 7:41 PM, Alex Williamson wrote:
On Mon, 7 Oct 2019 18:11:32 -0300 Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
(--- long post warning ---)
This is a work that derived from the discussions I had with Laine Stump and Alex Williamson in [1]. I'll provide a quick gist below.
----------
Today, Libvirt does not have proper support for partial assignment of functions of passed-through PCI multifunction devices (hostdev with VFIO-PCI). By partial assignment I mean the guest being able to use just some, not all, virtual functions of the device. Even if the functions itself became useless in the host, the some functions might not be safe to be used by the guest, thus the user should be able to limit it. Not safe in what way? Patch 2/4 says some devices might be "security sensitive", but the fact that this patch is necessary implies that the host kernel already considers the devices non-isolated. They must be in the same iommu group to have this issue. Is there a concrete example of a device where a user would want this configuration? The case I can think of is not a security issue, but a functional one where GPU and audio functions are grouped together and maybe the audio function doesn't work well when assigned, or maybe we just want the guest to default to another audio device and it's easier if we just don't expose this on-card audio.
The audio card example is one was thinking of (and I believe it was brought up in the [1] thread as well) when writing about the need for this work. But in the end, I believe my use of 'security issue' wording is wrong - I am implying that there might be a case in which isolated devices, in the same IOMMU, can present security risks for each other or something like that when assigned to the guest. I can't make such strong claim. These patches are mote about enhancing a functional use, like you said. Thanks for pointing this out. I'll rearrange the discourse in the next spins.
I mentioned 'proper' because today it is possible to get this done in Libvirt if we use 'managed=no' in the hostdevs. If the user makes the proper setup (i.e. detaching all IOMMU devices), and use managed='no', Libvirt will launch the guest just with the functions declared in the XML. The technical reason for this is simple: in virHostdevPreparePCIDevices() we do not take into account that multifunction PCI devices requires the whole IOMMU to be detached, not just the devices being declared in def->hostdevs. In this case, managed='yes' will not work in this scenario, causing errors in QEMU launch.
The discussion I've started in [1] was motivated by my attempt of automatically detaching the IOMMU inside the prepare function with managed='yes' devices. Laine discarded this idea, arguing that the concept of partial assignment will cause user confusion if Libvirt starts to handle things without the user being fully aware. In [1] it was discussed the possibility of declaring the functions that won't be assigned to the guest in the XML, forcing the user to be aware that these functions will be lost in the host, as a possible approach for a solution.
-----------
These series tries to solve the partial assignment of multifunction hostdev PCI devices by introducing a new hostdev attribute called 'assigned'. This is how it works:
- it is a boolean value that will be efffective just for multifunction hostdev PCI devices, since there's no other occurrence for this kind of use in Libvirt. Trying to declare assign='yes|no' in any other PCI hostdev device will cause parse errors;
- default value if the attribute is not present is 'assigned=yes';
- <address> element will be forbidden if the hostdev is declared with assigned='no'. This is to make more evident to the user that this is a function that the guest will NOT be using, with a bonus that we will not need to calculate an address that won't be used; It seems more intuitive to me to use the guest <address> element to expose this. libvirt often makes use of 'none' to declare empty devices, so maybe <address type='none'/> would be more in line with precedent. Thanks,
If <address type='none'> is not being used by anything else (it doesn't appear to be, at least in a quick look at libvirt.org docs), this is a good idea indeed. It also spare us from adding more documentation for a new attribute. I'll wait to see if more people wants to comment in this work and, unless someone presents a good reason not to, I'll see if I can make this <address type='none'> happen. Thanks, DHB
Alex