On Mon, Jun 24, 2013 at 05:54:49AM -0400, Laine Stump wrote:
When I first put in support for VFIO device assignment, I didn't
realize that groups of devices were quite as common as they actually
are. In particular, I didn't know that often multiple
seemingly-unrelated devices can end up in the same VFIO iommu group
due to unlucky circumstances of hardware - they may share a dma
controller which means that the devices can't truly be isolated from
each other, and thus should not be simultaneously assigned to
different guests (or even used by the host) - all of the devices in a
group should be either assigned to the same guest or, if not assigned
to the guest, should be isolated off in a driver to prevent them
from being used by the host.
The following set of patches makes setting that up easier to deal
with. The end result of all the patches is the following:
1) The virNodeDevice API will be able to detach or re-attach all the
devices in a particular group with a single API call.
2) <hostdev managed='yes'>, <interface type='hostdev'
managed='yes'>,
and <interface type='network' managed='yes'> devices (where the
network is itself a pool of SRIOV Virtual Functions) can specify:
<driver name='vfio' group='auto'/>
and libvirt will automatically detach (and bind to the 'vfio-pci'
driver for assignment/isolation) all devices in the same group as
the device being assigned. Likewise, when the device it detached
from the guest, a check will be made and, if none of the devices in
the same group as the device being detach is still in use by a guest
I am concerned that group='auto' is a really incredibly dangerous
setting from the POV of operation of the host OS.
I can just imagine forum postings / docs saying to use group=auto
and people blindly following it without much inclination as to
what will happen. They will be trying to assign a spare NIC to
their guest; they'll get an error saying it can't be done since it
is part of a group; they'll search google and find a recommendation
to use group=auto to "fix" the problem. libvirt will see that their
SATA controller & graphics card are part of the same group as the
NIC and automatically detach them both from the host OS. Kaboom,
the user is screwed.
With traditional configs, even with managed=yes, you could be sure
that only the single device in the XML would ever be touched. If
there was a conflict due to other devices being on the same PCI
bridge without FLR, then the device would safely fail to be assigned
until the user had explicitly disconnected other devices from the
host. We never attempted to automatically disconnect anything that
was not part of the XML
Following on from that, how does an application determine what
other devices are present in the group associated with the device
being assigned ? Are we exposing group membership info in the node
device XML anywhere ?
I'm not sure what else to suggest, other than to say we should not
add this attribute, and require that the application/user explicitly
disconnect any other devices in the same group from the host OS. Any
other option I can think of just sounds too dangerous.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|