On 03/15/2016 01:00 PM, Daniel P. Berrange wrote:
On Mon, Mar 14, 2016 at 03:41:48PM -0400, Laine Stump wrote:
> Suggested by Alex Williamson.
>
> If you plan to assign a GPU to a virtual machine, but that GPU happens
> to be the host system console, you likely want it to start out using
> the host driver (so that boot messages/etc will be displayed), then
> later have the host driver replaced with vfio-pci for assignment to
> the virtual machine.
>
> However, in at least some cases (e.g. Intel i915) once the device has
> been detached from the host driver and attached to vfio-pci, attempts
> to reattach to the host driver only lead to "grief" (ask Alex for
> details). This means that simply using "managed='yes'" in libvirt
> won't work.
>
> And if you set "managed='no'" in libvirt then either you have to
> manually run virsh nodedev-detach prior to the first start of the
> guest, or you have to have a management application intelligent enough
> to know that it should detach from the host driver, but never reattach
> to it.
>
> This patch makes it simple/automatic to deal with such a case - it
> adds a third "managed" mode for assigned PCI devices, called
> "detach". It will detach ("unbind" in driver parlance) the device
from
> the host driver prior to assigning it to the guest, but when the guest
> is finished with the device, will leave it bound to vfio-pci. This
> allows re-using the device for another guest, without requiring
> initial out-of-band intervention to unbind the host driver.
You say that managed=yes causes pain upon re-attachment and that
apps should use managed=detach to avoid it, but how do management
apps know which devices are going to cause pain ? Libvirt isn't
providing any info on whether a particular device id needs to
use managed=yes vs managed=detach, and we don't want to be asking
the user to choose between modes in openstack/ovirt IMHO. I think
thats a fundamental problem with inventing a new value for managed
here.
My suspicion is that in many/most cases users don't actually need for
the device to be re-bound to the host driver after the guest is finished
with it, because they're only going to use the device to assign to a
different guest anyway. But because managed='yes' is what's supplied and
is the easiest way to get it setup for assignment to a guest, that's
what they use.
As a matter of fact, all this extra churn of changing the driver back
and forth for devices that are only actually used when they're bound to
vfio-pci just wastes time, and makes it more likely that libvirt and its
users will reveal and get caught up in the effects of some strange
kernel driver loading/unloading bug (there was recently a bug reported
like this; unfortunately the BZ record had customer info in it, so it's
not publicly accessible :-( )
So beyond making this behavior available only when absolutely necessary,
I think it is useful in other cases, at the user's discretion (and as I
implied above, I think that if they understood the function and the
tradeoffs, most people would choose to use managed='detach' rather than
managed='yes')
(alternately, we could come back to the discussion of having persistent
nodedevice config, with one of the configurables being which devices
should be bound to vfio-pci when libvirtd is started. Did we maybe even
talk about exactly that in the past? I can't remember... That would of
course preclude the use case where someone 1) normally wanted to use the
device for the host, but 2) occasionally wanted to use it for a guest,
after which 3) they were well aware that they would need to reboot the
host before they could use the device on the host again. I know, I know
- "odd edge cases", and in particular "odd edge cases only encountered
by people who know other ways of working around the problem" :-))
Can you provide more details about the problems with detaching ?
Is this inherant to all VGA cards, or is it specific to the Intel
i915, or specific to a kernel version or something else ?
I feel like this is something where libvirt should "do the right
thing", since that's really what managed=yes is all about.
eg, if we have managed=yes and we see an i915, we should
automatically skip re-attach for that device.
Alex can give a much better description of that than I can (I had told
git to Cc him on the original patch, but it seems it didn't do that; I'm
trying again). But what if there is such a behavior now for a certain
set of VGA cards, and it gets fixed in the future? Would we continue to
force avoiding re-attach for the device? I understand the allure of
always doing the right thing without requiring config (and the dislike
of adding new seemingly esoteric options), but I don't know that libvirt
has (or can get) the necessary info to make the correct decision in all
cases.