On 03/14/2016 03:41 PM, Laine Stump wrote:
Suggested by Alex Williamson.
If you plan to assign a GPU to a virtual machine, but that GPU happens
to be the host system console, you likely want it to start out using
the host driver (so that boot messages/etc will be displayed), then
later have the host driver replaced with vfio-pci for assignment to
the virtual machine.
However, in at least some cases (e.g. Intel i915) once the device has
been detached from the host driver and attached to vfio-pci, attempts
to reattach to the host driver only lead to "grief" (ask Alex for
details). This means that simply using "managed='yes'" in libvirt
won't work.
And if you set "managed='no'" in libvirt then either you have to
manually run virsh nodedev-detach prior to the first start of the
guest, or you have to have a management application intelligent enough
to know that it should detach from the host driver, but never reattach
to it.
This patch makes it simple/automatic to deal with such a case - it
adds a third "managed" mode for assigned PCI devices, called
"detach". It will detach ("unbind" in driver parlance) the device
from
the host driver prior to assigning it to the guest, but when the guest
is finished with the device, will leave it bound to vfio-pci. This
allows re-using the device for another guest, without requiring
initial out-of-band intervention to unbind the host driver.
---
I'm sending this with the "RFC" tag because I'm concerned it might be
considered "feature creep" by some (although I think it makes at least
as much sense as "managed='yes'") and also because, even once (if) it
is ACKed, I wouldn't want to push it until abologna is finished
hacking around with the driver bind/unbind code - he has enough grief
to deal with without me causing a bunch of merge conflicts :-)
[...]
Rather the burying this in one of the 3 other conversations that have
been taking place - I'll respond at the top level. I have been following
the conversation, but not at any great depth... Going to leave the
should we have "managed='detach'" conversation alone (at least for
now).
Not that I want to necessarily dip my toes into these shark infested
waters; however, one thing keeps gnawing at my fingers from hanging onto
the don't get involved in this one ledge ;-)... That one thing is the
problem to me is less libvirt's ability to manage whether the devices
are or aren't detached from the host, but rather that lower layers (e.g.
kernel) aren't happy over the frequency of such requests. Thrashing for
any system isn't fun, but it's a lot easier to tell someone else to stop
doing it since it hurts when you do that. Tough to find a happy medium
between force user to detach rather than let libvirt manage and letting
libvirt be the culprit causing angst for the lower layers.
So, would it work to have some intermediary handle this thrashing by
creating some sort of "blob" that will accept responsibility for
reattaching the device to the host "at some point in time" as long as no
one else has requested to use it?
Why not add an attribute (e.g. delay='n') that is only valid when
managed='yes' to the device which means, rather than immediately
reattaching this to the host when the guest is destroy, libvirt will
delay the reattach by 'n' seconds. That way someone that knows they're
going to have a device used by multiple guests that could be thrashing
heavily in the detach -> reattach -> detach -> reattach -> etc loop
would be able to make use of an optimization of sorts that just places
the device back in the inactive list (as if it were detached manually),
then starts a thread that will reawaken when a timer fires to handle the
reattach. The thread would be destroyed in the event that something
codes along and uses (e.g. places back into the active list).
The caveats that come quickly to mind in using this is that devices that
were being managed could be left on the inactive list if the daemon dies
or is restarted, but I think it is detectable at restart so it may not
be totally bad. Also, failure to reattach is left in some thread which
has no one to message to other than libvirtd log files. Both would have
to be noted with any description of this.
Not all devices will want this delay logic and I think it's been pointed
out that there is a "known list" of them. In the long run it allows
some control by a user to decide how much rope they'd like to have to
hang themselves.
John
Not sure if Gerd, Alex, and Martin are regular libvir-list readers, so I
did CC them just in case so that it's easier for them to respond if they
so desire since they were at part of the discussions in this thread.