On 10/06/2016 12:10 PM, Daniel P. Berrange wrote:
On Thu, Oct 06, 2016 at 11:57:17AM -0400, Laine Stump wrote:
> On 10/06/2016 11:31 AM, Daniel P. Berrange wrote:
>> On Thu, Oct 06, 2016 at 12:58:51PM +0200, Andrea Bolognani wrote:
>>> On Wed, 2016-10-05 at 18:36 +0100, Richard W.M. Jones wrote:
>>>>>> (b) It would be nice to turn the whole thing off for people who
don't
>>>>>> care about / need hotplugging.
>>>>> I had contemplated having an "availablePCIeSlots" (or
something like
>>>>> that) that was either an attribute of the config, or an option in
>>>>> qemu.conf or libvirtd.conf. If we had such a setting, it could be
>>>>> set to "0".
>>> I remember some pushback when this was proposed. Maybe we
>>> should just give up on the idea of providing spare
>>> hotpluggable PCIe slots by default and ask the user to add
>>> them explicitly after all.
>>>
>>>> Note that changes to libvirt conf files are not usable by libguestfs.
>>>> The setting would need to go into the XML, and please also make it
>>>> possible to determine if $random version of libvirt supports the
>>>> setting, either by a version check or something in capabilities.
>>> Note that you can avoid using any PCIe root port at all by
>>> assigning PCI addresses manually. It looks like the overhead
>>> for the small (I'm assuming) number of devices a libguestfs
>>> appliance will use is low enough that you will probably not
>>> want to open that can of worm, though.
>> For most apps the performance impact of the PCI enumeration
>> is not a big deal. So having libvirt ensure there's enough
>> available hotpluggable PCIe slots is reasonable, as long as
>> we leave a get-out clause for libguestfs.
>>
>> This could be as simple as declaring that *if* we see one
>> or more <controller type="pci"> in the input XML, then libvirt
>> will honour those and not try to add new controllers to the
>> guest.
>>
>> That way, by default libvirt will just "do the right thing"
>> and auto-create a suitable number of controllers needed to
>> boot the guest.
>>
>> Apps that want strict control though, can specify the
>> <controllers> elements themselves. Libvirt can still
>> auto-assign device addresses onto these controllers.
>> It simply wouldn't add any further controllers itself
>> at that point.
Even if it was adding offline, and there wasn't any place to put a new
device? (i.e., the operation would fail without adding a controller, and
libvirt was able to add it). Or am I taking your statement beyond its
intent (I'm good at that :-)
>> NB I'm talking cold-boot here. So libguestfs
>> would specify <controller> itself to the minimal set it wants
>> to optimize its boot performance.
> That works for the initial definition of the domain, but as soon as you've
> saved it once, there will be controllers explicitly in the config, and since
> we don't have any way of differentiating between auto-added controllers and
> those specifically requested by the user, we have to assume they were
> explicitly added, so such a check is then meaningless because you will
> *always* have PCI controllers.
Ok, so coldplug was probably the wrong word to use. What I actually
meant was "at time of initial define", since that's when libvirt
actually does its controller auto-creation. If you later add more
devices to the guest, whether it is online or offline, that libvirt
would still be auto-adding more controllers if required (and if
possible) . I was not expecting libvirt to remember whether we
were auto-adding controllers the first time or not.
> Say you create a domain definition with no controllers, you would get enough
> for the devices in the initial config, plus "N" more empty root ports.
Let's
> say you then add 4 more devices (either hotplug or coldplug, doesn't
> matter). Those devices are placed on the existing unused pcie-root-ports.
> But now all your ports are full, and since you have PCI controllers in the
> config, libvirt is going to say "Ah, this user knows what they want to do,
> so I'm not going to add any extras! I'm so smart!". This would be
especially
> maddening in the case of "coldplug", where libvirt could have easily added
a
> new controller to accomodate the new device, but didn't.
>
> Unless we don't care what happens after the initial definition (and then
> adding of "N" new devices), trying to behave properly purely based on
> whether or not there are any PCI controllers present in the config isn't
> going to work.
I think that's fine.
Lets stop talking about coldplug since that's very misleading.
Do you mean use of the term "coldplug" at all, or talking about what
happens when you add a device to the persistent config of the domain but
not to the currently running guest itself?
What I mean is that...
1. When initially defining a guest
If no controllers are present, auto-add controllers implied
by the machine type, sufficient to deal with all currently
listed devices, plus "N" extra spare ports.
Else, simply assign devices to the controllers listed in
the XML config. If there are no extra spare ports after
doing this, so be it. It was the application's choice
to have not listed enough controllers to allow later
addition of more devices.
2. When adding further devices (whether to an offline or online
guest)
If there's not enough slots left, add further controllers
to host the devices.
Right. That works great for adding devices "offline" (since you don't
like the term "coldplug" :-). It's just in the case of hotplug that it's
problematic, because you can't add new PCI controllers to a running system
(big digression here - skip if you like...) (well, it *could* be
possible to hotplug an upstream port plus some number of downstream
ports, but qemu doesn't support it because it attaches devices one by
one, and guest has to be notified of the entire contraption at once when
the upstream port is attached - so you would have to send the attach for
all the downstream ports first, then the upstream, but you *can't* do it
in that order because then qemu doesn't yet know about the id (alias)
you're going to give to the upstream port at the time you're attaching
the downstreams. Anyway, even if and when qemu does support hotplugging
upstream+downstream ports, you can't add more downstream ports to an
existing upstream afterwards, so you would end up with some horrid
scheme where you had to always make sure you had at least one open
downstream or root-port open every time a device was added.)
If there were not enough slots left
to allow adding further controllers, that must be due to
the initial application decision at time of defining the
original XML
Or it's because there have already been "N" new devices added since the
domain was defined, and they're now trying to *hotplug* device "N+1".
I'm fine with that behavior, I just want to make sure everyone
understands this restriction beforehand.
So here's a rewording of your description (with a couple additional
conditions) to see if I understand everything correctly:
1) during initial domain definition:
A) If there are *no pci controllers at all* (not even a pci-root or
pcie-root) *and there are any unaddressed devices that need a PCI slot*
then auto-add enough controllers for the requested devices, *and* make
sure there are enough empty slots for "N" (do we stick with 4? or make
it 3?) devices to be added later without needing more controllers. (So,
if the domain has no PCI devices, we don't add anything extra, and also
if it only has PCI devices that already have addresses, then we also
don't add anything extra).
B) if there is at least one pci controller specified in the XML, and
there are any unused slots in the pci controllers in the provided XML,
then use them for the unaddressed devices. If there are more devices
that need an address at this time, also add controllers for them, but no
*extra* controllers.
(Note to Rich: libguestfs could avoid the extra controllers either by
adding a pci-root/pcie-root to the XML, or by manually addressing the
devices. The latter would actually be better, since it would avoid the
need for any pcie-root-ports).
2) When adding a device to the persistent config (i.e. offline): if
there is an empty slot on a controller, use it. If not, add a controller
for that device *but no extra controllers*
3) when adding a device to the guest machine (i.e. hotplug / online), if
there is an empty slot on a controller, use it. If not, then fail.
The differences I see from what (I think) you suggested are:
* if there aren't any unaddressed pci devices (even if there are no
controllers in the config), then we also don't add any extra controllers
(although we will of course add the pci-root or pcie-root, to
acknowledge it is there).
* if another controller is needed for adding a device offline, it's okay
to add it.