On 08/08/2016 04:56 AM, Laine Stump wrote:
When faced with a guest device that requires a PCI address but
doesn't
have one manually assigned in the config, libvirt has always insisted
(well... *tried* to insist) on auto-assigning an address that is on a
PCI controller that supports hotplug. One big problem with this is
that it prevents automatic use of a Q35 (or aarch64/virt) machine's
pcie-root (since the PCIe root complex doesn't support hotplug).
In order to promote simpler domain configs (more devices on pcie-root
rather than on a pci-bridge), this patch adds a new sub-element to all
guest devices that have a PCI address and support hotplug:
<hotplug require='no'/>
For devices that have hotplug require='no', we turn off the
VIR_PCI_CONNECT_HOGPLUGGABLE bit in the devFlags when searching for an
available PCI address. Since pcie-root now allows standard PCI
devices, this results in those devices being placed on pcie-root
rather than pci-bridge.
I've been playing around with this and, by itself, it works very well.
With this solved, combined with taking advantage of PCIe for virtio when
available, it's very easy to create q35 domains that have no legacy-PCI
without needing to resort to manually assigning addresses.
However, there is still another item that we need to be able to
configure - stating a preference of legacy PCI vs. PCIe when both are
available for a device (again, the aim is to do this *without* needing
to manually assign an address). The following devices have this choice:
1) vfio assigned devices
2) virtio devices
3) the nec-xhci USB controller
You might think that it would always be preferable to use PCIe if it's
available, but especially in these "early days" of using PCIe in guests
it would be useful to have to ability to *easily* force use of a legacy
PCI slot in case some PCIe-related bug is encountered (in particular,
people have pointed out in discussions about vfio device assignment that
it could be possible for a guest OS to misbehave when presented with a
device's PCIe configuration block (which hasn't been visible in the past
because the device was attached to a legacy PCI slot)).
In order maintain functionality while any such bugs are figured out and
fixed, we need to be able to force the device onto a PCI slot. There are
two ways of doing this:
1) manually specify the full PCI address of a legacy PCI slot in the config
2) provide an option in the config that simply says "use any PCI slot"
or "use any PCIe slot".
Assuming that (1) is too cumbersome, we need to come up with a
reasonable name/location for a config option (providing the backend for
it will be trivial). Some possible places:
2a) add a new attribute to the <address> element
I don't like this option because that makes it impossible to easily
force re-addressing of the devices in a domain by simply removing all
the <address> lines. (Yes, I know that's a non-issue in production,
especially when there is some other management system (OpenStack, oVirt)
sitting on top of libvirt. But it is a *big* help for developers who are
messing around with it).
2b) Add a new attribute to an existing subelement, e.g. <target
preferredBus='pci'/>
This makes parsing and formatting cumbersome, because every device type
has its own code to parse/format its <target> subelement. Also, in the
case of <interface>, the <target> subelement is being mis-used to hold
the name *on the host* of the tap device, and it would be confusing to
see something like this:
<target dev='vnet1' preferredBus='pci'/>
2c) add a new subelement just for this, e.g. <bus prefer='pci'/> or ???
I don't like this because it adds to the toplevel clutter in the
devices. XML's hierarchichal structure is useful to organize attributes
so they are easier to comprehend, and we should take advantage of that
as much as possible.
2d) Try to find a common subelement that can be used for *all* address
assignment preferences/restrictions, including hotplug.
This is the option that has prompted my writing this message in response
to my own patch mail. What it, instead of:
<hotplug require='no'/>
<bus prefer='pci'/> (or whatever)
we had something like this?
<addressPreferences hotplug='no' bus='pci'/>
(*PLEASE* think of a better name!)
In my mind, the choice is between 1 and 2d - if everyone thinks this is
something only needed during a short transitional stage, maybe (1) is an
adequate solution. If not, then we should decide now on the name for
this option, and potentially rename the hotplug option accordingly.
What are your opinions?
(BTW, just to throw another wrench into the works - I think it would
also be useful to be able to specify a numa node for devices, so that a
device could be placed on a particular numa node in the guest (i.e. on a
particular pci[e]-expander-bus or one of its subordinate buses) without
needing to know the full PCI address. That could be done by specifying
it the same way it's done in the pci[e]-expander-bus itself:
<hostdev ....>
<target>
<node>2</node>
</target>
...
</hostdev>
or it could be made a part of this new proposed element:
<addressPreferences hotplug='no' bus='pci'
numa='2'/>
This is something that we will want in the long term (not just a
temporary method of working around potential bugs), so if we're going to
want it in a separate element rather than in <target>, we'll need to
consider it *now* in order to avoid giving the wrong name to the new
hotplug option defined in the parent of this message.)