On 10/11/2016 05:34 AM, Andrea Bolognani wrote:
On Mon, 2016-10-10 at 15:43 -0400, Laine Stump wrote:
>>> 3) Although it is strongly discouraged, it is legal for a pci-bridge
>>> to be directly plugged into pcie-root, and we don't want to
>>> auto-add a dmi-to-pci-bridge if there is already a pci-bridge
>>> that's been forced directly into pcie-root. Finally, although I
>>> fail to see the utility of it, it is legal to have something like
>>> this in the xml:
>>>
>>> <controller type='pci' model='pcie-root'
index='0'/>
>>> <controller type='pci' model='pci-bridge'
index='2'/>
>>>
>>> and that will lead to an automatically added dmi-to-pci-bridge at
>>> index=1 (to give the unaddressed pci-bridge a proper place to plug
>>> in):
>>>
>>> <controller type='pci' model='dmi-to-pci-bridge'
index='1'/>
>>>
>>> (for example, see the existing test case
>>> "usb-controller-default-q35"). This is handled in
>>> qemuDomainPCIAddressSetCreate() when it's adding in controllers
to
>>> fill holes in the indexes.
>>
>> I wonder how this "feature" came to be... It seems to be the
>> reason for quite a bit of convoluted code down below, which
>> we could avoid if this had never worked at all. As is so
>> often the case, too late for that :(
>
> Maybe not. The only place I ever saw that was in the above test case,
> and another named "usb-controller-explicit-q35", and the author of both
> of those tests was (wait for it!), no, not Albert Einstein. Andrea
> Bolognani!
Oh, *that* guy! It's always *that* guy, isn't it? :P
> The only reason it worked in the past was because we always
> automatically added the dmi-to-pci-bridge very early in the post-parse
> processing. This implies that any saved config anywhere will already
> have the necessary dmi-to-pci-bridge at index='1', so we only need to
> preserve the behavior for new domain definitions that have a pci-bridge
> at index='2' but nothing at index='1'.
>
> Since you're the only person documented to have ever created a config
> like that, and it would only be problematic if someone tried to create
> another new config, maybe we should just stop accidentally supporting it
> and count it as a bug that's being fixed. What's your opinion?
Given the evidence you're presenting, I'm all for getting
rid of it, especially since it will make the code below
much simpler and hence more maintainable.
Considering how critical that part of libvirt is, anything
we can do to make it leaner and meaner is going to be a huge
win in the long run.
I'm looking back through the code and wondering how to simplify it - it
may be that the alternate method I had initially used (which failed that
one test) is just as complicated as what I have now; unfortunately I
didn't save it.
>>> @@ -82,6 +82,30 @@
virDomainPCIControllerModelToConnectType(virDomainControllerModelPCI model)
>>> return 0;
>>> }
>>>
>>> +
>>> +static int
>> s/int/virDomainControllerModelPCI/
>
> But then we can't return -1 when there isn't a perfect match (that's why
> I made it int)
Right. Disregard my comments then :)
>> Alternatively you could turn it into
>>
>> if ((flags & VIR_PCI_CONNECT_PCIE_DEVICE) ||
>> (flags & VIR_PCI_CONNECT_PCIE_SWITCH_UPSTREAM_PORT))
>>
>> which is more obviously correct and also nicer to look at :)
>
> ....but takes two operations instead of one.
I hardly think this would turn out to be a bottleneck, but
feel free to stick to the original implementation - after
fixing it, of course :)
>>> + if (lowestDMIToPCIBridge > idx)
>>> + lowestDMIToPCIBridge = idx;
>>> + } else if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE)
{
>>> + if (virDeviceInfoPCIAddressWanted(&cont->info)) {
>>> + if (lowestUnaddressedPCIBridge > idx)
>>> + lowestUnaddressedPCIBridge = idx;
>>> + } else {
>>> + if (lowestAddressedPCIBridge > idx)
>>> + lowestAddressedPCIBridge = idx;
>>> + }
>>> }
>>> + }
>>> +
>>> + if (nbuses > 0 && !addrs->buses[0].model) {
>>> + if (virDomainPCIAddressBusSetModel(&addrs->buses[0],
>>> +
VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) < 0)
>>> + goto error;
>>> + }
>>
>> Shouldn't we use either PCI_ROOT or PCIE_ROOT based on the
>> machine type here? And set hasPCIeRoot *after* doing so?
>>
>> Sorry for the questions, I guess this is the point in the
>> patch where I got a bit lost :(
I'm afraid you missed this question :)
Sorry about the omission. I've tried to limit the code that decides
whether or not there should be a pci-root or a pcie-root to the one
place when default controllers are being added (I forget the name of the
function right now), and after that only decide based on whether a
pci-root or pcie-root really is there in the config. My subconscious
reasoning for this is that the extra calisthenics around supporting
aarch64/virt with PCI(e) vs with mmio has made me wonder if there might
be machinetypes in the future that could support one type of root bus or
another (or none) according to what is in the config, rather than having
a root bus just builtin to the machine. I don't know if that would ever
happen, but if it did, this code would work without change - only the
function adding the default controllers would need to be changed.
(Note that I used the same logic when deciding how to right
qemuDomainMachineHasPCI[e]Root())(still not sure about that name, but it
can always be changed later to remove the "Machine" if someone doesn't
like it)
>>> + else if (lowestUnaddressedPCIBridge < MIN(lowestAddressedPCIBridge,
>>> + lowestDMIToPCIBridge))
>>> + defaultModel = VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE;
>>
>> Again, a bit lost here, sorry :(
>>
>> Basically if we've found a PCI bridge without address that
>> can't be plugged into any existing PCI bridge or DMI-to-PCI
>> bridge, we want to add a DMI-to-PCI bridge so that we have
>> somewhere to plug it in, right?
And I'm pretty sure I got it right here, but just to be on
the safe side, it would be great if you could confirm or
deny :)
>>> + else
>>> + defaultModel = VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT;
>>> +
>>> + for (i = 1; i < addrs->nbuses; i++) {
>>> +
>>> + if (addrs->buses[i].model)
>>> + continue;
>>> +
>>> + if (virDomainPCIAddressBusSetModel(&addrs->buses[i],
defaultModel) < 0)
>>> + goto error;
>>> +
>>> + VIR_DEBUG("Auto-adding <controller type='pci'
model='%s' index='%zu'/>",
>>> + virDomainControllerModelPCITypeToString(defaultModel),
i);
>>> + /* only add a single dmi-to-pci-bridge, then add pcie-root-port
>>> + * for any other unspecified controller indexes.
>>> + */
>>> + if (hasPCIeRoot)
>>> + defaultModel = VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT;
>>
>> Okay, so
>>
>> * if the machine has a PCIe Root Bus and we have a PCI
>> bridge that we don't know where to plug (no existing
>> legacy PCI hierarchy below it), we add a single
>> DMI-to-PCI bridge and then plug PCIe Root Ports into
>> PCIe Root Ports until we have as many buses as we need
>
> You mean "plug pcie-root-ports into pcie-root".
No, I actually meant what I wrote, but I think that thanks to
your remark I see my error now.
What I thought was going on was that whese additional buses
were *chained* to each other, eg. the first PCIe Root Port
(bus 1) would be plugged into the PCIe Root Bus (bus 0), the
second PCIe Root Port (bus 2) would be plugged into the first
PCIe Root Port (bus 1) and so on.
That wouldn't work because 1) pcie-root-port can only plug into
pcie-root or pcie-expander-bus, and 2) even if it could plug into
another pcie-root-port, there is only a single slot so you would have to
plug the pcie-root-port and the endpoint device into different functions
of the single slot, so hotplug would be impossible.
What happens instead is that the first PCIe Root Port (bus 1)
is plugged into the PCIe Root Bus (bus 0), and so is the
second PCIe Root Port (bus 2) and all subsequent ones.
Basically for a second there I forgot that the bus number
doesn't increase only when increasing the depth of the PCI
hierarchy, but also when increasing its breadth.
>> * if the machine has a PCIe Root Bus and we're not in the
>> situation above when it comes to PCI bridges (there is
>> an existing legacy PCI hierarchy below it), we plug PCIe
>> Root Ports into PCIe Root Ports until we have as many
>> buses as we need
>>
>> * otherwise (no PCIe Root Bus) we just plug PCI bridges
>> into PCI bridges until we have as many buses as we need
>>
>> Does that sound about right?
>
> Yep.
The same as above applies, though.
On the other hand, the virDomainPCIAddressSet structure
doesn't really store any information about the relationship
between controllers: whether eg. the PCIe Root Port providing
bus 5 is plugged directly into the PCIe Root Bus, or into
another PCIe Root Port, or into a PCIe Switch Downstream
Port, is something that we just don't know at this stage.
Yes, which may be necessary at some point - e.g. I've been thinking that
we really don't want the buses that are children of a
pci[e]-expander-bus to be used for auto-assigned addresses (although
maybe that could just be solved with a flag that's set at the time the
AddressSet is created, rather than needing to teach it the full
intracacies of the topology.)
>>> + <controller type='pci' index='1'
model='dmi-to-pci-bridge'>
>>> + <model name='i82801b11-bridge'/>
>>> + <address type='pci' domain='0x0000'
bus='0x00' slot='0x1e' function='0x0'/>
>>> + </controller>
>>
>> A DMI-to-PCI bridge with index='1' has been added
>> automatically here...
>>
>> ... but it's not being used by any of the devices. So why
>> would it be added in the first place?
>
> That is a *very* good question!
Can't wait to know the answer! ;)
Oh, now that I've looked in context of the patch ordering, I undestand:
it's because patch 16/18 hasn't been applied yet. I'd forgotten the
ordering...