
On 10/11/2016 05:34 AM, Andrea Bolognani wrote:
On Mon, 2016-10-10 at 15:43 -0400, Laine Stump wrote:
3) Although it is strongly discouraged, it is legal for a pci-bridge to be directly plugged into pcie-root, and we don't want to auto-add a dmi-to-pci-bridge if there is already a pci-bridge that's been forced directly into pcie-root. Finally, although I fail to see the utility of it, it is legal to have something like this in the xml:
<controller type='pci' model='pcie-root' index='0'/> <controller type='pci' model='pci-bridge' index='2'/>
and that will lead to an automatically added dmi-to-pci-bridge at index=1 (to give the unaddressed pci-bridge a proper place to plug in):
<controller type='pci' model='dmi-to-pci-bridge' index='1'/>
(for example, see the existing test case "usb-controller-default-q35"). This is handled in qemuDomainPCIAddressSetCreate() when it's adding in controllers to fill holes in the indexes.
I wonder how this "feature" came to be... It seems to be the reason for quite a bit of convoluted code down below, which we could avoid if this had never worked at all. As is so often the case, too late for that :(
Maybe not. The only place I ever saw that was in the above test case, and another named "usb-controller-explicit-q35", and the author of both of those tests was (wait for it!), no, not Albert Einstein. Andrea Bolognani! Oh, *that* guy! It's always *that* guy, isn't it? :P
The only reason it worked in the past was because we always automatically added the dmi-to-pci-bridge very early in the post-parse processing. This implies that any saved config anywhere will already have the necessary dmi-to-pci-bridge at index='1', so we only need to preserve the behavior for new domain definitions that have a pci-bridge at index='2' but nothing at index='1'.
Since you're the only person documented to have ever created a config like that, and it would only be problematic if someone tried to create another new config, maybe we should just stop accidentally supporting it and count it as a bug that's being fixed. What's your opinion? Given the evidence you're presenting, I'm all for getting rid of it, especially since it will make the code below much simpler and hence more maintainable.
Considering how critical that part of libvirt is, anything we can do to make it leaner and meaner is going to be a huge win in the long run.
I'm looking back through the code and wondering how to simplify it - it may be that the alternate method I had initially used (which failed that one test) is just as complicated as what I have now; unfortunately I didn't save it.
@@ -82,6 +82,30 @@ virDomainPCIControllerModelToConnectType(virDomainControllerModelPCI model) return 0; }
+ +static int s/int/virDomainControllerModelPCI/
But then we can't return -1 when there isn't a perfect match (that's why I made it int) Right. Disregard my comments then :)
Alternatively you could turn it into
if ((flags & VIR_PCI_CONNECT_PCIE_DEVICE) || (flags & VIR_PCI_CONNECT_PCIE_SWITCH_UPSTREAM_PORT))
which is more obviously correct and also nicer to look at :)
....but takes two operations instead of one. I hardly think this would turn out to be a bottleneck, but feel free to stick to the original implementation - after fixing it, of course :)
+ if (lowestDMIToPCIBridge > idx) + lowestDMIToPCIBridge = idx; + } else if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) { + if (virDeviceInfoPCIAddressWanted(&cont->info)) { + if (lowestUnaddressedPCIBridge > idx) + lowestUnaddressedPCIBridge = idx; + } else { + if (lowestAddressedPCIBridge > idx) + lowestAddressedPCIBridge = idx; + } } + } + + if (nbuses > 0 && !addrs->buses[0].model) { + if (virDomainPCIAddressBusSetModel(&addrs->buses[0], + VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) < 0) + goto error; + }
Shouldn't we use either PCI_ROOT or PCIE_ROOT based on the machine type here? And set hasPCIeRoot *after* doing so?
Sorry for the questions, I guess this is the point in the patch where I got a bit lost :( I'm afraid you missed this question :)
Sorry about the omission. I've tried to limit the code that decides whether or not there should be a pci-root or a pcie-root to the one place when default controllers are being added (I forget the name of the function right now), and after that only decide based on whether a pci-root or pcie-root really is there in the config. My subconscious reasoning for this is that the extra calisthenics around supporting aarch64/virt with PCI(e) vs with mmio has made me wonder if there might be machinetypes in the future that could support one type of root bus or another (or none) according to what is in the config, rather than having a root bus just builtin to the machine. I don't know if that would ever happen, but if it did, this code would work without change - only the function adding the default controllers would need to be changed. (Note that I used the same logic when deciding how to right qemuDomainMachineHasPCI[e]Root())(still not sure about that name, but it can always be changed later to remove the "Machine" if someone doesn't like it)
+ else if (lowestUnaddressedPCIBridge < MIN(lowestAddressedPCIBridge, + lowestDMIToPCIBridge)) + defaultModel = VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE;
Again, a bit lost here, sorry :(
Basically if we've found a PCI bridge without address that can't be plugged into any existing PCI bridge or DMI-to-PCI bridge, we want to add a DMI-to-PCI bridge so that we have somewhere to plug it in, right? And I'm pretty sure I got it right here, but just to be on the safe side, it would be great if you could confirm or deny :)
+ else + defaultModel = VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT; + + for (i = 1; i < addrs->nbuses; i++) { + + if (addrs->buses[i].model) + continue; + + if (virDomainPCIAddressBusSetModel(&addrs->buses[i], defaultModel) < 0) + goto error; + + VIR_DEBUG("Auto-adding <controller type='pci' model='%s' index='%zu'/>", + virDomainControllerModelPCITypeToString(defaultModel), i); + /* only add a single dmi-to-pci-bridge, then add pcie-root-port + * for any other unspecified controller indexes. + */ + if (hasPCIeRoot) + defaultModel = VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT;
Okay, so
* if the machine has a PCIe Root Bus and we have a PCI bridge that we don't know where to plug (no existing legacy PCI hierarchy below it), we add a single DMI-to-PCI bridge and then plug PCIe Root Ports into PCIe Root Ports until we have as many buses as we need
You mean "plug pcie-root-ports into pcie-root". No, I actually meant what I wrote, but I think that thanks to your remark I see my error now.
What I thought was going on was that whese additional buses were *chained* to each other, eg. the first PCIe Root Port (bus 1) would be plugged into the PCIe Root Bus (bus 0), the second PCIe Root Port (bus 2) would be plugged into the first PCIe Root Port (bus 1) and so on.
That wouldn't work because 1) pcie-root-port can only plug into pcie-root or pcie-expander-bus, and 2) even if it could plug into another pcie-root-port, there is only a single slot so you would have to plug the pcie-root-port and the endpoint device into different functions of the single slot, so hotplug would be impossible.
What happens instead is that the first PCIe Root Port (bus 1) is plugged into the PCIe Root Bus (bus 0), and so is the second PCIe Root Port (bus 2) and all subsequent ones.
Basically for a second there I forgot that the bus number doesn't increase only when increasing the depth of the PCI hierarchy, but also when increasing its breadth.
* if the machine has a PCIe Root Bus and we're not in the situation above when it comes to PCI bridges (there is an existing legacy PCI hierarchy below it), we plug PCIe Root Ports into PCIe Root Ports until we have as many buses as we need
* otherwise (no PCIe Root Bus) we just plug PCI bridges into PCI bridges until we have as many buses as we need
Does that sound about right?
Yep. The same as above applies, though.
On the other hand, the virDomainPCIAddressSet structure doesn't really store any information about the relationship between controllers: whether eg. the PCIe Root Port providing bus 5 is plugged directly into the PCIe Root Bus, or into another PCIe Root Port, or into a PCIe Switch Downstream Port, is something that we just don't know at this stage.
Yes, which may be necessary at some point - e.g. I've been thinking that we really don't want the buses that are children of a pci[e]-expander-bus to be used for auto-assigned addresses (although maybe that could just be solved with a flag that's set at the time the AddressSet is created, rather than needing to teach it the full intracacies of the topology.)
+ <controller type='pci' index='1' model='dmi-to-pci-bridge'> + <model name='i82801b11-bridge'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/> + </controller>
A DMI-to-PCI bridge with index='1' has been added automatically here...
... but it's not being used by any of the devices. So why would it be added in the first place?
That is a *very* good question! Can't wait to know the answer! ;)
Oh, now that I've looked in context of the patch ordering, I undestand: it's because patch 16/18 hasn't been applied yet. I'd forgotten the ordering...