[libvirt] [PATCH 0/4] Fix a couple PCI controller/addressing problems

1 - makes an error message less specific because it was misleading/incorrect in many/most cases 2+3 - log a more informative message when you try to put a pcie-root in a 440fx domain, or a pci-root in a Q35 (or arm/virt) domain 4 - makes it possible to create a Q35 domain in libvirt that doesn't have a dmi-to-pci-bridge controller (as long as you have something else in its place). Laine Stump (4): qemu: fix error log in qemuAssignPCIAddresses() conf: make virDomainDefAddController() public conf: log error when incorrect PCI root controller is added to domain qemu: don't be as insistent about adding dmi-to-pci-bridge or pci-bridge src/conf/domain_conf.c | 2 +- src/conf/domain_conf.h | 2 ++ src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 55 +++++++++++++++++++++++++++++++----------- src/qemu/qemu_domain_address.c | 7 +++--- 5 files changed, 48 insertions(+), 19 deletions(-) -- 2.5.5

This error message was too specific, based on the incorrect assumption that any error was cause by auto-added bridges: failed to create PCI bridge on bus 2: too many devices with fixed addresses In practice you can't know if a bridge with an index <= the bus it's connecting to was added automatically, or if it was a mistake in explicit config, and the auto-add problem is going to be dealt with in a different way in an upcoming patch. The new message is this: PCI Controller at index 1 (0x01) has " bus='0x02', but bus must be <= index (note that index is given in both decimal and hex because it is formatted as decimal in the XML, but bus is formatted as hex, and displaying the hex value of index makes it easier to see the problem when index > 9 (which will often be the case with PCIe, since most controllers only have a single port, not 32 slots as with standard PCI)). --- src/qemu/qemu_domain_address.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index bb8c16c..f4d5533 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -1609,10 +1609,9 @@ qemuDomainAssignPCIAddresses(virDomainDefPtr def, if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE && idx <= addr->bus) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("failed to create PCI bridge " - "on bus %d: too many devices with fixed " - "addresses"), - addr->bus); + _("PCI controller at index %d (0x%02x) has " + "bus='0x%02x', but bus must be <= index"), + idx, idx, addr->bus); goto cleanup; } } -- 2.5.5

On 04/21/2016 02:48 PM, Laine Stump wrote:
This error message was too specific, based on the incorrect assumption that any error was cause by auto-added bridges:
failed to create PCI bridge on bus 2: too many devices with fixed addresses
In practice you can't know if a bridge with an index <= the bus it's connecting to was added automatically, or if it was a mistake in explicit config, and the auto-add problem is going to be dealt with in a different way in an upcoming patch. The new message is this:
PCI Controller at index 1 (0x01) has " bus='0x02', but bus must be <= index
(note that index is given in both decimal and hex because it is formatted as decimal in the XML, but bus is formatted as hex, and displaying the hex value of index makes it easier to see the problem when index > 9 (which will often be the case with PCIe, since most controllers only have a single port, not 32 slots as with standard PCI)). --- src/qemu/qemu_domain_address.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index bb8c16c..f4d5533 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -1609,10 +1609,9 @@ qemuDomainAssignPCIAddresses(virDomainDefPtr def, if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE && idx <= addr->bus) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("failed to create PCI bridge " - "on bus %d: too many devices with fixed " - "addresses"), - addr->bus); + _("PCI controller at index %d (0x%02x) has " + "bus='0x%02x', but bus must be <= index"), + idx, idx, addr->bus); goto cleanup; } }
ACK - Cole

This will be needed by the qemu driver in an upcoming patch. --- src/conf/domain_conf.c | 2 +- src/conf/domain_conf.h | 2 ++ src/libvirt_private.syms | 1 + 3 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index db567f5..b1e6084 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -14555,7 +14555,7 @@ virDomainEmulatorPinDefParseXML(xmlNodePtr node) } -static virDomainControllerDefPtr +virDomainControllerDefPtr virDomainDefAddController(virDomainDefPtr def, int type, int idx, int model) { virDomainControllerDefPtr cont; diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index fd540ed..add8e6e 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -3093,6 +3093,8 @@ VIR_ENUM_DECL(virDomainCpuPlacementMode) VIR_ENUM_DECL(virDomainStartupPolicy) +virDomainControllerDefPtr +virDomainDefAddController(virDomainDefPtr def, int type, int idx, int model); int virDomainDefAddUSBController(virDomainDefPtr def, int idx, int model); int diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 2b55369..019d2e1 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -201,6 +201,7 @@ virDomainControllerRemove; virDomainControllerTypeToString; virDomainCpuPlacementModeTypeFromString; virDomainCpuPlacementModeTypeToString; +virDomainDefAddController; virDomainDefAddImplicitDevices; virDomainDefAddUSBController; virDomainDefCheckABIStability; -- 2.5.5

On 04/21/2016 02:48 PM, Laine Stump wrote:
This will be needed by the qemu driver in an upcoming patch. --- src/conf/domain_conf.c | 2 +- src/conf/domain_conf.h | 2 ++ src/libvirt_private.syms | 1 + 3 files changed, 4 insertions(+), 1 deletion(-)
ACK - Cole

libvirt may automatically add a pci-root or pcie-root controller to a domain, depending on the arch/machinetype, and it hopefully always makes the right decision about which to add (since in all cases these controllers are an implicit part of the virtual machine). But it's always possible that someone will create a config that explicitly supplies the wrong type of PCI controller for the selected machinetype. In the past that would lead to an error later when libvirt was trying to assign addresses to other devices, for example: XML error: PCI bus is not compatible with the device at 0000:00:02.0. Device requires a PCI Express slot, which is not provided by bus 0000:00 (that's the error message that appears if you replace the pcie-root controller in a Q35 domain with a pci-root controller). This patch adds a check at the same place that the implicit controllers are added (to ensure that the same logic is used to check which type of pci root is correct). If a pci controller with index='0' is already present, we verify that it is of the model that we would have otherwise added automatically; if not, an error is logged: The PCI controller with index='0' must be " model='pcie-root' for this machine type, " but model='pci-root' was found instead. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1004602 --- src/qemu/qemu_domain.c | 47 +++++++++++++++++++++++++++++++++++------------ 1 file changed, 35 insertions(+), 12 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 51d4830..86b7d13 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1437,6 +1437,7 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def, { bool addDefaultUSB = true; int usbModel = -1; /* "default for machinetype" */ + int pciRoot; /* index within def->controllers */ bool addImplicitSATA = false; bool addPCIRoot = false; bool addPCIeRoot = false; @@ -1538,14 +1539,26 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def, def, VIR_DOMAIN_CONTROLLER_TYPE_SATA, 0, -1) < 0) goto cleanup; + pciRoot = virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0); + /* NB: any machine that sets addPCIRoot to true must also return * true from the function qemuDomainSupportsPCI(). */ - if (addPCIRoot && - virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, - VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) < 0) - goto cleanup; + if (addPCIRoot) { + if (pciRoot >= 0) { + if (def->controllers[pciRoot]->model != VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) { + virReportError(VIR_ERR_XML_ERROR, + _("The PCI controller with index='0' must be " + "model='pci-root' for this machine type, " + "but model='%s' was found instead"), + virDomainControllerModelPCITypeToString(def->controllers[pciRoot]->model)); + goto cleanup; + } + } else if (!virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, + VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT)) { + goto cleanup; + } + } /* When a machine has a pcie-root, make sure that there is always * a dmi-to-pci-bridge controller added as bus 1, and a pci-bridge @@ -1555,15 +1568,25 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def, * true from the function qemuDomainSupportsPCI(). */ if (addPCIeRoot) { + if (pciRoot >= 0) { + if (def->controllers[pciRoot]->model != VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT) { + virReportError(VIR_ERR_XML_ERROR, + _("The PCI controller with index='0' must be " + "model='pcie-root' for this machine type, " + "but model='%s' was found instead"), + virDomainControllerModelPCITypeToString(def->controllers[pciRoot]->model)); + goto cleanup; + } + } else if (!virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, + VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT)) { + goto cleanup; + } if (virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, - VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT) < 0 || - virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, - VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE) < 0 || + def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, + VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE) < 0 || virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, - VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) < 0) { + def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, + VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) < 0) { goto cleanup; } } -- 2.5.5

On 04/21/2016 02:48 PM, Laine Stump wrote:
libvirt may automatically add a pci-root or pcie-root controller to a domain, depending on the arch/machinetype, and it hopefully always makes the right decision about which to add (since in all cases these controllers are an implicit part of the virtual machine).
But it's always possible that someone will create a config that explicitly supplies the wrong type of PCI controller for the selected machinetype. In the past that would lead to an error later when libvirt was trying to assign addresses to other devices, for example:
XML error: PCI bus is not compatible with the device at 0000:00:02.0. Device requires a PCI Express slot, which is not provided by bus 0000:00
(that's the error message that appears if you replace the pcie-root controller in a Q35 domain with a pci-root controller).
This patch adds a check at the same place that the implicit controllers are added (to ensure that the same logic is used to check which type of pci root is correct). If a pci controller with index='0' is already present, we verify that it is of the model that we would have otherwise added automatically; if not, an error is logged:
The PCI controller with index='0' must be " model='pcie-root' for this machine type, " but model='pci-root' was found instead.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1004602 --- src/qemu/qemu_domain.c | 47 +++++++++++++++++++++++++++++++++++------------ 1 file changed, 35 insertions(+), 12 deletions(-)
Given that this is a non-trivial error condition, I think we should have a qemuxml2xml test case that checks for this error - Cole

Previously there was no way to have a Q35 domain that didn't have these two controllers. This patch skips their creation as long as there are some other kinds of pci controllers at index 1 and 2 (e.g. some pcie-root-port controllers). I'm hoping that soon we won't add them at all, plugging all devices into auto-added pcie-*-port ports instead, but in the meantime this makes it easier to experiment with alternative bus hierarchies. --- src/qemu/qemu_domain.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 86b7d13..0b342e2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1581,14 +1581,18 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def, VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT)) { goto cleanup; } - if (virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, - VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE) < 0 || - virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, - VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) < 0) { + /* add a dmi-to-pci-bridge and a pci-bridge if there are no pci controllers + * other than the pcie-root. This is so that there will be hot-pluggable + * PCI slots available + */ + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, + VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE)) + goto cleanup; + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, + VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE)) goto cleanup; - } } if (addDefaultMemballoon && !def->memballoon) { -- 2.5.5

On 04/21/2016 02:48 PM, Laine Stump wrote:
Previously there was no way to have a Q35 domain that didn't have these two controllers. This patch skips their creation as long as there are some other kinds of pci controllers at index 1 and 2 (e.g. some pcie-root-port controllers).
I'm hoping that soon we won't add them at all, plugging all devices into auto-added pcie-*-port ports instead, but in the meantime this makes it easier to experiment with alternative bus hierarchies. --- src/qemu/qemu_domain.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 86b7d13..0b342e2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1581,14 +1581,18 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def, VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT)) { goto cleanup; } - if (virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, - VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE) < 0 || - virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, - VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) < 0) { + /* add a dmi-to-pci-bridge and a pci-bridge if there are no pci controllers + * other than the pcie-root. This is so that there will be hot-pluggable + * PCI slots available + */ + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, + VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE)) + goto cleanup; + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, + VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE)) goto cleanup; - } }
if (addDefaultMemballoon && !def->memballoon) {
Sounds like another qemuxml2xml test case candidate - Cole

On 04/23/2016 12:46 PM, Cole Robinson wrote:
On 04/21/2016 02:48 PM, Laine Stump wrote:
Previously there was no way to have a Q35 domain that didn't have these two controllers. This patch skips their creation as long as there are some other kinds of pci controllers at index 1 and 2 (e.g. some pcie-root-port controllers).
I'm hoping that soon we won't add them at all, plugging all devices into auto-added pcie-*-port ports instead, but in the meantime this makes it easier to experiment with alternative bus hierarchies. --- src/qemu/qemu_domain.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 86b7d13..0b342e2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1581,14 +1581,18 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def, VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT)) { goto cleanup; } - if (virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, - VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE) < 0 || - virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, - VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) < 0) { + /* add a dmi-to-pci-bridge and a pci-bridge if there are no pci controllers + * other than the pcie-root. This is so that there will be hot-pluggable + * PCI slots available + */ + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, + VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE)) + goto cleanup; + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, + VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE)) goto cleanup; - } }
if (addDefaultMemballoon && !def->memballoon) {
Sounds like another qemuxml2xml test case candidate
... and it turns out this patch doesn't go far enough to have any useful effect. The problem is that the code that automatically assigns PCI addresses does its best to always be sure that there is at least one empty hot-pluggable "standard PCI" slot on the system, so even if we don't explicitly add any pci-bridge to a Q35 domain, the address-assignment code will automatically add one anyway; and since the pci-bridge requires a standard PCI slot itself, we still need a dmi-to-pci-bridge anyway (which provides 32 non-hotpluggable standard PCI slots). I had been aiming to re-do how libvirt sets up the PCI controllers for Q35, eventually eliminating the current dmi-to-pci-bridge + pci-bridge in favor of a pure PCIe setup, using pcie-root-ports and pcie-switch-(up|down)stream-ports, and this was going to be the first step towards that. But after discussing that idea more with Alex Williamson on Friday, I think we *shouldn't* do it, but just leave the PCI controller setup essentially as it is (one possible change - allow connecting pci-bridge directly into a pcie-root port, thus eliminating the dmi-to-pci-bridge) What prompted the idea to change to pure PCIe controllers was that qemu allows any PCI device to be plugged into a PCIe slot and it apparently functions just fine, and there had been occasional complaints that plugging everything into a pci-bridge was somehow problematic (at one point someone suggested that virtio-net didn't work correctly on ARM if plugged into a non-Express slot, but I think that was later disproved). The problem with PCIe-only controllers, as Alex pointed out, is that when a non-Express device is plugged into a PCIe slot, the guest OS will see an apparently PCIe device that has no PCIe capabilities, and while so far this hasn't caused any problem, there is no guarantee that it won't - PCIe devices are supposed to have PCIe capabilities. Since the only emulated qemu device that does this is the NEC UHCI USB3 controller, it sees that we'll need to keep setting up the pci-bridge and plugging all the rest of the devices in there. Still, it would be nice to allow *those who really want to* to have a "pure PCIe" controller setup. We could do that if we did two things: 1) modify the PCI address auto-assignment code to not insist on always having at least one hot-pluggable standard PCI slot available. and one of the following: 2a) don't always add a dmi-to-pci-bridge, but instead only do so *if necessary* in order to plug in a pci-bridge (which would only be added if necessary). or 2b) permit auto-assigning a pci-bridge to plug into a port of pcie-root, thus eliminating the need for dmi-to-pci-bridge completely. Does anyone have opinions about (1) or (2b)? (NB: in the long run it might be nice if qemu would automatically advertise PCIe capabilities for any of its emulated devices that was plugged into a PCIe slot - this would make it possible to get rid of all non-Express controllers without violating any of the PCI specs. However, according to Alex this would have 0 practical effect (no better performance / actual capabilities of the devices / use of PCI address space), and it would also require that qemu gave us some way to understand it was going to do this (i.e. introspection) so I don't know how useful it would be to go to all that trouble.)

On 04/25/2016 05:28 PM, Laine Stump wrote:
On 04/23/2016 12:46 PM, Cole Robinson wrote:
On 04/21/2016 02:48 PM, Laine Stump wrote:
Previously there was no way to have a Q35 domain that didn't have these two controllers. This patch skips their creation as long as there are some other kinds of pci controllers at index 1 and 2 (e.g. some pcie-root-port controllers).
I'm hoping that soon we won't add them at all, plugging all devices into auto-added pcie-*-port ports instead, but in the meantime this makes it easier to experiment with alternative bus hierarchies. --- src/qemu/qemu_domain.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 86b7d13..0b342e2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1581,14 +1581,18 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def, VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT)) { goto cleanup; } - if (virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, - VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE) < 0 || - virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, - VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) < 0) { + /* add a dmi-to-pci-bridge and a pci-bridge if there are no pci controllers + * other than the pcie-root. This is so that there will be hot-pluggable + * PCI slots available + */ + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, + VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE)) + goto cleanup; + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, + VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE)) goto cleanup; - } } if (addDefaultMemballoon && !def->memballoon) {
Sounds like another qemuxml2xml test case candidate
... and it turns out this patch doesn't go far enough to have any useful effect. The problem is that the code that automatically assigns PCI addresses does its best to always be sure that there is at least one empty hot-pluggable "standard PCI" slot on the system, so even if we don't explicitly add any pci-bridge to a Q35 domain, the address-assignment code will automatically add one anyway; and since the pci-bridge requires a standard PCI slot itself, we still need a dmi-to-pci-bridge anyway (which provides 32 non-hotpluggable standard PCI slots).
I had been aiming to re-do how libvirt sets up the PCI controllers for Q35, eventually eliminating the current dmi-to-pci-bridge + pci-bridge in favor of a pure PCIe setup, using pcie-root-ports and pcie-switch-(up|down)stream-ports, and this was going to be the first step towards that. But after discussing that idea more with Alex Williamson on Friday, I think we *shouldn't* do it, but just leave the PCI controller setup essentially as it is (one possible change - allow connecting pci-bridge directly into a pcie-root port, thus eliminating the dmi-to-pci-bridge)
Connecting the pci-bridge directly into a pcie-root port is a little odd, but I suppose it would work. Since I am late to the party, can you please explain why do you want to eliminate the dmi-to-pci controller ? Because it doesn't support hotplug? I have some bad news in that direction. Even the pci-bridge (devices behind it) does not support hotplug on Q35, it is on my todo list.
What prompted the idea to change to pure PCIe controllers was that qemu allows any PCI device to be plugged into a PCIe slot and it apparently functions just fine, and there had been occasional complaints that plugging everything into a pci-bridge was somehow problematic (at one point someone suggested that virtio-net didn't work correctly on ARM if plugged into a non-Express slot, but I think that was later disproved).
The problem with PCIe-only controllers, as Alex pointed out, is that when a non-Express device is plugged into a PCIe slot, the guest OS will see an apparently PCIe device that has no PCIe capabilities, and while so far this hasn't caused any problem, there is no guarantee that it won't - PCIe devices are supposed to have PCIe capabilities. Since the only emulated qemu device that does this is the NEC UHCI USB3 controller, it sees that we'll need to keep setting up the pci-bridge and plugging all the rest of the devices in there.
Actually all virtio devices are now PCI Express devices if virtio-1 is enabled. The QEMU command-line is -device virtio-<dev>-pci,disable-modern=false. The only requirement is that the virtio device would not be connected directly to pcie.0 root bus, but to a pcie root port or switch.
Still, it would be nice to allow *those who really want to* to have a "pure PCIe" controller setup. We could do that if we did two things:
Sure, pure PCIe setup would be nice.
1) modify the PCI address auto-assignment code to not insist on always having at least one hot-pluggable standard PCI slot available.
and one of the following:
2a) don't always add a dmi-to-pci-bridge, but instead only do so *if necessary* in order to plug in a pci-bridge (which would only be added if necessary).
or
2b) permit auto-assigning a pci-bridge to plug into a port of pcie-root, thus eliminating the need for dmi-to-pci-bridge completely.
Does anyone have opinions about (1) or (2b)?
I personally don't like 2b because would not work in 'real' world, so 1 and 2a seems fine to me. Bottom line, maybe we can tackle the dmi-to-pci bridge to cover your requirements. Thanks, Marcel
(NB: in the long run it might be nice if qemu would automatically advertise PCIe capabilities for any of its emulated devices that was plugged into a PCIe slot - this would make it possible to get rid of all non-Express controllers without violating any of the PCI specs. However, according to Alex this would have 0 practical effect (no better performance / actual capabilities of the devices / use of PCI address space), and it would also require that qemu gave us some way to understand it was going to do this (i.e. introspection) so I don't know how useful it would be to go to all that trouble.)

On 04/25/2016 10:53 AM, Marcel Apfelbaum wrote:
On 04/25/2016 05:28 PM, Laine Stump wrote:
On 04/23/2016 12:46 PM, Cole Robinson wrote:
On 04/21/2016 02:48 PM, Laine Stump wrote:
Previously there was no way to have a Q35 domain that didn't have these two controllers. This patch skips their creation as long as there are some other kinds of pci controllers at index 1 and 2 (e.g. some pcie-root-port controllers).
I'm hoping that soon we won't add them at all, plugging all devices into auto-added pcie-*-port ports instead, but in the meantime this makes it easier to experiment with alternative bus hierarchies. --- src/qemu/qemu_domain.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 86b7d13..0b342e2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1581,14 +1581,18 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def, VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT)) { goto cleanup; } - if (virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, - VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE) < 0 || - virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, - VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) < 0) { + /* add a dmi-to-pci-bridge and a pci-bridge if there are no pci controllers + * other than the pcie-root. This is so that there will be hot-pluggable + * PCI slots available + */ + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, + VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE)) + goto cleanup; + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, + VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE)) goto cleanup; - } } if (addDefaultMemballoon && !def->memballoon) {
Sounds like another qemuxml2xml test case candidate
... and it turns out this patch doesn't go far enough to have any useful effect. The problem is that the code that automatically assigns PCI addresses does its best to always be sure that there is at least one empty hot-pluggable "standard PCI" slot on the system, so even if we don't explicitly add any pci-bridge to a Q35 domain, the address-assignment code will automatically add one anyway; and since the pci-bridge requires a standard PCI slot itself, we still need a dmi-to-pci-bridge anyway (which provides 32 non-hotpluggable standard PCI slots).
I had been aiming to re-do how libvirt sets up the PCI controllers for Q35, eventually eliminating the current dmi-to-pci-bridge + pci-bridge in favor of a pure PCIe setup, using pcie-root-ports and pcie-switch-(up|down)stream-ports, and this was going to be the first step towards that. But after discussing that idea more with Alex Williamson on Friday, I think we *shouldn't* do it, but just leave the PCI controller setup essentially as it is (one possible change - allow connecting pci-bridge directly into a pcie-root port, thus eliminating the dmi-to-pci-bridge)
Connecting the pci-bridge directly into a pcie-root port is a little odd, but I suppose it would work. Since I am late to the party, can you please explain why do you want to eliminate the dmi-to-pci controller ? Because it doesn't support hotplug?
I guess just because it's an extra device that we use only for the purpose of connecting the pci-bridge. But definitely the entire reason for putting in a pci-bridge is to have non-Express slots that support hotplug.
I have some bad news in that direction. Even the pci-bridge (devices behind it) does not support hotplug on Q35, it is on my todo list.
I now remember somebody telling me that within the last several weeks (possibly you?) So if hotplug doesn't work for pci-bridge slots on Q35, then we currently wouldn't be losing anything if we just used the dmi-to-pci-bridge slots directly (in either case, hotplug won't work). Alex had suggested maybe the dmi-to-pci-bridge could be enhanced to support hotplug, or possibly a similar but generic controller could be added that supported hotplug. What is the difficulty of that vs. fixing hotplug support on pci-bridge? (getting both would be best, of course). Would it be better to make the slots of the current dmi-to-pci-bridge (i82801b11-bridge) hotpluggable? Or to create a new device?
What prompted the idea to change to pure PCIe controllers was that qemu allows any PCI device to be plugged into a PCIe slot and it apparently functions just fine, and there had been occasional complaints that plugging everything into a pci-bridge was somehow problematic (at one point someone suggested that virtio-net didn't work correctly on ARM if plugged into a non-Express slot, but I think that was later disproved).
The problem with PCIe-only controllers, as Alex pointed out, is that when a non-Express device is plugged into a PCIe slot, the guest OS will see an apparently PCIe device that has no PCIe capabilities, and while so far this hasn't caused any problem, there is no guarantee that it won't - PCIe devices are supposed to have PCIe capabilities. Since the only emulated qemu device that does this is the NEC UHCI USB3 controller, it sees that we'll need to keep setting up the pci-bridge and plugging all the rest of the devices in there.
Actually all virtio devices are now PCI Express devices if virtio-1 is enabled. The QEMU command-line is -device virtio-<dev>-pci,disable-modern=false.
So that defaults to true? How does this relate to the experimental x-disable-pcie setting? Are they the same thing? What happens if you set this and then plug it into a non-Express slot? Does setting this flag change anything else for the device that would, e.g., require a different guest driver or other changes to the qemu commandline? It sounds like I can use this - just check for disable-modern on each virtio device when getting qemu capabilities, then prefer a PCIe slot for any unaddressed device that has it.
The only requirement is that the virtio device would not be connected directly to pcie.0 root bus, but to a pcie root port or switch.
So they have to be connected to a *-port even if you don't care about hotplug? The entire deal with the ports on the root complex are confusing to me - Alex had said the other day that it *does* happen that non-Express devices end up connected directly to the root complex in real hardware (although probably this violates the spec), so it may be reasonable to expect guest OSes to deal with that (that, and dgilbert's success doing it, are what gave me the courage to suggest plugging pci-bridge directly into a root complex port).
Still, it would be nice to allow *those who really want to* to have a "pure PCIe" controller setup. We could do that if we did two things:
Sure, pure PCIe setup would be nice.
1) modify the PCI address auto-assignment code to not insist on always having at least one hot-pluggable standard PCI slot available.
and one of the following:
2a) don't always add a dmi-to-pci-bridge, but instead only do so *if necessary* in order to plug in a pci-bridge (which would only be added if necessary).
or
2b) permit auto-assigning a pci-bridge to plug into a port of pcie-root, thus eliminating the need for dmi-to-pci-bridge completely.
Does anyone have opinions about (1) or (2b)?
I personally don't like 2b because would not work in 'real' world, so 1 and 2a seems fine to me.
Bottom line, maybe we can tackle the dmi-to-pci bridge to cover your requirements.
I guess you mean "make the dmi-to-pci-bridge slots hotpluggable" by this, right?

On 04/25/2016 07:09 PM, Laine Stump wrote:
On 04/25/2016 10:53 AM, Marcel Apfelbaum wrote:
On 04/25/2016 05:28 PM, Laine Stump wrote:
On 04/23/2016 12:46 PM, Cole Robinson wrote:
On 04/21/2016 02:48 PM, Laine Stump wrote:
Previously there was no way to have a Q35 domain that didn't have these two controllers. This patch skips their creation as long as there are some other kinds of pci controllers at index 1 and 2 (e.g. some pcie-root-port controllers).
I'm hoping that soon we won't add them at all, plugging all devices into auto-added pcie-*-port ports instead, but in the meantime this makes it easier to experiment with alternative bus hierarchies. --- src/qemu/qemu_domain.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 86b7d13..0b342e2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1581,14 +1581,18 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def, VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT)) { goto cleanup; } - if (virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, - VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE) < 0 || - virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, - VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) < 0) { + /* add a dmi-to-pci-bridge and a pci-bridge if there are no pci controllers + * other than the pcie-root. This is so that there will be hot-pluggable + * PCI slots available + */ + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, + VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE)) + goto cleanup; + if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2) < 0 && + !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, + VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE)) goto cleanup; - } } if (addDefaultMemballoon && !def->memballoon) {
Sounds like another qemuxml2xml test case candidate
... and it turns out this patch doesn't go far enough to have any useful effect. The problem is that the code that automatically assigns PCI addresses does its best to always be sure that there is at least one empty hot-pluggable "standard PCI" slot on the system, so even if we don't explicitly add any pci-bridge to a Q35 domain, the address-assignment code will automatically add one anyway; and since the pci-bridge requires a standard PCI slot itself, we still need a dmi-to-pci-bridge anyway (which provides 32 non-hotpluggable standard PCI slots).
I had been aiming to re-do how libvirt sets up the PCI controllers for Q35, eventually eliminating the current dmi-to-pci-bridge + pci-bridge in favor of a pure PCIe setup, using pcie-root-ports and pcie-switch-(up|down)stream-ports, and this was going to be the first step towards that. But after discussing that idea more with Alex Williamson on Friday, I think we *shouldn't* do it, but just leave the PCI controller setup essentially as it is (one possible change - allow connecting pci-bridge directly into a pcie-root port, thus eliminating the dmi-to-pci-bridge)
Connecting the pci-bridge directly into a pcie-root port is a little odd, but I suppose it would work. Since I am late to the party, can you please explain why do you want to eliminate the dmi-to-pci controller ? Because it doesn't support hotplug?
I guess just because it's an extra device that we use only for the purpose of connecting the pci-bridge. But definitely the entire reason for putting in a pci-bridge is to have non-Express slots that support hotplug.
Understood.
I have some bad news in that direction. Even the pci-bridge (devices behind it) does not support hotplug on Q35, it is on my todo list.
I now remember somebody telling me that within the last several weeks (possibly you?)
Possibly me. I remember talking about it.
So if hotplug doesn't work for pci-bridge slots on Q35, then we currently wouldn't be losing anything if we just used the dmi-to-pci-bridge slots directly (in either case, hotplug won't work).
Correct.
Alex had suggested maybe the dmi-to-pci-bridge could be enhanced to support hotplug, or possibly a similar but generic controller could be added that supported hotplug. What is the difficulty of that vs. fixing hotplug support on pci-bridge? (getting both would be best, of course). Would it be better to make the slots of the current dmi-to-pci-bridge (i82801b11-bridge) hotpluggable? Or to create a new device?
I'll speak with Michael about it (I cc-ed him to this mail thread), but I think dmi-to-pci-bridge supporting hot-plug would be the better way in my opinion. But only in case the real i82801b11-bridge supports hotplug. Otherwise a generic dmi-to-pci bridge would be preferable. Regarding the amount of work, I can't say before digging a little deeper.
What prompted the idea to change to pure PCIe controllers was that qemu allows any PCI device to be plugged into a PCIe slot and it apparently functions just fine, and there had been occasional complaints that plugging everything into a pci-bridge was somehow problematic (at one point someone suggested that virtio-net didn't work correctly on ARM if plugged into a non-Express slot, but I think that was later disproved).
The problem with PCIe-only controllers, as Alex pointed out, is that when a non-Express device is plugged into a PCIe slot, the guest OS will see an apparently PCIe device that has no PCIe capabilities, and while so far this hasn't caused any problem, there is no guarantee that it won't - PCIe devices are supposed to have PCIe capabilities. Since the only emulated qemu device that does this is the NEC UHCI USB3 controller, it sees that we'll need to keep setting up the pci-bridge and plugging all the rest of the devices in there.
Actually all virtio devices are now PCI Express devices if virtio-1 is enabled. The QEMU command-line is -device virtio-<dev>-pci,disable-modern=false.
So that defaults to true?
Yes. How does this relate to the experimental x-disable-pcie setting? PCIe is a prerequisite to virtio-1 (modern) Are they the same thing? No. A device can be PCIe but not supporting virtio-1. What happens if you set this and then plug it into a non-Express slot? It will remain a PCI device (no express capability) Does setting
this flag change anything else for the device that would, e.g., require a different guest driver or other changes to the qemu commandline?
Not really
It sounds like I can use this - just check for disable-modern on each virtio device when getting qemu capabilities, then prefer a PCIe slot for any unaddressed device that has it.
This is definitely a good idea. Just be sure to connect it to a root port or to a switch downstream port and not to pcie.0 bus (not an integrated end point)
The only requirement is that the virtio device would not be connected directly to pcie.0 root bus, but to a pcie root port or switch.
So they have to be connected to a *-port even if you don't care about hotplug?
Well, let's say that is preferable to connect them to a root/downstream port if they have disable-modern=false. (you don't have to) ======================================= By the way, even if is not directly related, another property you should look for is "disable-legacy". If is "on" for all devices behind a switch or for the device behind a root port, no IO space would be reserved for them. This is important because otherwise you can have max ~ 15 switches/root ports in the system. (The same as for pci-bridges in a PC machine) The problem here is that the flag does affect the guest, if the guest drivers support only legacy devices you have a problem. By the way, how can you know in advance what kind of drivers the guest has? And again, this is another discussion for another time, but I wanted you to know at least the existence of this flag. ===================================== The entire deal with the ports on the root complex are confusing to me - Alex had said the other day that it *does*
happen that non-Express devices end up connected directly to the root complex in real hardware (although probably this violates the spec), so it may be reasonable to expect guest OSes to deal with that (that, and dgilbert's success doing it, are what gave me the courage to suggest plugging pci-bridge directly into a root complex port).
OK, so the PCIe spec does treat Integrated Root Points (devices on bus pcie.0) differently by allowing the devices that are "inside/part of" the Root Complex to be legacy devices (PCI). That means that attaching the PCI bridge directly to the root complex it should be a valid (but odd) configuration as opposed to attaching the PCI bridge to a root port/downstream-port which is clearly a spec violation.
Still, it would be nice to allow *those who really want to* to have a "pure PCIe" controller setup. We could do that if we did two things:
Sure, pure PCIe setup would be nice.
1) modify the PCI address auto-assignment code to not insist on always having at least one hot-pluggable standard PCI slot available.
and one of the following:
2a) don't always add a dmi-to-pci-bridge, but instead only do so *if necessary* in order to plug in a pci-bridge (which would only be added if necessary).
or
2b) permit auto-assigning a pci-bridge to plug into a port of pcie-root, thus eliminating the need for dmi-to-pci-bridge completely.
Does anyone have opinions about (1) or (2b)?
I personally don't like 2b because would not work in 'real' world, so 1 and 2a seems fine to me.
Bottom line, maybe we can tackle the dmi-to-pci bridge to cover your requirements.
I guess you mean "make the dmi-to-pci-bridge slots hotpluggable" by this, right?
Yes, it seems that this is the direction, I am really hoping that the real hardware spec allows this. Otherwise we need a generic dmi-to-pci bridge, or we could go the Frankenstein way again. :) Thanks, Marcel
participants (3)
-
Cole Robinson
-
Laine Stump
-
Marcel Apfelbaum