[libvirt] [PATCH 00/13] PCIe fixes + new PCI controllers w/RFC

The first 4 patches are bugfixes/reorganizations that have no controversy. The sets of 5-7, 8-10, and 11-13 each implement a new model of PCI controller: 5-7 - <controller type='pci' model='pcie-root-port'/> This is based on qemu's ioh3420. 8-10 - <controller type='pci' model='pcie-switch-upstream-port'/> Based on qemu's x3130-upstream 11-13 - <controller type='pci' model='pcie-switch-downstream-port'/> (xio3130-downstream) The first patch of each set adds a capability bit for qemu (again non-controversial), the 2nd adds the new pci controller model, and the 3rd implements that model in qemu (by checking for the capability and adding a commandline arg or failing). The "controversial"/RFC bit is this - talking to Alex Williamson (after I'd rwritten these patches, which is why I'm presenting them in a form that I *don't* want to push) about the possibility of qemu adding generic root-port, switch-upstream-port, and switch-downstream-port controllers, and possibly also a generic dmi-to-pci-bridge (i.e. controllers not tied to any particular hardware implementation as with those currently available), I'm realizing that, while it was a correct decision to make all of the existing pci controllers "type='pci'" (since they share an address space), using the "model" attribute to set the kind of controller was probably a mistake. The problem - if: <controller type='pci' model='dmi-to-pci-bridge'/> currently means to add an i82801b11-bridge controller to the domain, once qemu implements a generic dmi-to-pci-bridge, how will *that* be denoted, and how will we avoid replacing the existing i81801b11-bridge in a particular domain with the generic version when a guest is restarted following a qemu/libvirt upgrade? In hindsight, it probably would have been better to do something like this with the four existing pci controllers: <controller type='pci' subType='dmi-to-pci-bridge' model='i82801b11-bridge'/> <controller type='pci' subType='pci-bridge' model='pci-bridge'/> (or maybe blank?) <controller type='pci' subType='pci-root'/> (again maybe model is blank) <controller type='pci' subType='pcie-root'/>(and again) (instead, what is shown above as "subType" is currently placed in the "model" attribute). Then we could add the 3 new types like this: <controller type='pci' subType='pcie-root-port' model='ioh3420'/> <controller type='pci' subType='pcie-switch-upstream-port' model='x3130-upstream/> <controller type='pci' subType='pcie-switch-downstream-port' model='xio3130-downstream/> and we would easily be able to add support for new generic controllers that behaved identically, by just adding a new model. But we haven't done that, and anything we do in the future must be backwards compatible with what's already there (right?). I'm trying to think of how to satisfy backward compatibility while making things work better in the future. Some ideas, in no particular order: === Idea 1: multiplex the meaning of the "model" attribute, so we currently have: <controller type='pci' model='dmi-to-pci-bridge'/> which means "add an i82801b11-bridge device" and when we add a generic version of this type of controller, we would do it with something like: <controller type='pci' model='generic-dmi-to-pci-bridge'/> and for another vendor's mythical controller: <controller type='pci' model='xyz-dmi-to-pci-bridge'/> Cons: This will make for ugliness in switch statements where a new case will have to be added whenever a different controller with similar behavior/usage is supported. And it's generally not a good idea to have a single attribute be used for two different functions. === Idea 2: implement new controllers as suggested in "20/20 hindsight" above. For controllers in existing domains (dmi-to-pci-bridge, pic-bridge, pci-root, and pcie-root) imply it into the controller definition of an existing domain when only model has been given (but don't write it out that way, to preserve the ability to downgrade). So this: [1] <controller type='pci' model='dmi-to-pci-bridge'/> would internally mean this: [2] <controller type='pci' subType='dmi-to-pci-bridge' model='i82801b11-bridge'/> (but would remain as [1] when config is rewritten/migrated) while this: [3] <controller type='pci' subType='dmi-to-pci-bridge' model='anything whatsoever/> would mean exactly what it says. Cons: Keeping this straight would mean having some sort of "oldStyleCompat" flag in the controller object, to be sure that [1] wasn't sent in migration status as [2] (since the destination might not recognize it). It would also mean keeping the code in the parser and formatter to deal with this flag. Forever. === Idea 3: interpret controllers with missing subType as above, but actually write it out to the config/migration/etc in the new modified format. Cons: This would prevent downgrading libvirt or migrating from a host with newer libvirt to one with older libvirt. (Although preserving compatibility at some level when downgrading may be a stated requirement of some downstream distros' builds of libvirt, I think for upstream it is only a "best effort"; I'm just not certain how much "best" is considered to be :-) === Idea 4: Unlike other uses of "model" in libvirt, for pci controllers, continue to use "model" to denote the subtype/class/whatever of controller, and create a new attribute to denote the different specific implementations of each one. So for example: [4] <controller type='pci' model='dmi-to-pci-bridge'/> would become: [5] <controller type='pci' model='dmi-to-pci-bridge' implementation='i82801b11-bridge'/> (or some other name in place of "implementation" - ideas? I'm horrible at thinkin up names) Pros: wouldn't create compatibility problems when downgrading or migrating cross version. Cons: Is inconsistent with every other use of "model" attribute in libvirt, and each new addition of a PCI controller further propagates this misuse. ==== I currently like either option 2 or 3 (depending on how good we want to be about supporting downgrade/intra-version migration), but as is obvious by the fact that it was me that suggested putting the type of pci controller into "model", I am very good at making the wrong decision on matters like this. Whatever everyone thinks is best, patches 5-13 of this series would be changed accordingly, and possibly a couple new patches would be made to adjust the 4 existing controller types. Note that this will also effect the libvirt support for the upcoming qemu "pxb" controller, which is a PCI root bus that can be placed in its own numa node (my description may be incorrect, but I think it gets the idea across). Anyway, with idea 2 or 3 it would be: <controller type='pci' subType='pci-root' model='pxb'/> (along with some options to setup numa info). So, along with any comments on the individual patches (including whether the specific args added to the qemu commandline are correct - I'm looking at you, qemu people!), I would appreciate opinions on how I can save us from this misuse of "model" that I've created. Laine Stump (13): qemu: refactor qemuBuildControllerDevStr to eliminate future duplicate code qemu: always permit PCI devices to be manually assigned to a PCIe bus qemu: ignore assumptions about hotplug requirement when address is from config docs: document when pcie-root/dmi-to-pci-bridge support was added qemu: add capabilities bit for device ioh3420 conf: new pci controller model "pcie-root-port" qemu: support new pci controller model "pcie-root-port" qemu: add capabilities bit for device x3130-upstream conf: new pci controller model "pcie-switch-upstream-port" qemu: support new pci controller model "pcie-switch-upstream-port" qemu: add capabilities bit for device xio3130-downstream conf: new pcie-controller model "pcie-switch-downstream-port" qemu: support new pci controller model "pcie-switch-downstream-port" docs/formatdomain.html.in | 48 +++++++++--- docs/schemas/domaincommon.rng | 3 + src/conf/domain_addr.c | 47 ++++++++++-- src/conf/domain_addr.h | 23 ++++-- src/conf/domain_conf.c | 5 +- src/conf/domain_conf.h | 3 + src/qemu/qemu_capabilities.c | 8 +- src/qemu/qemu_capabilities.h | 5 +- src/qemu/qemu_command.c | 86 +++++++++++++++++----- tests/qemucapabilitiesdata/caps_1.2.2-1.caps | 3 + tests/qemucapabilitiesdata/caps_1.3.1-1.caps | 3 + tests/qemucapabilitiesdata/caps_1.4.2-1.caps | 3 + tests/qemucapabilitiesdata/caps_1.5.3-1.caps | 3 + tests/qemucapabilitiesdata/caps_1.6.0-1.caps | 3 + tests/qemucapabilitiesdata/caps_1.6.50-1.caps | 3 + tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 3 + tests/qemuhelptest.c | 10 ++- .../qemuxml2argv-pcie-root-port.args | 10 +++ .../qemuxml2argv-pcie-root-port.xml | 33 +++++++++ .../qemuxml2argv-pcie-switch-downstream-port.args | 13 ++++ .../qemuxml2argv-pcie-switch-downstream-port.xml | 36 +++++++++ .../qemuxml2argv-pcie-switch-upstream-port.args | 9 +++ .../qemuxml2argv-pcie-switch-upstream-port.xml | 32 ++++++++ tests/qemuxml2argvtest.c | 25 +++++++ tests/qemuxml2xmltest.c | 3 + 25 files changed, 375 insertions(+), 45 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.xml -- 2.1.0

The PCI case of the switch statement in this function contains another switch statement with a case for each model. Currently every model except pci-root and pcie-root have a check for index > 0 (since only those two can have index==0), and the function should never be called for those two anyway. If we move the check for !pci[e]-root to the top of the pci case, then we can move the check for index > 0 out of the individual model cases. This will save repeating that check for the three new controller models about to be added. --- src/qemu/qemu_command.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 3886b4f..5e5cfe6 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4581,13 +4581,20 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, break; case VIR_DOMAIN_CONTROLLER_TYPE_PCI: + if (def->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT || + def->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("wrong function called for pci-root/pcie-root")); + return NULL; + } + if (def->idx == 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("index for pci controllers of model '%s' must be > 0"), + virDomainControllerModelPCITypeToString(def->model)); + goto error; + } switch (def->model) { case VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE: - if (def->idx == 0) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("PCI bridge index should be > 0")); - goto error; - } virBufferAsprintf(&buf, "pci-bridge,chassis_nr=%d,id=%s", def->idx, def->info.alias); break; @@ -4598,18 +4605,8 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, "controller is not supported in this QEMU binary")); goto error; } - if (def->idx == 0) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("dmi-to-pci-bridge index should be > 0")); - goto error; - } virBufferAsprintf(&buf, "i82801b11-bridge,id=%s", def->info.alias); break; - case VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT: - case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT: - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("wrong function called for pci-root/pcie-root")); - return NULL; } break; -- 2.1.0

On Mon, Jun 22, 2015 at 02:44:06PM -0400, Laine Stump wrote:
The PCI case of the switch statement in this function contains another switch statement with a case for each model. Currently every model except pci-root and pcie-root have a check for index > 0 (since only
every model has
those two can have index==0), and the function should never be called for those two anyway. If we move the check for !pci[e]-root to the top of the pci case, then we can move the check for index > 0 out of the individual model cases. This will save repeating that check for the three new controller models about to be added. --- src/qemu/qemu_command.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-)
ACK Jan

When support for the pcie-root and dmi-to-pci-bridge buses on a Q35 machinetype was added, I was concerned that even though qemu at the time allowed plugging a PCI device into a PCIe port, that it might not be supported in the future. To prevent painful backtracking in the possible future where this happened, I disallowed such connections except in a few specific cases requested by qemu developers (indicated in the code with the flag VIR_PCI_CONNEcT_TYPE_EITHER_IF_CONFIG). Now that a couple years have passed, there is a clear message from qemu that there is no danger in allowing PCI devices to be plugged into PCIe ports. This patch eliminates VIR_PCI_CONNECT_TYPE_EITHER_IF_CONFIG and changes the code to always allow PCI->PCIe or PCIe->PCI connection *when the PCI address is specified in the config. (For newly added devices that haven't yet been given a PCI address, the auto-placement still prefers using the correct type of bus). --- docs/formatdomain.html.in | 18 +++++++++++------- src/conf/domain_addr.c | 18 +++++++++++------- src/conf/domain_addr.h | 16 ++++++++++------ src/qemu/qemu_command.c | 6 ++---- 4 files changed, 34 insertions(+), 24 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 4e85b51..1dca2ac 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3034,17 +3034,21 @@ bus (for example, the machine types based on the Q35 chipset), the pcie-root controller with index=0 is auto-added to the domain's configuration. pcie-root has also no address, provides - 31 slots (numbered 1-31) and can only be used to attach PCIe - devices. In order to connect standard PCI devices on a system - which has a pcie-root controller, a pci controller + 31 slots (numbered 1-31) that can be used to attach PCIe or PCI + devices (although libvirt will never auto-assign a PCI device to + a PCIe slot, it will allow manual specification of such an + assignment). Devices connected to pcie-root cannot be + hotplugged. In order to make standard PCI slots available on a + system which has a pcie-root controller, a pci controller with <code>model='dmi-to-pci-bridge'</code> is automatically - added. A dmi-to-pci-bridge controller plugs into a PCIe slot (as - provided by pcie-root), and itself provides 31 standard PCI - slots (which are not hot-pluggable). In order to have + added, usually at the defacto standard location of slot=0x1e. A + dmi-to-pci-bridge controller plugs into a PCIe slot (as provided + by pcie-root), and itself provides 31 standard PCI slots (which + also do not support device hotplug). In order to have hot-pluggable PCI slots in the guest system, a pci-bridge controller will also be automatically created and connected to one of the slots of the auto-created dmi-to-pci-bridge - controller; all guest devices with PCI addresses that are + controller; all guest PCI devices with addresses that are auto-determined by libvirt will be placed on this pci-bridge device. (<span class="since">since 1.1.2</span>). </p> diff --git a/src/conf/domain_addr.c b/src/conf/domain_addr.c index d657c7f..93c6043 100644 --- a/src/conf/domain_addr.c +++ b/src/conf/domain_addr.c @@ -1,7 +1,7 @@ /* * domain_addr.c: helper APIs for managing domain device addresses * - * Copyright (C) 2006-2014 Red Hat, Inc. + * Copyright (C) 2006-2015 Red Hat, Inc. * Copyright (C) 2006 Daniel P. Berrange * * This library is free software; you can redistribute it and/or @@ -42,16 +42,21 @@ virDomainPCIAddressFlagsCompatible(virDevicePCIAddressPtr addr, { virErrorNumber errType = (fromConfig ? VIR_ERR_XML_ERROR : VIR_ERR_INTERNAL_ERROR); - virDomainPCIConnectFlags flagsMatchMask = VIR_PCI_CONNECT_TYPES_MASK; - if (fromConfig) - flagsMatchMask |= VIR_PCI_CONNECT_TYPE_EITHER_IF_CONFIG; + if (fromConfig) { + /* If the requested connection was manually specified in + * config, allow a PCI device to connect to a PCIe slot, or + * vice versa. + */ + if (busFlags & VIR_PCI_CONNECT_TYPES_ENDPOINT) + busFlags |= VIR_PCI_CONNECT_TYPES_ENDPOINT; + } /* If this bus doesn't allow the type of connection (PCI * vs. PCIe) required by the device, or if the device requires * hot-plug and this bus doesn't have it, return false. */ - if (!(devFlags & busFlags & flagsMatchMask)) { + if (!(devFlags & busFlags & VIR_PCI_CONNECT_TYPES_MASK)) { if (reportError) { if (devFlags & VIR_PCI_CONNECT_TYPE_PCI) { virReportError(errType, @@ -174,8 +179,7 @@ virDomainPCIAddressBusSetModel(virDomainPCIAddressBusPtr bus, * specified in user config *and* the particular device being * attached also allows it */ - bus->flags = (VIR_PCI_CONNECT_TYPE_PCIE | - VIR_PCI_CONNECT_TYPE_EITHER_IF_CONFIG); + bus->flags = VIR_PCI_CONNECT_TYPE_PCIE; bus->minSlot = 1; bus->maxSlot = VIR_PCI_ADDRESS_SLOT_LAST; break; diff --git a/src/conf/domain_addr.h b/src/conf/domain_addr.h index c18e130..dcfd2e5 100644 --- a/src/conf/domain_addr.h +++ b/src/conf/domain_addr.h @@ -1,7 +1,7 @@ /* * domain_addr.h: helper APIs for managing domain device addresses * - * Copyright (C) 2006-2014 Red Hat, Inc. + * Copyright (C) 2006-2015 Red Hat, Inc. * Copyright (C) 2006 Daniel P. Berrange * * This library is free software; you can redistribute it and/or @@ -39,10 +39,6 @@ typedef enum { /* PCI devices can connect to this bus */ VIR_PCI_CONNECT_TYPE_PCIE = 1 << 3, /* PCI Express devices can connect to this bus */ - VIR_PCI_CONNECT_TYPE_EITHER_IF_CONFIG = 1 << 4, - /* PCI *and* PCIe devices allowed, if the address - * was specified in the config by the user - */ } virDomainPCIConnectFlags; typedef struct { @@ -70,12 +66,20 @@ struct _virDomainPCIAddressSet { typedef struct _virDomainPCIAddressSet virDomainPCIAddressSet; typedef virDomainPCIAddressSet *virDomainPCIAddressSetPtr; -/* a combination of all bit that describe the type of connections +/* a combination of all bits that describe the type of connections * allowed, e.g. PCI, PCIe, switch */ # define VIR_PCI_CONNECT_TYPES_MASK \ (VIR_PCI_CONNECT_TYPE_PCI | VIR_PCI_CONNECT_TYPE_PCIE) +/* combination of all bits that could be used to connect a normal + * endpoint device (i.e. excluding the connection possible between an + * upstream and downstream switch port, or a PCIe root port and a PCIe + * port) + */ +# define VIR_PCI_CONNECT_TYPES_ENDPOINT \ + (VIR_PCI_CONNECT_TYPE_PCI | VIR_PCI_CONNECT_TYPE_PCIE) + char *virDomainPCIAddressAsString(virDevicePCIAddressPtr addr) ATTRIBUTE_NONNULL(1); diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 5e5cfe6..4952797 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1621,8 +1621,7 @@ qemuCollectPCIAddress(virDomainDefPtr def ATTRIBUTE_UNUSED, case VIR_DOMAIN_CONTROLLER_MODEL_USB_ICH9_UHCI2: case VIR_DOMAIN_CONTROLLER_MODEL_USB_ICH9_UHCI3: case VIR_DOMAIN_CONTROLLER_MODEL_USB_VT82C686B_UHCI: - flags = (VIR_PCI_CONNECT_TYPE_PCI | - VIR_PCI_CONNECT_TYPE_EITHER_IF_CONFIG); + flags = VIR_PCI_CONNECT_TYPE_PCI; break; case VIR_DOMAIN_CONTROLLER_MODEL_USB_NEC_XHCI: /* should this be PCIE-only? Or do we need to allow PCI @@ -1643,8 +1642,7 @@ qemuCollectPCIAddress(virDomainDefPtr def ATTRIBUTE_UNUSED, switch (device->data.sound->model) { case VIR_DOMAIN_SOUND_MODEL_ICH6: case VIR_DOMAIN_SOUND_MODEL_ICH9: - flags = (VIR_PCI_CONNECT_TYPE_PCI | - VIR_PCI_CONNECT_TYPE_EITHER_IF_CONFIG); + flags = VIR_PCI_CONNECT_TYPE_PCI; break; } break; -- 2.1.0

On Mon, Jun 22, 2015 at 02:44:07PM -0400, Laine Stump wrote:
When support for the pcie-root and dmi-to-pci-bridge buses on a Q35 machinetype was added, I was concerned that even though qemu at the time allowed plugging a PCI device into a PCIe port, that it might not be supported in the future. To prevent painful backtracking in the possible future where this happened, I disallowed such connections except in a few specific cases requested by qemu developers (indicated in the code with the flag VIR_PCI_CONNEcT_TYPE_EITHER_IF_CONFIG).
s/CONNEcT/CONNECT/g
Now that a couple years have passed, there is a clear message from qemu that there is no danger in allowing PCI devices to be plugged into PCIe ports. This patch eliminates VIR_PCI_CONNECT_TYPE_EITHER_IF_CONFIG and changes the code to always allow PCI->PCIe or PCIe->PCI connection *when the PCI address is specified in the config. (For newly added devices that haven't yet been given a PCI address, the auto-placement still prefers using the correct type of bus). --- docs/formatdomain.html.in | 18 +++++++++++------- src/conf/domain_addr.c | 18 +++++++++++------- src/conf/domain_addr.h | 16 ++++++++++------ src/qemu/qemu_command.c | 6 ++---- 4 files changed, 34 insertions(+), 24 deletions(-)
ACK Jan

Certain PCI buses don't support hotplug, and when automatically assigning PCI addresses for devices, libvirt is very concervative in its assumptions about whether or not a device will need to be hotplugged/unplugged in the future. But if the user manually assigns an address, they likely are aware of any hotplug requirements of the device (or at least they should be). In short, after this patch, automatically PCI address assignment will assume that the device must be pluggedin ot a hot-pluggable slot, but manually assignment can place the device in any bus that is compatible, regardless of whether or not it supports hotplug. If the user makes a mistake and plugs the device into a bus that doesn't support hotplug, then later tries to do a hot-unplug, qemu will give an appropriate error. (in the future we may want to add a "hotpluggable" attribute to all devices, with default being "yes" for autoassign, and "no" for manual assign). --- src/conf/domain_addr.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/conf/domain_addr.c b/src/conf/domain_addr.c index 93c6043..2be98c5 100644 --- a/src/conf/domain_addr.c +++ b/src/conf/domain_addr.c @@ -50,6 +50,12 @@ virDomainPCIAddressFlagsCompatible(virDevicePCIAddressPtr addr, */ if (busFlags & VIR_PCI_CONNECT_TYPES_ENDPOINT) busFlags |= VIR_PCI_CONNECT_TYPES_ENDPOINT; + /* Also allow manual specification of bus to override + * libvirt's assumptions about whether or not hotplug + * capability will be required. + */ + if (devFlags & VIR_PCI_CONNECT_HOTPLUGGABLE) + busFlags |= VIR_PCI_CONNECT_HOTPLUGGABLE; } /* If this bus doesn't allow the type of connection (PCI -- 2.1.0

On Mon, Jun 22, 2015 at 02:44:08PM -0400, Laine Stump wrote:
Certain PCI buses don't support hotplug, and when automatically assigning PCI addresses for devices, libvirt is very concervative in
conservative
its assumptions about whether or not a device will need to be hotplugged/unplugged in the future. But if the user manually assigns an address, they likely are aware of any hotplug requirements of the device (or at least they should be).
In short, after this patch, automatically PCI address assignment will assume that the device must be pluggedin ot a hot-pluggable slot, but
plugged in to
manually assignment can place the device in any bus that is compatible, regardless of whether or not it supports hotplug. If the user makes a mistake and plugs the device into a bus that doesn't support hotplug, then later tries to do a hot-unplug, qemu will give an appropriate error.
(in the future we may want to add a "hotpluggable" attribute to all devices, with default being "yes" for autoassign, and "no" for manual assign). --- src/conf/domain_addr.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/src/conf/domain_addr.c b/src/conf/domain_addr.c index 93c6043..2be98c5 100644 --- a/src/conf/domain_addr.c +++ b/src/conf/domain_addr.c @@ -50,6 +50,12 @@ virDomainPCIAddressFlagsCompatible(virDevicePCIAddressPtr addr, */ if (busFlags & VIR_PCI_CONNECT_TYPES_ENDPOINT) busFlags |= VIR_PCI_CONNECT_TYPES_ENDPOINT;
Here, it's the bus that allows plugging both PCI and PCIe devices in.
+ /* Also allow manual specification of bus to override + * libvirt's assumptions about whether or not hotplug + * capability will be required. + */ + if (devFlags & VIR_PCI_CONNECT_HOTPLUGGABLE) + busFlags |= VIR_PCI_CONNECT_HOTPLUGGABLE;
But the bus might not be hotpluggable here. We just don't care about being able to hotplug the device. How about: devFlags ^= VIR_PCI_CONNECT_HOTPLUGGABLE; ACK either way Jan

Also move the mention of version numbers for the various PCI controller models up to the end of the sentence where they are first given, to avoid confusion. --- docs/formatdomain.html.in | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 1dca2ac..fb3e2fb 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2998,6 +2998,9 @@ PCI controllers have an optional <code>model</code> attribute with possible values <code>pci-root</code>, <code>pcie-root</code>, <code>pci-bridge</code>, or <code>dmi-to-pci-bridge</code>. + (pci-root and pci-bridge <span class="since">since 1.0.5</span>, + pcie-root and dmi-to-pci-bridge <span class="since">since + 1.1.2</span>) The root controllers (<code>pci-root</code> and <code>pcie-root</code>) have an optional <code>pcihole64</code> element specifying how big (in kilobytes, or in the unit specified by <code>pcihole64</code>'s @@ -3017,7 +3020,6 @@ only refer to PCI buses provided by already specified PCI controllers. Leaving gaps in the PCI controller indexes might lead to an invalid configuration. - (pci-root and pci-bridge <span class="since">since 1.0.5</span>) </p> <pre> ... -- 2.1.0

On Mon, Jun 22, 2015 at 02:44:09PM -0400, Laine Stump wrote:
Also move the mention of version numbers for the various PCI controller models up to the end of the sentence where they are first given, to avoid confusion. --- docs/formatdomain.html.in | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
ACK Jan

This is a PCIE "root port". It connects only to a port of the integrated pcie.0 bus of a Q35 machine (can't be hotplugged), and provides a single PCIe port that can have PCI or PCIe devices hotplugged into it. This device will be used to implement the "pcie-root-port" model of pci controller. --- src/qemu/qemu_capabilities.c | 4 +++- src/qemu/qemu_capabilities.h | 3 ++- tests/qemucapabilitiesdata/caps_1.2.2-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.3.1-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.4.2-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.5.3-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.6.0-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.6.50-1.caps | 1 + tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 1 + tests/qemuhelptest.c | 6 ++++-- 10 files changed, 16 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 27a632a..a9a19e1 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -1,7 +1,7 @@ /* * qemu_capabilities.c: QEMU capabilities generation * - * Copyright (C) 2006-2014 Red Hat, Inc. + * Copyright (C) 2006-2015 Red Hat, Inc. * Copyright (C) 2006 Daniel P. Berrange * * This library is free software; you can redistribute it and/or @@ -287,6 +287,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "aarch64-off", "vhost-user-multiqueue", /* 190 */ + "ioh3420", ); @@ -1566,6 +1567,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "ivshmem", QEMU_CAPS_DEVICE_IVSHMEM }, { "pc-dimm", QEMU_CAPS_DEVICE_PC_DIMM }, { "pci-serial", QEMU_CAPS_DEVICE_PCI_SERIAL }, + { "ioh3420", QEMU_CAPS_DEVICE_IOH3420 }, }; static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBlk[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 30aa504..d38505e 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -1,7 +1,7 @@ /* * qemu_capabilities.h: QEMU capabilities generation * - * Copyright (C) 2006-2014 Red Hat, Inc. + * Copyright (C) 2006-2015 Red Hat, Inc. * Copyright (C) 2006 Daniel P. Berrange * * This library is free software; you can redistribute it and/or @@ -230,6 +230,7 @@ typedef enum { QEMU_CAPS_DEVICE_PCI_SERIAL = 188, /* -device pci-serial */ QEMU_CAPS_CPU_AARCH64_OFF = 189, /* -cpu ...,aarch64=off */ QEMU_CAPS_VHOSTUSER_MULTIQUEUE = 190, /* vhost-user with -netdev queues= */ + QEMU_CAPS_DEVICE_IOH3420 = 191, /* -device ioh3420 */ QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_1.2.2-1.caps b/tests/qemucapabilitiesdata/caps_1.2.2-1.caps index 30239df..a1fafa6 100644 --- a/tests/qemucapabilitiesdata/caps_1.2.2-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.2.2-1.caps @@ -120,4 +120,5 @@ <flag name='vmware-svga.vgamem_mb'/> <flag name='qxl.vgamem_mb'/> <flag name='qxl-vga.vgamem_mb'/> + <flag name='ioh3420'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.3.1-1.caps b/tests/qemucapabilitiesdata/caps_1.3.1-1.caps index ea3d850..824ef02 100644 --- a/tests/qemucapabilitiesdata/caps_1.3.1-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.3.1-1.caps @@ -135,4 +135,5 @@ <flag name='qxl.vgamem_mb'/> <flag name='qxl-vga.vgamem_mb'/> <flag name='pci-serial'/> + <flag name='ioh3420'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.4.2-1.caps b/tests/qemucapabilitiesdata/caps_1.4.2-1.caps index 2c19ddc..7ef5199 100644 --- a/tests/qemucapabilitiesdata/caps_1.4.2-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.4.2-1.caps @@ -136,4 +136,5 @@ <flag name='qxl.vgamem_mb'/> <flag name='qxl-vga.vgamem_mb'/> <flag name='pci-serial'/> + <flag name='ioh3420'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.5.3-1.caps b/tests/qemucapabilitiesdata/caps_1.5.3-1.caps index aadccd5..8c3d48e 100644 --- a/tests/qemucapabilitiesdata/caps_1.5.3-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.5.3-1.caps @@ -145,4 +145,5 @@ <flag name='qxl.vgamem_mb'/> <flag name='qxl-vga.vgamem_mb'/> <flag name='pci-serial'/> + <flag name='ioh3420'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.6.0-1.caps b/tests/qemucapabilitiesdata/caps_1.6.0-1.caps index 3e81cbf..72f7625 100644 --- a/tests/qemucapabilitiesdata/caps_1.6.0-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.6.0-1.caps @@ -151,4 +151,5 @@ <flag name='qxl.vgamem_mb'/> <flag name='qxl-vga.vgamem_mb'/> <flag name='pci-serial'/> + <flag name='ioh3420'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.6.50-1.caps b/tests/qemucapabilitiesdata/caps_1.6.50-1.caps index 84c357f..d81c80c 100644 --- a/tests/qemucapabilitiesdata/caps_1.6.50-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.6.50-1.caps @@ -151,4 +151,5 @@ <flag name='qxl.vgamem_mb'/> <flag name='qxl-vga.vgamem_mb'/> <flag name='pci-serial'/> + <flag name='ioh3420'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_2.1.1-1.caps b/tests/qemucapabilitiesdata/caps_2.1.1-1.caps index b1ee8df..1a39dce 100644 --- a/tests/qemucapabilitiesdata/caps_2.1.1-1.caps +++ b/tests/qemucapabilitiesdata/caps_2.1.1-1.caps @@ -167,4 +167,5 @@ <flag name='qxl-vga.vgamem_mb'/> <flag name='pc-dimm'/> <flag name='pci-serial'/> + <flag name='ioh3420'/> </qemuCaps> diff --git a/tests/qemuhelptest.c b/tests/qemuhelptest.c index 507831c..6211640 100644 --- a/tests/qemuhelptest.c +++ b/tests/qemuhelptest.c @@ -753,7 +753,8 @@ mymain(void) QEMU_CAPS_DEVICE_USB_KBD, QEMU_CAPS_DEVICE_USB_STORAGE, QEMU_CAPS_SPLASH_TIMEOUT, - QEMU_CAPS_DEVICE_IVSHMEM); + QEMU_CAPS_DEVICE_IVSHMEM, + QEMU_CAPS_DEVICE_IOH3420); DO_TEST("qemu-1.1.0", 1001000, 0, 0, QEMU_CAPS_VNC_COLON, QEMU_CAPS_NO_REBOOT, @@ -853,7 +854,8 @@ mymain(void) QEMU_CAPS_DEVICE_USB_STORAGE, QEMU_CAPS_OBJECT_USB_AUDIO, QEMU_CAPS_SPLASH_TIMEOUT, - QEMU_CAPS_DEVICE_IVSHMEM); + QEMU_CAPS_DEVICE_IVSHMEM, + QEMU_CAPS_DEVICE_IOH3420); DO_TEST_FULL("qemu-1.2.0", 1002000, 0, 0, VIR_ERR_CONFIG_UNSUPPORTED, QEMU_CAPS_LAST); DO_TEST_FULL("qemu-kvm-1.2.0", 1002000, 1, 0, VIR_ERR_CONFIG_UNSUPPORTED, -- 2.1.0

This controller can be connected (at domain startup time only - not hotpluggable) only to a port on the pcie root complex ("pcie-root" in libvirt config), hence the new connect type VIR_PCI_CONNECT_TYPE_PCIE_ROOT. It provides a hotpluggable port that will accept any PCI or PCIe device. --- docs/formatdomain.html.in | 16 +++++++++-- docs/schemas/domaincommon.rng | 1 + src/conf/domain_addr.c | 10 +++++-- src/conf/domain_addr.h | 5 +++- src/conf/domain_conf.c | 3 +- src/conf/domain_conf.h | 1 + .../qemuxml2argv-pcie-root-port.xml | 33 ++++++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 8 files changed, 64 insertions(+), 6 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index fb3e2fb..e6c140a 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2997,10 +2997,11 @@ <p> PCI controllers have an optional <code>model</code> attribute with possible values <code>pci-root</code>, <code>pcie-root</code>, - <code>pci-bridge</code>, or <code>dmi-to-pci-bridge</code>. + <code>pcie-root-port</code>, <code>pci-bridge</code>, + or <code>dmi-to-pci-bridge</code>. (pci-root and pci-bridge <span class="since">since 1.0.5</span>, pcie-root and dmi-to-pci-bridge <span class="since">since - 1.1.2</span>) + 1.1.2</span>, pcie-root-port <span class="since">since 1.2.17</span>) The root controllers (<code>pci-root</code> and <code>pcie-root</code>) have an optional <code>pcihole64</code> element specifying how big (in kilobytes, or in the unit specified by <code>pcihole64</code>'s @@ -3054,6 +3055,17 @@ auto-determined by libvirt will be placed on this pci-bridge device. (<span class="since">since 1.1.2</span>). </p> + <p> + Domains with an implicit pcie-root can also add controllers + with <code>model='pcie-root-port'</code>. This is a simple type of + bridge device that can connect only to one of the 31 slots on + the pcie-root bus on its upstream side, and makes a single + (PCIe, hotpluggable) port (at slot='0') available on the + downstream side. This controller can be used to provide a single + slot to later hotplug a PCIe device (but is not itself + hotpluggable - it must be in the configuration when the domain + is started). (<span class="since">since 1.2.17</span>) + </p> <pre> ... <devices> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index f0f7400..6139dfb 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1751,6 +1751,7 @@ <choice> <value>pci-bridge</value> <value>dmi-to-pci-bridge</value> + <value>pcie-root-port</value> </choice> </attribute> </group> diff --git a/src/conf/domain_addr.c b/src/conf/domain_addr.c index 2be98c5..4b5e81e 100644 --- a/src/conf/domain_addr.c +++ b/src/conf/domain_addr.c @@ -183,9 +183,9 @@ virDomainPCIAddressBusSetModel(virDomainPCIAddressBusPtr bus, case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT: /* slots 1 - 31, no hotplug, PCIe only unless the address was * specified in user config *and* the particular device being - * attached also allows it + * attached also allows it. */ - bus->flags = VIR_PCI_CONNECT_TYPE_PCIE; + bus->flags = VIR_PCI_CONNECT_TYPE_PCIE | VIR_PCI_CONNECT_TYPE_PCIE_ROOT; bus->minSlot = 1; bus->maxSlot = VIR_PCI_ADDRESS_SLOT_LAST; break; @@ -196,6 +196,12 @@ virDomainPCIAddressBusSetModel(virDomainPCIAddressBusPtr bus, bus->minSlot = 1; bus->maxSlot = VIR_PCI_ADDRESS_SLOT_LAST; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: + /* provides one slot which is pcie and hotpluggable */ + bus->flags = VIR_PCI_CONNECT_TYPE_PCIE | VIR_PCI_CONNECT_HOTPLUGGABLE; + bus->minSlot = 1; + bus->maxSlot = 1; + break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid PCI controller model %d"), model); diff --git a/src/conf/domain_addr.h b/src/conf/domain_addr.h index dcfd2e5..2a0ff96 100644 --- a/src/conf/domain_addr.h +++ b/src/conf/domain_addr.h @@ -39,6 +39,8 @@ typedef enum { /* PCI devices can connect to this bus */ VIR_PCI_CONNECT_TYPE_PCIE = 1 << 3, /* PCI Express devices can connect to this bus */ + VIR_PCI_CONNECT_TYPE_PCIE_ROOT = 1 << 4, + /* for devices that can only connect to pcie-root (i.e. root-port) */ } virDomainPCIConnectFlags; typedef struct { @@ -70,7 +72,8 @@ typedef virDomainPCIAddressSet *virDomainPCIAddressSetPtr; * allowed, e.g. PCI, PCIe, switch */ # define VIR_PCI_CONNECT_TYPES_MASK \ - (VIR_PCI_CONNECT_TYPE_PCI | VIR_PCI_CONNECT_TYPE_PCIE) + (VIR_PCI_CONNECT_TYPE_PCI | VIR_PCI_CONNECT_TYPE_PCIE | \ + VIR_PCI_CONNECT_TYPE_PCIE_ROOT) /* combination of all bits that could be used to connect a normal * endpoint device (i.e. excluding the connection possible between an diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index ca55981..0bdd51e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -323,7 +323,8 @@ VIR_ENUM_IMPL(virDomainControllerModelPCI, VIR_DOMAIN_CONTROLLER_MODEL_PCI_LAST, "pci-root", "pcie-root", "pci-bridge", - "dmi-to-pci-bridge") + "dmi-to-pci-bridge", + "pcie-root-port") VIR_ENUM_IMPL(virDomainControllerModelSCSI, VIR_DOMAIN_CONTROLLER_MODEL_SCSI_LAST, "auto", diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index ba17a8d..946210e 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -752,6 +752,7 @@ typedef enum { VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT, VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE, VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE, + VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT, VIR_DOMAIN_CONTROLLER_MODEL_PCI_LAST } virDomainControllerModelPCI; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.xml b/tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.xml new file mode 100644 index 0000000..a1fc354 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.xml @@ -0,0 +1,33 @@ +<domain type='qemu'> + <name>q35-test</name> + <uuid>11dbdcdd-4c3b-482b-8903-9bdb8c0a2774</uuid> + <memory unit='KiB'>2097152</memory> + <currentMemory unit='KiB'>2097152</currentMemory> + <vcpu placement='static' cpuset='0-1'>2</vcpu> + <os> + <type arch='x86_64' machine='q35'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/libexec/qemu-kvm</emulator> + <disk type='block' device='disk'> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='sda' bus='sata'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='pci' index='0' model='pcie-root'/> + <controller type='pci' index='1' model='dmi-to-pci-bridge'/> + <controller type='pci' index='2' model='pci-bridge'/> + <controller type='pci' index='3' model='pcie-root-port'/> + <controller type='pci' index='4' model='pcie-root-port'/> + <controller type='sata' index='0'/> + <video> + <model type='qxl' ram='65536' vram='32768' vgamem='8192' heads='1'/> + </video> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 44b388c..2acbae1 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -566,6 +566,7 @@ mymain(void) DO_TEST_DIFFERENT("pci-autoadd-idx"); DO_TEST_DIFFERENT("pcie-root"); DO_TEST_DIFFERENT("q35"); + DO_TEST("pcie-root-port"); DO_TEST("hostdev-scsi-lsi"); DO_TEST("hostdev-scsi-virtio-scsi"); -- 2.1.0

This is backed by the qemu device ioh3420. --- src/qemu/qemu_command.c | 20 ++++++++++++++++++++ .../qemuxml2argv-pcie-root-port.args | 10 ++++++++++ tests/qemuxml2argvtest.c | 7 +++++++ 3 files changed, 37 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.args diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 4952797..01ac690 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1598,6 +1598,12 @@ qemuCollectPCIAddress(virDomainDefPtr def ATTRIBUTE_UNUSED, */ flags = VIR_PCI_CONNECT_TYPE_PCIE; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: + /* pcie-root-port can only connect to pcie-root, isn't + * hot-pluggable + */ + flags = VIR_PCI_CONNECT_TYPE_PCIE_ROOT; + break; default: break; } @@ -2345,6 +2351,10 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, */ flags = VIR_PCI_CONNECT_TYPE_PCIE; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: + /* pcie-root-port can only plug into pcie-root */ + flags = VIR_PCI_CONNECT_TYPE_PCIE_ROOT; + break; default: flags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI; break; @@ -4605,6 +4615,16 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, } virBufferAsprintf(&buf, "i82801b11-bridge,id=%s", def->info.alias); break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_IOH3420)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("The pcie-root-port (ioh3420) controller " + "is not supported in this QEMU binary")); + goto error; + } + virBufferAsprintf(&buf, "ioh3420,port=%d,chassis=%d,id=%s", + def->idx, def->idx, def->info.alias); + break; } break; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.args b/tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.args new file mode 100644 index 0000000..30bec85 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.args @@ -0,0 +1,10 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/libexec/qemu-kvm -S -M q35 -m 2048 -smp 2 -nographic -nodefaults \ +-monitor unix:/tmp/test-monitor,server,nowait -no-acpi -boot c \ +-device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e \ +-device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 \ +-device ioh3420,port=3,chassis=3,id=pci.3,bus=pcie.0,addr=0x2 \ +-device ioh3420,port=4,chassis=4,id=pci.4,bus=pcie.0,addr=0x3 \ +-drive file=/dev/HostVG/QEMUGuest1,if=none,id=drive-sata0-0-0 \ +-device ide-drive,bus=ide.0,drive=drive-sata0-0-0,id=sata0-0-0 \ +-vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=33554432 diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index a90f9a6..999edff 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1461,6 +1461,13 @@ mymain(void) QEMU_CAPS_DRIVE, QEMU_CAPS_ICH9_AHCI, QEMU_CAPS_VGA, QEMU_CAPS_DEVICE_VIDEO_PRIMARY, QEMU_CAPS_VGA, QEMU_CAPS_VGA_QXL, QEMU_CAPS_DEVICE_QXL); + DO_TEST("pcie-root-port", + QEMU_CAPS_DEVICE, QEMU_CAPS_DEVICE_PCI_BRIDGE, + QEMU_CAPS_DEVICE_DMI_TO_PCI_BRIDGE, + QEMU_CAPS_DEVICE_IOH3420, + QEMU_CAPS_DRIVE, QEMU_CAPS_ICH9_AHCI, + QEMU_CAPS_VGA, QEMU_CAPS_DEVICE_VIDEO_PRIMARY, + QEMU_CAPS_VGA, QEMU_CAPS_VGA_QXL, QEMU_CAPS_DEVICE_QXL); DO_TEST("hostdev-scsi-lsi", QEMU_CAPS_DRIVE, QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE, -- 2.1.0

On Mon, 2015-06-22 at 14:44 -0400, Laine Stump wrote:
This is backed by the qemu device ioh3420. --- src/qemu/qemu_command.c | 20 ++++++++++++++++++++ .../qemuxml2argv-pcie-root-port.args | 10 ++++++++++ tests/qemuxml2argvtest.c | 7 +++++++ 3 files changed, 37 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.args
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 4952797..01ac690 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1598,6 +1598,12 @@ qemuCollectPCIAddress(virDomainDefPtr def ATTRIBUTE_UNUSED, */ flags = VIR_PCI_CONNECT_TYPE_PCIE; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: + /* pcie-root-port can only connect to pcie-root, isn't + * hot-pluggable + */ + flags = VIR_PCI_CONNECT_TYPE_PCIE_ROOT; + break; default: break; } @@ -2345,6 +2351,10 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, */ flags = VIR_PCI_CONNECT_TYPE_PCIE; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: + /* pcie-root-port can only plug into pcie-root */ + flags = VIR_PCI_CONNECT_TYPE_PCIE_ROOT; + break; default: flags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI; break; @@ -4605,6 +4615,16 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, } virBufferAsprintf(&buf, "i82801b11-bridge,id=%s", def->info.alias); break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_IOH3420)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("The pcie-root-port (ioh3420) controller " + "is not supported in this QEMU binary")); + goto error; + } + virBufferAsprintf(&buf, "ioh3420,port=%d,chassis=%d,id=%s", + def->idx, def->idx, def->info.alias);
So you didn't like the idea of using the flat devfn as the port number for root ports? I'm not a fan of QEMU's use of chassis_nr for bridges and with an 8bit limit on port numbers, using idx we'll run out of bits at some point. Thanks, Alex
+ break; } break;
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.args b/tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.args new file mode 100644 index 0000000..30bec85 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.args @@ -0,0 +1,10 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/libexec/qemu-kvm -S -M q35 -m 2048 -smp 2 -nographic -nodefaults \ +-monitor unix:/tmp/test-monitor,server,nowait -no-acpi -boot c \ +-device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e \ +-device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 \ +-device ioh3420,port=3,chassis=3,id=pci.3,bus=pcie.0,addr=0x2 \ +-device ioh3420,port=4,chassis=4,id=pci.4,bus=pcie.0,addr=0x3 \ +-drive file=/dev/HostVG/QEMUGuest1,if=none,id=drive-sata0-0-0 \ +-device ide-drive,bus=ide.0,drive=drive-sata0-0-0,id=sata0-0-0 \ +-vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=33554432 diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index a90f9a6..999edff 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1461,6 +1461,13 @@ mymain(void) QEMU_CAPS_DRIVE, QEMU_CAPS_ICH9_AHCI, QEMU_CAPS_VGA, QEMU_CAPS_DEVICE_VIDEO_PRIMARY, QEMU_CAPS_VGA, QEMU_CAPS_VGA_QXL, QEMU_CAPS_DEVICE_QXL); + DO_TEST("pcie-root-port", + QEMU_CAPS_DEVICE, QEMU_CAPS_DEVICE_PCI_BRIDGE, + QEMU_CAPS_DEVICE_DMI_TO_PCI_BRIDGE, + QEMU_CAPS_DEVICE_IOH3420, + QEMU_CAPS_DRIVE, QEMU_CAPS_ICH9_AHCI, + QEMU_CAPS_VGA, QEMU_CAPS_DEVICE_VIDEO_PRIMARY, + QEMU_CAPS_VGA, QEMU_CAPS_VGA_QXL, QEMU_CAPS_DEVICE_QXL);
DO_TEST("hostdev-scsi-lsi", QEMU_CAPS_DRIVE, QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE,

On 06/23/2015 11:31 AM, Alex Williamson wrote:
On Mon, 2015-06-22 at 14:44 -0400, Laine Stump wrote:
This is backed by the qemu device ioh3420. --- src/qemu/qemu_command.c | 20 ++++++++++++++++++++ .../qemuxml2argv-pcie-root-port.args | 10 ++++++++++ tests/qemuxml2argvtest.c | 7 +++++++ 3 files changed, 37 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.args
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 4952797..01ac690 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1598,6 +1598,12 @@ qemuCollectPCIAddress(virDomainDefPtr def ATTRIBUTE_UNUSED, */ flags = VIR_PCI_CONNECT_TYPE_PCIE; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: + /* pcie-root-port can only connect to pcie-root, isn't + * hot-pluggable + */ + flags = VIR_PCI_CONNECT_TYPE_PCIE_ROOT; + break; default: break; } @@ -2345,6 +2351,10 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, */ flags = VIR_PCI_CONNECT_TYPE_PCIE; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: + /* pcie-root-port can only plug into pcie-root */ + flags = VIR_PCI_CONNECT_TYPE_PCIE_ROOT; + break; default: flags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI; break; @@ -4605,6 +4615,16 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, } virBufferAsprintf(&buf, "i82801b11-bridge,id=%s", def->info.alias); break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_IOH3420)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("The pcie-root-port (ioh3420) controller " + "is not supported in this QEMU binary")); + goto error; + } + virBufferAsprintf(&buf, "ioh3420,port=%d,chassis=%d,id=%s", + def->idx, def->idx, def->info.alias); So you didn't like the idea of using the flat devfn as the port number for root ports? I'm not a fan of QEMU's use of chassis_nr for bridges and with an 8bit limit on port numbers, using idx we'll run out of bits at some point.
No, I just didn't understand your point until you re-explained on IRC a little while ago. I'm changing it as you suggest. In the meantime, I'm thinking about the other point that you made on IRC - that we're putting logic into libvirt to derive some of these parameters (port, chassis, chassis_nr) but in the future may decide that we need to derive them a different way, and if we do that to an existing machine during migration, things will get "seriously messed up". Even if this change takes place while the guest isn't running, the guest OS could get upset about such a change. All this implies that we really need to be saving the derived values in the config to avoid such surprises.

This is the upstream part of a PCIe switch. It connects to a PCIe port (but not PCI) on the upstream side, and can have up to 31 xio3130-downstream controllers (but no other types of devices) connected to its downstream side. This device will be used to implement the "pcie-switch" model of pci controller. --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_1.2.2-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.3.1-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.4.2-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.5.3-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.6.0-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.6.50-1.caps | 1 + tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 1 + tests/qemuhelptest.c | 6 ++++-- 10 files changed, 14 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index a9a19e1..5a33efb 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -288,6 +288,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "vhost-user-multiqueue", /* 190 */ "ioh3420", + "x3130-upstream", ); @@ -1568,6 +1569,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "pc-dimm", QEMU_CAPS_DEVICE_PC_DIMM }, { "pci-serial", QEMU_CAPS_DEVICE_PCI_SERIAL }, { "ioh3420", QEMU_CAPS_DEVICE_IOH3420 }, + { "x3130-upstream", QEMU_CAPS_DEVICE_X3130_UPSTREAM }, }; static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBlk[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index d38505e..3d28941 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -231,6 +231,7 @@ typedef enum { QEMU_CAPS_CPU_AARCH64_OFF = 189, /* -cpu ...,aarch64=off */ QEMU_CAPS_VHOSTUSER_MULTIQUEUE = 190, /* vhost-user with -netdev queues= */ QEMU_CAPS_DEVICE_IOH3420 = 191, /* -device ioh3420 */ + QEMU_CAPS_DEVICE_X3130_UPSTREAM = 192, /* -device x3130-upstream */ QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_1.2.2-1.caps b/tests/qemucapabilitiesdata/caps_1.2.2-1.caps index a1fafa6..78d7b82 100644 --- a/tests/qemucapabilitiesdata/caps_1.2.2-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.2.2-1.caps @@ -121,4 +121,5 @@ <flag name='qxl.vgamem_mb'/> <flag name='qxl-vga.vgamem_mb'/> <flag name='ioh3420'/> + <flag name='x3130-upstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.3.1-1.caps b/tests/qemucapabilitiesdata/caps_1.3.1-1.caps index 824ef02..7cec7f9 100644 --- a/tests/qemucapabilitiesdata/caps_1.3.1-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.3.1-1.caps @@ -136,4 +136,5 @@ <flag name='qxl-vga.vgamem_mb'/> <flag name='pci-serial'/> <flag name='ioh3420'/> + <flag name='x3130-upstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.4.2-1.caps b/tests/qemucapabilitiesdata/caps_1.4.2-1.caps index 7ef5199..f5f0034 100644 --- a/tests/qemucapabilitiesdata/caps_1.4.2-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.4.2-1.caps @@ -137,4 +137,5 @@ <flag name='qxl-vga.vgamem_mb'/> <flag name='pci-serial'/> <flag name='ioh3420'/> + <flag name='x3130-upstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.5.3-1.caps b/tests/qemucapabilitiesdata/caps_1.5.3-1.caps index 8c3d48e..9f0461a 100644 --- a/tests/qemucapabilitiesdata/caps_1.5.3-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.5.3-1.caps @@ -146,4 +146,5 @@ <flag name='qxl-vga.vgamem_mb'/> <flag name='pci-serial'/> <flag name='ioh3420'/> + <flag name='x3130-upstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.6.0-1.caps b/tests/qemucapabilitiesdata/caps_1.6.0-1.caps index 72f7625..1b23b82 100644 --- a/tests/qemucapabilitiesdata/caps_1.6.0-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.6.0-1.caps @@ -152,4 +152,5 @@ <flag name='qxl-vga.vgamem_mb'/> <flag name='pci-serial'/> <flag name='ioh3420'/> + <flag name='x3130-upstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.6.50-1.caps b/tests/qemucapabilitiesdata/caps_1.6.50-1.caps index d81c80c..ff0427f 100644 --- a/tests/qemucapabilitiesdata/caps_1.6.50-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.6.50-1.caps @@ -152,4 +152,5 @@ <flag name='qxl-vga.vgamem_mb'/> <flag name='pci-serial'/> <flag name='ioh3420'/> + <flag name='x3130-upstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_2.1.1-1.caps b/tests/qemucapabilitiesdata/caps_2.1.1-1.caps index 1a39dce..56b27e5 100644 --- a/tests/qemucapabilitiesdata/caps_2.1.1-1.caps +++ b/tests/qemucapabilitiesdata/caps_2.1.1-1.caps @@ -168,4 +168,5 @@ <flag name='pc-dimm'/> <flag name='pci-serial'/> <flag name='ioh3420'/> + <flag name='x3130-upstream'/> </qemuCaps> diff --git a/tests/qemuhelptest.c b/tests/qemuhelptest.c index 6211640..62b9a0c 100644 --- a/tests/qemuhelptest.c +++ b/tests/qemuhelptest.c @@ -754,7 +754,8 @@ mymain(void) QEMU_CAPS_DEVICE_USB_STORAGE, QEMU_CAPS_SPLASH_TIMEOUT, QEMU_CAPS_DEVICE_IVSHMEM, - QEMU_CAPS_DEVICE_IOH3420); + QEMU_CAPS_DEVICE_IOH3420, + QEMU_CAPS_DEVICE_X3130_UPSTREAM); DO_TEST("qemu-1.1.0", 1001000, 0, 0, QEMU_CAPS_VNC_COLON, QEMU_CAPS_NO_REBOOT, @@ -855,7 +856,8 @@ mymain(void) QEMU_CAPS_OBJECT_USB_AUDIO, QEMU_CAPS_SPLASH_TIMEOUT, QEMU_CAPS_DEVICE_IVSHMEM, - QEMU_CAPS_DEVICE_IOH3420); + QEMU_CAPS_DEVICE_IOH3420, + QEMU_CAPS_DEVICE_X3130_UPSTREAM); DO_TEST_FULL("qemu-1.2.0", 1002000, 0, 0, VIR_ERR_CONFIG_UNSUPPORTED, QEMU_CAPS_LAST); DO_TEST_FULL("qemu-kvm-1.2.0", 1002000, 1, 0, VIR_ERR_CONFIG_UNSUPPORTED, -- 2.1.0

This controller can be connected to any PCIe port, but not to a standard PCI port, which is the reason for the new connect type VIR_PCI_CONNECT_TYPE_PCIE_ONLY. pcie-switch provides 31 ports (slot=1 to slot=31), which can only have pci controllers of model "pcie-switch-downstream-port" plugged into them, which is the reason for the other new connect type VIR_PCI_CONNECT_TYPE_PCIE_SWITCH. --- docs/formatdomain.html.in | 5 ++-- docs/schemas/domaincommon.rng | 1 + src/conf/domain_addr.c | 11 +++++++- src/conf/domain_addr.h | 6 +++- src/conf/domain_conf.c | 3 +- src/conf/domain_conf.h | 1 + .../qemuxml2argv-pcie-switch-upstream-port.xml | 32 ++++++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 8 files changed, 55 insertions(+), 5 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index e6c140a..646c9ee 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2998,10 +2998,11 @@ PCI controllers have an optional <code>model</code> attribute with possible values <code>pci-root</code>, <code>pcie-root</code>, <code>pcie-root-port</code>, <code>pci-bridge</code>, - or <code>dmi-to-pci-bridge</code>. + <code>dmi-to-pci-bridge</code>, or <code>pcie-switch-upstream-port</code>. (pci-root and pci-bridge <span class="since">since 1.0.5</span>, pcie-root and dmi-to-pci-bridge <span class="since">since - 1.1.2</span>, pcie-root-port <span class="since">since 1.2.17</span>) + 1.1.2</span>, pcie-root-port and + pcie-switch-upstream-port <span class="since">since 1.2.17</span>) The root controllers (<code>pci-root</code> and <code>pcie-root</code>) have an optional <code>pcihole64</code> element specifying how big (in kilobytes, or in the unit specified by <code>pcihole64</code>'s diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 6139dfb..3f74cfc 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1752,6 +1752,7 @@ <value>pci-bridge</value> <value>dmi-to-pci-bridge</value> <value>pcie-root-port</value> + <value>pcie-switch-upstream-port</value> </choice> </attribute> </group> diff --git a/src/conf/domain_addr.c b/src/conf/domain_addr.c index 4b5e81e..59da745 100644 --- a/src/conf/domain_addr.c +++ b/src/conf/domain_addr.c @@ -57,6 +57,9 @@ virDomainPCIAddressFlagsCompatible(virDevicePCIAddressPtr addr, if (devFlags & VIR_PCI_CONNECT_HOTPLUGGABLE) busFlags |= VIR_PCI_CONNECT_HOTPLUGGABLE; } + /* devices with PCIE_ONLY can't connect to PCI, even if fromConfig */ + if (devFlags & VIR_PCI_CONNECT_TYPE_PCIE_ONLY) + devFlags |= VIR_PCI_CONNECT_TYPE_PCIE; /* If this bus doesn't allow the type of connection (PCI * vs. PCIe) required by the device, or if the device requires @@ -199,8 +202,14 @@ virDomainPCIAddressBusSetModel(virDomainPCIAddressBusPtr bus, case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: /* provides one slot which is pcie and hotpluggable */ bus->flags = VIR_PCI_CONNECT_TYPE_PCIE | VIR_PCI_CONNECT_HOTPLUGGABLE; + bus->minSlot = 0; + bus->maxSlot = 0; + break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_UPSTREAM_PORT: + /* 31 slots, can only accept pcie-switch-port, no hotplug */ + bus->flags = VIR_PCI_CONNECT_TYPE_PCIE_SWITCH; bus->minSlot = 1; - bus->maxSlot = 1; + bus->maxSlot = 31; break; default: virReportError(VIR_ERR_INTERNAL_ERROR, diff --git a/src/conf/domain_addr.h b/src/conf/domain_addr.h index 2a0ff96..8e9ca32 100644 --- a/src/conf/domain_addr.h +++ b/src/conf/domain_addr.h @@ -41,6 +41,10 @@ typedef enum { /* PCI Express devices can connect to this bus */ VIR_PCI_CONNECT_TYPE_PCIE_ROOT = 1 << 4, /* for devices that can only connect to pcie-root (i.e. root-port) */ + VIR_PCI_CONNECT_TYPE_PCIE_ONLY = 1 << 5, + /* devices that can't connect to pci even if manual config */ + VIR_PCI_CONNECT_TYPE_PCIE_SWITCH = 1 << 6, + /* devices that can only connect to a pcie-switch */ } virDomainPCIConnectFlags; typedef struct { @@ -73,7 +77,7 @@ typedef virDomainPCIAddressSet *virDomainPCIAddressSetPtr; */ # define VIR_PCI_CONNECT_TYPES_MASK \ (VIR_PCI_CONNECT_TYPE_PCI | VIR_PCI_CONNECT_TYPE_PCIE | \ - VIR_PCI_CONNECT_TYPE_PCIE_ROOT) + VIR_PCI_CONNECT_TYPE_PCIE_ROOT | VIR_PCI_CONNECT_TYPE_PCIE_SWITCH) /* combination of all bits that could be used to connect a normal * endpoint device (i.e. excluding the connection possible between an diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 0bdd51e..c0b8c92 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -324,7 +324,8 @@ VIR_ENUM_IMPL(virDomainControllerModelPCI, VIR_DOMAIN_CONTROLLER_MODEL_PCI_LAST, "pcie-root", "pci-bridge", "dmi-to-pci-bridge", - "pcie-root-port") + "pcie-root-port", + "pcie-switch-upstream-port") VIR_ENUM_IMPL(virDomainControllerModelSCSI, VIR_DOMAIN_CONTROLLER_MODEL_SCSI_LAST, "auto", diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 946210e..b7ce0ce 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -753,6 +753,7 @@ typedef enum { VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE, VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE, VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT, + VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_UPSTREAM_PORT, VIR_DOMAIN_CONTROLLER_MODEL_PCI_LAST } virDomainControllerModelPCI; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.xml b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.xml new file mode 100644 index 0000000..428a4ea --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.xml @@ -0,0 +1,32 @@ +<domain type='qemu'> + <name>q35-test</name> + <uuid>11dbdcdd-4c3b-482b-8903-9bdb8c0a2774</uuid> + <memory unit='KiB'>2097152</memory> + <currentMemory unit='KiB'>2097152</currentMemory> + <vcpu placement='static' cpuset='0-1'>2</vcpu> + <os> + <type arch='x86_64' machine='q35'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/libexec/qemu-kvm</emulator> + <disk type='block' device='disk'> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='sda' bus='sata'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='pci' index='0' model='pcie-root'/> + <controller type='pci' index='1' model='dmi-to-pci-bridge'/> + <controller type='pci' index='2' model='pci-bridge'/> + <controller type='pci' index='3' model='pcie-switch-upstream-port'/> + <controller type='sata' index='0'/> + <video> + <model type='qxl' ram='65536' vram='32768' vgamem='8192' heads='1'/> + </video> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 2acbae1..e5fcc5d 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -567,6 +567,7 @@ mymain(void) DO_TEST_DIFFERENT("pcie-root"); DO_TEST_DIFFERENT("q35"); DO_TEST("pcie-root-port"); + DO_TEST("pcie-switch-upstream-port"); DO_TEST("hostdev-scsi-lsi"); DO_TEST("hostdev-scsi-virtio-scsi"); -- 2.1.0

On Mon, 2015-06-22 at 14:44 -0400, Laine Stump wrote:
This controller can be connected to any PCIe port, but not to a standard PCI port, which is the reason for the new connect type VIR_PCI_CONNECT_TYPE_PCIE_ONLY. pcie-switch provides 31 ports (slot=1 to slot=31), which can only have pci controllers of model "pcie-switch-downstream-port" plugged into them, which is the reason for the other new connect type VIR_PCI_CONNECT_TYPE_PCIE_SWITCH. --- docs/formatdomain.html.in | 5 ++-- docs/schemas/domaincommon.rng | 1 + src/conf/domain_addr.c | 11 +++++++- src/conf/domain_addr.h | 6 +++- src/conf/domain_conf.c | 3 +- src/conf/domain_conf.h | 1 + .../qemuxml2argv-pcie-switch-upstream-port.xml | 32 ++++++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 8 files changed, 55 insertions(+), 5 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index e6c140a..646c9ee 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2998,10 +2998,11 @@ PCI controllers have an optional <code>model</code> attribute with possible values <code>pci-root</code>, <code>pcie-root</code>, <code>pcie-root-port</code>, <code>pci-bridge</code>, - or <code>dmi-to-pci-bridge</code>. + <code>dmi-to-pci-bridge</code>, or <code>pcie-switch-upstream-port</code>. (pci-root and pci-bridge <span class="since">since 1.0.5</span>, pcie-root and dmi-to-pci-bridge <span class="since">since - 1.1.2</span>, pcie-root-port <span class="since">since 1.2.17</span>) + 1.1.2</span>, pcie-root-port and + pcie-switch-upstream-port <span class="since">since 1.2.17</span>) The root controllers (<code>pci-root</code> and <code>pcie-root</code>) have an optional <code>pcihole64</code> element specifying how big (in kilobytes, or in the unit specified by <code>pcihole64</code>'s diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 6139dfb..3f74cfc 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1752,6 +1752,7 @@ <value>pci-bridge</value> <value>dmi-to-pci-bridge</value> <value>pcie-root-port</value> + <value>pcie-switch-upstream-port</value> </choice> </attribute> </group> diff --git a/src/conf/domain_addr.c b/src/conf/domain_addr.c index 4b5e81e..59da745 100644 --- a/src/conf/domain_addr.c +++ b/src/conf/domain_addr.c @@ -57,6 +57,9 @@ virDomainPCIAddressFlagsCompatible(virDevicePCIAddressPtr addr, if (devFlags & VIR_PCI_CONNECT_HOTPLUGGABLE) busFlags |= VIR_PCI_CONNECT_HOTPLUGGABLE; } + /* devices with PCIE_ONLY can't connect to PCI, even if fromConfig */ + if (devFlags & VIR_PCI_CONNECT_TYPE_PCIE_ONLY) + devFlags |= VIR_PCI_CONNECT_TYPE_PCIE;
/* If this bus doesn't allow the type of connection (PCI * vs. PCIe) required by the device, or if the device requires @@ -199,8 +202,14 @@ virDomainPCIAddressBusSetModel(virDomainPCIAddressBusPtr bus, case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: /* provides one slot which is pcie and hotpluggable */ bus->flags = VIR_PCI_CONNECT_TYPE_PCIE | VIR_PCI_CONNECT_HOTPLUGGABLE; + bus->minSlot = 0; + bus->maxSlot = 0;
Shouldn't this have belonged to a previous patch?
+ break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_UPSTREAM_PORT: + /* 31 slots, can only accept pcie-switch-port, no hotplug */ + bus->flags = VIR_PCI_CONNECT_TYPE_PCIE_SWITCH; bus->minSlot = 1; - bus->maxSlot = 1; + bus->maxSlot = 31;
Why exactly are we reserving slot 0? I know we discussed on IRC that slot 0 is reserved for the host bridge on root buses, but there are no reserved slots downstream of an upstream switch port. If QEMU is limiting this, it's a bug, if libvirt wants to impose this, it should be documented as such. Thanks, Alex
break; default: virReportError(VIR_ERR_INTERNAL_ERROR, diff --git a/src/conf/domain_addr.h b/src/conf/domain_addr.h index 2a0ff96..8e9ca32 100644 --- a/src/conf/domain_addr.h +++ b/src/conf/domain_addr.h @@ -41,6 +41,10 @@ typedef enum { /* PCI Express devices can connect to this bus */ VIR_PCI_CONNECT_TYPE_PCIE_ROOT = 1 << 4, /* for devices that can only connect to pcie-root (i.e. root-port) */ + VIR_PCI_CONNECT_TYPE_PCIE_ONLY = 1 << 5, + /* devices that can't connect to pci even if manual config */ + VIR_PCI_CONNECT_TYPE_PCIE_SWITCH = 1 << 6, + /* devices that can only connect to a pcie-switch */ } virDomainPCIConnectFlags;
typedef struct { @@ -73,7 +77,7 @@ typedef virDomainPCIAddressSet *virDomainPCIAddressSetPtr; */ # define VIR_PCI_CONNECT_TYPES_MASK \ (VIR_PCI_CONNECT_TYPE_PCI | VIR_PCI_CONNECT_TYPE_PCIE | \ - VIR_PCI_CONNECT_TYPE_PCIE_ROOT) + VIR_PCI_CONNECT_TYPE_PCIE_ROOT | VIR_PCI_CONNECT_TYPE_PCIE_SWITCH)
/* combination of all bits that could be used to connect a normal * endpoint device (i.e. excluding the connection possible between an diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 0bdd51e..c0b8c92 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -324,7 +324,8 @@ VIR_ENUM_IMPL(virDomainControllerModelPCI, VIR_DOMAIN_CONTROLLER_MODEL_PCI_LAST, "pcie-root", "pci-bridge", "dmi-to-pci-bridge", - "pcie-root-port") + "pcie-root-port", + "pcie-switch-upstream-port")
VIR_ENUM_IMPL(virDomainControllerModelSCSI, VIR_DOMAIN_CONTROLLER_MODEL_SCSI_LAST, "auto", diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 946210e..b7ce0ce 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -753,6 +753,7 @@ typedef enum { VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE, VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE, VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT, + VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_UPSTREAM_PORT,
VIR_DOMAIN_CONTROLLER_MODEL_PCI_LAST } virDomainControllerModelPCI; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.xml b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.xml new file mode 100644 index 0000000..428a4ea --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.xml @@ -0,0 +1,32 @@ +<domain type='qemu'> + <name>q35-test</name> + <uuid>11dbdcdd-4c3b-482b-8903-9bdb8c0a2774</uuid> + <memory unit='KiB'>2097152</memory> + <currentMemory unit='KiB'>2097152</currentMemory> + <vcpu placement='static' cpuset='0-1'>2</vcpu> + <os> + <type arch='x86_64' machine='q35'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/libexec/qemu-kvm</emulator> + <disk type='block' device='disk'> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='sda' bus='sata'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='pci' index='0' model='pcie-root'/> + <controller type='pci' index='1' model='dmi-to-pci-bridge'/> + <controller type='pci' index='2' model='pci-bridge'/> + <controller type='pci' index='3' model='pcie-switch-upstream-port'/> + <controller type='sata' index='0'/> + <video> + <model type='qxl' ram='65536' vram='32768' vgamem='8192' heads='1'/> + </video> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 2acbae1..e5fcc5d 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -567,6 +567,7 @@ mymain(void) DO_TEST_DIFFERENT("pcie-root"); DO_TEST_DIFFERENT("q35"); DO_TEST("pcie-root-port"); + DO_TEST("pcie-switch-upstream-port");
DO_TEST("hostdev-scsi-lsi"); DO_TEST("hostdev-scsi-virtio-scsi");

On 06/23/2015 11:23 AM, Alex Williamson wrote:
On Mon, 2015-06-22 at 14:44 -0400, Laine Stump wrote:
diff --git a/src/conf/domain_addr.c b/src/conf/domain_addr.c index 4b5e81e..59da745 100644 --- a/src/conf/domain_addr.c +++ b/src/conf/domain_addr.c @@ -57,6 +57,9 @@ virDomainPCIAddressFlagsCompatible(virDevicePCIAddressPtr addr, if (devFlags & VIR_PCI_CONNECT_HOTPLUGGABLE) busFlags |= VIR_PCI_CONNECT_HOTPLUGGABLE; } + /* devices with PCIE_ONLY can't connect to PCI, even if fromConfig */ + if (devFlags & VIR_PCI_CONNECT_TYPE_PCIE_ONLY) + devFlags |= VIR_PCI_CONNECT_TYPE_PCIE;
/* If this bus doesn't allow the type of connection (PCI * vs. PCIe) required by the device, or if the device requires @@ -199,8 +202,14 @@ virDomainPCIAddressBusSetModel(virDomainPCIAddressBusPtr bus, case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: /* provides one slot which is pcie and hotpluggable */ bus->flags = VIR_PCI_CONNECT_TYPE_PCIE | VIR_PCI_CONNECT_HOTPLUGGABLE; + bus->minSlot = 0; + bus->maxSlot = 0; Shouldn't this have belonged to a previous patch?
Yes, and I'm pretty certain that it originally was. I must have messed it up when resolving merge conflicts during my myriads of rebases to change names.
+ break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_UPSTREAM_PORT: + /* 31 slots, can only accept pcie-switch-port, no hotplug */ + bus->flags = VIR_PCI_CONNECT_TYPE_PCIE_SWITCH; bus->minSlot = 1; - bus->maxSlot = 1; + bus->maxSlot = 31; Why exactly are we reserving slot 0?
Uh, because I forgot to change it after talking to you? Thanks for catching it!
I know we discussed on IRC that slot 0 is reserved for the host bridge on root buses, but there are no reserved slots downstream of an upstream switch port.
(Btw, I've forgotten - is slot 0 also reserved for pci-bridge and 82801b11-bridge? Or can I send a patch to allow slot 0 on those as well?)

On Tue, 2015-06-23 at 13:47 -0400, Laine Stump wrote:
On 06/23/2015 11:23 AM, Alex Williamson wrote:
On Mon, 2015-06-22 at 14:44 -0400, Laine Stump wrote:
diff --git a/src/conf/domain_addr.c b/src/conf/domain_addr.c index 4b5e81e..59da745 100644 --- a/src/conf/domain_addr.c +++ b/src/conf/domain_addr.c @@ -57,6 +57,9 @@ virDomainPCIAddressFlagsCompatible(virDevicePCIAddressPtr addr, if (devFlags & VIR_PCI_CONNECT_HOTPLUGGABLE) busFlags |= VIR_PCI_CONNECT_HOTPLUGGABLE; } + /* devices with PCIE_ONLY can't connect to PCI, even if fromConfig */ + if (devFlags & VIR_PCI_CONNECT_TYPE_PCIE_ONLY) + devFlags |= VIR_PCI_CONNECT_TYPE_PCIE;
/* If this bus doesn't allow the type of connection (PCI * vs. PCIe) required by the device, or if the device requires @@ -199,8 +202,14 @@ virDomainPCIAddressBusSetModel(virDomainPCIAddressBusPtr bus, case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT: /* provides one slot which is pcie and hotpluggable */ bus->flags = VIR_PCI_CONNECT_TYPE_PCIE | VIR_PCI_CONNECT_HOTPLUGGABLE; + bus->minSlot = 0; + bus->maxSlot = 0; Shouldn't this have belonged to a previous patch?
Yes, and I'm pretty certain that it originally was. I must have messed it up when resolving merge conflicts during my myriads of rebases to change names.
+ break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_UPSTREAM_PORT: + /* 31 slots, can only accept pcie-switch-port, no hotplug */ + bus->flags = VIR_PCI_CONNECT_TYPE_PCIE_SWITCH; bus->minSlot = 1; - bus->maxSlot = 1; + bus->maxSlot = 31; Why exactly are we reserving slot 0?
Uh, because I forgot to change it after talking to you? Thanks for catching it!
I know we discussed on IRC that slot 0 is reserved for the host bridge on root buses, but there are no reserved slots downstream of an upstream switch port.
(Btw, I've forgotten - is slot 0 also reserved for pci-bridge and 82801b11-bridge? Or can I send a patch to allow slot 0 on those as well?)
I don't think there's anything special about slot 0 for either of those bridges, a quick test would verify. However, we're immediately into your case of changing the auto-assignment parameters. Have we always stored the resulting auto-assigned bus addresses for devices so we don't suddenly change the location of devices for existing domains? Thanks, Alex

On 06/23/2015 03:07 PM, Alex Williamson wrote:
I don't think there's anything special about slot 0 for either of those bridges, a quick test would verify. However, we're immediately into your case of changing the auto-assignment parameters. Have we always stored the resulting auto-assigned bus addresses for devices so we don't suddenly change the location of devices for existing domains?
Yes. libvirt always stores the bus/slot (and of course function and domain, which are currently both always 0 in the case of auto-assign) back in the XML. It has been this way for "a very long time".

this is backed by the qemu device x3130-upstream. --- src/qemu/qemu_command.c | 22 ++++++++++++++++++++++ .../qemuxml2argv-pcie-switch-upstream-port.args | 9 +++++++++ tests/qemuxml2argvtest.c | 9 +++++++++ 3 files changed, 40 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.args diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 01ac690..b4df65a 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1604,6 +1604,12 @@ qemuCollectPCIAddress(virDomainDefPtr def ATTRIBUTE_UNUSED, */ flags = VIR_PCI_CONNECT_TYPE_PCIE_ROOT; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_UPSTREAM_PORT: + /* pcie-switch can only connect to a true + * pcie bus, and can't be hot-plugged. + */ + flags = VIR_PCI_CONNECT_TYPE_PCIE_ONLY; + break; default: break; } @@ -2355,6 +2361,12 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, /* pcie-root-port can only plug into pcie-root */ flags = VIR_PCI_CONNECT_TYPE_PCIE_ROOT; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_UPSTREAM_PORT: + /* pcie-switch really does need a real PCIe + * port, but it doesn't need to be pcie-root + */ + flags = VIR_PCI_CONNECT_TYPE_PCIE_ONLY; + break; default: flags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI; break; @@ -4625,6 +4637,16 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, virBufferAsprintf(&buf, "ioh3420,port=%d,chassis=%d,id=%s", def->idx, def->idx, def->info.alias); break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_UPSTREAM_PORT: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_X3130_UPSTREAM)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("The pcie-switch-upstream-port " + "(x3130-upstream) controller " + "is not supported in this QEMU binary")); + goto error; + } + virBufferAsprintf(&buf, "x3130-upstream,id=%s", def->info.alias); + break; } break; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.args b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.args new file mode 100644 index 0000000..d737b6b --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.args @@ -0,0 +1,9 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/libexec/qemu-kvm -S -M q35 -m 2048 -smp 2 -nographic -nodefaults \ +-monitor unix:/tmp/test-monitor,server,nowait -no-acpi -boot c \ +-device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e \ +-device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 \ +-device x3130-upstream,id=pci.3,bus=pcie.0,addr=0x2 \ +-drive file=/dev/HostVG/QEMUGuest1,if=none,id=drive-sata0-0-0 \ +-device ide-drive,bus=ide.0,drive=drive-sata0-0-0,id=sata0-0-0 \ +-vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=33554432 diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 999edff..b15352e 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1469,6 +1469,15 @@ mymain(void) QEMU_CAPS_VGA, QEMU_CAPS_DEVICE_VIDEO_PRIMARY, QEMU_CAPS_VGA, QEMU_CAPS_VGA_QXL, QEMU_CAPS_DEVICE_QXL); + DO_TEST("pcie-switch-upstream-port", + QEMU_CAPS_DEVICE, QEMU_CAPS_DEVICE_PCI_BRIDGE, + QEMU_CAPS_DEVICE_DMI_TO_PCI_BRIDGE, + QEMU_CAPS_DEVICE_IOH3420, + QEMU_CAPS_DEVICE_X3130_UPSTREAM, + QEMU_CAPS_DRIVE, QEMU_CAPS_ICH9_AHCI, + QEMU_CAPS_VGA, QEMU_CAPS_DEVICE_VIDEO_PRIMARY, + QEMU_CAPS_VGA, QEMU_CAPS_VGA_QXL, QEMU_CAPS_DEVICE_QXL); + DO_TEST("hostdev-scsi-lsi", QEMU_CAPS_DRIVE, QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE, QEMU_CAPS_VIRTIO_SCSI, QEMU_CAPS_SCSI_LSI, -- 2.1.0

The downstream ports of an x3130-upstream switch can each have one of these plugged into them (and that is the only place they can be connected). Each xio3130-downstream provides a single PCIe port that can have PCI or PCIe devices hotplugged into it. Apparently an entire set of x3130-upstream + several xio3130-downstreams can be hotplugged as a unit, but it's not clear to me yet how that would be done, since qemu only allows attaching a single device at a time. This device will be used to implement the "pcie-switch-port" model of pci controller. --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_1.2.2-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.3.1-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.4.2-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.5.3-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.6.0-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.6.50-1.caps | 1 + tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 1 + tests/qemuhelptest.c | 6 ++++-- 10 files changed, 14 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 5a33efb..4103b6e 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -289,6 +289,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "vhost-user-multiqueue", /* 190 */ "ioh3420", "x3130-upstream", + "xio3130-downstream", ); @@ -1570,6 +1571,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "pci-serial", QEMU_CAPS_DEVICE_PCI_SERIAL }, { "ioh3420", QEMU_CAPS_DEVICE_IOH3420 }, { "x3130-upstream", QEMU_CAPS_DEVICE_X3130_UPSTREAM }, + { "xio3130-downstream", QEMU_CAPS_DEVICE_XIO3130_DOWNSTREAM }, }; static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBlk[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 3d28941..121e06b 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -232,6 +232,7 @@ typedef enum { QEMU_CAPS_VHOSTUSER_MULTIQUEUE = 190, /* vhost-user with -netdev queues= */ QEMU_CAPS_DEVICE_IOH3420 = 191, /* -device ioh3420 */ QEMU_CAPS_DEVICE_X3130_UPSTREAM = 192, /* -device x3130-upstream */ + QEMU_CAPS_DEVICE_XIO3130_DOWNSTREAM = 193, /* -device xio3130-downstream */ QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_1.2.2-1.caps b/tests/qemucapabilitiesdata/caps_1.2.2-1.caps index 78d7b82..ba16635 100644 --- a/tests/qemucapabilitiesdata/caps_1.2.2-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.2.2-1.caps @@ -122,4 +122,5 @@ <flag name='qxl-vga.vgamem_mb'/> <flag name='ioh3420'/> <flag name='x3130-upstream'/> + <flag name='xio3130-downstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.3.1-1.caps b/tests/qemucapabilitiesdata/caps_1.3.1-1.caps index 7cec7f9..51cd6d9 100644 --- a/tests/qemucapabilitiesdata/caps_1.3.1-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.3.1-1.caps @@ -137,4 +137,5 @@ <flag name='pci-serial'/> <flag name='ioh3420'/> <flag name='x3130-upstream'/> + <flag name='xio3130-downstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.4.2-1.caps b/tests/qemucapabilitiesdata/caps_1.4.2-1.caps index f5f0034..03d0a3e 100644 --- a/tests/qemucapabilitiesdata/caps_1.4.2-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.4.2-1.caps @@ -138,4 +138,5 @@ <flag name='pci-serial'/> <flag name='ioh3420'/> <flag name='x3130-upstream'/> + <flag name='xio3130-downstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.5.3-1.caps b/tests/qemucapabilitiesdata/caps_1.5.3-1.caps index 9f0461a..e2f22e4 100644 --- a/tests/qemucapabilitiesdata/caps_1.5.3-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.5.3-1.caps @@ -147,4 +147,5 @@ <flag name='pci-serial'/> <flag name='ioh3420'/> <flag name='x3130-upstream'/> + <flag name='xio3130-downstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.6.0-1.caps b/tests/qemucapabilitiesdata/caps_1.6.0-1.caps index 1b23b82..874a050 100644 --- a/tests/qemucapabilitiesdata/caps_1.6.0-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.6.0-1.caps @@ -153,4 +153,5 @@ <flag name='pci-serial'/> <flag name='ioh3420'/> <flag name='x3130-upstream'/> + <flag name='xio3130-downstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_1.6.50-1.caps b/tests/qemucapabilitiesdata/caps_1.6.50-1.caps index ff0427f..dd3bcda 100644 --- a/tests/qemucapabilitiesdata/caps_1.6.50-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.6.50-1.caps @@ -153,4 +153,5 @@ <flag name='pci-serial'/> <flag name='ioh3420'/> <flag name='x3130-upstream'/> + <flag name='xio3130-downstream'/> </qemuCaps> diff --git a/tests/qemucapabilitiesdata/caps_2.1.1-1.caps b/tests/qemucapabilitiesdata/caps_2.1.1-1.caps index 56b27e5..3ee2d6f 100644 --- a/tests/qemucapabilitiesdata/caps_2.1.1-1.caps +++ b/tests/qemucapabilitiesdata/caps_2.1.1-1.caps @@ -169,4 +169,5 @@ <flag name='pci-serial'/> <flag name='ioh3420'/> <flag name='x3130-upstream'/> + <flag name='xio3130-downstream'/> </qemuCaps> diff --git a/tests/qemuhelptest.c b/tests/qemuhelptest.c index 62b9a0c..8f317d4 100644 --- a/tests/qemuhelptest.c +++ b/tests/qemuhelptest.c @@ -755,7 +755,8 @@ mymain(void) QEMU_CAPS_SPLASH_TIMEOUT, QEMU_CAPS_DEVICE_IVSHMEM, QEMU_CAPS_DEVICE_IOH3420, - QEMU_CAPS_DEVICE_X3130_UPSTREAM); + QEMU_CAPS_DEVICE_X3130_UPSTREAM, + QEMU_CAPS_DEVICE_XIO3130_DOWNSTREAM); DO_TEST("qemu-1.1.0", 1001000, 0, 0, QEMU_CAPS_VNC_COLON, QEMU_CAPS_NO_REBOOT, @@ -857,7 +858,8 @@ mymain(void) QEMU_CAPS_SPLASH_TIMEOUT, QEMU_CAPS_DEVICE_IVSHMEM, QEMU_CAPS_DEVICE_IOH3420, - QEMU_CAPS_DEVICE_X3130_UPSTREAM); + QEMU_CAPS_DEVICE_X3130_UPSTREAM, + QEMU_CAPS_DEVICE_XIO3130_DOWNSTREAM); DO_TEST_FULL("qemu-1.2.0", 1002000, 0, 0, VIR_ERR_CONFIG_UNSUPPORTED, QEMU_CAPS_LAST); DO_TEST_FULL("qemu-kvm-1.2.0", 1002000, 1, 0, VIR_ERR_CONFIG_UNSUPPORTED, -- 2.1.0

This controller can be connected only to a port on a "pcie-switch-upstream-port". It provides a single hotpluggable port that will accept any PCI or PCIe device. --- docs/formatdomain.html.in | 31 +++++++++++++------ docs/schemas/domaincommon.rng | 1 + src/conf/domain_addr.c | 6 ++++ src/conf/domain_conf.c | 3 +- src/conf/domain_conf.h | 1 + .../qemuxml2argv-pcie-switch-downstream-port.xml | 36 ++++++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 7 files changed, 68 insertions(+), 11 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 646c9ee..7e16dc4 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2998,11 +2998,12 @@ PCI controllers have an optional <code>model</code> attribute with possible values <code>pci-root</code>, <code>pcie-root</code>, <code>pcie-root-port</code>, <code>pci-bridge</code>, - <code>dmi-to-pci-bridge</code>, or <code>pcie-switch-upstream-port</code>. + <code>dmi-to-pci-bridge</code>, <code>pcie-switch-upstream-port</code>, or + <code>pcie-switch-downstream-port</code>. (pci-root and pci-bridge <span class="since">since 1.0.5</span>, pcie-root and dmi-to-pci-bridge <span class="since">since - 1.1.2</span>, pcie-root-port and - pcie-switch-upstream-port <span class="since">since 1.2.17</span>) + 1.1.2</span>, pcie-root-port, pcie-switch-upstream-port, and + pcie-switch-downstream-port <span class="since">since 1.2.30</span>) The root controllers (<code>pci-root</code> and <code>pcie-root</code>) have an optional <code>pcihole64</code> element specifying how big (in kilobytes, or in the unit specified by <code>pcihole64</code>'s @@ -3058,14 +3059,24 @@ </p> <p> Domains with an implicit pcie-root can also add controllers - with <code>model='pcie-root-port'</code>. This is a simple type of - bridge device that can connect only to one of the 31 slots on - the pcie-root bus on its upstream side, and makes a single - (PCIe, hotpluggable) port (at slot='0') available on the - downstream side. This controller can be used to provide a single - slot to later hotplug a PCIe device (but is not itself + with <code>model='pcie-root-port'</code>, + <code>model='pcie-switch-upstream-port'</code>, + and <code>model='pcie-switch-downstream-port'</code>. pcie-root-port + is a simple type of bridge device that can connect only to one + of the 31 slots on the pcie-root bus on its upstream side, and + makes a single (PCIe, hotpluggable) port (at slot='0') available + on the downstream side. This controller can be used to provide a + single slot to later hotplug a PCIe device (but is not itself hotpluggable - it must be in the configuration when the domain - is started). (<span class="since">since 1.2.17</span>) + is started). pcie-switch-upstream-port is a more flexible (but + also more complex) device that can plug into any PCIe port on + the upstream side, and provides 31 ports on the downstream side + that accept only pcie-switch-downstream-port devices; each + pcie-switch-downstream-port device can only plug into a + pcie-switch-upstream-port on its upstream side, and on its + downstream side provides a single hotpluggable pcie port that + can accept any standard pci or pcie device. + (<span class="since">since 1.2.30</span>) </p> <pre> ... diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 3f74cfc..490c0e7 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1753,6 +1753,7 @@ <value>dmi-to-pci-bridge</value> <value>pcie-root-port</value> <value>pcie-switch-upstream-port</value> + <value>pcie-switch-downstream-port</value> </choice> </attribute> </group> diff --git a/src/conf/domain_addr.c b/src/conf/domain_addr.c index 59da745..b97a62c 100644 --- a/src/conf/domain_addr.c +++ b/src/conf/domain_addr.c @@ -211,6 +211,12 @@ virDomainPCIAddressBusSetModel(virDomainPCIAddressBusPtr bus, bus->minSlot = 1; bus->maxSlot = 31; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_DOWNSTREAM_PORT: + /* provides one slot which is pcie and hotpluggable */ + bus->flags = VIR_PCI_CONNECT_TYPE_PCIE | VIR_PCI_CONNECT_HOTPLUGGABLE; + bus->minSlot = 0; + bus->maxSlot = 0; + break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid PCI controller model %d"), model); diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c0b8c92..8987e81 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -325,7 +325,8 @@ VIR_ENUM_IMPL(virDomainControllerModelPCI, VIR_DOMAIN_CONTROLLER_MODEL_PCI_LAST, "pci-bridge", "dmi-to-pci-bridge", "pcie-root-port", - "pcie-switch-upstream-port") + "pcie-switch-upstream-port", + "pcie-switch-downstream-port") VIR_ENUM_IMPL(virDomainControllerModelSCSI, VIR_DOMAIN_CONTROLLER_MODEL_SCSI_LAST, "auto", diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index b7ce0ce..1351eef 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -754,6 +754,7 @@ typedef enum { VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE, VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT_PORT, VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_UPSTREAM_PORT, + VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_DOWNSTREAM_PORT, VIR_DOMAIN_CONTROLLER_MODEL_PCI_LAST } virDomainControllerModelPCI; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.xml b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.xml new file mode 100644 index 0000000..7d30bd1 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.xml @@ -0,0 +1,36 @@ +<domain type='qemu'> + <name>q35-test</name> + <uuid>11dbdcdd-4c3b-482b-8903-9bdb8c0a2774</uuid> + <memory unit='KiB'>2097152</memory> + <currentMemory unit='KiB'>2097152</currentMemory> + <vcpu placement='static' cpuset='0-1'>2</vcpu> + <os> + <type arch='x86_64' machine='q35'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/libexec/qemu-kvm</emulator> + <disk type='block' device='disk'> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='sda' bus='sata'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='pci' index='0' model='pcie-root'/> + <controller type='pci' index='1' model='dmi-to-pci-bridge'/> + <controller type='pci' index='2' model='pci-bridge'/> + <controller type='pci' index='3' model='pcie-switch-upstream-port'/> + <controller type='pci' index='4' model='pcie-switch-downstream-port'/> + <controller type='pci' index='5' model='pcie-switch-downstream-port'/> + <controller type='pci' index='6' model='pcie-switch-downstream-port'/> + <controller type='pci' index='7' model='pcie-switch-downstream-port'/> + <controller type='sata' index='0'/> + <video> + <model type='qxl' ram='65536' vram='32768' vgamem='8192' heads='1'/> + </video> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index e5fcc5d..d121fa0 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -568,6 +568,7 @@ mymain(void) DO_TEST_DIFFERENT("q35"); DO_TEST("pcie-root-port"); DO_TEST("pcie-switch-upstream-port"); + DO_TEST("pcie-switch-downstream-port"); DO_TEST("hostdev-scsi-lsi"); DO_TEST("hostdev-scsi-virtio-scsi"); -- 2.1.0

This is backed by the qemu device xio3130-downstream. It can only be connected to a pcie-switch-upstream-port (x3130-upstream) on the upstream side. --- src/qemu/qemu_command.c | 15 +++++++++++++++ .../qemuxml2argv-pcie-switch-downstream-port.args | 13 +++++++++++++ tests/qemuxml2argvtest.c | 9 +++++++++ 3 files changed, 37 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.args diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index b4df65a..064adcb 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -2367,6 +2367,10 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, */ flags = VIR_PCI_CONNECT_TYPE_PCIE_ONLY; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_DOWNSTREAM_PORT: + /* pcie-switch-port can only plug into pcie-switch */ + flags = VIR_PCI_CONNECT_TYPE_PCIE_SWITCH; + break; default: flags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI; break; @@ -4647,6 +4651,17 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, } virBufferAsprintf(&buf, "x3130-upstream,id=%s", def->info.alias); break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_DOWNSTREAM_PORT: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_XIO3130_DOWNSTREAM)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("The pcie-switch-downstream-port " + "(xio3130-downstream) controller " + "is not supported in this QEMU binary")); + goto error; + } + virBufferAsprintf(&buf, "xio3130-downstream,port=1,chassis=%d,id=%s", + def->idx, def->info.alias); + break; } break; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.args b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.args new file mode 100644 index 0000000..04b760c --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.args @@ -0,0 +1,13 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/libexec/qemu-kvm -S -M q35 -m 2048 -smp 2 -nographic -nodefaults \ +-monitor unix:/tmp/test-monitor,server,nowait -no-acpi -boot c \ +-device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e \ +-device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 \ +-device x3130-upstream,id=pci.3,bus=pcie.0,addr=0x2 \ +-device xio3130-downstream,port=1,chassis=4,id=pci.4,bus=pci.3,addr=0x1 \ +-device xio3130-downstream,port=1,chassis=5,id=pci.5,bus=pci.3,addr=0x2 \ +-device xio3130-downstream,port=1,chassis=6,id=pci.6,bus=pci.3,addr=0x3 \ +-device xio3130-downstream,port=1,chassis=7,id=pci.7,bus=pci.3,addr=0x4 \ +-drive file=/dev/HostVG/QEMUGuest1,if=none,id=drive-sata0-0-0 \ +-device ide-drive,bus=ide.0,drive=drive-sata0-0-0,id=sata0-0-0 \ +-vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=33554432 diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index b15352e..fc16152 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1477,6 +1477,15 @@ mymain(void) QEMU_CAPS_DRIVE, QEMU_CAPS_ICH9_AHCI, QEMU_CAPS_VGA, QEMU_CAPS_DEVICE_VIDEO_PRIMARY, QEMU_CAPS_VGA, QEMU_CAPS_VGA_QXL, QEMU_CAPS_DEVICE_QXL); + DO_TEST("pcie-switch-downstream-port", + QEMU_CAPS_DEVICE, QEMU_CAPS_DEVICE_PCI_BRIDGE, + QEMU_CAPS_DEVICE_DMI_TO_PCI_BRIDGE, + QEMU_CAPS_DEVICE_IOH3420, + QEMU_CAPS_DEVICE_X3130_UPSTREAM, + QEMU_CAPS_DEVICE_XIO3130_DOWNSTREAM, + QEMU_CAPS_DRIVE, QEMU_CAPS_ICH9_AHCI, + QEMU_CAPS_VGA, QEMU_CAPS_DEVICE_VIDEO_PRIMARY, + QEMU_CAPS_VGA, QEMU_CAPS_VGA_QXL, QEMU_CAPS_DEVICE_QXL); DO_TEST("hostdev-scsi-lsi", QEMU_CAPS_DRIVE, QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE, -- 2.1.0

On Mon, 2015-06-22 at 14:44 -0400, Laine Stump wrote:
This is backed by the qemu device xio3130-downstream. It can only be connected to a pcie-switch-upstream-port (x3130-upstream) on the upstream side. --- src/qemu/qemu_command.c | 15 +++++++++++++++ .../qemuxml2argv-pcie-switch-downstream-port.args | 13 +++++++++++++ tests/qemuxml2argvtest.c | 9 +++++++++ 3 files changed, 37 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.args
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index b4df65a..064adcb 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -2367,6 +2367,10 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, */ flags = VIR_PCI_CONNECT_TYPE_PCIE_ONLY; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_DOWNSTREAM_PORT: + /* pcie-switch-port can only plug into pcie-switch */ + flags = VIR_PCI_CONNECT_TYPE_PCIE_SWITCH; + break; default: flags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI; break; @@ -4647,6 +4651,17 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, } virBufferAsprintf(&buf, "x3130-upstream,id=%s", def->info.alias); break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_DOWNSTREAM_PORT: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_XIO3130_DOWNSTREAM)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("The pcie-switch-downstream-port " + "(xio3130-downstream) controller " + "is not supported in this QEMU binary")); + goto error; + } + virBufferAsprintf(&buf, "xio3130-downstream,port=1,chassis=%d,id=%s", + def->idx, def->info.alias);
And why put all the downstream switch ports on port=1? I had suggested the slot address would match what I see on physical hardware. Is the slot address not available at this point? Thanks, Alex
+ break; } break;
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.args b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.args new file mode 100644 index 0000000..04b760c --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.args @@ -0,0 +1,13 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/libexec/qemu-kvm -S -M q35 -m 2048 -smp 2 -nographic -nodefaults \ +-monitor unix:/tmp/test-monitor,server,nowait -no-acpi -boot c \ +-device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e \ +-device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 \ +-device x3130-upstream,id=pci.3,bus=pcie.0,addr=0x2 \ +-device xio3130-downstream,port=1,chassis=4,id=pci.4,bus=pci.3,addr=0x1 \ +-device xio3130-downstream,port=1,chassis=5,id=pci.5,bus=pci.3,addr=0x2 \ +-device xio3130-downstream,port=1,chassis=6,id=pci.6,bus=pci.3,addr=0x3 \ +-device xio3130-downstream,port=1,chassis=7,id=pci.7,bus=pci.3,addr=0x4 \ +-drive file=/dev/HostVG/QEMUGuest1,if=none,id=drive-sata0-0-0 \ +-device ide-drive,bus=ide.0,drive=drive-sata0-0-0,id=sata0-0-0 \ +-vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=33554432 diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index b15352e..fc16152 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1477,6 +1477,15 @@ mymain(void) QEMU_CAPS_DRIVE, QEMU_CAPS_ICH9_AHCI, QEMU_CAPS_VGA, QEMU_CAPS_DEVICE_VIDEO_PRIMARY, QEMU_CAPS_VGA, QEMU_CAPS_VGA_QXL, QEMU_CAPS_DEVICE_QXL); + DO_TEST("pcie-switch-downstream-port", + QEMU_CAPS_DEVICE, QEMU_CAPS_DEVICE_PCI_BRIDGE, + QEMU_CAPS_DEVICE_DMI_TO_PCI_BRIDGE, + QEMU_CAPS_DEVICE_IOH3420, + QEMU_CAPS_DEVICE_X3130_UPSTREAM, + QEMU_CAPS_DEVICE_XIO3130_DOWNSTREAM, + QEMU_CAPS_DRIVE, QEMU_CAPS_ICH9_AHCI, + QEMU_CAPS_VGA, QEMU_CAPS_DEVICE_VIDEO_PRIMARY, + QEMU_CAPS_VGA, QEMU_CAPS_VGA_QXL, QEMU_CAPS_DEVICE_QXL);
DO_TEST("hostdev-scsi-lsi", QEMU_CAPS_DRIVE, QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE,

On 06/23/2015 11:35 AM, Alex Williamson wrote:
On Mon, 2015-06-22 at 14:44 -0400, Laine Stump wrote:
This is backed by the qemu device xio3130-downstream. It can only be connected to a pcie-switch-upstream-port (x3130-upstream) on the upstream side. --- src/qemu/qemu_command.c | 15 +++++++++++++++ .../qemuxml2argv-pcie-switch-downstream-port.args | 13 +++++++++++++ tests/qemuxml2argvtest.c | 9 +++++++++ 3 files changed, 37 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.args
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index b4df65a..064adcb 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -2367,6 +2367,10 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, */ flags = VIR_PCI_CONNECT_TYPE_PCIE_ONLY; break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_DOWNSTREAM_PORT: + /* pcie-switch-port can only plug into pcie-switch */ + flags = VIR_PCI_CONNECT_TYPE_PCIE_SWITCH; + break; default: flags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI; break; @@ -4647,6 +4651,17 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, } virBufferAsprintf(&buf, "x3130-upstream,id=%s", def->info.alias); break; + case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_SWITCH_DOWNSTREAM_PORT: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_XIO3130_DOWNSTREAM)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("The pcie-switch-downstream-port " + "(xio3130-downstream) controller " + "is not supported in this QEMU binary")); + goto error; + } + virBufferAsprintf(&buf, "xio3130-downstream,port=1,chassis=%d,id=%s", + def->idx, def->info.alias); And why put all the downstream switch ports on port=1?
Just because that's what was shown as an example in docs/q35-chipset.cfg in the qemu source.
I had suggested the slot address would match what I see on physical hardware. Is the slot address not available at this point?
Yes, the slot is available - def->info.addr.pci.slot. I was just in "new information overload" and missed that part. I'll do it that way the next round. Thanks again for the comments.

On Mon, Jun 22, 2015 at 02:44:05PM -0400, Laine Stump wrote:
The first 4 patches are bugfixes/reorganizations that have no controversy.
The sets of 5-7, 8-10, and 11-13 each implement a new model of PCI controller:
5-7 - <controller type='pci' model='pcie-root-port'/> This is based on qemu's ioh3420.
8-10 - <controller type='pci' model='pcie-switch-upstream-port'/> Based on qemu's x3130-upstream
11-13 - <controller type='pci' model='pcie-switch-downstream-port'/> (xio3130-downstream)
The first patch of each set adds a capability bit for qemu (again non-controversial), the 2nd adds the new pci controller model, and the 3rd implements that model in qemu (by checking for the capability and adding a commandline arg or failing).
The "controversial"/RFC bit is this - talking to Alex Williamson (after I'd rwritten these patches, which is why I'm presenting them in a form that I *don't* want to push) about the possibility of qemu adding generic root-port, switch-upstream-port, and switch-downstream-port controllers, and possibly also a generic dmi-to-pci-bridge (i.e. controllers not tied to any particular hardware implementation as with those currently available), I'm realizing that, while it was a correct decision to make all of the existing pci controllers "type='pci'" (since they share an address space), using the "model" attribute to set the kind of controller was probably a mistake. The problem - if:
<controller type='pci' model='dmi-to-pci-bridge'/>
currently means to add an i82801b11-bridge controller to the domain, once qemu implements a generic dmi-to-pci-bridge, how will *that* be denoted, and how will we avoid replacing the existing i81801b11-bridge in a particular domain with the generic version when a guest is restarted following a qemu/libvirt upgrade?
In hindsight, it probably would have been better to do something like this with the four existing pci controllers:
<controller type='pci' subType='dmi-to-pci-bridge' model='i82801b11-bridge'/> <controller type='pci' subType='pci-bridge' model='pci-bridge'/> (or maybe blank?) <controller type='pci' subType='pci-root'/> (again maybe model is blank) <controller type='pci' subType='pcie-root'/>(and again)
(instead, what is shown above as "subType" is currently placed in the "model" attribute).
Then we could add the 3 new types like this:
<controller type='pci' subType='pcie-root-port' model='ioh3420'/> <controller type='pci' subType='pcie-switch-upstream-port' model='x3130-upstream/> <controller type='pci' subType='pcie-switch-downstream-port' model='xio3130-downstream/>
and we would easily be able to add support for new generic controllers that behaved identically, by just adding a new model. But we haven't done that, and anything we do in the future must be backwards compatible with what's already there (right?). I'm trying to think of how to satisfy backward compatibility while making things work better in the future.
Some ideas, in no particular order:
=== Idea 1: multiplex the meaning of the "model" attribute, so we currently have:
<controller type='pci' model='dmi-to-pci-bridge'/>
which means "add an i82801b11-bridge device" and when we add a generic version of this type of controller, we would do it with something like:
<controller type='pci' model='generic-dmi-to-pci-bridge'/>
and for another vendor's mythical controller:
<controller type='pci' model='xyz-dmi-to-pci-bridge'/>
Cons: This will make for ugliness in switch statements where a new case will have to be added whenever a different controller with similar behavior/usage is supported. And it's generally not a good idea to have a single attribute be used for two different functions.
===
Idea 2: implement new controllers as suggested in "20/20 hindsight" above. For controllers in existing domains (dmi-to-pci-bridge, pic-bridge, pci-root, and pcie-root) imply it into the controller definition of an existing domain when only model has been given (but don't write it out that way, to preserve the ability to downgrade). So this:
[1] <controller type='pci' model='dmi-to-pci-bridge'/>
would internally mean this:
[2] <controller type='pci' subType='dmi-to-pci-bridge' model='i82801b11-bridge'/>
(but would remain as [1] when config is rewritten/migrated) while this:
[3] <controller type='pci' subType='dmi-to-pci-bridge' model='anything whatsoever/>
would mean exactly what it says.
Cons: Keeping this straight would mean having some sort of "oldStyleCompat" flag in the controller object, to be sure that [1] wasn't sent in migration status as [2] (since the destination might not recognize it). It would also mean keeping the code in the parser and formatter to deal with this flag. Forever.
=== Idea 3: interpret controllers with missing subType as above, but actually write it out to the config/migration/etc in the new modified format.
Cons: This would prevent downgrading libvirt or migrating from a host with newer libvirt to one with older libvirt. (Although preserving compatibility at some level when downgrading may be a stated requirement of some downstream distros' builds of libvirt, I think for upstream it is only a "best effort"; I'm just not certain how much "best" is considered to be :-)
=== Idea 4: Unlike other uses of "model" in libvirt, for pci controllers, continue to use "model" to denote the subtype/class/whatever of controller, and create a new attribute to denote the different specific implementations of each one. So for example:
[4] <controller type='pci' model='dmi-to-pci-bridge'/>
would become:
[5] <controller type='pci' model='dmi-to-pci-bridge' implementation='i82801b11-bridge'/>
(or some other name in place of "implementation" - ideas? I'm horrible at thinkin up names)
Pros: wouldn't create compatibility problems when downgrading or migrating cross version.
Cons: Is inconsistent with every other use of "model" attribute in libvirt, and each new addition of a PCI controller further propagates this misuse.
I must say that I thought of this ^^ exactly when reading first three ideas. I wouldn't necessarily say it's a "misuse" of the 'model' naming. We can say it's 'model' and 'subModel' or 'driver' or whatever. One thing to add to pros is that if you don't care about the implementation (submodel/driver), then you don't need to increase the size of the XML. Pus the code complexity added is not greater than the benefit gained. Having said that, I don't insist on that, it's merely my two cents that I like the last idea the best.
====
I currently like either option 2 or 3 (depending on how good we want to be about supporting downgrade/intra-version migration), but as is obvious by the fact that it was me that suggested putting the type of pci controller into "model", I am very good at making the wrong decision on matters like this.
Whatever everyone thinks is best, patches 5-13 of this series would be changed accordingly, and possibly a couple new patches would be made to adjust the 4 existing controller types.
Note that this will also effect the libvirt support for the upcoming qemu "pxb" controller, which is a PCI root bus that can be placed in its own numa node (my description may be incorrect, but I think it gets the idea across). Anyway, with idea 2 or 3 it would be:
<controller type='pci' subType='pci-root' model='pxb'/>
(along with some options to setup numa info).
So, along with any comments on the individual patches (including whether the specific args added to the qemu commandline are correct - I'm looking at you, qemu people!), I would appreciate opinions on how I can save us from this misuse of "model" that I've created.
Laine Stump (13): qemu: refactor qemuBuildControllerDevStr to eliminate future duplicate code qemu: always permit PCI devices to be manually assigned to a PCIe bus qemu: ignore assumptions about hotplug requirement when address is from config docs: document when pcie-root/dmi-to-pci-bridge support was added qemu: add capabilities bit for device ioh3420 conf: new pci controller model "pcie-root-port" qemu: support new pci controller model "pcie-root-port" qemu: add capabilities bit for device x3130-upstream conf: new pci controller model "pcie-switch-upstream-port" qemu: support new pci controller model "pcie-switch-upstream-port" qemu: add capabilities bit for device xio3130-downstream conf: new pcie-controller model "pcie-switch-downstream-port" qemu: support new pci controller model "pcie-switch-downstream-port"
docs/formatdomain.html.in | 48 +++++++++--- docs/schemas/domaincommon.rng | 3 + src/conf/domain_addr.c | 47 ++++++++++-- src/conf/domain_addr.h | 23 ++++-- src/conf/domain_conf.c | 5 +- src/conf/domain_conf.h | 3 + src/qemu/qemu_capabilities.c | 8 +- src/qemu/qemu_capabilities.h | 5 +- src/qemu/qemu_command.c | 86 +++++++++++++++++----- tests/qemucapabilitiesdata/caps_1.2.2-1.caps | 3 + tests/qemucapabilitiesdata/caps_1.3.1-1.caps | 3 + tests/qemucapabilitiesdata/caps_1.4.2-1.caps | 3 + tests/qemucapabilitiesdata/caps_1.5.3-1.caps | 3 + tests/qemucapabilitiesdata/caps_1.6.0-1.caps | 3 + tests/qemucapabilitiesdata/caps_1.6.50-1.caps | 3 + tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 3 + tests/qemuhelptest.c | 10 ++- .../qemuxml2argv-pcie-root-port.args | 10 +++ .../qemuxml2argv-pcie-root-port.xml | 33 +++++++++ .../qemuxml2argv-pcie-switch-downstream-port.args | 13 ++++ .../qemuxml2argv-pcie-switch-downstream-port.xml | 36 +++++++++ .../qemuxml2argv-pcie-switch-upstream-port.args | 9 +++ .../qemuxml2argv-pcie-switch-upstream-port.xml | 32 ++++++++ tests/qemuxml2argvtest.c | 25 +++++++ tests/qemuxml2xmltest.c | 3 + 25 files changed, 375 insertions(+), 45 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-root-port.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-downstream-port.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pcie-switch-upstream-port.xml
-- 2.1.0
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 06/23/2015 04:40 AM, Martin Kletzander wrote:
On Mon, Jun 22, 2015 at 02:44:05PM -0400, Laine Stump wrote:
The first 4 patches are bugfixes/reorganizations that have no controversy.
The sets of 5-7, 8-10, and 11-13 each implement a new model of PCI controller:
5-7 - <controller type='pci' model='pcie-root-port'/> This is based on qemu's ioh3420.
8-10 - <controller type='pci' model='pcie-switch-upstream-port'/> Based on qemu's x3130-upstream
11-13 - <controller type='pci' model='pcie-switch-downstream-port'/> (xio3130-downstream)
The first patch of each set adds a capability bit for qemu (again non-controversial), the 2nd adds the new pci controller model, and the 3rd implements that model in qemu (by checking for the capability and adding a commandline arg or failing).
The "controversial"/RFC bit is this - talking to Alex Williamson (after I'd rwritten these patches, which is why I'm presenting them in a form that I *don't* want to push) about the possibility of qemu adding generic root-port, switch-upstream-port, and switch-downstream-port controllers, and possibly also a generic dmi-to-pci-bridge (i.e. controllers not tied to any particular hardware implementation as with those currently available), I'm realizing that, while it was a correct decision to make all of the existing pci controllers "type='pci'" (since they share an address space), using the "model" attribute to set the kind of controller was probably a mistake. The problem - if:
<controller type='pci' model='dmi-to-pci-bridge'/>
currently means to add an i82801b11-bridge controller to the domain, once qemu implements a generic dmi-to-pci-bridge, how will *that* be denoted, and how will we avoid replacing the existing i81801b11-bridge in a particular domain with the generic version when a guest is restarted following a qemu/libvirt upgrade?
In hindsight, it probably would have been better to do something like this with the four existing pci controllers:
<controller type='pci' subType='dmi-to-pci-bridge' model='i82801b11-bridge'/> <controller type='pci' subType='pci-bridge' model='pci-bridge'/> (or maybe blank?) <controller type='pci' subType='pci-root'/> (again maybe model is blank) <controller type='pci' subType='pcie-root'/>(and again)
(instead, what is shown above as "subType" is currently placed in the "model" attribute).
Then we could add the 3 new types like this:
<controller type='pci' subType='pcie-root-port' model='ioh3420'/> <controller type='pci' subType='pcie-switch-upstream-port' model='x3130-upstream/> <controller type='pci' subType='pcie-switch-downstream-port' model='xio3130-downstream/>
and we would easily be able to add support for new generic controllers that behaved identically, by just adding a new model. But we haven't done that, and anything we do in the future must be backwards compatible with what's already there (right?). I'm trying to think of how to satisfy backward compatibility while making things work better in the future.
Some ideas, in no particular order:
=== Idea 1: multiplex the meaning of the "model" attribute, so we currently have:
<controller type='pci' model='dmi-to-pci-bridge'/>
which means "add an i82801b11-bridge device" and when we add a generic version of this type of controller, we would do it with something like:
<controller type='pci' model='generic-dmi-to-pci-bridge'/>
and for another vendor's mythical controller:
<controller type='pci' model='xyz-dmi-to-pci-bridge'/>
Cons: This will make for ugliness in switch statements where a new case will have to be added whenever a different controller with similar behavior/usage is supported. And it's generally not a good idea to have a single attribute be used for two different functions.
===
Idea 2: implement new controllers as suggested in "20/20 hindsight" above. For controllers in existing domains (dmi-to-pci-bridge, pic-bridge, pci-root, and pcie-root) imply it into the controller definition of an existing domain when only model has been given (but don't write it out that way, to preserve the ability to downgrade). So this:
[1] <controller type='pci' model='dmi-to-pci-bridge'/>
would internally mean this:
[2] <controller type='pci' subType='dmi-to-pci-bridge' model='i82801b11-bridge'/>
(but would remain as [1] when config is rewritten/migrated) while this:
[3] <controller type='pci' subType='dmi-to-pci-bridge' model='anything whatsoever/>
would mean exactly what it says.
Cons: Keeping this straight would mean having some sort of "oldStyleCompat" flag in the controller object, to be sure that [1] wasn't sent in migration status as [2] (since the destination might not recognize it). It would also mean keeping the code in the parser and formatter to deal with this flag. Forever.
=== Idea 3: interpret controllers with missing subType as above, but actually write it out to the config/migration/etc in the new modified format.
Cons: This would prevent downgrading libvirt or migrating from a host with newer libvirt to one with older libvirt. (Although preserving compatibility at some level when downgrading may be a stated requirement of some downstream distros' builds of libvirt, I think for upstream it is only a "best effort"; I'm just not certain how much "best" is considered to be :-)
=== Idea 4: Unlike other uses of "model" in libvirt, for pci controllers, continue to use "model" to denote the subtype/class/whatever of controller, and create a new attribute to denote the different specific implementations of each one. So for example:
[4] <controller type='pci' model='dmi-to-pci-bridge'/>
would become:
[5] <controller type='pci' model='dmi-to-pci-bridge' implementation='i82801b11-bridge'/>
(or some other name in place of "implementation" - ideas? I'm horrible at thinkin up names)
Pros: wouldn't create compatibility problems when downgrading or migrating cross version.
Cons: Is inconsistent with every other use of "model" attribute in libvirt, and each new addition of a PCI controller further propagates this misuse.
I must say that I thought of this ^^ exactly when reading first three ideas. I wouldn't necessarily say it's a "misuse" of the 'model' naming. We can say it's 'model' and 'subModel' or 'driver' or whatever. One thing to add to pros is that if you don't care about the implementation (submodel/driver), then you don't need to increase the size of the XML. Pus the code complexity added is not greater than the benefit gained.
Yes, it definitely is the simplest, provides full functionality, and gives no downgrade/version compatibility problems. I'm just a pedant about consistency, and this would bother me *forever*, so I'm either looking for a better way, or a fact that I can use to placate my feelings of regret at having not done it right in the first place :-) In libvirt, "model" is usually used to describe the exact device provided to the guest, in many/most cases by spelling out the exact model number of the real hardware that is being emulated, while "driver" is usually an element that provides details about the internal implementation of that model. So if I did it this way, I don't think "driver" is the right name (and neither is my hastily suggested "implementation"). It might be okay as "subModel" though (but starting from scratch I would still prefer type, subType, and model)

On Mon, Jun 22, 2015 at 02:44:05PM -0400, Laine Stump wrote:
The first 4 patches are bugfixes/reorganizations that have no controversy.
The sets of 5-7, 8-10, and 11-13 each implement a new model of PCI controller:
5-7 - <controller type='pci' model='pcie-root-port'/> This is based on qemu's ioh3420.
8-10 - <controller type='pci' model='pcie-switch-upstream-port'/> Based on qemu's x3130-upstream
11-13 - <controller type='pci' model='pcie-switch-downstream-port'/> (xio3130-downstream)
The first patch of each set adds a capability bit for qemu (again non-controversial), the 2nd adds the new pci controller model, and the 3rd implements that model in qemu (by checking for the capability and adding a commandline arg or failing).
The "controversial"/RFC bit is this - talking to Alex Williamson (after I'd rwritten these patches, which is why I'm presenting them in a form that I *don't* want to push) about the possibility of qemu adding generic root-port, switch-upstream-port, and switch-downstream-port controllers, and possibly also a generic dmi-to-pci-bridge (i.e. controllers not tied to any particular hardware implementation as with those currently available), I'm realizing that, while it was a correct decision to make all of the existing pci controllers "type='pci'" (since they share an address space), using the "model" attribute to set the kind of controller was probably a mistake. The problem - if:
<controller type='pci' model='dmi-to-pci-bridge'/>
currently means to add an i82801b11-bridge controller to the domain, once qemu implements a generic dmi-to-pci-bridge, how will *that* be denoted, and how will we avoid replacing the existing i81801b11-bridge in a particular domain with the generic version when a guest is restarted following a qemu/libvirt upgrade?
In hindsight, it probably would have been better to do something like this with the four existing pci controllers:
<controller type='pci' subType='dmi-to-pci-bridge' model='i82801b11-bridge'/> <controller type='pci' subType='pci-bridge' model='pci-bridge'/> (or maybe blank?) <controller type='pci' subType='pci-root'/> (again maybe model is blank) <controller type='pci' subType='pcie-root'/>(and again)
(instead, what is shown above as "subType" is currently placed in the "model" attribute).
So far, just the dmi-to-pci-bridge does not match the actual device model used by qemu -- pci-bridge is generic and we don't format the -roots.
Then we could add the 3 new types like this:
<controller type='pci' subType='pcie-root-port' model='ioh3420'/> <controller type='pci' subType='pcie-switch-upstream-port' model='x3130-upstream/> <controller type='pci' subType='pcie-switch-downstream-port' model='xio3130-downstream/>
and we would easily be able to add support for new generic controllers that behaved identically, by just adding a new model. But we haven't done that, and anything we do in the future must be backwards compatible with what's already there (right?). I'm trying to think of how to satisfy backward compatibility while making things work better in the future.
Some ideas, in no particular order:
=== Idea 1: multiplex the meaning of the "model" attribute, so we currently have:
<controller type='pci' model='dmi-to-pci-bridge'/>
which means "add an i82801b11-bridge device" and when we add a generic version of this type of controller, we would do it with something like:
<controller type='pci' model='generic-dmi-to-pci-bridge'/>
and for another vendor's mythical controller:
<controller type='pci' model='xyz-dmi-to-pci-bridge'/>
This could be just 'xyz'. We do not have <model type='rtl8193-pci-network-card'/> or <memballoon model='virtio-memballoon'/> For the new ones, it would be model='ioh3420', 'x3130-upstream', etc.
Cons: This will make for ugliness in switch statements where a new case will have to be added whenever a different controller with similar behavior/usage is supported. And it's generally not a good idea to have a single attribute be used for two different functions.
===
Idea 2: implement new controllers as suggested in "20/20 hindsight" above. For controllers in existing domains (dmi-to-pci-bridge, pic-bridge, pci-root, and pcie-root) imply it into the controller definition of an existing domain when only model has been given (but don't write it out that way, to preserve the ability to downgrade). So this:
[1] <controller type='pci' model='dmi-to-pci-bridge'/>
would internally mean this:
[2] <controller type='pci' subType='dmi-to-pci-bridge' model='i82801b11-bridge'/>
If we want to allow forward migration, we need to treat a missing model as 'i82801b11-bridge' anyway.
(but would remain as [1] when config is rewritten/migrated) while this:
[3] <controller type='pci' subType='dmi-to-pci-bridge' model='anything whatsoever/>
would mean exactly what it says.
Cons: Keeping this straight would mean having some sort of "oldStyleCompat" flag in the controller object, to be sure that [1] wasn't sent in migration status as [2] (since the destination might not recognize it). It would also mean keeping the code in the parser and formatter to deal with this flag. Forever.
So when would the 'subType' be formatted? If it's just internal, this seems to be identical with my suggestion to Idea #1.
=== Idea 3: interpret controllers with missing subType as above, but actually write it out to the config/migration/etc in the new modified format.
Cons: This would prevent downgrading libvirt or migrating from a host with newer libvirt to one with older libvirt. (Although preserving compatibility at some level when downgrading may be a stated requirement of some downstream distros' builds of libvirt, I think for upstream it is only a "best effort"; I'm just not certain how much "best" is considered to be :-)
I do not know of any effort done to make downgrading libvirt work. Any machine configs that use new values for old attributes will disappear (and so will running machines, because of new qemu capabilities). Migration is somewhat supported, so the compatible format could be used only when the model is default and VIR_DOMAIN_DEF_FORMAT_MIGRATABLE was specified?
=== Idea 4: Unlike other uses of "model" in libvirt, for pci controllers, continue to use "model" to denote the subtype/class/whatever of controller, and create a new attribute to denote the different specific implementations of each one. So for example:
[4] <controller type='pci' model='dmi-to-pci-bridge'/>
would become:
[5] <controller type='pci' model='dmi-to-pci-bridge' implementation='i82801b11-bridge'/>
(or some other name in place of "implementation" - ideas? I'm horrible at thinkin up names)
device? actualModel? :)
Pros: wouldn't create compatibility problems when downgrading or migrating cross version.
If you tried to migrate to older libvirt with: model='dmi-to-pci-bridge' impl='generic', older libvirt would not parse the impl flag and create a machine with i8201b11-bridge. (Assuming the QEMU would know the machine type).
Cons: Is inconsistent with every other use of "model" attribute in libvirt, and each new addition of a PCI controller further propagates this misuse.
It is consistent with all the other inconsistencies in libvirt :)
====
I currently like either option 2 or 3 (depending on how good we want to be about supporting downgrade/intra-version migration), but as is obvious by the fact that it was me that suggested putting the type of pci controller into "model", I am very good at making the wrong decision on matters like this.
My favorite would be #2 without the subType (AKA #1 without the controller type in new model names) - it is implied from the model anyway. A more structured XML might be user-frendlier, but this way even old libvirt will recognize that it cannot start the machine because of an incompatible model. This way only the dmi-to-pci-bridge -> i82801b11-bridge mapping would be odd. Jan

On 06/23/2015 11:57 AM, Ján Tomko wrote:
On Mon, Jun 22, 2015 at 02:44:05PM -0400, Laine Stump wrote:
The first 4 patches are bugfixes/reorganizations that have no controversy.
The sets of 5-7, 8-10, and 11-13 each implement a new model of PCI controller:
5-7 - <controller type='pci' model='pcie-root-port'/> This is based on qemu's ioh3420.
8-10 - <controller type='pci' model='pcie-switch-upstream-port'/> Based on qemu's x3130-upstream
11-13 - <controller type='pci' model='pcie-switch-downstream-port'/> (xio3130-downstream)
The first patch of each set adds a capability bit for qemu (again non-controversial), the 2nd adds the new pci controller model, and the 3rd implements that model in qemu (by checking for the capability and adding a commandline arg or failing).
The "controversial"/RFC bit is this - talking to Alex Williamson (after I'd rwritten these patches, which is why I'm presenting them in a form that I *don't* want to push) about the possibility of qemu adding generic root-port, switch-upstream-port, and switch-downstream-port controllers, and possibly also a generic dmi-to-pci-bridge (i.e. controllers not tied to any particular hardware implementation as with those currently available), I'm realizing that, while it was a correct decision to make all of the existing pci controllers "type='pci'" (since they share an address space), using the "model" attribute to set the kind of controller was probably a mistake. The problem - if:
<controller type='pci' model='dmi-to-pci-bridge'/>
currently means to add an i82801b11-bridge controller to the domain, once qemu implements a generic dmi-to-pci-bridge, how will *that* be denoted, and how will we avoid replacing the existing i81801b11-bridge in a particular domain with the generic version when a guest is restarted following a qemu/libvirt upgrade?
In hindsight, it probably would have been better to do something like this with the four existing pci controllers:
<controller type='pci' subType='dmi-to-pci-bridge' model='i82801b11-bridge'/> <controller type='pci' subType='pci-bridge' model='pci-bridge'/> (or maybe blank?) <controller type='pci' subType='pci-root'/> (again maybe model is blank) <controller type='pci' subType='pcie-root'/>(and again)
(instead, what is shown above as "subType" is currently placed in the "model" attribute). So far, just the dmi-to-pci-bridge does not match the actual device model used by qemu -- pci-bridge is generic and we don't format the -roots.
Not yet, but as far as I understand, the pxb is another kind of pci-root.
Then we could add the 3 new types like this:
<controller type='pci' subType='pcie-root-port' model='ioh3420'/> <controller type='pci' subType='pcie-switch-upstream-port' model='x3130-upstream/> <controller type='pci' subType='pcie-switch-downstream-port' model='xio3130-downstream/>
and we would easily be able to add support for new generic controllers that behaved identically, by just adding a new model. But we haven't done that, and anything we do in the future must be backwards compatible with what's already there (right?). I'm trying to think of how to satisfy backward compatibility while making things work better in the future.
Some ideas, in no particular order:
=== Idea 1: multiplex the meaning of the "model" attribute, so we currently have:
<controller type='pci' model='dmi-to-pci-bridge'/>
which means "add an i82801b11-bridge device" and when we add a generic version of this type of controller, we would do it with something like:
<controller type='pci' model='generic-dmi-to-pci-bridge'/>
and for another vendor's mythical controller:
<controller type='pci' model='xyz-dmi-to-pci-bridge'/>
This could be just 'xyz'. We do not have <model type='rtl8193-pci-network-card'/> or <memballoon model='virtio-memballoon'/>
because in those cases the type of hardware is already well known and the different models have a similar function on the guest (in form, if not exactly in implementation)
For the new ones, it would be model='ioh3420', 'x3130-upstream', etc.
I like having the XML give a clue to the exact function of the device without needing to look up exactly what each specific device from some random chipset does.
Cons: This will make for ugliness in switch statements where a new case will have to be added whenever a different controller with similar behavior/usage is supported. And it's generally not a good idea to have a single attribute be used for two different functions.
===
Idea 2: implement new controllers as suggested in "20/20 hindsight" above. For controllers in existing domains (dmi-to-pci-bridge, pic-bridge, pci-root, and pcie-root) imply it into the controller definition of an existing domain when only model has been given (but don't write it out that way, to preserve the ability to downgrade). So this:
[1] <controller type='pci' model='dmi-to-pci-bridge'/>
would internally mean this:
[2] <controller type='pci' subType='dmi-to-pci-bridge' model='i82801b11-bridge'/>
If we want to allow forward migration, we need to treat a missing model as 'i82801b11-bridge' anyway.
Correct. Just as we set a network device model of rtl8139 for any <interface> that has no model (because rtl8139 has always been the default up until now, but may not be in the future), we would treat a missing [whatever it is] as "i82801b11-bridge", and also encode it into the XML whenever this domain is re-saved to disk.
(but would remain as [1] when config is rewritten/migrated) while this:
[3] <controller type='pci' subType='dmi-to-pci-bridge' model='anything whatsoever/>
would mean exactly what it says.
Cons: Keeping this straight would mean having some sort of "oldStyleCompat" flag in the controller object, to be sure that [1] wasn't sent in migration status as [2] (since the destination might not recognize it). It would also mean keeping the code in the parser and formatter to deal with this flag. Forever. So when would the 'subType' be formatted? If it's just internal, this seems to be identical with my suggestion to Idea #1.
The idea would be that when it is specified in the XML of define/create, it will also be formatted. But I think I've soured on this idea anyway.
=== Idea 3: interpret controllers with missing subType as above, but actually write it out to the config/migration/etc in the new modified format.
Cons: This would prevent downgrading libvirt or migrating from a host with newer libvirt to one with older libvirt. (Although preserving compatibility at some level when downgrading may be a stated requirement of some downstream distros' builds of libvirt, I think for upstream it is only a "best effort"; I'm just not certain how much "best" is considered to be :-)
I do not know of any effort done to make downgrading libvirt work.
You haven't talked to enough people deploying RHEV/oVirt in production - they want the ability to upgrade some nodes, migrate guests over to them, then migrate the guests back if the upgrade needs to be undone.
Any machine configs that use new values for old attributes will disappear (and so will running machines, because of new qemu capabilities).
Migration is somewhat supported, so the compatible format could be used only when the model is default and VIR_DOMAIN_DEF_FORMAT_MIGRATABLE was specified?
Isn't VIR_DOMAIN_DEF_FORMAT_MIGRATABLE there in part so that migrations from newer libvirt to older libvirt might work? (it doesn't guarantee it, but it does make it work in some situations where it otherwise wouldn't).
=== Idea 4: Unlike other uses of "model" in libvirt, for pci controllers, continue to use "model" to denote the subtype/class/whatever of controller, and create a new attribute to denote the different specific implementations of each one. So for example:
[4] <controller type='pci' model='dmi-to-pci-bridge'/>
would become:
[5] <controller type='pci' model='dmi-to-pci-bridge' implementation='i82801b11-bridge'/>
(or some other name in place of "implementation" - ideas? I'm horrible at thinkin up names)
device? actualModel? :)
hey, I think I might like "device"!
Pros: wouldn't create compatibility problems when downgrading or migrating cross version.
If you tried to migrate to older libvirt with: model='dmi-to-pci-bridge' impl='generic', older libvirt would not parse the impl flag and create a machine with i8201b11-bridge. (Assuming the QEMU would know the machine type).
Well, yeah, this does point out a flaw in my thinking :-/
Cons: Is inconsistent with every other use of "model" attribute in libvirt, and each new addition of a PCI controller further propagates this misuse.
It is consistent with all the other inconsistencies in libvirt :)
====
I currently like either option 2 or 3 (depending on how good we want to be about supporting downgrade/intra-version migration), but as is obvious by the fact that it was me that suggested putting the type of pci controller into "model", I am very good at making the wrong decision on matters like this. My favorite would be #2 without the subType (AKA #1 without the controller type in new model names) - it is implied from the model anyway. A more structured XML might be user-frendlier, but this way even old libvirt will recognize that it cannot start the machine because of an incompatible model.
This way only the dmi-to-pci-bridge -> i82801b11-bridge mapping would be odd.
I have to think about this some more...

On Tue, Jun 23, 2015 at 09:06:24PM -0400, Laine Stump wrote:
On 06/23/2015 11:57 AM, Ján Tomko wrote:
On Mon, Jun 22, 2015 at 02:44:05PM -0400, Laine Stump wrote:
[snip]
=== Idea 3: interpret controllers with missing subType as above, but actually write it out to the config/migration/etc in the new modified format.
Cons: This would prevent downgrading libvirt or migrating from a host with newer libvirt to one with older libvirt. (Although preserving compatibility at some level when downgrading may be a stated requirement of some downstream distros' builds of libvirt, I think for upstream it is only a "best effort"; I'm just not certain how much "best" is considered to be :-)
I do not know of any effort done to make downgrading libvirt work.
You haven't talked to enough people deploying RHEV/oVirt in production - they want the ability to upgrade some nodes, migrate guests over to them, then migrate the guests back if the upgrade needs to be undone.
That is migration, where we pass the "migratable" XML. The domain configs and live status XMLs can contain things that won't be parsable by older libvirt. Under 'downgrade' I imagined simply installing older libvirt packages - this can possibly fail, even if the domains only use features present in the old libvirt too.
Any machine configs that use new values for old attributes will disappear (and so will running machines, because of new qemu capabilities).
Migration is somewhat supported, so the compatible format could be used only when the model is default and VIR_DOMAIN_DEF_FORMAT_MIGRATABLE was specified?
Isn't VIR_DOMAIN_DEF_FORMAT_MIGRATABLE there in part so that migrations from newer libvirt to older libvirt might work? (it doesn't guarantee it, but it does make it work in some situations where it otherwise wouldn't).
Yes, if the domain worked with the old libvirt, we should be able to migrate to the new libvirt and back, thanks to this flag.
=== Idea 4: Unlike other uses of "model" in libvirt, for pci controllers, continue to use "model" to denote the subtype/class/whatever of controller, and create a new attribute to denote the different specific implementations of each one. So for example:
[4] <controller type='pci' model='dmi-to-pci-bridge'/>
would become:
[5] <controller type='pci' model='dmi-to-pci-bridge' implementation='i82801b11-bridge'/>
(or some other name in place of "implementation" - ideas? I'm horrible at thinkin up names)
device? actualModel? :)
hey, I think I might like "device"!
Pros: wouldn't create compatibility problems when downgrading or migrating cross version.
If you tried to migrate to older libvirt with: model='dmi-to-pci-bridge' impl='generic', older libvirt would not parse the impl flag and create a machine with i8201b11-bridge. (Assuming the QEMU would know the machine type).
Well, yeah, this does point out a flaw in my thinking :-/
This happens with all new XML elements - libvirt's parser usually does an XPath query for the elements it knows and ignores the unknown ones. IIRC this is on purpose. Jan

I think the only votes were for option 1 and 4 (interesting how the only ones that were chosen were those that I *didn't* pick personally :-). See comments below. In the meantime the other issue Alex pointed out may cause this to take a slightly different direction. On 06/22/2015 02:44 PM, Laine Stump wrote:
=== Idea 1: multiplex the meaning of the "model" attribute, so we currently have:
<controller type='pci' model='dmi-to-pci-bridge'/>
which means "add an i82801b11-bridge device" and when we add a generic version of this type of controller, we would do it with something like:
<controller type='pci' model='generic-dmi-to-pci-bridge'/>
and for another vendor's mythical controller:
<controller type='pci' model='xyz-dmi-to-pci-bridge'/>
Cons: This will make for ugliness in switch statements where a new case will have to be added whenever a different controller with similar behavior/usage is supported. And it's generally not a good idea to have a single attribute be used for two different functions.
jtomko advocated this one because it would make it easier for an older libvirt to notice an unsupported class+model of controller during a migration attempt from a newer libvirt. An example would be if a newer libvirt had a guest with <controller type='pci' model='xyz'/> (where xyz is some qemu device that implements a dmi-to-pci-bridge) That would fail on an attempt to migrate to older libvirt, but <controller type='pci' model='dmi-to-pci-bridge'/> would succeed. On the other hand, if we did this: <controller type='pci' model='dmi-to-pci-bridge submodel='xyz'/> that would succeed (and create the wrong device) because submodel would be ignored. Since there is currently no alternate implementation of a dmi-to-pci-bridge available in qemu, and it will likely be awhile until that happens, I think if we start filling out the XML now as: <controller type='pci' model='dmi-to-pci-bridge' submodel='i82801b11-bridge'/> by the time we get to the point that we do have an alternate 'xyz' controller, any version of libvirt that doesn't understand "submodel" will be so far in the past that we wouldn't want to be able to migrate back to it anyway.
=== Idea 4: Unlike other uses of "model" in libvirt, for pci controllers, continue to use "model" to denote the subtype/class/whatever of controller, and create a new attribute to denote the different specific implementations of each one. So for example:
[4] <controller type='pci' model='dmi-to-pci-bridge'/>
would become:
[5] <controller type='pci' model='dmi-to-pci-bridge' implementation='i82801b11-bridge'/>
(or some other name in place of "implementation" - ideas? I'm horrible at thinkin up names)
Pros: wouldn't create compatibility problems when downgrading or migrating cross version.
Cons: Is inconsistent with every other use of "model" attribute in libvirt, and each new addition of a PCI controller further propagates this misuse.
mkletzan preferred this one, and danpb was amenable to it in IRC. I think I now agree, but Alex's comments about needing to keep track of what we put in the qemu commandline for port and chassis have me thinking this isn't enough. The problem is that the "ioh3420" device needs its "port" and "chassis" options set; Alex recommends setting port to: (slot << 3)+function Likewise, he recommends setting "port" for the xio3130-downstream device to the same value as slot. So, for the following: <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> </controller> we would end up with the following commandline: -device ioh3420,chassis=3,port=0x21,id='pcie.3',bus='pcie.0', addr=0x4.0x1 (chassis is the same as "index" in the xml (again per Alex's recommendation), and port is (4 << 3) + 1 = 0x21) *However* he also says that we may change our mind on these in the future, so chassis and port may end up being something different. Since those values are visible to the guest, we can't allow them to change on an existing machine, as it breaks the guest ABI. So we need to store them in the XML. They aren't part of the PCI address though, they are options. So I need to figure out the best way to represent that in the XML, and fill it in when it is auto-generated when the controller is defined. A few possible ways to solve both problems at once: [1] <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <target type='ioh3420' chassis='3' port='0x21'/> </controller> Precedent: <target> is used to store the port number for serial/console devices, specify guest-side bus and device name for disks, specify guest-side mount location for filesystems, specify the *host*-side name of tap devices for network interfaces, memory size for <memory>. So it seems kind of proper. (slight variation - <target model='ioh3420' .../> :-/ ) [2] <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <model type='ioh3420' chassis='3' port='0x21'/> </controller> (How's *that* for confusing?!? Both a model attribute *and* a model subelement.) Precedent: This is exactly patterned after the use of <model> in <video> devices though, where the specific card being emulated along with any memory/etc options are put in <model>, so at least there is precedence) [3] <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <device type='ioh3420' chassis='3' port='0x21'/> </controller> (This one is completely new, just for a fresh start in case people think neither of the others are exactly right) At this point you can see that it all comes down to what name we want to give the subelement; within that, we give the exact name of the qemu device, and the exact name/value of any qemu options that we set to non-default values. Somebody *please* have an opinion on the name for this, because none of these strikes me as better or worse (well, I think I dislike <driver> (P.S. There are other "automagic" qemu options being specified by libvirt that maybe could use the same treatment as these. Two that come to mind are chassis_nr for pci-bridge controllers (set to the controller's index), and "vectors" for virtio-net multiqueue support (set to (queues*2)+2).

On Wed, Jun 24, 2015 at 12:04:33PM -0400, Laine Stump wrote:
I think the only votes were for option 1 and 4 (interesting how the only ones that were chosen were those that I *didn't* pick personally :-). See comments below. In the meantime the other issue Alex pointed out may cause this to take a slightly different direction.
On 06/22/2015 02:44 PM, Laine Stump wrote:
=== Idea 1: multiplex the meaning of the "model" attribute, so we currently have:
<controller type='pci' model='dmi-to-pci-bridge'/>
which means "add an i82801b11-bridge device" and when we add a generic version of this type of controller, we would do it with something like:
<controller type='pci' model='generic-dmi-to-pci-bridge'/>
and for another vendor's mythical controller:
<controller type='pci' model='xyz-dmi-to-pci-bridge'/>
Cons: This will make for ugliness in switch statements where a new case will have to be added whenever a different controller with similar behavior/usage is supported. And it's generally not a good idea to have a single attribute be used for two different functions.
jtomko advocated this one because it would make it easier for an older libvirt to notice an unsupported class+model of controller during a migration attempt from a newer libvirt. An example would be if a newer libvirt had a guest with
<controller type='pci' model='xyz'/>
(where xyz is some qemu device that implements a dmi-to-pci-bridge)
That would fail on an attempt to migrate to older libvirt, but
<controller type='pci' model='dmi-to-pci-bridge'/>
would succeed. On the other hand, if we did this:
<controller type='pci' model='dmi-to-pci-bridge submodel='xyz'/>
that would succeed (and create the wrong device) because submodel would be ignored.
Since there is currently no alternate implementation of a dmi-to-pci-bridge available in qemu, and it will likely be awhile until that happens, I think if we start filling out the XML now as:
<controller type='pci' model='dmi-to-pci-bridge' submodel='i82801b11-bridge'/>
by the time we get to the point that we do have an alternate 'xyz' controller, any version of libvirt that doesn't understand "submodel" will be so far in the past that we wouldn't want to be able to migrate back to it anyway.
=== Idea 4: Unlike other uses of "model" in libvirt, for pci controllers, continue to use "model" to denote the subtype/class/whatever of controller, and create a new attribute to denote the different specific implementations of each one. So for example:
[4] <controller type='pci' model='dmi-to-pci-bridge'/>
would become:
[5] <controller type='pci' model='dmi-to-pci-bridge' implementation='i82801b11-bridge'/>
(or some other name in place of "implementation" - ideas? I'm horrible at thinkin up names)
Pros: wouldn't create compatibility problems when downgrading or migrating cross version.
Cons: Is inconsistent with every other use of "model" attribute in libvirt, and each new addition of a PCI controller further propagates this misuse.
mkletzan preferred this one, and danpb was amenable to it in IRC. I think I now agree, but Alex's comments about needing to keep track of what we put in the qemu commandline for port and chassis have me thinking this isn't enough.
The problem is that the "ioh3420" device needs its "port" and "chassis" options set; Alex recommends setting port to:
(slot << 3)+function
Likewise, he recommends setting "port" for the xio3130-downstream device to the same value as slot. So, for the following:
<controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> </controller>
we would end up with the following commandline:
-device ioh3420,chassis=3,port=0x21,id='pcie.3',bus='pcie.0', addr=0x4.0x1
(chassis is the same as "index" in the xml (again per Alex's recommendation), and port is (4 << 3) + 1 = 0x21)
*However* he also says that we may change our mind on these in the future, so chassis and port may end up being something different. Since those values are visible to the guest, we can't allow them to change on an existing machine, as it breaks the guest ABI. So we need to store them in the XML. They aren't part of the PCI address though, they are options. So I need to figure out the best way to represent that in the XML, and fill it in when it is auto-generated when the controller is defined. A few possible ways to solve both problems at once:
[1] <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <target type='ioh3420' chassis='3' port='0x21'/> </controller>
Precedent: <target> is used to store the port number for serial/console devices, specify guest-side bus and device name for disks, specify guest-side mount location for filesystems, specify the *host*-side name of tap devices for network interfaces, memory size for <memory>. So it seems kind of proper. (slight variation - <target model='ioh3420' .../> :-/ )
[2] <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <model type='ioh3420' chassis='3' port='0x21'/> </controller>
(How's *that* for confusing?!? Both a model attribute *and* a model subelement.)
That's no worse than model & submodel really. If anything it is better as having a child element gives more room for expansion than an attribute, and <model> is still a fairly standard approach we've used inpast. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Wed, Jun 24, 2015 at 12:04:33PM -0400, Laine Stump wrote:
I think the only votes were for option 1 and 4 (interesting how the only ones that were chosen were those that I *didn't* pick personally :-). See comments below. In the meantime the other issue Alex pointed out may cause this to take a slightly different direction.
On 06/22/2015 02:44 PM, Laine Stump wrote:
=== Idea 1: multiplex the meaning of the "model" attribute, so we currently have:
<controller type='pci' model='dmi-to-pci-bridge'/>
which means "add an i82801b11-bridge device" and when we add a generic version of this type of controller, we would do it with something like:
<controller type='pci' model='generic-dmi-to-pci-bridge'/>
and for another vendor's mythical controller:
<controller type='pci' model='xyz-dmi-to-pci-bridge'/>
Cons: This will make for ugliness in switch statements where a new case will have to be added whenever a different controller with similar behavior/usage is supported. And it's generally not a good idea to have a single attribute be used for two different functions.
jtomko advocated this one because it would make it easier for an older libvirt to notice an unsupported class+model of controller during a migration attempt from a newer libvirt. An example would be if a newer libvirt had a guest with
<controller type='pci' model='xyz'/>
(where xyz is some qemu device that implements a dmi-to-pci-bridge)
That would fail on an attempt to migrate to older libvirt, but
<controller type='pci' model='dmi-to-pci-bridge'/>
would succeed. On the other hand, if we did this:
<controller type='pci' model='dmi-to-pci-bridge submodel='xyz'/>
that would succeed (and create the wrong device) because submodel would be ignored.
Since there is currently no alternate implementation of a dmi-to-pci-bridge available in qemu, and it will likely be awhile until that happens, I think if we start filling out the XML now as:
<controller type='pci' model='dmi-to-pci-bridge' submodel='i82801b11-bridge'/>
by the time we get to the point that we do have an alternate 'xyz' controller, any version of libvirt that doesn't understand "submodel" will be so far in the past that we wouldn't want to be able to migrate back to it anyway.
=== Idea 4: Unlike other uses of "model" in libvirt, for pci controllers, continue to use "model" to denote the subtype/class/whatever of controller, and create a new attribute to denote the different specific implementations of each one. So for example:
[4] <controller type='pci' model='dmi-to-pci-bridge'/>
would become:
[5] <controller type='pci' model='dmi-to-pci-bridge' implementation='i82801b11-bridge'/>
(or some other name in place of "implementation" - ideas? I'm horrible at thinkin up names)
Pros: wouldn't create compatibility problems when downgrading or migrating cross version.
Cons: Is inconsistent with every other use of "model" attribute in libvirt, and each new addition of a PCI controller further propagates this misuse.
mkletzan preferred this one, and danpb was amenable to it in IRC. I think I now agree, but Alex's comments about needing to keep track of what we put in the qemu commandline for port and chassis have me thinking this isn't enough.
The problem is that the "ioh3420" device needs its "port" and "chassis" options set; Alex recommends setting port to:
(slot << 3)+function
Likewise, he recommends setting "port" for the xio3130-downstream device to the same value as slot. So, for the following:
<controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> </controller>
we would end up with the following commandline:
-device ioh3420,chassis=3,port=0x21,id='pcie.3',bus='pcie.0', addr=0x4.0x1
(chassis is the same as "index" in the xml (again per Alex's recommendation), and port is (4 << 3) + 1 = 0x21)
*However* he also says that we may change our mind on these in the future, so chassis and port may end up being something different. Since those values are visible to the guest, we can't allow them to change on an existing machine, as it breaks the guest ABI. So we need to store them in the XML. They aren't part of the PCI address though, they are options. So I need to figure out the best way to represent that in the XML, and fill it in when it is auto-generated when the controller is defined. A few possible ways to solve both problems at once:
[1] <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <target type='ioh3420' chassis='3' port='0x21'/> </controller>
Precedent: <target> is used to store the port number for serial/console devices, specify guest-side bus and device name for disks, specify guest-side mount location for filesystems, specify the *host*-side name of tap devices for network interfaces, memory size for <memory>. So it seems kind of proper. (slight variation - <target model='ioh3420' .../> :-/ )
[2] <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <model type='ioh3420' chassis='3' port='0x21'/> </controller>
(How's *that* for confusing?!? Both a model attribute *and* a model subelement.)
Precedent: This is exactly patterned after the use of <model> in <video> devices though, where the specific card being emulated along with any memory/etc options are put in <model>, so at least there is precedence)
[3] <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <device type='ioh3420' chassis='3' port='0x21'/> </controller>
(This one is completely new, just for a fresh start in case people think neither of the others are exactly right)
At this point you can see that it all comes down to what name we want to give the subelement; within that, we give the exact name of the qemu device, and the exact name/value of any qemu options that we set to non-default values.
Somebody *please* have an opinion on the name for this, because none of these strikes me as better or worse (well, I think I dislike <driver>
To be honest, I kinds dislike all of them. Not that they would be chosen poorly, no, it's simply because the good sensible choice is unavailable due to another poor decision in the past (this may be another point for Michal's talk on KVM Forum). Thinking about it I must say I don't like how target (which is supposed to match a place where the device appears for the guest) is used for the model specification, on the other hand (ab)using 'model' element for the specification of an "address" in guest (that's what I understand chassis and port are) doesn't feel any better. What if we go with two of those elements? Would that be too much pain? E.g.: <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <model type='ioh3420'/> <target chassis='3' port='0x21'/> </controller> I understand this might look like an overkill, but I think it's better safe then sorry, I guess I just see us not so far in the future regretting any decision made now. Another idea I briefly thought about, which would also deal with the migration to older libvirt versions was inventing a new address type ... and then I realized how stupid^Wugly was that. Anyway, I must say that I don't fully get the idea of how some of these controllers are supposed to be specified and what particular HW they should describe, so I might not had the best idea in town. <rant> Mainly I don't understand why there's no pci switch. Or is <controller type='pci'/> actually a pci switch (I usderstand it like that)? Then why is there a downstream and upstream device when {down,up}stream should just indicate whether the port aims towards the root complex or the endpoints, respectively. </rant>
(P.S. There are other "automagic" qemu options being specified by libvirt that maybe could use the same treatment as these. Two that come to mind are chassis_nr for pci-bridge controllers (set to the controller's index), and "vectors" for virtio-net multiqueue support (set to (queues*2)+2).
I guess 'chassis_nr' is something exposed to guest, similarly to 'chassis' in the pci-root-port case, so that might be worth saving in the XML. Maybe we can use 'chassis' in the target/model specification for storing the chassis_nr as a level of abstraction on libvirt level. But the number of vectors (to my knowledge) is not something exposed to guest and is only advised to being set according to the formula you wrote, above.

On Thu, Jun 25, 2015 at 08:44:17AM +0200, Martin Kletzander wrote:
To be honest, I kinds dislike all of them. Not that they would be chosen poorly, no, it's simply because the good sensible choice is unavailable due to another poor decision in the past (this may be another point for Michal's talk on KVM Forum). Thinking about it I must say I don't like how target (which is supposed to match a place where the device appears for the guest) is used for the model specification, on the other hand (ab)using 'model' element for the specification of an "address" in guest (that's what I understand chassis and port are) doesn't feel any better. What if we go with two of those elements? Would that be too much pain? E.g.:
<controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <model type='ioh3420'/> <target chassis='3' port='0x21'/> </controller>
I understand this might look like an overkill, but I think it's better safe then sorry, I guess I just see us not so far in the future regretting any decision made now.
I'd be fine with this proposal too. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Jun 25, 2015 at 08:44:17AM +0200, Martin Kletzander wrote:
On Wed, Jun 24, 2015 at 12:04:33PM -0400, Laine Stump wrote:
At this point you can see that it all comes down to what name we want to give the subelement; within that, we give the exact name of the qemu device, and the exact name/value of any qemu options that we set to non-default values.
Somebody *please* have an opinion on the name for this, because none of these strikes me as better or worse (well, I think I dislike <driver>
To be honest, I kinds dislike all of them. Not that they would be chosen poorly, no, it's simply because the good sensible choice is unavailable due to another poor decision in the past (this may be another point for Michal's talk on KVM Forum). Thinking about it I must say I don't like how target (which is supposed to match a place where the device appears for the guest) is used for the model specification, on the other hand (ab)using 'model' element for the specification of an "address" in guest (that's what I understand chassis and port are) doesn't feel any better. What if we go with two of those elements? Would that be too much pain? E.g.:
<controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <model type='ioh3420'/> <target chassis='3' port='0x21'/> </controller>
I like this, essentially a subModel without the camelCase. One small nit: <model name='ioh4320'/> would look nicer to me, but we're using <model type=/> in at least one other place, so I'm fine with both. Jan
I understand this might look like an overkill, but I think it's better safe then sorry, I guess I just see us not so far in the future regretting any decision made now.

On Thu, Jun 25, 2015 at 11:46:41AM +0200, Ján Tomko wrote:
On Thu, Jun 25, 2015 at 08:44:17AM +0200, Martin Kletzander wrote:
On Wed, Jun 24, 2015 at 12:04:33PM -0400, Laine Stump wrote:
At this point you can see that it all comes down to what name we want to give the subelement; within that, we give the exact name of the qemu device, and the exact name/value of any qemu options that we set to non-default values.
Somebody *please* have an opinion on the name for this, because none of these strikes me as better or worse (well, I think I dislike <driver>
To be honest, I kinds dislike all of them. Not that they would be chosen poorly, no, it's simply because the good sensible choice is unavailable due to another poor decision in the past (this may be another point for Michal's talk on KVM Forum). Thinking about it I must say I don't like how target (which is supposed to match a place where the device appears for the guest) is used for the model specification, on the other hand (ab)using 'model' element for the specification of an "address" in guest (that's what I understand chassis and port are) doesn't feel any better. What if we go with two of those elements? Would that be too much pain? E.g.:
<controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <model type='ioh3420'/> <target chassis='3' port='0x21'/> </controller>
I like this, essentially a subModel without the camelCase. One small nit: <model name='ioh4320'/> would look nicer to me, but we're using <model type=/> in at least one other place, so I'm fine with both.
I didn't think about the 'type', I probably just typed it in there because each line had a 'type' already in :) I like 'name' better too, but diving deeply into it, I have no strong opinions on that.
Jan
I understand this might look like an overkill, but I think it's better safe then sorry, I guess I just see us not so far in the future regretting any decision made now.

On 06/25/2015 02:44 AM, Martin Kletzander wrote:
On Wed, Jun 24, 2015 at 12:04:33PM -0400, Laine Stump wrote:
I think the only votes were for option 1 and 4 (interesting how the only ones that were chosen were those that I *didn't* pick personally :-). See comments below. In the meantime the other issue Alex pointed out may cause this to take a slightly different direction.
On 06/22/2015 02:44 PM, Laine Stump wrote:
=== Idea 1: multiplex the meaning of the "model" attribute, so we currently have:
<controller type='pci' model='dmi-to-pci-bridge'/>
which means "add an i82801b11-bridge device" and when we add a generic version of this type of controller, we would do it with something like:
<controller type='pci' model='generic-dmi-to-pci-bridge'/>
and for another vendor's mythical controller:
<controller type='pci' model='xyz-dmi-to-pci-bridge'/>
Cons: This will make for ugliness in switch statements where a new case will have to be added whenever a different controller with similar behavior/usage is supported. And it's generally not a good idea to have a single attribute be used for two different functions.
jtomko advocated this one because it would make it easier for an older libvirt to notice an unsupported class+model of controller during a migration attempt from a newer libvirt. An example would be if a newer libvirt had a guest with
<controller type='pci' model='xyz'/>
(where xyz is some qemu device that implements a dmi-to-pci-bridge)
That would fail on an attempt to migrate to older libvirt, but
<controller type='pci' model='dmi-to-pci-bridge'/>
would succeed. On the other hand, if we did this:
<controller type='pci' model='dmi-to-pci-bridge submodel='xyz'/>
that would succeed (and create the wrong device) because submodel would be ignored.
Since there is currently no alternate implementation of a dmi-to-pci-bridge available in qemu, and it will likely be awhile until that happens, I think if we start filling out the XML now as:
<controller type='pci' model='dmi-to-pci-bridge' submodel='i82801b11-bridge'/>
by the time we get to the point that we do have an alternate 'xyz' controller, any version of libvirt that doesn't understand "submodel" will be so far in the past that we wouldn't want to be able to migrate back to it anyway.
=== Idea 4: Unlike other uses of "model" in libvirt, for pci controllers, continue to use "model" to denote the subtype/class/whatever of controller, and create a new attribute to denote the different specific implementations of each one. So for example:
[4] <controller type='pci' model='dmi-to-pci-bridge'/>
would become:
[5] <controller type='pci' model='dmi-to-pci-bridge' implementation='i82801b11-bridge'/>
(or some other name in place of "implementation" - ideas? I'm horrible at thinkin up names)
Pros: wouldn't create compatibility problems when downgrading or migrating cross version.
Cons: Is inconsistent with every other use of "model" attribute in libvirt, and each new addition of a PCI controller further propagates this misuse.
mkletzan preferred this one, and danpb was amenable to it in IRC. I think I now agree, but Alex's comments about needing to keep track of what we put in the qemu commandline for port and chassis have me thinking this isn't enough.
The problem is that the "ioh3420" device needs its "port" and "chassis" options set; Alex recommends setting port to:
(slot << 3)+function
Likewise, he recommends setting "port" for the xio3130-downstream device to the same value as slot. So, for the following:
<controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> </controller>
we would end up with the following commandline:
-device ioh3420,chassis=3,port=0x21,id='pcie.3',bus='pcie.0', addr=0x4.0x1
(chassis is the same as "index" in the xml (again per Alex's recommendation), and port is (4 << 3) + 1 = 0x21)
*However* he also says that we may change our mind on these in the future, so chassis and port may end up being something different. Since those values are visible to the guest, we can't allow them to change on an existing machine, as it breaks the guest ABI. So we need to store them in the XML. They aren't part of the PCI address though, they are options. So I need to figure out the best way to represent that in the XML, and fill it in when it is auto-generated when the controller is defined. A few possible ways to solve both problems at once:
[1] <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <target type='ioh3420' chassis='3' port='0x21'/> </controller>
Precedent: <target> is used to store the port number for serial/console devices, specify guest-side bus and device name for disks, specify guest-side mount location for filesystems, specify the *host*-side name of tap devices for network interfaces, memory size for <memory>. So it seems kind of proper. (slight variation - <target model='ioh3420' .../> :-/ )
[2] <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <model type='ioh3420' chassis='3' port='0x21'/> </controller>
(How's *that* for confusing?!? Both a model attribute *and* a model subelement.)
Precedent: This is exactly patterned after the use of <model> in <video> devices though, where the specific card being emulated along with any memory/etc options are put in <model>, so at least there is precedence)
[3] <controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <device type='ioh3420' chassis='3' port='0x21'/> </controller>
(This one is completely new, just for a fresh start in case people think neither of the others are exactly right)
At this point you can see that it all comes down to what name we want to give the subelement; within that, we give the exact name of the qemu device, and the exact name/value of any qemu options that we set to non-default values.
Somebody *please* have an opinion on the name for this, because none of these strikes me as better or worse (well, I think I dislike <driver>
To be honest, I kinds dislike all of them. Not that they would be chosen poorly, no, it's simply because the good sensible choice is unavailable due to another poor decision in the past (this may be another point for Michal's talk on KVM Forum). Thinking about it I must say I don't like how target (which is supposed to match a place where the device appears for the guest) is used for the model specification, on the other hand (ab)using 'model' element for the specification of an "address" in guest (that's what I understand chassis and port are)
Actually they aren't an address. The address where the controller is connected is completely specified in its <address> element, and the address where other devices can connect to this controller is specified by 1) the index of the controller and 2) implied (from the class of controller) information about how many "slots" are available on this controller and the starting slot#. As it was described to me (by Alex in IRC and email) chassis, chassis_nr, and port are just numbers that are stored in a register that the guest can examine, and on real hardware they are used to learn, e.g. the physical location of a connector. From libvirt's point of view, you can end the previous sentence after the word "register". This is part of the reason the proper solution isn't obvious/clear.
doesn't feel any better. What if we go with two of those elements? Would that be too much pain? E.g.:
<controller type='pci model='pci-root-port' index='3'> <address type='pci' bus='0' slot='4' function='1'> <model type='ioh3420'/> <target chassis='3' port='0x21'/> </controller>
I understand this might look like an overkill, but I think it's better safe then sorry, I guess I just see us not so far in the future regretting any decision made now.
Well, another item that we will want to specify in the future is the numa node of a pxb bridge (which I now understand as another kind of "pci-bridge", *not* another kind of "pci-root" as I previously believed). I just looked for where numa node info is set for other devices, and see that for the <memory> device, it is set with the numa *subelement* of the target subelement of <memory>. So: <memory ....> <target> <numa>1</numa> </target> <memory> On the other hand, guest-side nume node for CPUs is set like this: <cpu> <numa> <cell id='0' cpus='0-3' memory='512000' unit='KiB'/> <cell id='1' cpus='4-7' memory='512000' unit='KiB' memAccess='shared'/> </numa> </cpu> so no use of <target> at all in that case. I've tried to find a "smoking gun" item that would force me to use your suggestion, but haven't yet. So it comes down to this: Do we want to be more consistent with: <video> <model type='vga' vram='16384' heads='1'> <acceleration accel3d='yes' accel2d='yes'/> </model> </video> (i.e. everything in <model>) or: <memory ...> <target> <size>blah</size> <node>1</node> </target> ... </memory> and <disk ...> <target dev='hdb' bus='ide'/> </disk ???
Another idea I briefly thought about, which would also deal with the migration to older libvirt versions was inventing a new address type ... and then I realized how stupid^Wugly was that.
Anyway, I must say that I don't fully get the idea of how some of these controllers are supposed to be specified and what particular HW they should describe, so I might not had the best idea in town.
It's like a hybrid set of legos and duplos (and also some "leglos" or "dupos" or something) mixed together - you can't just plug "anything" into "anything". Each controller has limitations on what it can be plugged into, what (and how many) can be plugged into it, and whether or not each of the connections (both up and down) supports hotplug/unplug. It's taken me quite awhile to get a decent model in my mind of how they all fit together, and I still learn something almost daily that invalidates some belief I've had.
<rant> Mainly I don't understand why there's no pci switch. Or is <controller type='pci'/> actually a pci switch (I usderstand it like that)? Then why is there a downstream and upstream device when {down,up}stream should just indicate whether the port aims towards the root complex or the endpoints, respectively. </rant>
Yeah, that one bothers me too. The picture I have in my mind is that the x3130-upstream is a switch with one upstream (pcie) port and 32 downstream (switch) ports, and that the xio3130-downstream device is a "converter dongle" that has one upstream (switch) port and one downstream (pcie) port. So the upstream port gives you a switch and a bunch of downstream ports, but you can only plug one kind of thing into them. I wanted to call the upstream one "pcie-switch" and the downstream one "pcie-switch-port", but Alex (who is my go-to authority on all things PCIe) said that seemed wrong to him, the "switch" is the collection of one upstream and one or more downstream ports, and that he really wanted the "upstream" and "downstream" in the names. That's why I now have pcie-switch-upstream-port and pcie-switch-downstream-port. So the way I think about it now is that the "switch" is this magical nonexistent thing that sits between an upstream and a bunch of downstreams. I still like the other way better, but am going with what seems right to Alex because he's been dealing with these devices for much longer than me so he almost certainly knows some nuances that I'm unaware of.
(P.S. There are other "automagic" qemu options being specified by libvirt that maybe could use the same treatment as these. Two that come to mind are chassis_nr for pci-bridge controllers (set to the controller's index), and "vectors" for virtio-net multiqueue support (set to (queues*2)+2).
I guess 'chassis_nr' is something exposed to guest, similarly to 'chassis' in the pci-root-port case, so that might be worth saving in the XML. Maybe we can use 'chassis' in the target/model specification for storing the chassis_nr as a level of abstraction on libvirt level.
That thought crossed my mind, but then I thought of the possibility that there may some day be a device that uses both chassis and chassis_nr at the same time. I even have a vague memory of something like this happening in the past (although it may have been somewhere totally unrelated to libvirt). For that reason,
But the number of vectors (to my knowledge) is not something exposed to guest and is only advised to being set according to the formula you wrote, above.
Ah, good point. We're safe on that one then.
participants (5)
-
Alex Williamson
-
Daniel P. Berrange
-
Ján Tomko
-
Laine Stump
-
Martin Kletzander