[libvirt] [PATCH v5 0/4] qemu: Allow PCI virtio on ARM "virt" machine

Virt machine in qemu since v2.3.0 has PCI generic host controller, and can use PCI devices. This provides performance improvement as well as vhost-net with irqfd support for virtio-net. However libvirt currently does not allow ARM virt machine to have PCI devices. This patchset adds the necessary support. Changes since v4: - Rebased onto current master - Added possibility to plug virtio-net-pci adapter directly into PCIe bus. This is necessary for irqfds to work in qemu. Changes since v3: - Capability is based not on qemu version but on support of "gpex-pcihost" device by qemu - Added a workaround, allowing to pass "make check". The problem is that test suite does not build capabilities cache. Unfortunately this means that correct unit-test for the new functionality currently cannot be written. Test suite framework needs to be improved. Changes since v2: Complete rework, use different approach - Correctly model PCI Express bus on the machine. It is now possible to explicitly specify <address-type='pci'> with attributes. This allows to attach not only virtio, but any other PCI device to the model. - Default is not changed and still mmio, for backwards compatibility with existing installations. PCI bus has to be explicitly specified. - Check for the capability in correct place, in v2 it actually did not work Changes since v1: - Added capability based on qemu version number - Recognize also "virt-" prefix Pavel Fedin (4): qemu: Introduce QEMU_CAPS_OBJECT_GPEX Add PCI-Express root to ARM virt machine Build correct command line for PCI NICs on ARM Allow to plug virtio-net-pci into PCIe slot src/qemu/qemu_capabilities.c | 10 ++++++++++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 11 ++++++++++- src/qemu/qemu_domain.c | 17 +++++++++++++---- 4 files changed, 34 insertions(+), 5 deletions(-) -- 1.9.5.msysgit.0

This capability specifies that qemu can implement generic PCI host controller. It is often used for virtual environments, including ARM. Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + 2 files changed, 3 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d8cb32d..d570fdd 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -288,6 +288,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "vhost-user-multiqueue", /* 190 */ "migration-event", + "gpex-pcihost", ); @@ -1568,6 +1569,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "ivshmem", QEMU_CAPS_DEVICE_IVSHMEM }, { "pc-dimm", QEMU_CAPS_DEVICE_PC_DIMM }, { "pci-serial", QEMU_CAPS_DEVICE_PCI_SERIAL }, + { "gpex-pcihost", QEMU_CAPS_OBJECT_GPEX}, }; static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBlk[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index f77bd06..a50c3ca 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -231,6 +231,7 @@ typedef enum { QEMU_CAPS_CPU_AARCH64_OFF = 189, /* -cpu ...,aarch64=off */ QEMU_CAPS_VHOSTUSER_MULTIQUEUE = 190, /* vhost-user with -netdev queues= */ QEMU_CAPS_MIGRATION_EVENT = 191, /* MIGRATION event */ + QEMU_CAPS_OBJECT_GPEX = 192, /* have generic PCI host controller */ QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; -- 1.9.5.msysgit.0

On Fri, Jul 17, 2015 at 02:27:44PM +0300, Pavel Fedin wrote:
This capability specifies that qemu can implement generic PCI host controller. It is often used for virtual environments, including ARM.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + 2 files changed, 3 insertions(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d8cb32d..d570fdd 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -288,6 +288,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST,
"vhost-user-multiqueue", /* 190 */ "migration-event", + "gpex-pcihost", );
@@ -1568,6 +1569,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "ivshmem", QEMU_CAPS_DEVICE_IVSHMEM }, { "pc-dimm", QEMU_CAPS_DEVICE_PC_DIMM }, { "pci-serial", QEMU_CAPS_DEVICE_PCI_SERIAL }, + { "gpex-pcihost", QEMU_CAPS_OBJECT_GPEX},
Missing space between GPEX and '}'. Otherwise OK.
};
static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBlk[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index f77bd06..a50c3ca 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -231,6 +231,7 @@ typedef enum { QEMU_CAPS_CPU_AARCH64_OFF = 189, /* -cpu ...,aarch64=off */ QEMU_CAPS_VHOSTUSER_MULTIQUEUE = 190, /* vhost-user with -netdev queues= */ QEMU_CAPS_MIGRATION_EVENT = 191, /* MIGRATION event */ + QEMU_CAPS_OBJECT_GPEX = 192, /* have generic PCI host controller */
QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; -- 1.9.5.msysgit.0
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices. In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0' Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; } + if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */ + if (STREQ(def->os.machine, "virt")) + return true; + } + return false; } diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break; case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque; + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */ + if (driver->qemuCapsCache) { + virQEMUCapsPtr qemuCaps = + virQEMUCapsCacheLookup(driver->qemuCapsCache, def->emulator); + addPCIeRoot = virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_GPEX); + } + } break; case VIR_ARCH_PPC64: -- 1.9.5.msysgit.0

On 17.07.2015 13:27, Pavel Fedin wrote:
Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; }
+ if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */ + if (STREQ(def->os.machine, "virt")) + return true; + } + return false; }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break;
case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque;
I believe this will be reused later. I'd like to make this variable function wide.
+ + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */ + if (driver->qemuCapsCache) { + virQEMUCapsPtr qemuCaps = + virQEMUCapsCacheLookup(driver->qemuCapsCache, def->emulator); + addPCIeRoot = virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_GPEX);
You actually need to unref the @qemuCaps when no longer needed.
+ } + } break;
case VIR_ARCH_PPC64:
ACK with this squashed in: diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 42c9753..02906d3 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -983,6 +983,8 @@ qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, void *opaque) { + virQEMUDriverPtr driver = opaque; + virQEMUCapsPtr qemuCaps = NULL; bool addDefaultUSB = true; bool addImplicitSATA = false; bool addPCIRoot = false; @@ -991,17 +993,24 @@ qemuDomainDefPostParse(virDomainDefPtr def, bool addDefaultUSBKBD = false; bool addDefaultUSBMouse = false; bool addPanicDevice = false; + int ret = -1; if (def->os.bootloader || def->os.bootloaderArgs) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("bootloader is not supported by QEMU")); - return -1; + return ret; } /* check for emulator and create a default one if needed */ if (!def->emulator && !(def->emulator = virDomainDefGetDefaultEmulator(def, caps))) - return -1; + return ret; + + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */ + if (driver->qemuCapsCache) + qemuCaps = virQEMUCapsCacheLookup(driver->qemuCapsCache, def->emulator); /* Add implicit PCI root controller if the machine has one */ switch (def->os.arch) { @@ -1031,21 +1040,13 @@ qemuDomainDefPostParse(virDomainDefPtr def, case VIR_ARCH_ARMV7L: case VIR_ARCH_AARCH64: - addDefaultUSB = false; - addDefaultMemballoon = false; - if (STREQ(def->os.machine, "virt") || - STRPREFIX(def->os.machine, "virt-")) { - virQEMUDriverPtr driver = opaque; - - /* This condition is actually a (temporary) hack for test suite which - * does not create capabilities cache */ - if (driver->qemuCapsCache) { - virQEMUCapsPtr qemuCaps = - virQEMUCapsCacheLookup(driver->qemuCapsCache, def->emulator); - addPCIeRoot = virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_GPEX); - } - } - break; + addDefaultUSB = false; + addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + addPCIeRoot = virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_GPEX); + } + break; case VIR_ARCH_PPC64: case VIR_ARCH_PPC64LE: @@ -1085,18 +1086,18 @@ qemuDomainDefPostParse(virDomainDefPtr def, if (addDefaultUSB && virDomainDefMaybeAddController( def, VIR_DOMAIN_CONTROLLER_TYPE_USB, 0, -1) < 0) - return -1; + goto cleanup; if (addImplicitSATA && virDomainDefMaybeAddController( def, VIR_DOMAIN_CONTROLLER_TYPE_SATA, 0, -1) < 0) - return -1; + goto cleanup; if (addPCIRoot && virDomainDefMaybeAddController( def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) < 0) - return -1; + goto cleanup; /* When a machine has a pcie-root, make sure that there is always * a dmi-to-pci-bridge controller added as bus 1, and a pci-bridge @@ -1112,14 +1113,14 @@ qemuDomainDefPostParse(virDomainDefPtr def, virDomainDefMaybeAddController( def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) < 0) { - return -1; + goto cleanup; } } if (addDefaultMemballoon && !def->memballoon) { virDomainMemballoonDefPtr memballoon; if (VIR_ALLOC(memballoon) < 0) - return -1; + goto cleanup; memballoon->model = VIR_DOMAIN_MEMBALLOON_MODEL_VIRTIO; def->memballoon = memballoon; @@ -1130,24 +1131,27 @@ qemuDomainDefPostParse(virDomainDefPtr def, virDomainDefMaybeAddInput(def, VIR_DOMAIN_INPUT_TYPE_KBD, VIR_DOMAIN_INPUT_BUS_USB) < 0) - return -1; + goto cleanup; if (addDefaultUSBMouse && def->ngraphics > 0 && virDomainDefMaybeAddInput(def, VIR_DOMAIN_INPUT_TYPE_MOUSE, VIR_DOMAIN_INPUT_BUS_USB) < 0) - return -1; + goto cleanup; if (addPanicDevice && !def->panic) { virDomainPanicDefPtr panic; if (VIR_ALLOC(panic) < 0) - return -1; + goto cleanup; def->panic = panic; } - return 0; + ret = 0; + cleanup: + virObjectUnref(qemuCaps); + return ret; } static const char * Michal

On Fri, Jul 17, 2015 at 02:27:45PM +0300, Pavel Fedin wrote:
Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; }
+ if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */
So every ARM qemu with the "virt" machine type supports both PCI and multiqueue? How about those "virt-*" for which you check below. That might not be related, I'm just curious.
+ if (STREQ(def->os.machine, "virt")) + return true; + } + return false; }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break;
case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque; + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */
Few questions here. a) how "temporary" is this since you're not removing it in this series? b) for what tests you need this hack and what part of the below is the hack? Moreover, you cannot use capabilities when defining an XML. The emulator can change between the domain is defined and started, so you cannot know with what emulator this will be started. I see Michal (Cc'd) just pushed this, I probably just missed the mail from him. Anyway, I don't think this is how this should be handled. I know this is bit of a chicken-and-egg problem due to the fact that we generate addresses during definitions. But I think we could figure something out instead of this hack.
+ if (driver->qemuCapsCache) { + virQEMUCapsPtr qemuCaps = + virQEMUCapsCacheLookup(driver->qemuCapsCache, def->emulator); + addPCIeRoot = virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_GPEX); + } + } break;
case VIR_ARCH_PPC64: -- 1.9.5.msysgit.0
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Thu, Aug 06, 2015 at 06:37:41PM +0200, Martin Kletzander wrote:
On Fri, Jul 17, 2015 at 02:27:45PM +0300, Pavel Fedin wrote:
Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; }
+ if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */
So every ARM qemu with the "virt" machine type supports both PCI and multiqueue? How about those "virt-*" for which you check below. That might not be related, I'm just curious.
+ if (STREQ(def->os.machine, "virt")) + return true; + } + return false; }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break;
case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque; + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */
Few questions here. a) how "temporary" is this since you're not removing it in this series? b) for what tests you need this hack and what part of the below is the hack?
Moreover, you cannot use capabilities when defining an XML. The emulator can change between the domain is defined and started, so you cannot know with what emulator this will be started.
I see Michal (Cc'd) just pushed this, I probably just missed the mail
Of course I forgot, Cc'ing now.
from him. Anyway, I don't think this is how this should be handled. I know this is bit of a chicken-and-egg problem due to the fact that we generate addresses during definitions. But I think we could figure something out instead of this hack.
+ if (driver->qemuCapsCache) { + virQEMUCapsPtr qemuCaps = + virQEMUCapsCacheLookup(driver->qemuCapsCache, def->emulator); + addPCIeRoot = virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_GPEX); + } + } break;
case VIR_ARCH_PPC64: -- 1.9.5.msysgit.0
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Hello!
So every ARM qemu with the "virt" machine type supports both PCI and multiqueue?
Every ARM qemu with "virt" machine AND pci-generic supports them.
How about those "virt-*" for which you check below. That might not be related, I'm just curious.
Actually mainstream qemu doesn't have "virt-" at all. Commit message where this has been introduced says that this is reserved for distro maintainers. And everywhere "virt-" has been added as an alias for "virt", so i just followed this policy. What these "virt-" machines really do, depends on their authors. I simply don't know.
Few questions here. a) how "temporary" is this since you're not removing it in this series?
Well, we could discuss it. Personally, i don't have much time to work on the test suite, to tell the truth... But, i know, no developer likes documenting and/or testing :)
b) for what tests you need this hack
For every qemu-related tests. If you want to see it, you can omit the condition and run "make check", the first qemu test will crash upon NULL dereference in virQEMUCapsCacheLookup().
and what part of the below is the hack?
The hack is "if (driver->qemuCapsCache)" check. Our test suite does not create the cache, instead it builds "caps" object directly and passes around the pointer to all functions which take it. It is actually not really difficult to fix this. The test suite should mock up qemuCapsCache too. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On Fri, Aug 07, 2015 at 09:32:56AM +0300, Pavel Fedin wrote:
Hello!
So every ARM qemu with the "virt" machine type supports both PCI and multiqueue?
Every ARM qemu with "virt" machine AND pci-generic supports them.
How about those "virt-*" for which you check below. That might not be related, I'm just curious.
Actually mainstream qemu doesn't have "virt-" at all. Commit message where this has been introduced says that this is reserved for distro maintainers. And everywhere "virt-" has been added as an alias for "virt", so i just followed this policy. What these "virt-" machines really do, depends on their authors. I simply don't know.
OK, that's very reasonable then. If the maintainer adds some 'virt-' type, they can also add the proper checks for it.
Few questions here. a) how "temporary" is this since you're not removing it in this series?
Well, we could discuss it. Personally, i don't have much time to work on the test suite, to tell the truth... But, i know, no developer likes documenting and/or testing :)
b) for what tests you need this hack
For every qemu-related tests. If you want to see it, you can omit the condition and run "make check", the first qemu test will crash upon NULL dereference in virQEMUCapsCacheLookup().
and what part of the below is the hack?
The hack is "if (driver->qemuCapsCache)" check. Our test suite does not create the cache, instead it builds "caps" object directly and passes around the pointer to all functions which take it. It is actually not really difficult to fix this. The test suite should mock up qemuCapsCache too.
The qemuxml2argvtest mocks the capability cache, so it must be qemuxml2xmltest. And that one should do *nothing* with any capabilities. Merely because whatever it does must work without any qemu installed in the system. Moreover the test might produce different results for different QEMU binaries installed in the system, which is even worse.
Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

Hello!
The qemuxml2argvtest mocks the capability cache, so it must be qemuxml2xmltest. And that one should do *nothing* with any capabilities. Merely because whatever it does must work without any qemu installed in the system.
This is what happens if you remove the check: --- cut --- ../build-aux/test-driver: line 107: 21055 Segmentation fault "$@" > $log_file 2>&1 FAIL: qemuxml2argvtest ../build-aux/test-driver: line 107: 21081 Segmentation fault "$@" > $log_file 2>&1 FAIL: qemuxml2xmltest ../build-aux/test-driver: line 107: 21107 Segmentation fault "$@" > $log_file 2>&1 FAIL: qemuxmlnstest ../build-aux/test-driver: line 107: 21133 Segmentation fault "$@" > $log_file 2>&1 FAIL: qemuargv2xmltest PASS: qemuhelptest ../build-aux/test-driver: line 107: 21185 Segmentation fault "$@" > $log_file 2>&1 FAIL: domainsnapshotxml2xmltest --- cut ---
Moreover the test might produce different results for different QEMU binaries installed in the system, which is even worse.
Really? As far as i understand the code capabilities are not retrieved from the actual qemu. They are hardcoded as a set in tests. But, i can miss something because i haven't studied the code well. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On Fri, Aug 07, 2015 at 03:27:29PM +0300, Pavel Fedin wrote:
Hello!
The qemuxml2argvtest mocks the capability cache, so it must be qemuxml2xmltest. And that one should do *nothing* with any capabilities. Merely because whatever it does must work without any qemu installed in the system.
This is what happens if you remove the check: --- cut --- ../build-aux/test-driver: line 107: 21055 Segmentation fault "$@" > $log_file 2>&1 FAIL: qemuxml2argvtest ../build-aux/test-driver: line 107: 21081 Segmentation fault "$@" > $log_file 2>&1 FAIL: qemuxml2xmltest ../build-aux/test-driver: line 107: 21107 Segmentation fault "$@" > $log_file 2>&1 FAIL: qemuxmlnstest ../build-aux/test-driver: line 107: 21133 Segmentation fault "$@" > $log_file 2>&1 FAIL: qemuargv2xmltest PASS: qemuhelptest ../build-aux/test-driver: line 107: 21185 Segmentation fault "$@" > $log_file 2>&1 FAIL: domainsnapshotxml2xmltest --- cut ---
Of course, because there are no capabilities when defining XML, capabilities are per emulator binary and that is not known until you are starting the machine (it can change between define and start just by updating qemu for example).
Moreover the test might produce different results for different QEMU binaries installed in the system, which is even worse.
Really? As far as i understand the code capabilities are not retrieved from the actual qemu. They are hardcoded as a set in tests. But, i can miss something because i haven't studied the code well.
They are, but only in qemuxml2argvtest and apparently only when creating the command-line. Because parsing the XML should be unambiguous and indifferent to the qemu binary that's in the system. Martin

Hello!
Of course, because there are no capabilities when defining XML,
And now we need them, because depending on them we now may want to add PCIe root complex description. Therefore i think we have to add this to qemu test suite. I will take a look at it after some time. It actually shouldn't be that difficult. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On Mon, Aug 10, 2015 at 11:24:42AM +0300, Pavel Fedin wrote:
Hello!
Of course, because there are no capabilities when defining XML,
And now we need them, because depending on them we now may want to add PCIe root complex description. Therefore i think we have to add this to qemu test suite. I will take a look at it after some time. It actually shouldn't be that difficult.
And that's the exact problem. I'm not saying "capabilities are not made available when defining", but rather "there are no capabilities when defining". I know this is just a big misunderstanding, but I don't know how else can I explain what's wrong with it. Can anyone else chime in?

Hello!
And that's the exact problem. I'm not saying "capabilities are not made available when defining", but rather "there are no capabilities when defining".
Stop stop stop... If we don't mess up with terminology... The test case contains some hardcoded capabilities, needed by the test, right? Ok, this means that actually we have this information, and we need to make up capabilities cache from these caps, and there will be no problem. With this change the test will exactly mimic real life behavior, where we have the cache, and this cache was obtained from the qemu binary. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On Mon, Aug 10, 2015 at 05:12:26PM +0300, Pavel Fedin wrote:
Hello!
And that's the exact problem. I'm not saying "capabilities are not made available when defining", but rather "there are no capabilities when defining".
Stop stop stop... If we don't mess up with terminology...
OK, sorry for that, I'll be clear from now on.
The test case contains some hardcoded capabilities, needed by the test, right? Ok, this means that
Let's say we're speaking about qemuxml2xmltest, just to make it as easy as possible. That test ultimately checks parsing and formatting and parsing is what we're talking about here. That test has *no* hardcoded capabilities. And the reason why it has no hardcoded capabilities is what I'm trying to express here.
actually we have this information, and we need to make up capabilities cache from these caps, and there will be no problem. With this change the test will exactly mimic real life behavior, where we have the cache, and this cache was obtained from the qemu binary.
I don't know whether to explain what's wrong here just so we don't get confusing again. But I must try. In libvirt, we are not using capabilities when parsing/formatting XML because it [parsing/formatting] must a) work without any qemu binary and b) produce exactly same results no matter what is installed in the system. Is that any clearer now by any chance?
Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On Thu, Aug 06, 2015 at 07:46:58PM +0200, Martin Kletzander wrote:
On Thu, Aug 06, 2015 at 06:37:41PM +0200, Martin Kletzander wrote:
On Fri, Jul 17, 2015 at 02:27:45PM +0300, Pavel Fedin wrote:
Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; }
+ if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */
So every ARM qemu with the "virt" machine type supports both PCI and multiqueue? How about those "virt-*" for which you check below. That might not be related, I'm just curious.
+ if (STREQ(def->os.machine, "virt")) + return true; + } + return false; }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break;
case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque; + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */
Few questions here. a) how "temporary" is this since you're not removing it in this series? b) for what tests you need this hack and what part of the below is the hack?
Moreover, you cannot use capabilities when defining an XML. The emulator can change between the domain is defined and started, so you cannot know with what emulator this will be started.
I see Michal (Cc'd) just pushed this, I probably just missed the mail
Of course I forgot, Cc'ing now.
I agree with your core statement that we should not be using the QEMU capabilities when defining the XML. With all existing scenarios we have been able to determine whether to add the implicit PCI controller based on the machine type name only, because with every other QEMU arch when doing such a major change as adding a PCI bus, they have created a new machine type. The problem is that arm 'virt' machine type is not stable, it is being changed arbitrarily in new QEMU releases :-( So AFAIK, that leaves us with 3 choices - Never add PCI controller at time the XML is defined, on the basis that we have to be conservative in what we add to cope with old QEMU - Always add PCI controller at time the XML is defined, on the basis that most people will have new enough QEMU because ARM 'virt' machine type is very much still in development, so no one will serously stick with the older QEMU versions which lack PCI. - Use the capabilities in XML post-parse to conditionally add the PCI controller. This is what was currently merged I don't think option 1 makes much sense as it'll harm ARM arch forever more, to cope with QEMU versions that will almost never be used in practice. I'd be inclined to go with option 2, and then if any PCI devices are actually used with the guest, check the capability at start time when we are doing auto-address assignment. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 08/10/2015 11:09 AM, Daniel P. Berrange wrote:
On Thu, Aug 06, 2015 at 07:46:58PM +0200, Martin Kletzander wrote:
On Thu, Aug 06, 2015 at 06:37:41PM +0200, Martin Kletzander wrote:
On Fri, Jul 17, 2015 at 02:27:45PM +0300, Pavel Fedin wrote:
Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; }
+ if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */ So every ARM qemu with the "virt" machine type supports both PCI and multiqueue? How about those "virt-*" for which you check below. That might not be related, I'm just curious.
+ if (STREQ(def->os.machine, "virt")) + return true; + } + return false; }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break;
case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque; + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */ Few questions here. a) how "temporary" is this since you're not removing it in this series? b) for what tests you need this hack and what part of the below is the hack?
Moreover, you cannot use capabilities when defining an XML. The emulator can change between the domain is defined and started, so you cannot know with what emulator this will be started.
I see Michal (Cc'd) just pushed this, I probably just missed the mail Of course I forgot, Cc'ing now. I agree with your core statement that we should not be using the QEMU capabilities when defining the XML. With all existing scenarios we have been able to determine whether to add the implicit PCI controller based on the machine type name only, because with every other QEMU arch when doing such a major change as adding a PCI bus, they have created a new machine type. The problem is that arm 'virt' machine type is not stable, it is being changed arbitrarily in new QEMU releases :-(
So AFAIK, that leaves us with 3 choices
- Never add PCI controller at time the XML is defined, on the basis that we have to be conservative in what we add to cope with old QEMU
- Always add PCI controller at time the XML is defined, on the basis that most people will have new enough QEMU because ARM 'virt' machine type is very much still in development, so no one will serously stick with the older QEMU versions which lack PCI.
- Use the capabilities in XML post-parse to conditionally add the PCI controller. This is what was currently merged
I don't think option 1 makes much sense as it'll harm ARM arch forever more, to cope with QEMU versions that will almost never be used in practice.
I'd be inclined to go with option 2, and then if any PCI devices are actually used with the guest, check the capability at start time when we are doing auto-address assignment.
Except that auto-address assignment happens after the parse, but before we write the config to disk (i.e. we don't wait until the domain is started. I'm guessing that was originally done because we didn't want to be potentially re-writing the config to disk every time the domain was started.) Also, we can't assume that "no PCI devices used at start" means "no PCI devices will ever be desired" because hotplug. At any rate, a machinetype that sometimes has an implied controller device and sometimes doesn't is just a completely broken idea - different instances of the same machinetype are assumed to have the same ABI, in particular if you start up a virtual machine of that machinetype with --nodefaults, it should *always* give you exactly the same set of implied devices. I don't think we should break our internal boundaries in order to cater to this broken machinetype (i.e. I agree that option 2 is the correct way to go), especially if it's a transient thing. On the other hand, if the virt-* machinetypes are for dev only, should we be trying to cater to them at all? Will they *always* be for dev only, and when something useful is settled on, a "permanent" machinetype will be defined? If that's the case, then maybe the proper thing is to *not* automatically add a pcie-root for this machinetype, but to properly support having one manually added. (Of course I've found all this discussion less than 24 hours after pushing a long series of patches that added 3 new new capabilities, and the new OBJECT_GPEX capability caused merge conflicts that had to be resolved; now if we remove the need for QEMU_CAPS_OBJECT_GPEX, there will be more gratuitous merge conflicts. I wish I'd waited until today to push. sigh.) BTW, a related topic - when addPCIeRoot is true, we not only add an implied pcie-root, but we also add a dmi-to-pci-bridge and a pci-bridge. pci-bridge is a generic device that I assume could work okay on something non-X86, but the dmi-to-pci-bridge is implemented with an i828b011-bridge device, which is very much specific to an Intel chipset. Is that device really functional on this ARM virt-* machinetype?

On Mon, Aug 10, 2015 at 12:08:02PM -0400, Laine Stump wrote:
On 08/10/2015 11:09 AM, Daniel P. Berrange wrote:
On Thu, Aug 06, 2015 at 07:46:58PM +0200, Martin Kletzander wrote:
On Thu, Aug 06, 2015 at 06:37:41PM +0200, Martin Kletzander wrote:
On Fri, Jul 17, 2015 at 02:27:45PM +0300, Pavel Fedin wrote:
Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; }
+ if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */ So every ARM qemu with the "virt" machine type supports both PCI and multiqueue? How about those "virt-*" for which you check below. That might not be related, I'm just curious.
+ if (STREQ(def->os.machine, "virt")) + return true; + } + return false; }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break;
case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque; + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */ Few questions here. a) how "temporary" is this since you're not removing it in this series? b) for what tests you need this hack and what part of the below is the hack?
Moreover, you cannot use capabilities when defining an XML. The emulator can change between the domain is defined and started, so you cannot know with what emulator this will be started.
I see Michal (Cc'd) just pushed this, I probably just missed the mail Of course I forgot, Cc'ing now. I agree with your core statement that we should not be using the QEMU capabilities when defining the XML. With all existing scenarios we have been able to determine whether to add the implicit PCI controller based on the machine type name only, because with every other QEMU arch when doing such a major change as adding a PCI bus, they have created a new machine type. The problem is that arm 'virt' machine type is not stable, it is being changed arbitrarily in new QEMU releases :-(
So AFAIK, that leaves us with 3 choices
- Never add PCI controller at time the XML is defined, on the basis that we have to be conservative in what we add to cope with old QEMU
- Always add PCI controller at time the XML is defined, on the basis that most people will have new enough QEMU because ARM 'virt' machine type is very much still in development, so no one will serously stick with the older QEMU versions which lack PCI.
- Use the capabilities in XML post-parse to conditionally add the PCI controller. This is what was currently merged
I don't think option 1 makes much sense as it'll harm ARM arch forever more, to cope with QEMU versions that will almost never be used in practice.
I'd be inclined to go with option 2, and then if any PCI devices are actually used with the guest, check the capability at start time when we are doing auto-address assignment.
Except that auto-address assignment happens after the parse, but before we write the config to disk (i.e. we don't wait until the domain is started. I'm guessing that was originally done because we didn't want to be potentially re-writing the config to disk every time the domain was started.) Also, we can't assume that "no PCI devices used at start" means "no PCI devices will ever be desired" because hotplug.
Ok, so it sounds to me like we are pretty much forced in to a position where we have to probe QEMU capabilities at XML define time. If we aim to minimize the usage of capabilities at define time, this won't be the end of the world. The only negative impact it would have is that if someone has old QEMU installed at define time and later upgrades to a QEMU with PCI support before starting the guest, they'll lack the PCI controller. THat's niche enough scenario that I think we can live with it. So from that POV, maybe we should just clean up the current merged patch so that it is properly exercised in the test suite by providing capabilities Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Hi!
The only negative impact it would have is that if someone has old QEMU installed at define time and later upgrades to a QEMU with PCI support before starting the guest, they'll lack the PCI controller.
This won't actually happen, because together with the missing controller he/she will not have PCI devices. Attempt to add a device without a bus immediately produces "No PCI bus" report, and the XML gets rejected. And, right after the upgrade, as soon as he/she adds some PCI devices to use, the XML will be re-parsed, and PCI bus description will be added. Because reparsing happens after any change.
So from that POV, maybe we should just clean up the current merged patch so that it is properly exercised in the test suite by providing capabilities
I will try to do it soon. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On Tue, Aug 11, 2015 at 09:28:21AM +0100, Daniel P. Berrange wrote:
On Mon, Aug 10, 2015 at 12:08:02PM -0400, Laine Stump wrote:
On 08/10/2015 11:09 AM, Daniel P. Berrange wrote:
On Thu, Aug 06, 2015 at 07:46:58PM +0200, Martin Kletzander wrote:
On Thu, Aug 06, 2015 at 06:37:41PM +0200, Martin Kletzander wrote:
On Fri, Jul 17, 2015 at 02:27:45PM +0300, Pavel Fedin wrote:
Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; }
+ if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */ So every ARM qemu with the "virt" machine type supports both PCI and multiqueue? How about those "virt-*" for which you check below. That might not be related, I'm just curious.
+ if (STREQ(def->os.machine, "virt")) + return true; + } + return false; }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break;
case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque; + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */ Few questions here. a) how "temporary" is this since you're not removing it in this series? b) for what tests you need this hack and what part of the below is the hack?
Moreover, you cannot use capabilities when defining an XML. The emulator can change between the domain is defined and started, so you cannot know with what emulator this will be started.
I see Michal (Cc'd) just pushed this, I probably just missed the mail Of course I forgot, Cc'ing now. I agree with your core statement that we should not be using the QEMU capabilities when defining the XML. With all existing scenarios we have been able to determine whether to add the implicit PCI controller based on the machine type name only, because with every other QEMU arch when doing such a major change as adding a PCI bus, they have created a new machine type. The problem is that arm 'virt' machine type is not stable, it is being changed arbitrarily in new QEMU releases :-(
So AFAIK, that leaves us with 3 choices
- Never add PCI controller at time the XML is defined, on the basis that we have to be conservative in what we add to cope with old QEMU
- Always add PCI controller at time the XML is defined, on the basis that most people will have new enough QEMU because ARM 'virt' machine type is very much still in development, so no one will serously stick with the older QEMU versions which lack PCI.
- Use the capabilities in XML post-parse to conditionally add the PCI controller. This is what was currently merged
I don't think option 1 makes much sense as it'll harm ARM arch forever more, to cope with QEMU versions that will almost never be used in practice.
I'd be inclined to go with option 2, and then if any PCI devices are actually used with the guest, check the capability at start time when we are doing auto-address assignment.
Except that auto-address assignment happens after the parse, but before we write the config to disk (i.e. we don't wait until the domain is started. I'm guessing that was originally done because we didn't want to be potentially re-writing the config to disk every time the domain was started.) Also, we can't assume that "no PCI devices used at start" means "no PCI devices will ever be desired" because hotplug.
Ok, so it sounds to me like we are pretty much forced in to a position where we have to probe QEMU capabilities at XML define time. If we aim to minimize the usage of capabilities at define time, this won't be the end of the world. The only negative impact it would have is that if someone has old QEMU installed at define time and later upgrades to a QEMU with PCI support before starting the guest, they'll lack the PCI controller. THat's niche enough scenario that I think we can live with it.
So from that POV, maybe we should just clean up the current merged patch so that it is properly exercised in the test suite by providing capabilities
I still don't like that. It seems to me like e.g. adding spice display during parsing if qemu supports it. There would be no way of removing it and in case we want to remove it we would have to add some nonsense like model='none' again. I liked the idea of adding all this if user supplies e.g. <controller type='pci' model='pci-root'/>, the only think I'm afraid of is that I would be the one who'd vote for this version if we were to vote on this. So you have not convinced me it's a good idea, but I will have to live with it. At least if the test code is cleaned up. Thanks for the input from all of you, at least we reached a decision. Martin

Hello!
I liked the idea of adding all this if user supplies e.g. <controller type='pci' model='pci-root'/>
IMHO there would be a usability problem with this: a) Too much excessive typing. b) It is not intuitively understood that you have to define the controller explicitly. Especially because if you run qemu by hands, this is not true. For example, when i first started to cope with PCI devices, i quickly found out about <address type='pci'> in google. But having to add something else is not that obvious and easy to discover. c) AFAIK virt-manager even cannot add PCI controller explicitly, it relies on libvirt's automagic in question. I think it would be a perfect consensus if we add "PCI devices are actually present" to the condition. In this case, if there is at least one PCI device in the input XML, controller description is automatically added. If there is none, it's OK to remove it. IMHO it's not difficult to implement because it's post-parse stage, and everything that the user entered has already been parsed. But we still want the capability check because old qemu versions simply cannot accommodate this. Therefore it doesn't remove the need to fix up tests. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On 08/10/2015 11:09 AM, Daniel P. Berrange wrote:
On Thu, Aug 06, 2015 at 07:46:58PM +0200, Martin Kletzander wrote:
On Thu, Aug 06, 2015 at 06:37:41PM +0200, Martin Kletzander wrote:
On Fri, Jul 17, 2015 at 02:27:45PM +0300, Pavel Fedin wrote:
Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; }
+ if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */
So every ARM qemu with the "virt" machine type supports both PCI and multiqueue? How about those "virt-*" for which you check below. That might not be related, I'm just curious.
+ if (STREQ(def->os.machine, "virt")) + return true; + } + return false; }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break;
case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque; + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */
Few questions here. a) how "temporary" is this since you're not removing it in this series? b) for what tests you need this hack and what part of the below is the hack?
Moreover, you cannot use capabilities when defining an XML. The emulator can change between the domain is defined and started, so you cannot know with what emulator this will be started.
I see Michal (Cc'd) just pushed this, I probably just missed the mail
Of course I forgot, Cc'ing now.
I agree with your core statement that we should not be using the QEMU capabilities when defining the XML. With all existing scenarios we have been able to determine whether to add the implicit PCI controller based on the machine type name only, because with every other QEMU arch when doing such a major change as adding a PCI bus, they have created a new machine type. The problem is that arm 'virt' machine type is not stable, it is being changed arbitrarily in new QEMU releases :-(
So AFAIK, that leaves us with 3 choices
- Never add PCI controller at time the XML is defined, on the basis that we have to be conservative in what we add to cope with old QEMU
- Always add PCI controller at time the XML is defined, on the basis that most people will have new enough QEMU because ARM 'virt' machine type is very much still in development, so no one will serously stick with the older QEMU versions which lack PCI.
- Use the capabilities in XML post-parse to conditionally add the PCI controller. This is what was currently merged
I don't think option 1 makes much sense as it'll harm ARM arch forever more, to cope with QEMU versions that will almost never be used in practice.
I'd be inclined to go with option 2, and then if any PCI devices are actually used with the guest, check the capability at start time when we are doing auto-address assignment.
Another option: add versioned 'virt' machine types to the next qemu release (like virt-2.5 etc.), and key libvirt enabling pci off of that. _Eventually_ we are going to need versioned 'virt' machine types for migration compatibility like we already do with x86 -M pc-*. Might as well make the change early so libvirt can actually use it. - Cole

On Mon, Aug 10, 2015 at 07:23:07PM -0400, Cole Robinson wrote:
On 08/10/2015 11:09 AM, Daniel P. Berrange wrote:
On Thu, Aug 06, 2015 at 07:46:58PM +0200, Martin Kletzander wrote:
On Thu, Aug 06, 2015 at 06:37:41PM +0200, Martin Kletzander wrote:
On Fri, Jul 17, 2015 at 02:27:45PM +0300, Pavel Fedin wrote:
Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; }
+ if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */
So every ARM qemu with the "virt" machine type supports both PCI and multiqueue? How about those "virt-*" for which you check below. That might not be related, I'm just curious.
+ if (STREQ(def->os.machine, "virt")) + return true; + } + return false; }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break;
case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque; + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */
Few questions here. a) how "temporary" is this since you're not removing it in this series? b) for what tests you need this hack and what part of the below is the hack?
Moreover, you cannot use capabilities when defining an XML. The emulator can change between the domain is defined and started, so you cannot know with what emulator this will be started.
I see Michal (Cc'd) just pushed this, I probably just missed the mail
Of course I forgot, Cc'ing now.
I agree with your core statement that we should not be using the QEMU capabilities when defining the XML. With all existing scenarios we have been able to determine whether to add the implicit PCI controller based on the machine type name only, because with every other QEMU arch when doing such a major change as adding a PCI bus, they have created a new machine type. The problem is that arm 'virt' machine type is not stable, it is being changed arbitrarily in new QEMU releases :-(
So AFAIK, that leaves us with 3 choices
- Never add PCI controller at time the XML is defined, on the basis that we have to be conservative in what we add to cope with old QEMU
- Always add PCI controller at time the XML is defined, on the basis that most people will have new enough QEMU because ARM 'virt' machine type is very much still in development, so no one will serously stick with the older QEMU versions which lack PCI.
- Use the capabilities in XML post-parse to conditionally add the PCI controller. This is what was currently merged
I don't think option 1 makes much sense as it'll harm ARM arch forever more, to cope with QEMU versions that will almost never be used in practice.
I'd be inclined to go with option 2, and then if any PCI devices are actually used with the guest, check the capability at start time when we are doing auto-address assignment.
Another option: add versioned 'virt' machine types to the next qemu release (like virt-2.5 etc.), and key libvirt enabling pci off of that.
_Eventually_ we are going to need versioned 'virt' machine types for migration compatibility like we already do with x86 -M pc-*. Might as well make the change early so libvirt can actually use it.
I was also thinking about a middle ground between choices 1 and 2 from Dan in case the machine types are not versioned any time soon. We would, by default add no pci controller when defining unless we think it's needed. That would be determined by any of the following: a) there is a device that we know needs PCI controller b) there is a device with PCI address c) <controller type='pci'/> is spotted in the user-supplied XML In case any of the above is true (notice that users themselves can override the addition with third option), we add all those controllers and when starting the machine, we just check that it's supported (using the capability). Martin

On 08/11/2015 01:23 AM, Martin Kletzander wrote:
On Mon, Aug 10, 2015 at 07:23:07PM -0400, Cole Robinson wrote:
On 08/10/2015 11:09 AM, Daniel P. Berrange wrote:
On Thu, Aug 06, 2015 at 07:46:58PM +0200, Martin Kletzander wrote:
On Thu, Aug 06, 2015 at 06:37:41PM +0200, Martin Kletzander wrote:
On Fri, Jul 17, 2015 at 02:27:45PM +0300, Pavel Fedin wrote:
Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; }
+ if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */
So every ARM qemu with the "virt" machine type supports both PCI and multiqueue? How about those "virt-*" for which you check below. That might not be related, I'm just curious.
+ if (STREQ(def->os.machine, "virt")) + return true; + } + return false; }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break;
case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque; + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */
Few questions here. a) how "temporary" is this since you're not removing it in this series? b) for what tests you need this hack and what part of the below is the hack?
Moreover, you cannot use capabilities when defining an XML. The emulator can change between the domain is defined and started, so you cannot know with what emulator this will be started.
I see Michal (Cc'd) just pushed this, I probably just missed the mail
Of course I forgot, Cc'ing now.
I agree with your core statement that we should not be using the QEMU capabilities when defining the XML. With all existing scenarios we have been able to determine whether to add the implicit PCI controller based on the machine type name only, because with every other QEMU arch when doing such a major change as adding a PCI bus, they have created a new machine type. The problem is that arm 'virt' machine type is not stable, it is being changed arbitrarily in new QEMU releases :-(
So AFAIK, that leaves us with 3 choices
- Never add PCI controller at time the XML is defined, on the basis that we have to be conservative in what we add to cope with old QEMU
- Always add PCI controller at time the XML is defined, on the basis that most people will have new enough QEMU because ARM 'virt' machine type is very much still in development, so no one will serously stick with the older QEMU versions which lack PCI.
- Use the capabilities in XML post-parse to conditionally add the PCI controller. This is what was currently merged
I don't think option 1 makes much sense as it'll harm ARM arch forever more, to cope with QEMU versions that will almost never be used in practice.
I'd be inclined to go with option 2, and then if any PCI devices are actually used with the guest, check the capability at start time when we are doing auto-address assignment.
Another option: add versioned 'virt' machine types to the next qemu release (like virt-2.5 etc.), and key libvirt enabling pci off of that.
_Eventually_ we are going to need versioned 'virt' machine types for migration compatibility like we already do with x86 -M pc-*. Might as well make the change early so libvirt can actually use it.
I was also thinking about a middle ground between choices 1 and 2 from Dan in case the machine types are not versioned any time soon. We would, by default add no pci controller when defining unless we think it's needed. That would be determined by any of the following:
a) there is a device that we know needs PCI controller
b) there is a device with PCI address
c) <controller type='pci'/> is spotted in the user-supplied XML
In case any of the above is true (notice that users themselves can override the addition with third option),
Yeah, that's an expansion of what I was talking about in the "On the other hand" paragraph in my reply to the grandparent of your message (I had said allow for adding <controller type='pci' model='pcie-root'/> manually); your idea has that plus some useful extra (also looking for devices that need a pci controller). This sounds like the safest thing to do for the virt-* machinetypes where we don't know whether or not pcie-root exists. But then how many of these will there be? I think we should lobby fairly strongly for qemu to version the virt machinetypes right away.
we add all those controllers and when starting the machine, we just check that it's supported (using the capability).
lacking cooperation from qemu on the versioning + stability from, +1 from me on this idea. If they're nice about it and start versioning right away, then it may be a lot of extra ugly code that will only be useful for a very short while, and after that just burden us with maintenance for ever.

Hello!
I'd be inclined to go with option 2, and then if any PCI devices are actually used with the guest, check the capability at start time when we are doing auto-address assignment.
I am following the discussion, sorry for being a bit silent... Just didn't have much to say. But what is actually wrong with using capabilities in post-parse? Anyway, devices do not have PCI addresses by default. They have them only if the user explicitly says so, to stay backwards-compatible with old guests. So, in a practical scenario the user would first upgrade qemu, then start adding PCI devices. And, while adding them, the user will get PCI controller in the description automatically. The only hypothetical scenario i could imagine is downgrading qemu. But anyway, in this case the user would have to manually remove PCI devices from the domain XML. Can anybody suggest a scenario where adding PCI in post-parse actually harms? The only argument i heard was: "we should not be using the QEMU capabilities when defining the XML". But why?
Another option: add versioned 'virt' machine types to the next qemu release (like virt-2.5 etc.), and key libvirt enabling pci off of that.
1. virt machine (unversioned one) already has PCI. 2. qemu people don't want to have versioned "virt". I already proposed this for GICv3 and they strictly refused. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On Mon, Aug 10, 2015 at 07:23:07PM -0400, Cole Robinson wrote:
On 08/10/2015 11:09 AM, Daniel P. Berrange wrote:
On Thu, Aug 06, 2015 at 07:46:58PM +0200, Martin Kletzander wrote:
On Thu, Aug 06, 2015 at 06:37:41PM +0200, Martin Kletzander wrote:
On Fri, Jul 17, 2015 at 02:27:45PM +0300, Pavel Fedin wrote:
Here we assume that if qemu supports generic PCI host controller, it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_domain.c | 17 +++++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d570fdd..f3486c7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -2138,6 +2138,14 @@ bool virQEMUCapsHasPCIMultiBus(virQEMUCapsPtr qemuCaps, return false; }
+ if (ARCH_IS_ARM(def->os.arch)) { + /* If 'virt' supports PCI, it supports multibus. + * No extra conditions here for simplicity. + */
So every ARM qemu with the "virt" machine type supports both PCI and multiqueue? How about those "virt-*" for which you check below. That might not be related, I'm just curious.
+ if (STREQ(def->os.machine, "virt")) + return true; + } + return false; }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 8b050a0..c7d14e4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -981,7 +981,7 @@ virDomainXMLNamespace virQEMUDriverDomainXMLNamespace = { static int qemuDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps, - void *opaque ATTRIBUTE_UNUSED) + void *opaque) { bool addDefaultUSB = true; bool addImplicitSATA = false; @@ -1030,12 +1030,21 @@ qemuDomainDefPostParse(virDomainDefPtr def, break;
case VIR_ARCH_ARMV7L: - addDefaultUSB = false; - addDefaultMemballoon = false; - break; case VIR_ARCH_AARCH64: addDefaultUSB = false; addDefaultMemballoon = false; + if (STREQ(def->os.machine, "virt") || + STRPREFIX(def->os.machine, "virt-")) { + virQEMUDriverPtr driver = opaque; + + /* This condition is actually a (temporary) hack for test suite which + * does not create capabilities cache */
Few questions here. a) how "temporary" is this since you're not removing it in this series? b) for what tests you need this hack and what part of the below is the hack?
Moreover, you cannot use capabilities when defining an XML. The emulator can change between the domain is defined and started, so you cannot know with what emulator this will be started.
I see Michal (Cc'd) just pushed this, I probably just missed the mail
Of course I forgot, Cc'ing now.
I agree with your core statement that we should not be using the QEMU capabilities when defining the XML. With all existing scenarios we have been able to determine whether to add the implicit PCI controller based on the machine type name only, because with every other QEMU arch when doing such a major change as adding a PCI bus, they have created a new machine type. The problem is that arm 'virt' machine type is not stable, it is being changed arbitrarily in new QEMU releases :-(
So AFAIK, that leaves us with 3 choices
- Never add PCI controller at time the XML is defined, on the basis that we have to be conservative in what we add to cope with old QEMU
- Always add PCI controller at time the XML is defined, on the basis that most people will have new enough QEMU because ARM 'virt' machine type is very much still in development, so no one will serously stick with the older QEMU versions which lack PCI.
- Use the capabilities in XML post-parse to conditionally add the PCI controller. This is what was currently merged
I don't think option 1 makes much sense as it'll harm ARM arch forever more, to cope with QEMU versions that will almost never be used in practice.
I'd be inclined to go with option 2, and then if any PCI devices are actually used with the guest, check the capability at start time when we are doing auto-address assignment.
Another option: add versioned 'virt' machine types to the next qemu release (like virt-2.5 etc.), and key libvirt enabling pci off of that.
_Eventually_ we are going to need versioned 'virt' machine types for migration compatibility like we already do with x86 -M pc-*. Might as well make the change early so libvirt can actually use it.
I've raised the idea of versioned 'virt' machine type several times on qemu-devel and its been clearly rejected saying they're not willing to guarantee migration compatibility between releases on arm yet. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Legacy -net option works correctly only with embedded device models, which do not require any bus specification. Therefore, we should use -device for PCI hardware Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_command.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 42906a8..5569be6 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -457,7 +457,8 @@ qemuDomainSupportsNicdev(virDomainDefPtr def, /* non-virtio ARM nics require legacy -net nic */ if (((def->os.arch == VIR_ARCH_ARMV7L) || (def->os.arch == VIR_ARCH_AARCH64)) && - net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO) + net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO && + net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) return false; return true; -- 1.9.5.msysgit.0

On 07/17/2015 07:27 AM, Pavel Fedin wrote:
Legacy -net option works correctly only with embedded device models, which do not require any bus specification. Therefore, we should use -device for PCI hardware
This function is starting to look a bit like a hack on top of a hack on top of a hack. Are the original reasons for using -net instead of -device on certain arches still applicable?
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_command.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 42906a8..5569be6 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -457,7 +457,8 @@ qemuDomainSupportsNicdev(virDomainDefPtr def, /* non-virtio ARM nics require legacy -net nic */ if (((def->os.arch == VIR_ARCH_ARMV7L) || (def->os.arch == VIR_ARCH_AARCH64)) && - net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO) + net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO && + net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) return false;
return true;

On 08/10/2015 12:29 PM, Laine Stump wrote:
On 07/17/2015 07:27 AM, Pavel Fedin wrote:
Legacy -net option works correctly only with embedded device models, which do not require any bus specification. Therefore, we should use -device for PCI hardware
This function is starting to look a bit like a hack on top of a hack on top of a hack. Are the original reasons for using -net instead of -device on certain arches still applicable?
Yes, the old logic is still needed for all non-virtio uses of arm and aarch64, and potentially a ton of other qemu architectures if we ever wanted to properly support them. - Cole
Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_command.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 42906a8..5569be6 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -457,7 +457,8 @@ qemuDomainSupportsNicdev(virDomainDefPtr def, /* non-virtio ARM nics require legacy -net nic */ if (((def->os.arch == VIR_ARCH_ARMV7L) || (def->os.arch == VIR_ARCH_AARCH64)) && - net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO) + net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO && + net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) return false;
return true;

virtio-net-pci adapter is capable to use irqfd with vhost-net only in MSI-X mode, which appears to be available only on PCIe bus, at least on ARM Signed-off-by: Pavel Fedin <p.fedin@samsung.com> --- src/qemu/qemu_command.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 5569be6..a2594b0 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1662,6 +1662,14 @@ qemuCollectPCIAddress(virDomainDefPtr def ATTRIBUTE_UNUSED, */ flags = VIR_PCI_CONNECT_TYPE_PCI | VIR_PCI_CONNECT_TYPE_PCIE; break; + + case VIR_DOMAIN_DEVICE_NET: + if (STREQ(device->data.net->model, "virtio")) { + /* virtio-net-pci adapter in qemu has to be plugged into PCIe slot + * in order to be able to use irqfds with vhost-net + */ + flags = VIR_PCI_CONNECT_TYPE_PCI | VIR_PCI_CONNECT_TYPE_PCIE; + } } /* Ignore implicit controllers on slot 0:0:1.0: -- 1.9.5.msysgit.0

On 17.07.2015 13:27, Pavel Fedin wrote:
Virt machine in qemu since v2.3.0 has PCI generic host controller, and can use PCI devices. This provides performance improvement as well as vhost-net with irqfd support for virtio-net. However libvirt currently does not allow ARM virt machine to have PCI devices. This patchset adds the necessary support.
Changes since v4: - Rebased onto current master - Added possibility to plug virtio-net-pci adapter directly into PCIe bus. This is necessary for irqfds to work in qemu. Changes since v3: - Capability is based not on qemu version but on support of "gpex-pcihost" device by qemu - Added a workaround, allowing to pass "make check". The problem is that test suite does not build capabilities cache. Unfortunately this means that correct unit-test for the new functionality currently cannot be written. Test suite framework needs to be improved. Changes since v2: Complete rework, use different approach - Correctly model PCI Express bus on the machine. It is now possible to explicitly specify <address-type='pci'> with attributes. This allows to attach not only virtio, but any other PCI device to the model. - Default is not changed and still mmio, for backwards compatibility with existing installations. PCI bus has to be explicitly specified. - Check for the capability in correct place, in v2 it actually did not work Changes since v1: - Added capability based on qemu version number - Recognize also "virt-" prefix
Pavel Fedin (4): qemu: Introduce QEMU_CAPS_OBJECT_GPEX Add PCI-Express root to ARM virt machine Build correct command line for PCI NICs on ARM Allow to plug virtio-net-pci into PCIe slot
src/qemu/qemu_capabilities.c | 10 ++++++++++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 11 ++++++++++- src/qemu/qemu_domain.c | 17 +++++++++++++---- 4 files changed, 34 insertions(+), 5 deletions(-)
ACKed all the patches, squashed in the 2/4 amendment and pushed. Sorry for the delay. Michal

Hello!
ACKed all the patches, squashed in the 2/4 amendment and pushed. Sorry for the delay.
Thank you very much. By the way, just curious, why is there a restriction to plug in most of devices only into PCI, not PCIe? On machines with PCIe host this seems to do nothing good, only prevents them from using MSI-X. For example virtio-scsi still suffers from this problem. I ignore it for now because we currently don't have vhost-scsi support in qemu. Comments say that "PCIe is not hotpluggable". But why? And why do we mandate hotplug? Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

On 06.08.2015 15:06, Pavel Fedin wrote:
Hello!
ACKed all the patches, squashed in the 2/4 amendment and pushed. Sorry for the delay.
Thank you very much.
By the way, just curious, why is there a restriction to plug in most of devices only into PCI, not PCIe? On machines with PCIe host this seems to do nothing good, only prevents them from using MSI-X. For example virtio-scsi still suffers from this problem. I ignore it for now because we currently don't have vhost-scsi support in qemu. Comments say that "PCIe is not hotpluggable". But why? And why do we mandate hotplug?
Yeah, that's a good question. In a real world, PCIe is perfectly hot pluggable. Maybe the limitation comes from qemu? Michal

On 08/06/2015 09:38 AM, Michal Privoznik wrote:
On 06.08.2015 15:06, Pavel Fedin wrote:
Hello!
ACKed all the patches, squashed in the 2/4 amendment and pushed. Sorry for the delay. Thank you very much.
By the way, just curious, why is there a restriction to plug in most of devices only into PCI, not PCIe? On machines with PCIe host this seems to do nothing good, only prevents them from using MSI-X.
The original patches to support pcie-root severely restricted what could plug into what because in real hardware you can't plug a PCI device into a PCIe slot (physically it doesn't work), and although it did happen to work with qemu, I had no information to tell me that, e.g. plugging a PCI device into a PCIe slot, or PCIe device into a PCI slot would continue to work in the future. I've since received assurances from qemu developers that this will continue to work, so the restriction has been loosened (see below).
For example virtio-scsi still suffers from this problem. I ignore it for now because we currently don't have vhost-scsi support in qemu. Comments say that "PCIe is not hotpluggable".
You cannot hotplug directly into pcie-root. This is true in the real world, and either is or should be true in qemu. hotplug with pcie is supported by other pcie controllers that are connected into pcie-root at guest startup time.
But why?
because that's the way it was originally implemented in hardware?
And why do we mandate hotplug?
We don't (any longer) mandate it. But in the absence of any information to the contrary, we assume that the user might want it, so we try not to make a setup that will surprise and disappoint them. In the past, the rules about insisting on a hotpluggable port (and only plugging PCI devices into true PCI slots rather than PCIe) were strictly enforced by libvirt all the time, but there was an open BZ about that which I resolved a couple months ago: https://bugzilla.redhat.com/show_bug.cgi?id=1238338 The behavior now is that if libvirt is auto-assigning a slot for a device, it will put it into a hotpluggable true PCI slot, but if you manually assign it to a slot that is non-hotpluggable and/or PCIe, it will be allowed. Beyond that, just last night I pushed a series of patches supporting three new PCIe controller types: pcie-root-port, pcie-switch-upstream-port, and pcie-switch-downstream-port. With those implemented, we can now implement pretty much any standard PCIe bus topology, so you can now hotplug into a true PCIe port. (Note that there are a few restrictions about which type of controllers can be plugged into which other controllers and these *are* strictly enforced, because not following those rules leads to non-working configurations; for example, a pcie-switch-upstream-port cannot be plugged directly into pcie-root; it can *only* be plugged into a pcie-root-port or a pcie-switch-downstream-port.)
Yeah, that's a good question. In a real world, PCIe is perfectly hot pluggable. Maybe the limitation comes from qemu?
That is a very broad statement, and isn't true in all cases. For example, in the "real world" you can't hotplug/unplug directly to the pcie-root bus (nor can you hotplug/unplug to the dmi-to-pci-bridge (aka i8280111-bridge). But you *can* (at guest startup time) attach a pcie-root-port to the pcie-root, or a pcie-switch-upstream-port+pcie-switch-downstream-ports to pcie-root or to a dmi-to-pci-bridge, then hotplug/unplug to a pcie-root-port or a pcie-switch-downstream-port (do a "git pull" of master to get the new controller types I explained above and you'll see what I'm talking about). BTW, I'm still wondering if the arm machinetype really does support the supposedly Interl-x86-specific i82801b11-bridge device (and the new controller devices - ioh3420 (pcie-root-port), x3130-upstream (pcie-switch-upstream-port), and xio3130-downstream (pcie-switch-downstream-port). In particular, the i82801b11-bridge is important, since it is automatically added by libvirt when pcie-root is present.

Hello!
The original patches to support pcie-root severely restricted what could plug into what because in real hardware you can't plug a PCI device into a PCIe slot (physically it doesn't work)
But how do you know whether the device is PCI or PCIe ? I don't see anything like this in the code, i see that for example "all network cards are PCI", which is, BTW, not true in the real world.
The behavior now is that if libvirt is auto-assigning a slot for a device, it will put it into a hotpluggable true PCI slot, but if you manually assign it to a slot that is non-hotpluggable and/or PCIe, it will be allowed.
But when i tried to manually assign virtio-PCI to PCIe i simply got "Device requires standard PCI slot" and that was all. I had to make patch N4 in order to overcome this.
BTW, I'm still wondering if the arm machinetype really does support the supposedly Interl-x86-specific i82801b11-bridge device
Yes, it works fine. Just devices behind it cannot get MSI-X enabled. By the way, you should be using virtio-pci with PC guests for a while, does it also suffer from this restriction there ?
(and the new > controller devices - ioh3420 (pcie-root-port), x3130-upstream (pcie-switch-upstream-port), and xio3130-downstream (pcie-switch-downstream-port).
Didn't try that, but don't see why they would not work. PCI is just PCI after all, everything behing the controller is pretty much standard and arch-independent. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia

(Alex - I cc'ed you because I addressed a question or two your way down towards the bottom). On 08/11/2015 02:52 AM, Pavel Fedin wrote:
Hello!
The original patches to support pcie-root severely restricted what could plug into what because in real hardware you can't plug a PCI device into a PCIe slot (physically it doesn't work) But how do you know whether the device is PCI or PCIe ? I don't see anything like this in the code, i see that for example "all network cards are PCI", which is, BTW, not true in the real world.
Two years ago when I first added support for q35-based machinetypes and the first pcie controllers, I had less information than I do now. When I looked in the ouput of "qemu-kvm -device ?" I saw that each device listed the type of bus it connected to (PCI or ISA), and assumed that even though at the time qemu didn't differentiate between PCI and PCIe there, since the two things *are* different in the real world eventually they likely would. I wanted the libvirt code to be prepared for that eventuality. Of course every example device (except the PCIe controllers themselves) ends up with the flag saying that it can connect to a PCI bus, not PCIe). Later I was told that, unlike the real world where, if nothing else, the physical slots themselves limit you, any normal PCI device in qemu could be plugged into a PCI or PCIe slot. There still are several restrictions though, which showed themselves as more complicated than just the naive PCI vs PCIe that I originally imagined - just look at the restrictions on the different PCIe controllers: ("pcie-sw-up-port" == "pcie-switch-upstream-port", "pcie-sw-dn-port" == "pcie-switch-downstream-port") name upstream downstream ----------------- ----------------- ------------------- pcie-root none any endpoint pcie-root-port dmi-to-pci-bridge pci-bridge 31 ports NO hotplug dmi-to-pci-bridge pcie-root any endpoint device pcie-root-port pcie-sw-up-port pcie-sw-dn-port NO hotplug 32 ports NO hotplug pcie-root-port pcie-root-only any endpoint NO hotplug dmi-to-pci-bridge pcie-sw-up-port 1 port hotpluggable pcie-sw-up-port pcie-root-port pcie-sw-dn-port pcie-sw-dn-port 32 ports "kind of" hotpluggable "kind of" hotpluggable pcie-sw-dn-port pcie-sw-up-port any endpoint "kind of" hotplug pcie-sw-up-port 1 port hotpluggable pci-bridge pci-root any endpoint pcie-root pci-bridge dmi-to-pci-bridge 32 ports hotpluggable pcie-root-port pcie-sw-dn-port NO hotplug (now) So the original restrictions I placed on what could plugin where were *too* restrictive for endpoint devices, but other restrictions were useful, and the framework came in handy as I learned the restrictions of each new pci controller model.
The behavior now is that if libvirt is auto-assigning a slot for a device, it will put it into a hotpluggable true PCI slot, but if you manually assign it to a slot that is non-hotpluggable and/or PCIe, it will be allowed. But when i tried to manually assign virtio-PCI to PCIe i simply got "Device requires standard PCI slot" and that was all. I had to make patch N4 in order to overcome this.
I'm glad you pointed out this patch (I had skipped over it), because 1) that patch is completely unnecessary ever since commits 1e15be1 and 9a12b6c were pushed upstream, and 2) that patch will cause problems with auto-assignment of addresses for virtio-net devices on Q35 machines (they will be put on pcie-root instead of the pci-bridge). I have verified that (1) is true - I removed your patch, built and installed new libvirt, and tried adding a new virtio-net device to Cole's aarch64 example domain with a manually set pci address on both bus 0 (pcie-root) and bus 1 (dmi-to-pci-bridge), and both were successful. I just sent a patch that reverts your patch 4. Please test it to verify my claims and let me know so I can push it. https://www.redhat.com/archives/libvir-list/2015-August/msg00488.html
BTW, I'm still wondering if the arm machinetype really does support the supposedly Interl-x86-specific i82801b11-bridge device Yes, it works fine. Just devices behind it cannot get MSI-X enabled.
I'm not expert enough to know for sure, but that sounds like a bug. Alex? I do recall that a very long time ago when I first tried out dmi-to-pci-bridge and Q35, I found a qemu bug that made it impossible to use network devices (and they were probably virtio-net, as that's what I usually use) attached to a pci-bridge on a Q35 machine. Maybe the same bug? I don't remember what the exact problem was (but again, I think Alex will remember the problem). I can say that currently (and for almost the last two years) there is no problem using a virtio-net adapter that is connected to a q35 machine in this way: pcie-root --> dmi-to-pci-bridge --> pci-bridge --> virtio-net
By the way, you should be using virtio-pci with PC guests for a while, does it also suffer from this restriction there ?
See above :-)
(and the new > controller devices - ioh3420 (pcie-root-port), x3130-upstream (pcie-switch-upstream-port), and xio3130-downstream (pcie-switch-downstream-port). Didn't try that, but don't see why they would not work. PCI is just PCI after all, everything behing the controller is pretty much standard and arch-independent.
Alex (yeah, I know, I need to stop being such an Alex-worshipper on PCI topics :-P) has actually expressed concern that we are using all of these Intel-chipset-specific PCI controllers, and thinks that we should instead create some generic devices that have different PCI device IDs than those. Among other problems, those Intel-specific devices only exist in real hardware at specific addresses, and he's concerned that us putting them at a different address might confuse some OS; better to have a device that functions similarly, but IDs itself differently so the OS won't make any potentially incorrect assumptions (that type of problem is likely a non-issue for your use of these devices though, since you don't really have any "legacy OS" you have to deal with).

On Tue, 2015-08-11 at 19:26 -0400, Laine Stump wrote:
(Alex - I cc'ed you because I addressed a question or two your way down towards the bottom).
On 08/11/2015 02:52 AM, Pavel Fedin wrote:
Hello!
The original patches to support pcie-root severely restricted what could plug into what because in real hardware you can't plug a PCI device into a PCIe slot (physically it doesn't work) But how do you know whether the device is PCI or PCIe ? I don't see anything like this in the code, i see that for example "all network cards are PCI", which is, BTW, not true in the real world.
Two years ago when I first added support for q35-based machinetypes and the first pcie controllers, I had less information than I do now. When I looked in the ouput of "qemu-kvm -device ?" I saw that each device listed the type of bus it connected to (PCI or ISA), and assumed that even though at the time qemu didn't differentiate between PCI and PCIe there, since the two things *are* different in the real world eventually they likely would. I wanted the libvirt code to be prepared for that eventuality. Of course every example device (except the PCIe controllers themselves) ends up with the flag saying that it can connect to a PCI bus, not PCIe).
Later I was told that, unlike the real world where, if nothing else, the physical slots themselves limit you, any normal PCI device in qemu could be plugged into a PCI or PCIe slot. There still are several restrictions though, which showed themselves as more complicated than just the naive PCI vs PCIe that I originally imagined - just look at the restrictions on the different PCIe controllers:
("pcie-sw-up-port" == "pcie-switch-upstream-port", "pcie-sw-dn-port" == "pcie-switch-downstream-port")
name upstream downstream ----------------- ----------------- ------------------- pcie-root none any endpoint pcie-root-port dmi-to-pci-bridge pci-bridge 31 ports NO hotplug
dmi-to-pci-bridge pcie-root any endpoint device pcie-root-port pcie-sw-up-port pcie-sw-dn-port NO hotplug 32 ports NO hotplug
Hmm, pcie-sw-up-port on the downstream is a stretch here. pci-bridge should be allowed downstream though.
pcie-root-port pcie-root-only any endpoint NO hotplug dmi-to-pci-bridge pcie-sw-up-port 1 port hotpluggable
pcie-sw-up-port pcie-root-port pcie-sw-dn-port pcie-sw-dn-port 32 ports "kind of" hotpluggable "kind of" hotpluggable
pcie-sw-dn-port pcie-sw-up-port any endpoint "kind of" hotplug pcie-sw-up-port 1 port hotpluggable
pci-bridge pci-root any endpoint pcie-root pci-bridge dmi-to-pci-bridge 32 ports hotpluggable pcie-root-port pcie-sw-dn-port NO hotplug (now)
So the original restrictions I placed on what could plugin where were *too* restrictive for endpoint devices, but other restrictions were useful, and the framework came in handy as I learned the restrictions of each new pci controller model.
System software ends up being pretty amiable as well since PCIe is software compatible with conventional PCI. If we have a guest-based IOMMU though, things could start to get interesting because the difference isn't so transparent. The kernel starts to care about whether a device is express and expects certain compatible upstream devices as it walks the topology. Thankfully though real hardware gets plenty wrong too, so we only have to be not substantially worse than real hardware ;)
The behavior now is that if libvirt is auto-assigning a slot for a device, it will put it into a hotpluggable true PCI slot, but if you manually assign it to a slot that is non-hotpluggable and/or PCIe, it will be allowed. But when i tried to manually assign virtio-PCI to PCIe i simply got "Device requires standard PCI slot" and that was all. I had to make patch N4 in order to overcome this.
I'm glad you pointed out this patch (I had skipped over it), because
1) that patch is completely unnecessary ever since commits 1e15be1 and 9a12b6c were pushed upstream, and
2) that patch will cause problems with auto-assignment of addresses for virtio-net devices on Q35 machines (they will be put on pcie-root instead of the pci-bridge).
I have verified that (1) is true - I removed your patch, built and installed new libvirt, and tried adding a new virtio-net device to Cole's aarch64 example domain with a manually set pci address on both bus 0 (pcie-root) and bus 1 (dmi-to-pci-bridge), and both were successful.
I just sent a patch that reverts your patch 4. Please test it to verify my claims and let me know so I can push it.
https://www.redhat.com/archives/libvir-list/2015-August/msg00488.html
BTW, I'm still wondering if the arm machinetype really does support the supposedly Interl-x86-specific i82801b11-bridge device Yes, it works fine. Just devices behind it cannot get MSI-X enabled.
I'm not expert enough to know for sure, but that sounds like a bug. Alex?
i82801b11-bridge is the "DMI-to-PCI" bridge and ARM certainly doesn't support DMI, but it's really just a PCIe-to-PCI bridge, but I don't know why MSI wouldn't work behind it. I can't say I've done much with it though.
I do recall that a very long time ago when I first tried out dmi-to-pci-bridge and Q35, I found a qemu bug that made it impossible to use network devices (and they were probably virtio-net, as that's what I usually use) attached to a pci-bridge on a Q35 machine. Maybe the same bug? I don't remember what the exact problem was (but again, I think Alex will remember the problem).
Drawing a blank...
I can say that currently (and for almost the last two years) there is no problem using a virtio-net adapter that is connected to a q35 machine in this way:
pcie-root --> dmi-to-pci-bridge --> pci-bridge --> virtio-net
By the way, you should be using virtio-pci with PC guests for a while, does it also suffer from this restriction there ?
See above :-)
(and the new > controller devices - ioh3420 (pcie-root-port), x3130-upstream (pcie-switch-upstream-port), and xio3130-downstream (pcie-switch-downstream-port). Didn't try that, but don't see why they would not work. PCI is just PCI after all, everything behing the controller is pretty much standard and arch-independent.
Alex (yeah, I know, I need to stop being such an Alex-worshipper on PCI topics :-P) has actually expressed concern that we are using all of these Intel-chipset-specific PCI controllers, and thinks that we should instead create some generic devices that have different PCI device IDs than those. Among other problems, those Intel-specific devices only exist in real hardware at specific addresses, and he's concerned that us putting them at a different address might confuse some OS; better to have a device that functions similarly, but IDs itself differently so the OS won't make any potentially incorrect assumptions (that type of problem is likely a non-issue for your use of these devices though, since you don't really have any "legacy OS" you have to deal with).
Yep, I'd love for us to replace all these specific devices with generic ones: pci-pci-bridge, pci-pcie-bridge, pcie-pci-bridge, pcie-root-port, pcie-upstream-switch-port, pcie-downstream-switch-port, etc. Unless there's some specific need for emulating a specific device, I expect we're only making things more complicated. Thanks, Alex

On 08/11/2015 10:13 PM, Alex Williamson wrote:
On Tue, 2015-08-11 at 19:26 -0400, Laine Stump wrote:
(Alex - I cc'ed you because I addressed a question or two your way down towards the bottom).
Hello!
The original patches to support pcie-root severely restricted what could plug into what because in real hardware you can't plug a PCI device into a PCIe slot (physically it doesn't work) But how do you know whether the device is PCI or PCIe ? I don't see anything like this in the code, i see that for example "all network cards are PCI", which is, BTW, not true in the real world. Two years ago when I first added support for q35-based machinetypes and
On 08/11/2015 02:52 AM, Pavel Fedin wrote: the first pcie controllers, I had less information than I do now. When I looked in the ouput of "qemu-kvm -device ?" I saw that each device listed the type of bus it connected to (PCI or ISA), and assumed that even though at the time qemu didn't differentiate between PCI and PCIe there, since the two things *are* different in the real world eventually they likely would. I wanted the libvirt code to be prepared for that eventuality. Of course every example device (except the PCIe controllers themselves) ends up with the flag saying that it can connect to a PCI bus, not PCIe).
Later I was told that, unlike the real world where, if nothing else, the physical slots themselves limit you, any normal PCI device in qemu could be plugged into a PCI or PCIe slot. There still are several restrictions though, which showed themselves as more complicated than just the naive PCI vs PCIe that I originally imagined - just look at the restrictions on the different PCIe controllers:
("pcie-sw-up-port" == "pcie-switch-upstream-port", "pcie-sw-dn-port" == "pcie-switch-downstream-port")
name upstream downstream ----------------- ----------------- ------------------- pcie-root none any endpoint pcie-root-port dmi-to-pci-bridge pci-bridge 31 ports NO hotplug
dmi-to-pci-bridge pcie-root any endpoint device pcie-root-port pcie-sw-up-port pcie-sw-dn-port NO hotplug 32 ports NO hotplug Hmm, pcie-sw-up-port on the downstream is a stretch here. pci-bridge should be allowed downstream though.
You're right, I messed up the chart. pcie-sw-up-port can only plug into pcie-root-port or pcie-sw-dn-port. And I forgot to add in pci-bridge. Of course my main objective was to graphically point out that you can't just plug "anything" into "anything" :-)
pcie-root-port pcie-root-only any endpoint NO hotplug dmi-to-pci-bridge pcie-sw-up-port 1 port hotpluggable
pcie-sw-up-port pcie-root-port pcie-sw-dn-port pcie-sw-dn-port 32 ports "kind of" hotpluggable "kind of" hotpluggable
pcie-sw-dn-port pcie-sw-up-port any endpoint "kind of" hotplug pcie-sw-up-port 1 port hotpluggable
pci-bridge pci-root any endpoint pcie-root pci-bridge dmi-to-pci-bridge 32 ports hotpluggable pcie-root-port pcie-sw-dn-port NO hotplug (now)
So the original restrictions I placed on what could plugin where were *too* restrictive for endpoint devices, but other restrictions were useful, and the framework came in handy as I learned the restrictions of each new pci controller model. System software ends up being pretty amiable as well since PCIe is software compatible with conventional PCI. If we have a guest-based IOMMU though, things could start to get interesting because the difference isn't so transparent. The kernel starts to care about whether a device is express and expects certain compatible upstream devices as it walks the topology. Thankfully though real hardware gets plenty wrong too, so we only have to be not substantially worse than real hardware ;)

On 07/17/2015 07:27 AM, Pavel Fedin wrote:
Virt machine in qemu since v2.3.0 has PCI generic host controller, and can use PCI devices. This provides performance improvement as well as vhost-net with irqfd support for virtio-net. However libvirt currently does not allow ARM virt machine to have PCI devices. This patchset adds the necessary support.
Changes since v4: - Rebased onto current master - Added possibility to plug virtio-net-pci adapter directly into PCIe bus. This is necessary for irqfds to work in qemu. Changes since v3: - Capability is based not on qemu version but on support of "gpex-pcihost" device by qemu - Added a workaround, allowing to pass "make check". The problem is that test suite does not build capabilities cache. Unfortunately this means that correct unit-test for the new functionality currently cannot be written. Test suite framework needs to be improved. Changes since v2: Complete rework, use different approach - Correctly model PCI Express bus on the machine. It is now possible to explicitly specify <address-type='pci'> with attributes. This allows to attach not only virtio, but any other PCI device to the model. - Default is not changed and still mmio, for backwards compatibility with existing installations. PCI bus has to be explicitly specified. - Check for the capability in correct place, in v2 it actually did not work Changes since v1: - Added capability based on qemu version number - Recognize also "virt-" prefix
Pavel Fedin (4): qemu: Introduce QEMU_CAPS_OBJECT_GPEX Add PCI-Express root to ARM virt machine Build correct command line for PCI NICs on ARM Allow to plug virtio-net-pci into PCIe slot
I was a bit confused about the patches that landed; I see now that they only add a PCI controller for modern -M virt, but don't change the virtio defaults to use it. This is good, and in my brief testing doesn't cause any regressions with current virt-manager. We can figure out later if/how to change the bus=virtio or model=virtio default to use PCI instead of virtio-mmio But yeah, we need to figure out that test case issue. There's already a regression with git head: $ sudo virsh create test-aarch64.xml error: Failed to create domain from test-aarch64.xml error: internal error: autogenerated dmi-to-pci-bridge options not set XML attached. I'm guessing this is one of Laine's patches, CCd - Cole

On 08/11/2015 01:51 PM, Cole Robinson wrote:
On 07/17/2015 07:27 AM, Pavel Fedin wrote:
Virt machine in qemu since v2.3.0 has PCI generic host controller, and can use PCI devices. This provides performance improvement as well as vhost-net with irqfd support for virtio-net. However libvirt currently does not allow ARM virt machine to have PCI devices. This patchset adds the necessary support.
Changes since v4: - Rebased onto current master - Added possibility to plug virtio-net-pci adapter directly into PCIe bus. This is necessary for irqfds to work in qemu. Changes since v3: - Capability is based not on qemu version but on support of "gpex-pcihost" device by qemu - Added a workaround, allowing to pass "make check". The problem is that test suite does not build capabilities cache. Unfortunately this means that correct unit-test for the new functionality currently cannot be written. Test suite framework needs to be improved. Changes since v2: Complete rework, use different approach - Correctly model PCI Express bus on the machine. It is now possible to explicitly specify <address-type='pci'> with attributes. This allows to attach not only virtio, but any other PCI device to the model. - Default is not changed and still mmio, for backwards compatibility with existing installations. PCI bus has to be explicitly specified. - Check for the capability in correct place, in v2 it actually did not work Changes since v1: - Added capability based on qemu version number - Recognize also "virt-" prefix
Pavel Fedin (4): qemu: Introduce QEMU_CAPS_OBJECT_GPEX Add PCI-Express root to ARM virt machine Build correct command line for PCI NICs on ARM Allow to plug virtio-net-pci into PCIe slot I was a bit confused about the patches that landed; I see now that they only add a PCI controller for modern -M virt, but don't change the virtio defaults to use it. This is good, and in my brief testing doesn't cause any regressions with current virt-manager. We can figure out later if/how to change the bus=virtio or model=virtio default to use PCI instead of virtio-mmio
But yeah, we need to figure out that test case issue. There's already a regression with git head:
$ sudo virsh create test-aarch64.xml error: Failed to create domain from test-aarch64.xml error: internal error: autogenerated dmi-to-pci-bridge options not set
Hah! So you see Martin and John, *that* is why I wanted all that "extra" checking! :-)
XML attached. I'm guessing this is one of Laine's patches, CCd
Actually the problem is that the original hacks had yet another function added in called qemuDomainSupportsPCI(), which is called during PCI address assignment (also when the PCI controllers are setup), and these new ARM PCI patches haven't added the new sauce to that function. I just added the same bit of code checking for QEMU_CAPS_OBJECT_GPEX, and your example now starts up with no problems. I'm sending the patch now.

Hello!
I was a bit confused about the patches that landed; I see now that they only add a PCI controller for modern -M virt, but don't change the virtio defaults to use it.
Yes, because in first versions i changed the default and was criticized for it, because it broke backwards compatibility with existing distros, which even don't enable PCI-Generic for both ARM architectures (and indeed ARM64 support for PCI-Generic is going to be enabled in upstream kernel v4.3 only). To tell the truth i'm too lazy now to look up relevant threads. Kind regards, Pavel Fedin Expert Engineer Samsung Electronics Research center Russia
participants (7)
-
Alex Williamson
-
Cole Robinson
-
Daniel P. Berrange
-
Laine Stump
-
Martin Kletzander
-
Michal Privoznik
-
Pavel Fedin