
On 08/16/2016 01:29 PM, Andrea Bolognani wrote:
On Mon, 2016-08-15 at 01:50 -0400, Laine Stump wrote:
libvirt previously assigned nearly all devices to a hotpluggable legacy PCI slot even on machines with a PCIe root complex. Doing this means that the domain will need a dmi-to-pci-bridge (to convert from PCIe to legacy PCI) and a pci-bridge (to provide hotpluggable legacy PCI slots.
To help reduce the need for these legacy controllers, this patch checks for the QEMU_CAPS_VIRTIO_PCI_DISABLE_LEGACY capability (if that capability is present, then all virtio devices will automatically present as PCIe when attached to a PCIe controller, or PCI when attached to a legacy PCI controller), and assigns virtio devices to a hotpluggable PCIe slot instead.
NB: since the slot must be hotpluggable, and pcie-root (the PCIe root complex) does *not* support hotplug, this means that suitable controllers must also be in the config (i.e. either pcie-root-port, or pcie-downstream-port). For now, libvirt doesn't add those automatically, so if you put virtio devices in a config for a qemu that has PCIe-capable virtio devices, you'll need to add extra pcie-root-ports yourself. That requirement will be eliminated in a future patch, but for now, it's simple to do this:
<controller type='pci' model='pcie-root-port'/> <controller type='pci' model='pcie-root-port'/> <controller type='pci' model='pcie-root-port'/> ...
Partially Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1330024 --- [...]
@@ -1021,17 +1028,22 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, } }
- /* all other devices that plug into a PCI slot are treated as a - * PCI endpoint devices that require a hotplug-capable slot - * (except for some special cases which have specific handling - * below) + pciFlags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI_DEVICE; + pcieFlags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCIE_DEVICE; Blank line here.
+ /* if qemu has the disable-legacy option for + * virtio-net, then its virtio devices will present + * themselves as PCIe devices when plugged into a PCIe + * slot, so we can safely assign them to a PCIe slot. */ - flags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI_DEVICE; + virtioFlags = havePCIeRoot && + virQEMUCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_PCI_DISABLE_LEGACY) ? + pcieFlags : pciFlags; How about unpacking that? I feel like
if (havePCIeRoot && virQEMUCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_PCI_DISABLE_LEGACY)) { virtioFlags = pcieFlags; } else { virtioFlags = pciFlags; }
would be more readable and maintainable.
Sure.
for (i = 0; i < def->nfss; i++) { if (!virDeviceInfoPCIAddressWanted(&def->fss[i]->info)) continue;
+ flags = virtioFlags;
Blank line here.
/* Only support VirtIO-9p-pci so far. If that changes, * we might need to skip devices here */ if (virDomainPCIAddressReserveNextSlot(addrs, &def->fss[i]->info, @@ -1045,12 +1057,18 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, * in hostdevs list anyway, so handle them with other hostdevs * instead of here. */ - if ((def->nets[i]->type == VIR_DOMAIN_NET_TYPE_HOSTDEV) || - !virDeviceInfoPCIAddressWanted(&def->nets[i]->info)) { + virDomainNetDefPtr net = def->nets[i]; + + if ((net->type == VIR_DOMAIN_NET_TYPE_HOSTDEV) || + !virDeviceInfoPCIAddressWanted(&net->info)) { continue; } - if (virDomainPCIAddressReserveNextSlot(addrs, &def->nets[i]->info, - flags) < 0)
Blank line here.
+ if (STREQ(net->model, "virtio")) + flags = virtioFlags; + else + flags = pciFlags; It's kind of insane that we don't have a virDomainNetModel enum for this sort of check, isn't it?
Yes. It's just always been that way and nobody's done anything about it.
+ + if (virDomainPCIAddressReserveNextSlot(addrs, &net->info, flags) < 0) goto error; }
@@ -1064,6 +1082,7 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, def->sounds[i]->model == VIR_DOMAIN_SOUND_MODEL_USB) continue;
+ flags = pciFlags; if (virDomainPCIAddressReserveNextSlot(addrs, &def->sounds[i]->info, flags) < 0) goto error; @@ -1094,6 +1113,7 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, if (!virDeviceInfoPCIAddressWanted(&def->controllers[i]->info)) continue;
+ flags = pciFlags; Blank line here.
/* USB2 needs special handling to put all companions in the same slot */ if (IS_USB2_CONTROLLER(def->controllers[i])) { virPCIDeviceAddress addr = { 0, 0, 0, 0, false }; @@ -1150,6 +1170,12 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, def->controllers[i]->info.type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI; def->controllers[i]->info.addr.pci = addr; } else { + if ((def->controllers[i]->type == VIR_DOMAIN_CONTROLLER_TYPE_SCSI && + def->controllers[i]->model == + VIR_DOMAIN_CONTROLLER_MODEL_SCSI_VIRTIO_SCSI) || + (def->controllers[i]->type == + VIR_DOMAIN_CONTROLLER_TYPE_VIRTIO_SERIAL)) + flags = virtioFlags;
This is one of those times I really wish our coding style guidelines allowed us to go past 80 columns :)
Yep. Sometimes it's impossible to follow anyway (if a single identifier by itself on the line goes past column 80).
[...]
@@ -1284,6 +1323,7 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, !virDeviceInfoPCIAddressWanted(&chr->info)) continue;
+ flags = pciFlags; if (virDomainPCIAddressReserveNextSlot(addrs, &chr->info, flags) < 0) goto error; } Having gotten this far, I wonder if the pattern used here couldn't be semplified a bit... Try squashing in the attached patch, see if you like it.
Yeah, I actually thought of doing something like that fairly late in the process after I saw what it had turned into, but decided not to mostly because I needed to settle on "something". I'll squash in your patch.
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-q35-virtio-pci.xml b/tests/qemuxml2argvdata/qemuxml2argv-q35-virtio-pci.xml new file mode 100644 index 0000000..7bed08c --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-q35-virtio-pci.xml As you mention in the test programs below, qemuxml2argv-q35-virtio-pci.xml and qemuxml2argv-q35-pcie.xml have the exact same contents.
Please make one of them a symbolic link to the other one, to save space and ensure they will remain in sync.
Sure. I didn't realize that was "a thing", but I see now there are two other examples of symlinking test files already.
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-q35-virtio-pcie.args b/tests/qemuxml2argvdata/qemuxml2argv-q35-virtio-pcie.args new file mode 100644 index 0000000..c43c537 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-q35-virtio-pcie.args This looks like a leftover from a previous attempt, and in fact 'make check' still passes after deleting it.
You are correct. I decided to rename the test case so that it could be used to test *all* pcie devices, not just virtio pcie, but blindly added all the new files to the commit.
diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index ad0693f..46b602f 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -607,7 +607,6 @@ mymain(void) unsetenv("QEMU_AUDIO_DRV"); unsetenv("SDL_AUDIODRIVER");
- DO_TEST("minimal", NONE); No reason to delete this AFAICT.
Yep. That was accidental.
DO_TEST_PARSE_ERROR("minimal-no-memory", NONE); DO_TEST("minimal-msg-timestamp", QEMU_CAPS_MSG_TIMESTAMP); DO_TEST("machine-aliases1", NONE); @@ -1670,6 +1669,48 @@ mymain(void) QEMU_CAPS_PCI_MULTIFUNCTION, QEMU_CAPS_ICH9_USB_EHCI1, QEMU_CAPS_DEVICE_VIDEO_PRIMARY, QEMU_CAPS_VGA_QXL, QEMU_CAPS_DEVICE_QXL); + DO_TEST("q35-pcie", + QEMU_CAPS_VIRTIO_PCI_DISABLE_LEGACY, + QEMU_CAPS_DEVICE_VIRTIO_RNG, + QEMU_CAPS_OBJECT_RNG_RANDOM, + QEMU_CAPS_NETDEV, + QEMU_CAPS_DEVICE_VIRTIO_NET, + QEMU_CAPS_DEVICE_VIRTIO_GPU, + QEMU_CAPS_DEVICE_VIRTIO_GPU_VIRGL, + QEMU_CAPS_VIRTIO_KEYBOARD, + QEMU_CAPS_VIRTIO_MOUSE, + QEMU_CAPS_VIRTIO_TABLET, + QEMU_CAPS_VIRTIO_INPUT_HOST, + QEMU_CAPS_VIRTIO_SCSI, + QEMU_CAPS_FSDEV, + QEMU_CAPS_FSDEV_WRITEOUT, + QEMU_CAPS_DEVICE_PCI_BRIDGE, + QEMU_CAPS_DEVICE_DMI_TO_PCI_BRIDGE, + QEMU_CAPS_DEVICE_IOH3420, + QEMU_CAPS_ICH9_AHCI, + QEMU_CAPS_PCI_MULTIFUNCTION, QEMU_CAPS_ICH9_USB_EHCI1, + QEMU_CAPS_DEVICE_VIDEO_PRIMARY); + /* same XML as q35-pcie, but don't set QEMU_CAPS_VIRTIO_PCI_LEGACY */
This is really nice. We have a bunch of tests where the name is not that helpful in conveying exactly what they're supposed to be testing... Actually, adding a few words for q35-pcie would be even nicer! :)
+ DO_TEST("q35-virtio-pci", + QEMU_CAPS_DEVICE_VIRTIO_RNG, + QEMU_CAPS_OBJECT_RNG_RANDOM, + QEMU_CAPS_NETDEV, + QEMU_CAPS_DEVICE_VIRTIO_NET, + QEMU_CAPS_DEVICE_VIRTIO_GPU, + QEMU_CAPS_DEVICE_VIRTIO_GPU_VIRGL, + QEMU_CAPS_VIRTIO_KEYBOARD, + QEMU_CAPS_VIRTIO_MOUSE, + QEMU_CAPS_VIRTIO_TABLET, + QEMU_CAPS_VIRTIO_INPUT_HOST, + QEMU_CAPS_VIRTIO_SCSI, + QEMU_CAPS_FSDEV, + QEMU_CAPS_FSDEV_WRITEOUT, + QEMU_CAPS_DEVICE_PCI_BRIDGE, + QEMU_CAPS_DEVICE_DMI_TO_PCI_BRIDGE, + QEMU_CAPS_DEVICE_IOH3420, + QEMU_CAPS_ICH9_AHCI, + QEMU_CAPS_PCI_MULTIFUNCTION, QEMU_CAPS_ICH9_USB_EHCI1, + QEMU_CAPS_DEVICE_VIDEO_PRIMARY); DO_TEST("pcie-root-port", QEMU_CAPS_DEVICE_PCI_BRIDGE, QEMU_CAPS_DEVICE_DMI_TO_PCI_BRIDGE, I didn't go through the test suite additions in detail, because when I tried I got cross-eyed very quickly. Plus, I assume you already verified the output of the tests make sense. I'll give it a closer look in the respin.
-- Andrea Bolognani / Red Hat / Virtualization