[libvirt] [PATCH 0/4] Enable spapr-pci-vfio-host-bridge controllers for VFIO passthrough support

The following series of patches enable spapr-pci-vfio-host-bridge controllers on PPC64-pseries machine which is required for supporting host device passthrough using VFIO. There were some initial enablement work on the same at http://www.redhat.com/archives/libvir-list/2013-September/msg00838.html. The testing revealed contrasting observations and the current patch series solve the same. Unlike other architectures, on pseries(ppc64) the vfio host devices(will refer as hostdevs here on) cannot be assigned to the default emulated pci-host- bus controller(like the default pci.0). On pseries, the hostdevs goto spapr-pci-vfio-host-bridge. The hostdevs belonging to the same iommu group share the same spapr-pci-vfio-host-bridge. Henceforth, new spapr-pci-host-bridge needs to be added for every hostdev belonging to any new iommu group. The hostdevs appear in new pci domain inside the guest. The spapr-pci-vfio-host-bridge is a generic bridge and it can host both pci and pci-e host devices. The model pci-root is chosen for this controller just for convenience. The patch series take care to add the new controller. Each new controller creates up a new pci domain in the guest. Every hostdev get their pci address in the relavent domain. Tha patch series take care of device addressing in passthrough hostdevs, SR-IOV interfaces and network interface from SRIOV virtual function pools. --- Shivaprasad G Bhat (4): qemu: Add SPAPR_VFIO_HOST_BRIDGE capability for PPC platform qemu: parse and add spapr-vfio-pci controller into domain qemu: assign addresses for spapr vfio hostdevices and generate cli qemu: add test case for spapr-pci-vfio-host-bridge docs/schemas/domaincommon.rng | 28 ++ src/bhyve/bhyve_domain.c | 2 src/conf/domain_addr.c | 8 - src/conf/domain_addr.h | 1 src/conf/domain_conf.c | 149 ++++++++++- src/conf/domain_conf.h | 19 + src/libvirt_private.syms | 2 src/qemu/qemu_capabilities.c | 2 src/qemu/qemu_capabilities.h | 1 src/qemu/qemu_command.c | 279 +++++++++++++++++++- src/qemu/qemu_command.h | 17 + src/qemu/qemu_domain.c | 12 - src/qemu/qemu_driver.c | 6 tests/qemuhotplugtest.c | 2 .../qemuxml2argv-hostdev-spapr-vfio.args | 16 + .../qemuxml2argv-hostdev-spapr-vfio.xml | 76 +++++ tests/qemuxml2argvtest.c | 8 + 17 files changed, 591 insertions(+), 37 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.xml -- Signature

To support VFIO for PPC, it is needed spapr-vfio-pci-host-bridge in QEMU. This patch is to add one capability for it. Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + 2 files changed, 3 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 6fcb5c7..55c0217 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -271,6 +271,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "iothread", "migrate-rdma", "ivshmem", + "spapr-pci-vfio-host-bridge", ); @@ -1502,6 +1503,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "usb-audio", QEMU_CAPS_OBJECT_USB_AUDIO }, { "iothread", QEMU_CAPS_OBJECT_IOTHREAD}, { "ivshmem", QEMU_CAPS_DEVICE_IVSHMEM }, + { "spapr-pci-vfio-host-bridge", QEMU_CAPS_SPAPR_VFIO_HOST_BRIDGE }, }; static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBlk[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 0214f30..6aa7e37 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -218,6 +218,7 @@ typedef enum { QEMU_CAPS_OBJECT_IOTHREAD = 176, /* -object iothread */ QEMU_CAPS_MIGRATE_RDMA = 177, /* have rdma migration */ QEMU_CAPS_DEVICE_IVSHMEM = 178, /* -device ivshmem */ + QEMU_CAPS_SPAPR_VFIO_HOST_BRIDGE = 179, /* -device spapr-pci-vfio-host-bridge */ QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags;

This patch will get the iommu group for host devices by XML configuration of the vfio host bridge controller or from the sysfs. Like other architectures, the devices in the iommu group need to be detached manually before the guest is created. On pseries, the iommu group can be shared by multiple host devices and they would share the same spapr-vfio-host-bus controller. A new controller is added for every new iommu group. Every spapr-vfio-host-bridge in the cli creates a new pci domain in the guest. For Example, -device spapr-pci-vfio-host-bridge,iommu=1,id=SOMEDOMAIN,index=1 The "SOMEDOMAIN" is the id for new pci domain inside guest. spapr-pci-vfio-host-bridge is actually a PCI host bridge with VFIO support. It can host pic-bridges, and has the features/behaviours similar to the default emulated pci host bridge. The controllers are assigned with new domain numbers for every new iommu group. The sample controller tags would look like below: <controller type='spapr-vfio-pci' index='0' model='pci-root' iommuGroupNum='3' domain='1'/> <controller type='spapr-vfio-pci' index='1' model='pci-bridge' iommuGroupNum='3' domain='1'> <address type='pci' domain='0x0001' bus='0x00' slot='0x02' function='0x0'/> </controller> <controller type='spapr-vfio-pci' index='0' model='pci-root' iommuGroupNum='13' domain='2'/> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> Signed-off-by: Pradipta Kumar Banerjee <bpradip@in.ibm.com> --- src/bhyve/bhyve_domain.c | 2 - src/conf/domain_conf.c | 149 ++++++++++++++++++++++++++++++++++++++++++++-- src/conf/domain_conf.h | 19 ++++++ src/libvirt_private.syms | 1 src/qemu/qemu_command.c | 4 + src/qemu/qemu_domain.c | 12 ++-- src/qemu/qemu_driver.c | 6 ++ 7 files changed, 178 insertions(+), 15 deletions(-) diff --git a/src/bhyve/bhyve_domain.c b/src/bhyve/bhyve_domain.c index ecb1758..96d30ab 100644 --- a/src/bhyve/bhyve_domain.c +++ b/src/bhyve/bhyve_domain.c @@ -63,7 +63,7 @@ bhyveDomainDefPostParse(virDomainDefPtr def, void *opaque ATTRIBUTE_UNUSED) { /* Add an implicit PCI root controller */ - if (virDomainDefMaybeAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, + if (virDomainDefMaybeAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, 0, VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) < 0) return -1; diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 42c0223..66f7809 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -34,6 +34,7 @@ #include "datatypes.h" #include "domain_conf.h" #include "snapshot_conf.h" +#include "virpci.h" #include "viralloc.h" #include "verify.h" #include "virxml.h" @@ -327,7 +328,8 @@ VIR_ENUM_IMPL(virDomainController, VIR_DOMAIN_CONTROLLER_TYPE_LAST, "virtio-serial", "ccid", "usb", - "pci") + "pci", + "spapr-vfio-pci") VIR_ENUM_IMPL(virDomainControllerModelPCI, VIR_DOMAIN_CONTROLLER_MODEL_PCI_LAST, "pci-root", @@ -2926,6 +2928,8 @@ virDomainDefRejectDuplicateControllers(virDomainDefPtr def) /* multiple USB controllers with the same index are allowed */ max_idx[VIR_DOMAIN_CONTROLLER_TYPE_USB] = -1; + /* The idx can be same across different pci domains */ + max_idx[VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO] = -1; for (i = 0; i < VIR_DOMAIN_CONTROLLER_TYPE_LAST; i++) { if (max_idx[i] >= 0 && !(bitmaps[i] = virBitmapNew(max_idx[i] + 1))) @@ -6412,6 +6416,8 @@ virDomainControllerModelTypeFromString(const virDomainControllerDef *def, return virDomainControllerModelUSBTypeFromString(model); else if (def->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI) return virDomainControllerModelPCITypeFromString(model); + else if (def->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) + return virDomainControllerModelPCITypeFromString(model); return -1; } @@ -6584,7 +6590,48 @@ virDomainControllerDefParseXML(xmlNodePtr node, def->opts.pciopts.pcihole64size = VIR_DIV_UP(bytes, 1024); } } + break; + case VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO: { + char *iommuStr = NULL; + char *domainStr = NULL; + + def->domain = -1; + def->opts.spaprvfio.iommuGroupNum = -1; + if (def->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) { + if (def->idx != 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("pci-root and pcie-root controllers " + "should have index 0")); + goto error; + } + } + domainStr = virXMLPropString(node, "domain"); + if (domainStr) { + int r = virStrToLong_i(domainStr, NULL, 10, + &def->domain); + if (r != 0 || def->domain <= 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid domain number: %s"), domainStr); + VIR_FREE(domainStr); + goto error; + } + } + VIR_FREE(domainStr); + iommuStr = virXMLPropString(node, "iommuGroupNum"); + if (iommuStr) { + int r = virStrToLong_i(iommuStr, NULL, 10, + &def->opts.spaprvfio.iommuGroupNum); + if (r != 0 || def->opts.spaprvfio.iommuGroupNum < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid iommu group number: %s"), iommuStr); + VIR_FREE(iommuStr); + goto error; + } + } + VIR_FREE(iommuStr); + break; + } default: break; } @@ -11885,6 +11932,7 @@ virDomainVcpuPinDefParseXML(xmlNodePtr node, int virDomainDefMaybeAddController(virDomainDefPtr def, int type, + int domain, int idx, int model) { @@ -11893,6 +11941,7 @@ virDomainDefMaybeAddController(virDomainDefPtr def, for (i = 0; i < def->ncontrollers; i++) { if (def->controllers[i]->type == type && + def->controllers[i]->domain == domain && def->controllers[i]->idx == idx) return 0; } @@ -11901,6 +11950,7 @@ virDomainDefMaybeAddController(virDomainDefPtr def, return -1; cont->type = type; + cont->domain = domain; cont->idx = idx; cont->model = model; @@ -11908,6 +11958,8 @@ virDomainDefMaybeAddController(virDomainDefPtr def, cont->opts.vioserial.ports = -1; cont->opts.vioserial.vectors = -1; } + if (cont->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) + cont->opts.spaprvfio.iommuGroupNum = -1; if (VIR_APPEND_ELEMENT(def->controllers, def->ncontrollers, cont) < 0) { VIR_FREE(cont); @@ -12026,6 +12078,79 @@ virDomainResourceDefParse(xmlNodePtr node, return NULL; } +int +virDomainDefMaybeAddHostdevSpaprPCIVfiocontrollers(virDomainDefPtr def) +{ + size_t i, j; + virDomainHostdevDefPtr hostdev; + virDomainControllerDefPtr controller; + int ret = -1; + int maxDomainId = 0; + int skip; + + if ((def->os.arch != VIR_ARCH_PPC64) || + !(def->os.machine && STRPREFIX(def->os.machine, "pseries"))) + return 0; + + for (i = 0; i < def->nhostdevs; i++) { + hostdev = def->hostdevs[i]; + if (IS_PCI_VFIO_HOSTDEV(hostdev)) + hostdev->source.subsys.u.pci.iommu = -1; + } + /* The hostdevs belonging to same iommu are + * all part of same domain. + */ + for (i = 0; i < def->ncontrollers; i++) { + controller = def->controllers[i]; + if (controller->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO && + controller->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) + for (j = 0; j < def->nhostdevs; j++) { + hostdev = def->hostdevs[j]; + if (IS_PCI_VFIO_HOSTDEV(hostdev)) + if (hostdev->info->addr.pci.domain == controller->domain) + hostdev->source.subsys.u.pci.iommu = controller->opts.spaprvfio.iommuGroupNum; + } + if (controller->domain > maxDomainId) + maxDomainId = controller->domain; + } + /* If the spapr-vfio controller doesnt exist for the hostdev + * add a controller for that iommu group. + */ + for (i = 0; i < def->nhostdevs; i++) { + skip = 0; + hostdev = def->hostdevs[i]; + if (IS_PCI_VFIO_HOSTDEV(hostdev)) { + virPCIDeviceAddressPtr addr; + int iommu = -1; + if (hostdev->source.subsys.u.pci.iommu == -1) { + addr = (virPCIDeviceAddressPtr)&hostdev->source.subsys.u.pci.addr; + if ((iommu = virPCIDeviceAddressGetIOMMUGroupNum(addr)) < 0) + goto error; + hostdev->source.subsys.u.pci.iommu = iommu; + + for (j = 0; j < def->ncontrollers; j++) { + controller = def->controllers[j]; + if (controller->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO && + controller->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) { + if (iommu == controller->opts.spaprvfio.iommuGroupNum) + skip = 1; + } + } + if (skip) + continue; + if (virDomainDefMaybeAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO, + ++maxDomainId, 0, VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) < 0) + goto error; + def->controllers[def->ncontrollers-1]->opts.spaprvfio.iommuGroupNum = iommu; + } + } + } + + ret = 0; + error: + return ret; +} + static int virDomainDefMaybeAddHostdevSCSIcontroller(virDomainDefPtr def) { @@ -12047,7 +12172,7 @@ virDomainDefMaybeAddHostdevSCSIcontroller(virDomainDefPtr def) return 0; for (i = 0; i <= maxController; i++) { - if (virDomainDefMaybeAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_SCSI, i, -1) < 0) + if (virDomainDefMaybeAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_SCSI, 0, i, -1) < 0) return -1; } @@ -15525,7 +15650,7 @@ virDomainDefAddDiskControllersForType(virDomainDefPtr def, return 0; for (i = 0; i <= maxController; i++) { - if (virDomainDefMaybeAddController(def, controllerType, i, -1) < 0) + if (virDomainDefMaybeAddController(def, controllerType, 0, i, -1) < 0) return -1; } @@ -15548,7 +15673,7 @@ virDomainDefMaybeAddVirtioSerialController(virDomainDefPtr def) idx = channel->info.addr.vioserial.controller; if (virDomainDefMaybeAddController(def, - VIR_DOMAIN_CONTROLLER_TYPE_VIRTIO_SERIAL, idx, -1) < 0) + VIR_DOMAIN_CONTROLLER_TYPE_VIRTIO_SERIAL, 0, idx, -1) < 0) return -1; } } @@ -15563,7 +15688,7 @@ virDomainDefMaybeAddVirtioSerialController(virDomainDefPtr def) idx = console->info.addr.vioserial.controller; if (virDomainDefMaybeAddController(def, - VIR_DOMAIN_CONTROLLER_TYPE_VIRTIO_SERIAL, idx, -1) < 0) + VIR_DOMAIN_CONTROLLER_TYPE_VIRTIO_SERIAL, 0, idx, -1) < 0) return -1; } } @@ -15603,7 +15728,7 @@ virDomainDefMaybeAddSmartcardController(virDomainDefPtr def) if (virDomainDefMaybeAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_CCID, - idx, -1) < 0) + 0, idx, -1) < 0) return -1; } @@ -15648,6 +15773,9 @@ virDomainDefAddImplicitControllers(virDomainDefPtr def) if (virDomainDefMaybeAddHostdevSCSIcontroller(def) < 0) return -1; + if (virDomainDefMaybeAddHostdevSpaprPCIVfiocontrollers(def) < 0) + return -1; + return 0; } @@ -16412,6 +16540,8 @@ virDomainControllerModelTypeToString(virDomainControllerDefPtr def, return virDomainControllerModelUSBTypeToString(model); else if (def->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI) return virDomainControllerModelPCITypeToString(model); + else if (def->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) + return virDomainControllerModelPCITypeToString(model); return NULL; } @@ -16465,7 +16595,12 @@ virDomainControllerDefFormat(virBufferPtr buf, if (def->opts.pciopts.pcihole64) pcihole64 = true; break; - + case VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO: + virBufferAsprintf(buf, " iommuGroupNum='%d'", + def->opts.spaprvfio.iommuGroupNum); + virBufferAsprintf(buf, " domain='%d'", + def->domain); + break; default: break; } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 9908d88..8671fcd 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -419,6 +419,7 @@ typedef virDomainHostdevSubsysPCI *virDomainHostdevSubsysPCIPtr; struct _virDomainHostdevSubsysPCI { virDevicePCIAddress addr; /* host address */ int backend; /* enum virDomainHostdevSubsysPCIBackendType */ + int iommu; }; typedef struct _virDomainHostdevSubsysSCSIHost virDomainHostdevSubsysSCSIHost; @@ -685,6 +686,7 @@ typedef enum { VIR_DOMAIN_CONTROLLER_TYPE_CCID, VIR_DOMAIN_CONTROLLER_TYPE_USB, VIR_DOMAIN_CONTROLLER_TYPE_PCI, + VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO, VIR_DOMAIN_CONTROLLER_TYPE_LAST } virDomainControllerType; @@ -742,6 +744,12 @@ struct _virDomainPCIControllerOpts { unsigned long pcihole64size; }; +typedef struct _virDomainSPAPRVfioControllerOpts virDomainSPAPRVfioControllerOpts; +typedef virDomainSPAPRVfioControllerOpts *virDomainiSPAPRVfioControllerOptsPtr; +struct _virDomainSPAPRVfioControllerOpts { + int iommuGroupNum; +}; + /* Stores the virtual disk controller configuration */ struct _virDomainControllerDef { int type; @@ -750,9 +758,11 @@ struct _virDomainControllerDef { unsigned int queues; unsigned int cmd_per_lun; unsigned int max_sectors; + int domain; union { virDomainVirtioSerialOpts vioserial; virDomainPCIControllerOpts pciopts; + virDomainSPAPRVfioControllerOpts spaprvfio; } opts; virDomainDeviceInfo info; }; @@ -2808,12 +2818,15 @@ void virDomainListFree(virDomainPtr *list); int virDomainDefMaybeAddController(virDomainDefPtr def, int type, + int domain, int idx, int model); int virDomainDefMaybeAddInput(virDomainDefPtr def, int type, int bus); +int +virDomainDefMaybeAddHostdevSpaprPCIVfiocontrollers(virDomainDefPtr def); char *virDomainDefGetDefaultEmulator(virDomainDefPtr def, virCapsPtr caps); @@ -2845,6 +2858,12 @@ int virDomainObjSetMetadata(virDomainObjPtr vm, const char *configDir, unsigned int flags); +# define IS_PCI_VFIO_HOSTDEV(dvc) \ + (((dvc)->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) && \ + ((dvc)->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) && \ + (((dvc)->source.subsys.u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_VFIO) || \ + ((dvc)->source.subsys.u.pci.backend == VIR_DOMAIN_HOSTDEV_PCI_BACKEND_DEFAULT))) + bool virDomainDefNeedsPlacementAdvice(virDomainDefPtr def) ATTRIBUTE_NONNULL(1); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 9f749b7..f61fccd 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -197,6 +197,7 @@ virDomainDefFree; virDomainDefGetDefaultEmulator; virDomainDefGetSecurityLabelDef; virDomainDefMaybeAddController; +virDomainDefMaybeAddHostdevSpaprPCIVfiocontrollers; virDomainDefMaybeAddInput; virDomainDefNeedsPlacementAdvice; virDomainDefNew; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 8cb0865..3cec764 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -559,6 +559,8 @@ qemuNetworkPrepareDevices(virDomainDefPtr def) } if (virDomainHostdevInsert(def, hostdev) < 0) goto cleanup; + if (virDomainDefMaybeAddHostdevSpaprPCIVfiocontrollers(def) < 0) + goto cleanup; } } ret = 0; @@ -1498,7 +1500,7 @@ qemuDomainAssignPCIAddresses(virDomainDefPtr def, virDomainPCIAddressBusPtr bus = &addrs->buses[i]; if ((rv = virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, + def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, i, bus->model)) < 0) goto cleanup; /* If we added a new bridge, we will need one more address */ diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 76fccce..8c3f92a 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -998,17 +998,17 @@ qemuDomainDefPostParse(virDomainDefPtr def, if (addDefaultUSB && virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_USB, 0, -1) < 0) + def, VIR_DOMAIN_CONTROLLER_TYPE_USB, 0, 0, -1) < 0) return -1; if (addImplicitSATA && virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_SATA, 0, -1) < 0) + def, VIR_DOMAIN_CONTROLLER_TYPE_SATA, 0, 0, -1) < 0) return -1; if (addPCIRoot && virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, + def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, 0, VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) < 0) return -1; @@ -1018,13 +1018,13 @@ qemuDomainDefPostParse(virDomainDefPtr def, */ if (addPCIeRoot) { if (virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, + def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, 0, VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT) < 0 || virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1, + def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, 1, VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE) < 0 || virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2, + def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, 2, VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) < 0) { return -1; } diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7377320..508b748 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7099,6 +7099,9 @@ qemuDomainAttachDeviceConfig(virQEMUCapsPtr qemuCaps, net = dev->data.net; if (virDomainNetInsert(vmdef, net)) return -1; + if (dev->data.net->type == VIR_DOMAIN_NET_TYPE_HOSTDEV) + if (virDomainDefMaybeAddHostdevSpaprPCIVfiocontrollers(vmdef) < 0) + return -1; dev->data.net = NULL; if (qemuDomainAssignAddresses(vmdef, qemuCaps, NULL) < 0) return -1; @@ -7113,6 +7116,9 @@ qemuDomainAttachDeviceConfig(virQEMUCapsPtr qemuCaps, } if (virDomainHostdevInsert(vmdef, hostdev)) return -1; + if (IS_PCI_VFIO_HOSTDEV(hostdev)) + if (virDomainDefMaybeAddHostdevSpaprPCIVfiocontrollers(vmdef) < 0) + return -1; dev->data.hostdev = NULL; if (qemuDomainAssignAddresses(vmdef, qemuCaps, NULL) < 0) return -1;

On pseries, the vfio host devices attach to the spapr-pci-vfio domain instead of the default emulated domain. So, for a host device belonging to iommu group(say) 3, would need below host bridge. -device spapr-pci-vfio-host-bridge,iommu=3,id=vfiohostbridge3,index=1 The vfio device then needs to assign itself to the bus "vfiohostbridge3" as below : -device vfio-pci,host=0003:05:00.1,id=hostdev0,bus=vfiohostbridge3.0,addr=0x1 Since each host bridge controller adds a new domain, all the devices addressing would need to start from bus0:slot1:function0 in the new domain. The controller tag for spapr-pci-vfio-host-bridge in the xml has the domain and iommu number allocated during the parsing based of the hostdevs in the xml. Assign the pci addressses for the hostdevs from their respective domains. The domain id "vfiohostbridge<iommu>" is used for uniqueness in the controller alias. For Example, The live xml and the lspci output of a running guest can be seen at http://fpaste.org/142911/ Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> Signed-off-by: Pradipta Kumar Banerjee <bpradip@in.ibm.com> --- src/conf/domain_addr.c | 8 + src/conf/domain_addr.h | 1 src/libvirt_private.syms | 1 src/qemu/qemu_command.c | 277 +++++++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_command.h | 17 ++- tests/qemuhotplugtest.c | 2 6 files changed, 283 insertions(+), 23 deletions(-) diff --git a/src/conf/domain_addr.c b/src/conf/domain_addr.c index fb4a76f..db448f2 100644 --- a/src/conf/domain_addr.c +++ b/src/conf/domain_addr.c @@ -110,11 +110,11 @@ virDomainPCIAddressValidate(virDomainPCIAddressSetPtr addrs, virReportError(errType, "%s", _("No PCI buses available")); return false; } - if (addr->domain != 0) { + if (addr->domain != addrs->domainId) { virReportError(errType, _("Invalid PCI address %s. " - "Only PCI domain 0 is available"), - addrStr); + "Only PCI domain %d is available"), + addrStr, addrs->domainId); return false; } if (addr->bus >= addrs->nbuses) { @@ -463,7 +463,7 @@ virDomainPCIAddressGetNextSlot(virDomainPCIAddressSetPtr addrs, /* default to starting the search for a free slot from * 0000:00:00.0 */ - virDevicePCIAddress a = { 0, 0, 0, 0, false }; + virDevicePCIAddress a = { addrs->domainId, 0, 0, 0, false }; char *addrStr = NULL; /* except if this search is for the exact same type of device as diff --git a/src/conf/domain_addr.h b/src/conf/domain_addr.h index 2c3468e..2510bfd 100644 --- a/src/conf/domain_addr.h +++ b/src/conf/domain_addr.h @@ -64,6 +64,7 @@ struct _virDomainPCIAddressSet { size_t nbuses; virDevicePCIAddress lastaddr; virDomainPCIConnectFlags lastFlags; + int domainId; bool dryRun; /* on a dry run, new buses are auto-added and addresses aren't saved in device infos */ }; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index f61fccd..dfebcee 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -213,6 +213,7 @@ virDomainDeviceDefFree; virDomainDeviceDefParse; virDomainDeviceFindControllerModel; virDomainDeviceGetInfo; +virDomainDeviceInfoClear; virDomainDeviceInfoCopy; virDomainDeviceInfoIterate; virDomainDeviceTypeToString; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 3cec764..e0727b7 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -69,6 +69,8 @@ VIR_LOG_INIT("qemu.qemu_command"); #define VIO_ADDR_SERIAL 0x30000000ul #define VIO_ADDR_NVRAM 0x3000ul +#define DEFAULT_PCI 0 + VIR_ENUM_DECL(virDomainDiskQEMUBus) VIR_ENUM_IMPL(virDomainDiskQEMUBus, VIR_DOMAIN_DISK_BUS_LAST, "ide", @@ -561,6 +563,12 @@ qemuNetworkPrepareDevices(virDomainDefPtr def) goto cleanup; if (virDomainDefMaybeAddHostdevSpaprPCIVfiocontrollers(def) < 0) goto cleanup; + /* The net device is originally assigned address in generic domain. + * Clear the original address for the new address to take effect. + */ + if ((def->os.arch == VIR_ARCH_PPC64) && def->os.machine && + STRPREFIX(def->os.machine, "pseries")) + virDomainDeviceInfoClear(&net->info); } } ret = 0; @@ -885,6 +893,9 @@ qemuAssignDeviceControllerAlias(virDomainControllerDefPtr controller) return virAsprintf(&controller->info.alias, "pcie.%d", controller->idx); else return virAsprintf(&controller->info.alias, "pci.%d", controller->idx); + } else if (controller->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) { + return virAsprintf(&controller->info.alias, "vfiohostbridge%d.%d", + controller->opts.spaprvfio.iommuGroupNum, controller->idx); } return virAsprintf(&controller->info.alias, "%s%d", prefix, controller->idx); @@ -1291,6 +1302,49 @@ int qemuDomainAssignSpaprVIOAddresses(virDomainDefPtr def, return ret; } +static int +qemuIsValidCurrentDomainDevice(virDomainDefPtr def, + virDomainDeviceDefPtr device, + int domain) +{ + size_t j; + int currentDomainHostdev = 0; + int actualType = -1; + virDomainHostdevDefPtr hostdev = NULL; + + if (device->type == VIR_DOMAIN_DEVICE_CONTROLLER && + domain != device->data.controller->domain) + return -1; /* Dont reserve controllers not belonging to the current domain */ + + if (device->type == VIR_DOMAIN_DEVICE_NET) { + actualType = virDomainNetGetActualType(device->data.net); + if (actualType == VIR_DOMAIN_NET_TYPE_HOSTDEV) + hostdev = virDomainNetGetActualHostdev(device->data.net); + } else if (device->type == VIR_DOMAIN_DEVICE_HOSTDEV) { + hostdev = device->data.hostdev; + } + + if (domain == 0 && hostdev && (IS_PCI_VFIO_HOSTDEV(hostdev))) + return -1; /* Don't reserve vfio devices in emulated domain. */ + + if (domain != 0) { + if (device->type != VIR_DOMAIN_DEVICE_CONTROLLER && !hostdev) + return -1; /* Don't reserve non vfio devices and controllers in spapr-vfio domain */ + + if (hostdev && IS_PCI_VFIO_HOSTDEV(hostdev)) { + for (j = 0; j < def->ncontrollers; j++) { + if ((def->controllers[j]->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) && + (domain == def->controllers[j]->domain) && + (def->controllers[j]->opts.spaprvfio.iommuGroupNum == hostdev->source.subsys.u.pci.iommu)) + currentDomainHostdev = 1; + } + /* Dont reserve hostdevs which dont belong to current domain */ + if (!currentDomainHostdev) + return -1; + } + } + return 0; +} static int qemuCollectPCIAddress(virDomainDefPtr def ATTRIBUTE_UNUSED, @@ -1316,6 +1370,16 @@ qemuCollectPCIAddress(virDomainDefPtr def ATTRIBUTE_UNUSED, return 0; } + /* For PPC64 the passthrough devices are assigned to non-emulated + * pci domain and wont be part of domain zero. + */ + if (def->os.arch == VIR_ARCH_PPC64 && def->os.machine && + STRPREFIX(def->os.machine, "pseries")) + { + if (qemuIsValidCurrentDomainDevice(def, device, addrs->domainId) < 0) + return 0; + } + /* Change flags according to differing requirements of different * devices. */ @@ -1323,6 +1387,7 @@ qemuCollectPCIAddress(virDomainDefPtr def ATTRIBUTE_UNUSED, case VIR_DOMAIN_DEVICE_CONTROLLER: switch (device->data.controller->type) { case VIR_DOMAIN_CONTROLLER_TYPE_PCI: + case VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO: switch (device->data.controller->model) { case VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE: /* pci-bridge needs a PCI slot, but it isn't @@ -1460,27 +1525,40 @@ qemuDomainSupportsPCI(virDomainDefPtr def) int qemuDomainAssignPCIAddresses(virDomainDefPtr def, - virQEMUCapsPtr qemuCaps, - virDomainObjPtr obj) + virQEMUCapsPtr qemuCaps, + virDomainObjPtr obj, + int domain) { int ret = -1; + int iommu = -1; virDomainPCIAddressSetPtr addrs = NULL; qemuDomainObjPrivatePtr priv = NULL; + int controllerType = -1; if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE)) { int max_idx = -1; int nbuses = 0; + int ncontrollers = def->ncontrollers; size_t i; int rv; virDomainPCIConnectFlags flags = VIR_PCI_CONNECT_TYPE_PCI; for (i = 0; i < def->ncontrollers; i++) { - if (def->controllers[i]->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI) { - if ((int) def->controllers[i]->idx > max_idx) - max_idx = def->controllers[i]->idx; - } + virDomainControllerDefPtr cont = def->controllers[i]; + if ((cont->type != VIR_DOMAIN_CONTROLLER_TYPE_PCI) && + (cont->type != VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO)) + continue; + if (cont->domain != domain) + continue; + if ((int) cont->idx > max_idx) + max_idx = cont->idx; + if (cont->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) + iommu = cont->opts.spaprvfio.iommuGroupNum; } + controllerType = (domain == DEFAULT_PCI)?VIR_DOMAIN_CONTROLLER_TYPE_PCI: + VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO; + nbuses = max_idx + 1; if (nbuses > 0 && @@ -1488,7 +1566,7 @@ qemuDomainAssignPCIAddresses(virDomainDefPtr def, virDomainDeviceInfo info; /* 1st pass to figure out how many PCI bridges we need */ - if (!(addrs = qemuDomainPCIAddressSetCreate(def, nbuses, true))) + if (!(addrs = qemuDomainPCIAddressSetCreate(def, nbuses, domain, true))) goto cleanup; if (qemuAssignDevicePCISlots(def, qemuCaps, addrs) < 0) goto cleanup; @@ -1500,9 +1578,18 @@ qemuDomainAssignPCIAddresses(virDomainDefPtr def, virDomainPCIAddressBusPtr bus = &addrs->buses[i]; if ((rv = virDomainDefMaybeAddController( - def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 0, + def, controllerType, domain, i, bus->model)) < 0) goto cleanup; + + if (controllerType == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) { + if (ncontrollers < def->ncontrollers) { + /* If a bridge was actually added, set the IOMMU */ + def->controllers[def->ncontrollers-1]->opts.spaprvfio.iommuGroupNum = iommu; + ncontrollers = def->ncontrollers; + } + } + /* If we added a new bridge, we will need one more address */ if (rv > 0 && virDomainPCIAddressReserveNextSlot(addrs, &info, flags) < 0) @@ -1519,7 +1606,7 @@ qemuDomainAssignPCIAddresses(virDomainDefPtr def, goto cleanup; } - if (!(addrs = qemuDomainPCIAddressSetCreate(def, nbuses, false))) + if (!(addrs = qemuDomainPCIAddressSetCreate(def, nbuses, domain, false))) goto cleanup; if (qemuDomainSupportsPCI(def)) { @@ -1528,7 +1615,7 @@ qemuDomainAssignPCIAddresses(virDomainDefPtr def, } } - if (obj && obj->privateData) { + if (obj && obj->privateData && domain == DEFAULT_PCI) { priv = obj->privateData; if (addrs) { /* if this is the live domain object, we persist the PCI addresses*/ @@ -1549,6 +1636,32 @@ qemuDomainAssignPCIAddresses(virDomainDefPtr def, return ret; } +int +qemuDomainAssignSpaprVFIOHostdevPCIAddresses(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainObjPtr obj) +{ + int ret = -1; + if ((def->os.arch != VIR_ARCH_PPC64) || + !STRPREFIX(def->os.machine, "pseries")) + return 0; + + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE)) { + size_t i; + for (i = 0; i < def->ncontrollers; i++) { + if (def->controllers[i]->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT && + def->controllers[i]->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) { + ret = qemuDomainAssignPCIAddresses(def, qemuCaps, obj, + def->controllers[i]->domain); + if (ret) + return ret; + } + } + } + + return 0; +} + int qemuDomainAssignAddresses(virDomainDefPtr def, virQEMUCapsPtr qemuCaps, virDomainObjPtr obj) @@ -1567,13 +1680,19 @@ int qemuDomainAssignAddresses(virDomainDefPtr def, if (rc) return rc; - return qemuDomainAssignPCIAddresses(def, qemuCaps, obj); + rc = qemuDomainAssignSpaprVFIOHostdevPCIAddresses(def, qemuCaps, obj); + if (rc) + return rc; + + return qemuDomainAssignPCIAddresses(def, qemuCaps, obj, DEFAULT_PCI); + } virDomainPCIAddressSetPtr qemuDomainPCIAddressSetCreate(virDomainDefPtr def, unsigned int nbuses, + int domainId, bool dryRun) { virDomainPCIAddressSetPtr addrs; @@ -1584,6 +1703,7 @@ qemuDomainPCIAddressSetCreate(virDomainDefPtr def, addrs->nbuses = nbuses; addrs->dryRun = dryRun; + addrs->domainId = domainId; /* As a safety measure, set default model='pci-root' for first pci * controller and 'pci-bridge' for all subsequent. After setting @@ -1603,7 +1723,11 @@ qemuDomainPCIAddressSetCreate(virDomainDefPtr def, for (i = 0; i < def->ncontrollers; i++) { size_t idx = def->controllers[i]->idx; - if (def->controllers[i]->type != VIR_DOMAIN_CONTROLLER_TYPE_PCI) + if ((def->controllers[i]->type != VIR_DOMAIN_CONTROLLER_TYPE_PCI) && + (def->controllers[i]->type != VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO)) + continue; + + if (def->controllers[i]->domain != domainId) continue; if (idx >= addrs->nbuses) { @@ -1964,6 +2088,85 @@ qemuDomainValidateDevicePCISlotsQ35(virDomainDefPtr def, return ret; } +static int +qemuAssignSPAPRHostDevAddresses(virDomainDefPtr def, + virDomainPCIAddressSetPtr addrs) +{ + size_t i; + int iommu = -1; + int actualType = -1; + virDomainHostdevDefPtr hostdev = NULL; + virDomainPCIConnectFlags flags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI; + /* PCI controllers */ + for (i = 0; i < def->ncontrollers; i++) { + if (def->controllers[i]->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) { + if (def->controllers[i]->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + continue; + if (def->controllers[i]->domain != addrs->domainId) + continue; + switch (def->controllers[i]->model) { + case VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT: + /* pci-root is implicit in the machine, + * and needs no address */ + iommu = def->controllers[i]->opts.spaprvfio.iommuGroupNum; + continue; + case VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE: + /* pci-bridge doesn't require hot-plug + * (although it does provide hot-plug in its slots) + */ + flags = VIR_PCI_CONNECT_TYPE_PCI; + break; + default: + flags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI; + break; + } + iommu = def->controllers[i]->opts.spaprvfio.iommuGroupNum; + if (virDomainPCIAddressReserveNextSlot(addrs, + &def->controllers[i]->info, + flags) < 0) + goto error; + } + } + + flags = VIR_PCI_CONNECT_HOTPLUGGABLE | VIR_PCI_CONNECT_TYPE_PCI; + + for (i = 0; i < def->nhostdevs; i++) { + if (def->hostdevs[i]->info->type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + continue; + + if (!IS_PCI_VFIO_HOSTDEV(def->hostdevs[i]) || + def->hostdevs[i]->source.subsys.u.pci.iommu != iommu) + continue; + + if (virDomainPCIAddressReserveNextSlot(addrs, def->hostdevs[i]->info, flags) < 0) + goto error; + } + /* Network interfaces */ + for (i = 0; i < def->nnets; i++) { + /* type='network' network devices that are SR-IOV VFs use VFIO. + * handle them here. + */ + if ((def->nets[i]->type != VIR_DOMAIN_NET_TYPE_NETWORK) || + (def->nets[i]->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE)) { + continue; + } + + actualType = virDomainNetGetActualType(def->nets[i]); + if (actualType != VIR_DOMAIN_NET_TYPE_HOSTDEV) + continue; + + hostdev = virDomainNetGetActualHostdev(def->nets[i]); + if (!hostdev || hostdev->source.subsys.u.pci.iommu != iommu) + continue; + + if (virDomainPCIAddressReserveNextSlot(addrs, &def->nets[i]->info, + flags) < 0) + goto error; + } + return 0; + error: + return -1; +} /* * This assigns static PCI slots to all configured devices. @@ -2020,6 +2223,12 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, goto error; } + if (STRPREFIX(def->os.machine, "pseries") && (addrs->domainId > 0)) { + if (qemuAssignSPAPRHostDevAddresses(def, addrs)) + goto error; + return 0; + } + /* PCI controllers */ for (i = 0; i < def->ncontrollers; i++) { if (def->controllers[i]->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI) { @@ -2077,6 +2286,13 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, (def->nets[i]->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE)) { continue; } + + if (addrs->domainId == 0 && (STRPREFIX(def->os.machine, "pseries"))) { + int actualType = virDomainNetGetActualType(def->nets[i]); + if (actualType == VIR_DOMAIN_NET_TYPE_HOSTDEV) + continue; + } + if (virDomainPCIAddressReserveNextSlot(addrs, &def->nets[i]->info, flags) < 0) goto error; @@ -2124,6 +2340,10 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, if (def->controllers[i]->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) continue; + if (def->controllers[i]->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) { + continue; + } + /* USB2 needs special handling to put all companions in the same slot */ if (IS_USB2_CONTROLLER(def->controllers[i])) { virDevicePCIAddress addr = { 0, 0, 0, 0, false }; @@ -2213,6 +2433,10 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, def->hostdevs[i]->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) continue; + if (addrs->domainId == 0 && (STRPREFIX(def->os.machine, "pseries")) && + IS_PCI_VFIO_HOSTDEV(def->hostdevs[i])) + continue; + if (virDomainPCIAddressReserveNextSlot(addrs, def->hostdevs[i]->info, flags) < 0) @@ -2318,9 +2542,12 @@ qemuBuildDeviceAddressStr(virBufferPtr buf, goto cleanup; for (i = 0; i < domainDef->ncontrollers; i++) { virDomainControllerDefPtr cont = domainDef->controllers[i]; + if (info->addr.pci.domain != cont->domain) + continue; - if (cont->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI && - cont->idx == info->addr.pci.bus) { + if ((cont->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI || + cont->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) && + cont->idx == info->addr.pci.bus) { contAlias = cont->info.alias; if (!contAlias) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -4335,6 +4562,7 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, break; case VIR_DOMAIN_CONTROLLER_TYPE_PCI: + case VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO: switch (def->model) { case VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE: if (def->idx == 0) { @@ -4342,8 +4570,12 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, _("PCI bridge index should be > 0")); goto error; } - virBufferAsprintf(&buf, "pci-bridge,chassis_nr=%d,id=pci.%d", - def->idx, def->idx); + if (def->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI) + virBufferAsprintf(&buf, "pci-bridge,chassis_nr=%d,id=pci.%d", + def->idx, def->idx); + else + virBufferAsprintf(&buf, "pci-bridge,chassis_nr=%d,id=vfiohostbridge%d.%d", + def->opts.spaprvfio.iommuGroupNum, def->opts.spaprvfio.iommuGroupNum, def->idx); break; case VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE: if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_DMI_TO_PCI_BRIDGE)) { @@ -4360,6 +4592,18 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, virBufferAsprintf(&buf, "i82801b11-bridge,id=pci.%d", def->idx); break; case VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT: + if (def->type == VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO) { + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_SPAPR_VFIO_HOST_BRIDGE)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("The spapr-pci-vfio-host-bridge " + "controller is not supported in this QEMU binary")); + goto error; + } + virBufferAsprintf(&buf, + "spapr-pci-vfio-host-bridge,iommu=%d,id=vfiohostbridge%d,index=%d", + def->opts.spaprvfio.iommuGroupNum, def->opts.spaprvfio.iommuGroupNum, def->domain); + break; + } case VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT: virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("wrong function called for pci-root/pcie-root")); @@ -7799,6 +8043,7 @@ qemuBuildCommandLine(virConnectPtr conn, VIR_DOMAIN_CONTROLLER_TYPE_SATA, VIR_DOMAIN_CONTROLLER_TYPE_VIRTIO_SERIAL, VIR_DOMAIN_CONTROLLER_TYPE_CCID, + VIR_DOMAIN_CONTROLLER_TYPE_SPAPR_PCI_VFIO, }; virArch hostarch = virArchFromHost(); virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index aa40c9e..6f997f3 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -161,6 +161,13 @@ char *qemuBuildPCIHostdevDevStr(virDomainDefPtr def, virDomainHostdevDefPtr dev, const char *configfd, virQEMUCapsPtr qemuCaps); +char * +qemuGetSPAPRVFIOHostDevContAliasString(virDomainDefPtr def, + virDomainDeviceInfoPtr info); + +int qemuBuildSPAPRVFIODeviceCommandLine(virCommandPtr cmd, + virDomainDefPtr def, + virQEMUCapsPtr qemuCaps); int qemuOpenPCIConfig(virDomainHostdevDefPtr dev); @@ -232,22 +239,28 @@ int qemuDomainAssignAddresses(virDomainDefPtr def, ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); int qemuDomainAssignSpaprVIOAddresses(virDomainDefPtr def, virQEMUCapsPtr qemuCaps); +int +qemuDomainAssignSpaprVFIOHostdevPCIAddresses(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainObjPtr obj); void qemuDomainReleaseDeviceAddress(virDomainObjPtr vm, virDomainDeviceInfoPtr info, const char *devstr); - int qemuDomainAssignPCIAddresses(virDomainDefPtr def, virQEMUCapsPtr qemuCaps, - virDomainObjPtr obj); + virDomainObjPtr obj, int domain); virDomainPCIAddressSetPtr qemuDomainPCIAddressSetCreate(virDomainDefPtr def, unsigned int nbuses, + int domainId, bool dryRun); int qemuAssignDevicePCISlots(virDomainDefPtr def, virQEMUCapsPtr qemuCaps, virDomainPCIAddressSetPtr addrs); +int qemuAssignHostDevAddresses(virDomainDefPtr def, + virDomainPCIAddressSetPtr addrs); int qemuAssignDeviceAliases(virDomainDefPtr def, virQEMUCapsPtr qemuCaps); int qemuDomainNetVLAN(virDomainNetDefPtr def); diff --git a/tests/qemuhotplugtest.c b/tests/qemuhotplugtest.c index 9d39968..dea2d77 100644 --- a/tests/qemuhotplugtest.c +++ b/tests/qemuhotplugtest.c @@ -84,7 +84,7 @@ qemuHotplugCreateObjects(virDomainXMLOptionPtr xmlopt, if (event) virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_DEVICE_DEL_EVENT); - if (qemuDomainAssignPCIAddresses((*vm)->def, priv->qemuCaps, *vm) < 0) + if (qemuDomainAssignPCIAddresses((*vm)->def, priv->qemuCaps, *vm, 0) < 0) goto cleanup; if (qemuAssignDeviceAliases((*vm)->def, priv->qemuCaps) < 0)

The test case adds passthrough hostdevices and interface of type hostdev. The test case tests the address for multifunction, multibus, and pci bridges within the spapr-vfio domain. Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com> Signed-off-by: Pradipta Kumar Banerjee <bpradip@in.ibm.com> --- docs/schemas/domaincommon.rng | 28 +++++++ .../qemuxml2argv-hostdev-spapr-vfio.args | 16 ++++ .../qemuxml2argv-hostdev-spapr-vfio.xml | 76 ++++++++++++++++++++ tests/qemuxml2argvtest.c | 8 ++ 4 files changed, 128 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.xml diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 20d81ae..d04d044 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1778,6 +1778,34 @@ </group> </choice> </group> + <!-- spapr-vfio-pci has a default "model" --> + <group> + <attribute name="type"> + <value>spapr-vfio-pci</value> + </attribute> + <attribute name="iommuGroupNum"> + <ref name="unsignedInt"/> + </attribute> + <attribute name="domain"> + <ref name="unsignedInt"/> + </attribute> + <choice> + <group> + <attribute name="model"> + <choice> + <value>pci-root</value> + </choice> + </attribute> + </group> + <group> + <attribute name="model"> + <choice> + <value>pci-bridge</value> + </choice> + </attribute> + </group> + </choice> + </group> <!-- virtio-serial has optional "ports" and "vectors" --> <group> <attribute name="type"> diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.args b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.args new file mode 100644 index 0000000..88db0c1 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.args @@ -0,0 +1,16 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-ppc64 -S -M pseries -m 512 -realtime mlock=off -smp 1 \ +-nographic -nodefaults -monitor unix:/tmp/test-monitor,server,nowait -no-acpi \ +-boot c -device spapr-vscsi,id=scsi0,reg=0x2000 \ +-device spapr-pci-vfio-host-bridge,iommu=3,id=vfiohostbridge3,index=1 \ +-device spapr-pci-vfio-host-bridge,iommu=156,id=vfiohostbridge156,index=2 \ +-device pci-bridge,chassis_nr=3,id=vfiohostbridge3.1,bus=vfiohostbridge3.0,addr=0x2 \ +-device pci-bridge,chassis_nr=3,id=vfiohostbridge3.2,bus=vfiohostbridge3.0,addr=0x3 \ +-usb -drive file=/dev/HostVG/QEMUGuest1,if=none,id=drive-scsi0-0-0-0 \ +-device scsi-disk,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \ +-drive file=/root/boot.iso,if=none,media=cdrom,id=drive-scsi0-0-0-2 \ +-device scsi-disk,bus=scsi0.0,channel=0,scsi-id=0,lun=2,drive=drive-scsi0-0-0-2,id=scsi0-0-0-2 \ +-serial pty -device vfio-pci,host=0003:05:00.1,id=hostdev0,bus=vfiohostbridge156.0,addr=0x1 \ +-device vfio-pci,host=0003:20:01.0,id=hostdev1,bus=vfiohostbridge3.0,multifunction=on,addr=0x1 \ +-device vfio-pci,host=0003:20:01.1,id=hostdev2,bus=vfiohostbridge3.0,addr=0x1.0x1 \ +-device vfio-pci,host=0003:20:01.2,id=hostdev3,bus=vfiohostbridge3.2,addr=0x1 diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.xml b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.xml new file mode 100644 index 0000000..2d379c8 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-spapr-vfio.xml @@ -0,0 +1,76 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>87eedafe-eedc-4336-8130-ed9fe5dc90c8</uuid> + <memory unit='KiB'>524288</memory> + <currentMemory unit='KiB'>524288</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='ppc64' machine='pseries'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-ppc64</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='scsi'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <disk type='file' device='cdrom'> + <driver name='qemu' type='raw'/> + <source file='/root/boot.iso'/> + <target dev='hdc' bus='scsi'/> + <readonly/> + <address type='drive' controller='0' bus='0' target='0' unit='2'/> + </disk> + <controller type='scsi' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <console type='pty'> + <target type='serial' port='0'/> + <address type='spapr-vio' reg='0x30000000'/> + </console> + <controller type='spapr-vfio-pci' index='0' model='pci-root' iommuGroupNum='3' domain='1'/> + <controller type='spapr-vfio-pci' index='0' model='pci-root' iommuGroupNum='156' domain='2'/> + <controller type='spapr-vfio-pci' index='1' model='pci-bridge' iommuGroupNum='3' domain='1'> + <address type='pci' domain='0x0001' bus='0x00' slot='0x02' function='0x0'/> + </controller> + <controller type='spapr-vfio-pci' index='2' model='pci-bridge' iommuGroupNum='3' domain='1'> + <address type='pci' domain='0x0001' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0003' bus='0x20' slot='0x01' function='0x0'/> + </source> + <address type='pci' domain='0x0001' bus='0x00' slot='0x01' function='0x0' multifunction='on'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0003' bus='0x20' slot='0x01' function='0x1'/> + </source> + <address type='pci' domain='0x0001' bus='0x00' slot='0x01' function='0x1'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <driver name='vfio'/> + <source> + <address domain='0x0003' bus='0x20' slot='0x01' function='0x2'/> + </source> + <address type='pci' domain='0x0001' bus='0x02' slot='0x01' function='0x0'/> + </hostdev> + <interface type='hostdev' managed='yes'> + <driver name='vfio'/> + <mac address='52:54:00:6d:90:02'/> + <source> + <address type='pci' domain='0x0003' bus='0x05' slot='0x00' function='0x1'/> + </source> + <address type='pci' domain='0x0002' bus='0x00' slot='0x01' function='0x0'/> + </interface> + <memballoon model='none'/> + </devices> +</domain> + diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index c13aa99..8c2a7f3 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1405,6 +1405,14 @@ mymain(void) QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE, QEMU_CAPS_VIRTIO_SCSI, QEMU_CAPS_VIRTIO_SCSI, QEMU_CAPS_DEVICE_SCSI_GENERIC); + DO_TEST("hostdev-spapr-vfio", QEMU_CAPS_DEVICE, + QEMU_CAPS_DEVICE_PCI_BRIDGE, QEMU_CAPS_DRIVE, + QEMU_CAPS_DRIVE, QEMU_CAPS_VIRTIO_SCSI, + QEMU_CAPS_SPAPR_VFIO_HOST_BRIDGE, + QEMU_CAPS_DEVICE_SCSI_GENERIC, QEMU_CAPS_MLOCK, + QEMU_CAPS_DEVICE_VFIO_PCI, QEMU_CAPS_HOST_PCI_MULTIDOMAIN, + QEMU_CAPS_PCI_MULTIFUNCTION, + QEMU_CAPS_PCI_MULTIBUS); DO_TEST("mlock-on", QEMU_CAPS_MLOCK); DO_TEST_FAILURE("mlock-on", NONE);
participants (1)
-
Shivaprasad G Bhat