[PATCH v3 0/3] qemu: Implement support for multiple device-pluggable SMMUv3s
Hi, This is a follow up to the second patchset [0] for supporting multiple device-pluggable vSMMU instances. (re-sending this series for review) This patchset implements support for specifying multiple <iommu> devices within the VM definition when smmuv3 IOMMU model is specified with a pciBus driver attribute, and is tested with Shameer's recently accepted qemu series for user-creatable vSMMU devices [1] For instance, specifying the associated emulated device in a VM definition with multiple IOMMUs, configured to be routed to pcie-expander-bus or pcie-root controllers: <devices> <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='pcie-expander-bus'> <model name='pxb-pcie'/> <target busNr='252'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/> </controller> <controller type='pci' index='2' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='2' port='0x0'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </controller> ... <rng model='virtio'> <backend model='random'>/dev/urandom</backend> <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </rng> <rng model='virtio'> <backend model='random'>/dev/urandom</backend> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </rng> <iommu model='smmuv3'> <driver pciBus='0'/> </iommu> <iommu model='smmuv3'> <driver pciBus='1'/> </iommu> </devices> This would get translated to a qemu command line with the arguments below. -device '{"driver":"pxb-pcie","bus_nr":252,"id":"pci.1","bus":"pcie.0","addr":"0x1"}' \ -device '{"driver":"pcie-root-port","port":0,"chassis":2,"id":"pci.2","bus":"pci.1","addr":"0x0"}' \ -device '{"driver":"arm-smmuv3","primary-bus":"pcie.0","id":"smmuv3.0"}' \ -device '{"driver":"arm-smmuv3","primary-bus":"pci.1","id":"smmuv3.1"}' \ ... -object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \ -device '{"driver":"virtio-rng-pci","rng":"objrng0","id":"rng0","bus":"pci.2","addr":"0x0"}' \ -object '{"qom-type":"rng-random","id":"objrng1","filename":"/dev/urandom"}' \ -device '{"driver":"virtio-rng-pci","rng":"objrng1","id":"rng1","bus":"pcie.0","addr":"0x4"}' \ Changes from v2: - Make multi-iommu adoption the first commit - Consolidate smmuv3Dev into smmuv3 model - Change 'parentIdx' attribute name to 'pciBus' - Change naming of virDomainDef 'iommu' attribute to 'iommus' - Move multi-iommu check to validation code - Move qemu dev props function to device-pluggable smmuv3 commit - Fix conditional statements per feedback Changes from v1: - Moved parentIdx attribute under <driver> - Revised qemuxmlconfdata test cases to attach emulated devices This series is on Github: https://github.com/NathanChenNVIDIA/libvirt/tree/nestedsmmuv3-no-accel-10-21... Thanks, Nathan [0] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/IF6OS... [1] https://lore.kernel.org/qemu-devel/20250829082543.7680-1-skolothumtho@nvidia... Nathan Chen (3): conf: Support multiple device-pluggable smmuv3 IOMMUs qemu: Implement pluggable-device smmuv3 tests: qemuxmlconfdata: provide device-pluggable smmuv3 sample XML and CLI args docs/formatdomain.rst | 4 + src/conf/domain_conf.c | 100 +++++++-- src/conf/domain_conf.h | 10 +- src/conf/domain_validate.c | 61 ++++-- src/conf/schemas/domaincommon.rng | 9 +- src/libvirt_private.syms | 2 + src/qemu/qemu_alias.c | 15 +- src/qemu/qemu_command.c | 197 ++++++++++++------ src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_domain_address.c | 5 +- src/qemu/qemu_driver.c | 8 +- src/qemu/qemu_postparse.c | 11 +- src/qemu/qemu_validate.c | 2 +- .../iommu-smmuv3-pci-bus.aarch64-latest.args | 41 ++++ .../iommu-smmuv3-pci-bus.aarch64-latest.xml | 62 ++++++ .../qemuxmlconfdata/iommu-smmuv3-pci-bus.xml | 49 +++++ tests/qemuxmlconftest.c | 1 + 17 files changed, 455 insertions(+), 124 deletions(-) create mode 100644 tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.args create mode 100644 tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.xml create mode 100644 tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.xml -- 2.43.0
Add support for parsing multiple IOMMU devices from the VM definition when "smmuv3" is the IOMMU model. Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- src/conf/domain_conf.c | 84 +++++++++++++++----- src/conf/domain_conf.h | 9 ++- src/conf/domain_validate.c | 33 +++++--- src/conf/schemas/domaincommon.rng | 4 +- src/libvirt_private.syms | 2 + src/qemu/qemu_alias.c | 15 ++-- src/qemu/qemu_command.c | 127 +++++++++++++++--------------- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_domain_address.c | 5 +- src/qemu/qemu_driver.c | 8 +- src/qemu/qemu_postparse.c | 11 +-- src/qemu/qemu_validate.c | 2 +- 12 files changed, 184 insertions(+), 118 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 396cd1c0db..a587dbf3e1 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4141,7 +4141,9 @@ void virDomainDefFree(virDomainDef *def) virDomainCryptoDefFree(def->cryptos[i]); g_free(def->cryptos); - virDomainIOMMUDefFree(def->iommu); + for (i = 0; i < def->niommus; i++) + virDomainIOMMUDefFree(def->iommus[i]); + g_free(def->iommus); virDomainPstoreDefFree(def->pstore); @@ -5013,9 +5015,9 @@ virDomainDeviceInfoIterateFlags(virDomainDef *def, } device.type = VIR_DOMAIN_DEVICE_IOMMU; - if (def->iommu) { - device.data.iommu = def->iommu; - if ((rc = cb(def, &device, &def->iommu->info, opaque)) != 0) + for (i = 0; i < def->niommus; i++) { + device.data.iommu = def->iommus[i]; + if ((rc = cb(def, &device, &def->iommus[i]->info, opaque)) != 0) return rc; } @@ -16536,6 +16538,42 @@ virDomainInputDefFind(const virDomainDef *def, } +bool +virDomainIOMMUDefEquals(const virDomainIOMMUDef *a, + const virDomainIOMMUDef *b) +{ + if (a->model != b->model || + a->intremap != b->intremap || + a->caching_mode != b->caching_mode || + a->eim != b->eim || + a->iotlb != b->iotlb || + a->aw_bits != b->aw_bits || + a->dma_translation != b->dma_translation) + return false; + + if (a->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + !virDomainDeviceInfoAddressIsEqual(&a->info, &b->info)) + return false; + + return true; +} + + +ssize_t +virDomainIOMMUDefFind(const virDomainDef *def, + const virDomainIOMMUDef *iommu) +{ + size_t i; + + for (i = 0; i < def->niommus; i++) { + if (virDomainIOMMUDefEquals(iommu, def->iommus[i])) + return i; + } + + return -1; +} + + bool virDomainVsockDefEquals(const virDomainVsockDef *a, const virDomainVsockDef *b) @@ -20205,19 +20243,22 @@ virDomainDefParseXML(xmlXPathContextPtr ctxt, } VIR_FREE(nodes); + /* Parsing iommu device definitions */ if ((n = virXPathNodeSet("./devices/iommu", ctxt, &nodes)) < 0) return NULL; - if (n > 1) { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("only a single IOMMU device is supported")); - return NULL; - } + if (n > 0) + def->iommus = g_new0(virDomainIOMMUDef *, n); - if (n > 0) { - if (!(def->iommu = virDomainIOMMUDefParseXML(xmlopt, nodes[0], - ctxt, flags))) + for (i = 0; i < n; i++) { + virDomainIOMMUDef *iommu; + + iommu = virDomainIOMMUDefParseXML(xmlopt, nodes[i], ctxt, flags); + + if (!iommu) return NULL; + + def->iommus[def->niommus++] = iommu; } VIR_FREE(nodes); @@ -22667,15 +22708,17 @@ virDomainDefCheckABIStabilityFlags(virDomainDef *src, goto error; } - if (!!src->iommu != !!dst->iommu) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("Target domain IOMMU device count does not match source")); + if (src->niommus != dst->niommus) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target domain IOMMU device count %1$zu does not match source %2$zu"), + dst->niommus, src->niommus); goto error; } - if (src->iommu && - !virDomainIOMMUDefCheckABIStability(src->iommu, dst->iommu)) - goto error; + for (i = 0; i < src->niommus; i++) { + if (!virDomainIOMMUDefCheckABIStability(src->iommus[i], dst->iommus[i])) + goto error; + } if (!!src->vsock != !!dst->vsock) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", @@ -29531,8 +29574,9 @@ virDomainDefFormatInternalSetRootName(virDomainDef *def, for (n = 0; n < def->ncryptos; n++) { virDomainCryptoDefFormat(buf, def->cryptos[n], flags); } - if (def->iommu) - virDomainIOMMUDefFormat(buf, def->iommu); + + for (n = 0; n < def->niommus; n++) + virDomainIOMMUDefFormat(buf, def->iommus[n]); if (def->vsock) virDomainVsockDefFormat(buf, def->vsock); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 81e735993d..f44e275270 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -3303,6 +3303,9 @@ struct _virDomainDef { size_t nwatchdogs; virDomainWatchdogDef **watchdogs; + size_t niommus; + virDomainIOMMUDef **iommus; + /* At maximum 2 TPMs on the domain if a TPM Proxy is present. */ size_t ntpms; virDomainTPMDef **tpms; @@ -3312,7 +3315,6 @@ struct _virDomainDef { virDomainNVRAMDef *nvram; virCPUDef *cpu; virDomainRedirFilterDef *redirfilter; - virDomainIOMMUDef *iommu; virDomainVsockDef *vsock; virDomainPstoreDef *pstore; @@ -4317,6 +4319,11 @@ virDomainShmemDef *virDomainShmemDefRemove(virDomainDef *def, size_t idx) ssize_t virDomainInputDefFind(const virDomainDef *def, const virDomainInputDef *input) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; +bool virDomainIOMMUDefEquals(const virDomainIOMMUDef *a, + const virDomainIOMMUDef *b); +ssize_t virDomainIOMMUDefFind(const virDomainDef *def, + const virDomainIOMMUDef *iommu) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; bool virDomainVsockDefEquals(const virDomainVsockDef *a, const virDomainVsockDef *b) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index 17955decc0..9fae071975 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -1856,21 +1856,28 @@ virDomainDefCputuneValidate(const virDomainDef *def) static int virDomainDefIOMMUValidate(const virDomainDef *def) { - if (!def->iommu) - return 0; + size_t i; - if (def->iommu->intremap == VIR_TRISTATE_SWITCH_ON && - def->features[VIR_DOMAIN_FEATURE_IOAPIC] != VIR_DOMAIN_IOAPIC_QEMU) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("IOMMU interrupt remapping requires split I/O APIC (ioapic driver='qemu')")); - return -1; - } + for (i = 0; i < def->niommus; i++) { + virDomainIOMMUDef *iommu = def->iommus[i]; + if (def->niommus > 1 && iommu->model != VIR_DOMAIN_IOMMU_MODEL_SMMUV3) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("IOMMU model smmuv3 must be specified for multiple IOMMU definitions")); + } - if (def->iommu->eim == VIR_TRISTATE_SWITCH_ON && - def->iommu->intremap != VIR_TRISTATE_SWITCH_ON) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("IOMMU eim requires interrupt remapping to be enabled")); - return -1; + if (iommu->intremap == VIR_TRISTATE_SWITCH_ON && + def->features[VIR_DOMAIN_FEATURE_IOAPIC] != VIR_DOMAIN_IOAPIC_QEMU) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("IOMMU interrupt remapping requires split I/O APIC (ioapic driver='qemu')")); + return -1; + } + + if (iommu->eim == VIR_TRISTATE_SWITCH_ON && + iommu->intremap != VIR_TRISTATE_SWITCH_ON) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("IOMMU eim requires interrupt remapping to be enabled")); + return -1; + } } return 0; diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng index 75b5124c33..ae3fa95904 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -6964,9 +6964,9 @@ <zeroOrMore> <ref name="panic"/> </zeroOrMore> - <optional> + <zeroOrMore> <ref name="iommu"/> - </optional> + </zeroOrMore> <optional> <ref name="vsock"/> </optional> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 7269dd3786..f62ffa45ab 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -494,6 +494,8 @@ virDomainInputSourceGrabToggleTypeToString; virDomainInputSourceGrabTypeFromString; virDomainInputSourceGrabTypeToString; virDomainInputTypeToString; +virDomainIOMMUDefEquals; +virDomainIOMMUDefFind; virDomainIOMMUDefFree; virDomainIOMMUDefNew; virDomainIOMMUModelTypeFromString; diff --git a/src/qemu/qemu_alias.c b/src/qemu/qemu_alias.c index b0bc057bd1..400ce73283 100644 --- a/src/qemu/qemu_alias.c +++ b/src/qemu/qemu_alias.c @@ -650,10 +650,14 @@ qemuAssignDeviceVsockAlias(virDomainVsockDef *vsock) static void -qemuAssignDeviceIOMMUAlias(virDomainIOMMUDef *iommu) +qemuAssignDeviceIOMMUAlias(virDomainDef *def, + virDomainIOMMUDef **iommu) { - if (!iommu->info.alias) - iommu->info.alias = g_strdup("iommu0"); + size_t i; + for (i = 0; i < def->niommus; i++) { + if (!iommu[i]->info.alias) + iommu[i]->info.alias = g_strdup_printf("iommu%zu", i); + } } @@ -769,8 +773,9 @@ qemuAssignDeviceAliases(virDomainDef *def) if (def->vsock) { qemuAssignDeviceVsockAlias(def->vsock); } - if (def->iommu) - qemuAssignDeviceIOMMUAlias(def->iommu); + if (def->niommus > 0) { + qemuAssignDeviceIOMMUAlias(def, def->iommus); + } for (i = 0; i < def->ncryptos; i++) { qemuAssignDeviceCryptoAlias(def, def->cryptos[i]); } diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index c56c321a6e..97fe4267ec 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -6244,84 +6244,85 @@ qemuBuildIOMMUCommandLine(virCommand *cmd, const virDomainDef *def, virQEMUCaps *qemuCaps) { + size_t i; g_autoptr(virJSONValue) props = NULL; g_autoptr(virJSONValue) wrapperProps = NULL; - const virDomainIOMMUDef *iommu = def->iommu; - if (!iommu) + if (def->niommus == 0) return 0; - switch (iommu->model) { - case VIR_DOMAIN_IOMMU_MODEL_INTEL: - if (virJSONValueObjectAdd(&props, - "s:driver", "intel-iommu", - "s:id", iommu->info.alias, - "S:intremap", qemuOnOffAuto(iommu->intremap), - "T:caching-mode", iommu->caching_mode, - "S:eim", qemuOnOffAuto(iommu->eim), - "T:device-iotlb", iommu->iotlb, - "z:aw-bits", iommu->aw_bits, - "T:dma-translation", iommu->dma_translation, - NULL) < 0) - return -1; + for (i = 0; i < def->niommus; i++) { + virDomainIOMMUDef *iommu = def->iommus[i]; + switch (iommu->model) { + case VIR_DOMAIN_IOMMU_MODEL_INTEL: + if (virJSONValueObjectAdd(&props, + "s:driver", "intel-iommu", + "s:id", iommu->info.alias, + "S:intremap", qemuOnOffAuto(iommu->intremap), + "T:caching-mode", iommu->caching_mode, + "S:eim", qemuOnOffAuto(iommu->eim), + "T:device-iotlb", iommu->iotlb, + "z:aw-bits", iommu->aw_bits, + "T:dma-translation", iommu->dma_translation, + NULL) < 0) + return -1; - if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) - return -1; + if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) + return -1; - return 0; + return 0; + case VIR_DOMAIN_IOMMU_MODEL_VIRTIO: + if (virJSONValueObjectAdd(&props, + "s:driver", "virtio-iommu", + "s:id", iommu->info.alias, + NULL) < 0) { + return -1; + } - case VIR_DOMAIN_IOMMU_MODEL_VIRTIO: - if (virJSONValueObjectAdd(&props, - "s:driver", "virtio-iommu", - "s:id", iommu->info.alias, - NULL) < 0) { - return -1; - } + if (qemuBuildDeviceAddressProps(props, def, &iommu->info) < 0) + return -1; - if (qemuBuildDeviceAddressProps(props, def, &iommu->info) < 0) - return -1; + if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) + return -1; - if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) - return -1; + return 0; + case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: + /* There is no -device for SMMUv3, so nothing to be done here */ + return 0; - return 0; + case VIR_DOMAIN_IOMMU_MODEL_AMD: + if (virJSONValueObjectAdd(&wrapperProps, + "s:driver", "AMDVI-PCI", + "s:id", iommu->info.alias, + NULL) < 0) + return -1; - case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: - /* There is no -device for SMMUv3, so nothing to be done here */ - return 0; + if (qemuBuildDeviceAddressProps(wrapperProps, def, &iommu->info) < 0) + return -1; - case VIR_DOMAIN_IOMMU_MODEL_AMD: - if (virJSONValueObjectAdd(&wrapperProps, - "s:driver", "AMDVI-PCI", - "s:id", iommu->info.alias, - NULL) < 0) - return -1; + if (qemuBuildDeviceCommandlineFromJSON(cmd, wrapperProps, def, qemuCaps) < 0) + return -1; - if (qemuBuildDeviceAddressProps(wrapperProps, def, &iommu->info) < 0) - return -1; + if (virJSONValueObjectAdd(&props, + "s:driver", "amd-iommu", + "s:pci-id", iommu->info.alias, + "S:intremap", qemuOnOffAuto(iommu->intremap), + "T:pt", iommu->pt, + "T:xtsup", iommu->xtsup, + "T:device-iotlb", iommu->iotlb, + NULL) < 0) + return -1; - if (qemuBuildDeviceCommandlineFromJSON(cmd, wrapperProps, def, qemuCaps) < 0) - return -1; + if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) + return -1; - if (virJSONValueObjectAdd(&props, - "s:driver", "amd-iommu", - "s:pci-id", iommu->info.alias, - "S:intremap", qemuOnOffAuto(iommu->intremap), - "T:pt", iommu->pt, - "T:xtsup", iommu->xtsup, - "T:device-iotlb", iommu->iotlb, - NULL) < 0) - return -1; + return 0; - if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) + case VIR_DOMAIN_IOMMU_MODEL_LAST: + default: + virReportEnumRangeError(virDomainIOMMUModel, iommu->model); return -1; - - return 0; - - case VIR_DOMAIN_IOMMU_MODEL_LAST: - default: - virReportEnumRangeError(virDomainIOMMUModel, iommu->model); - return -1; + } } return 0; @@ -7158,8 +7159,8 @@ qemuBuildMachineCommandLine(virCommand *cmd, if (qemuAppendDomainFeaturesMachineParam(&buf, def, qemuCaps) < 0) return -1; - if (def->iommu) { - switch (def->iommu->model) { + if (def->niommus == 1) { + switch (def->iommus[0]->model) { case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: virBufferAddLit(&buf, ",iommu=smmuv3"); break; @@ -7172,7 +7173,7 @@ qemuBuildMachineCommandLine(virCommand *cmd, case VIR_DOMAIN_IOMMU_MODEL_LAST: default: - virReportEnumRangeError(virDomainIOMMUModel, def->iommu->model); + virReportEnumRangeError(virDomainIOMMUModel, def->iommus[0]->model); return -1; } } diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index a42721efad..889427f289 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8371,7 +8371,7 @@ qemuDomainGetMemLockLimitBytes(virDomainDef *def) int factor = nvdpa + nnvme; if (nvfio) { - if (def->iommu) + if (def->niommus > 0) factor += nvfio; else factor += 1; diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index 6d5c4785e8..7233df888c 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -2396,9 +2396,8 @@ qemuDomainAssignDevicePCISlots(virDomainDef *def, /* Nada - none are PCI based (yet) */ } - if (def->iommu) { - virDomainIOMMUDef *iommu = def->iommu; - + for (i = 0; i < def->niommus; i++) { + virDomainIOMMUDef *iommu = def->iommus[i]; switch (iommu->model) { case VIR_DOMAIN_IOMMU_MODEL_VIRTIO: case VIR_DOMAIN_IOMMU_MODEL_AMD: diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a1b1edcbbf..54926064da 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6894,12 +6894,12 @@ qemuDomainAttachDeviceConfig(virDomainDef *vmdef, break; case VIR_DOMAIN_DEVICE_IOMMU: - if (vmdef->iommu) { + if (vmdef->niommus > 0) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("domain already has an iommu device")); return -1; } - vmdef->iommu = g_steal_pointer(&dev->data.iommu); + VIR_APPEND_ELEMENT(vmdef->iommus, vmdef->niommus, dev->data.iommu); break; case VIR_DOMAIN_DEVICE_VIDEO: @@ -7113,12 +7113,12 @@ qemuDomainDetachDeviceConfig(virDomainDef *vmdef, break; case VIR_DOMAIN_DEVICE_IOMMU: - if (!vmdef->iommu) { + if ((idx = virDomainIOMMUDefFind(vmdef, dev->data.iommu)) < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("matching iommu device not found")); return -1; } - g_clear_pointer(&vmdef->iommu, virDomainIOMMUDefFree); + VIR_DELETE_ELEMENT(vmdef->iommus, idx, vmdef->niommus); break; case VIR_DOMAIN_DEVICE_VIDEO: diff --git a/src/qemu/qemu_postparse.c b/src/qemu/qemu_postparse.c index fd27f8be27..3b417ea5d0 100644 --- a/src/qemu/qemu_postparse.c +++ b/src/qemu/qemu_postparse.c @@ -1470,7 +1470,7 @@ qemuDomainDefAddDefaultDevices(virQEMUDriver *driver, } } - if (addIOMMU && !def->iommu && + if (addIOMMU && !def->iommus && def->niommus == 0 && virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_INTEL_IOMMU) && virQEMUCapsGet(qemuCaps, QEMU_CAPS_INTEL_IOMMU_INTREMAP) && virQEMUCapsGet(qemuCaps, QEMU_CAPS_INTEL_IOMMU_EIM)) { @@ -1482,7 +1482,8 @@ qemuDomainDefAddDefaultDevices(virQEMUDriver *driver, iommu->intremap = VIR_TRISTATE_SWITCH_ON; iommu->eim = VIR_TRISTATE_SWITCH_ON; - def->iommu = g_steal_pointer(&iommu); + def->iommus = g_new0(virDomainIOMMUDef *, 1); + def->iommus[def->niommus++] = g_steal_pointer(&iommu); } if (qemuDomainDefAddDefaultAudioBackend(driver, def) < 0) @@ -1558,9 +1559,9 @@ qemuDomainDefEnableDefaultFeatures(virDomainDef *def, * domain already has IOMMU without inremap. This will be fixed in * qemuDomainIOMMUDefPostParse() but there domain definition can't be * modified so change it now. */ - if (def->iommu && - (def->iommu->intremap == VIR_TRISTATE_SWITCH_ON || - qemuDomainNeedsIOMMUWithEIM(def)) && + if (def->iommus && def->niommus == 1 && + (def->iommus[0]->intremap == VIR_TRISTATE_SWITCH_ON || + qemuDomainNeedsIOMMUWithEIM(def)) && def->features[VIR_DOMAIN_FEATURE_IOAPIC] == VIR_DOMAIN_IOAPIC_NONE) { def->features[VIR_DOMAIN_FEATURE_IOAPIC] = VIR_DOMAIN_IOAPIC_QEMU; } diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index 747e54bf44..357978fc2c 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -907,7 +907,7 @@ qemuValidateDomainVCpuTopology(const virDomainDef *def, virQEMUCaps *qemuCaps) QEMU_MAX_VCPUS_WITHOUT_EIM); return -1; } - if (!def->iommu || def->iommu->eim != VIR_TRISTATE_SWITCH_ON) { + if (!def->iommus || def->iommus[0]->eim != VIR_TRISTATE_SWITCH_ON) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("more than %1$d vCPUs require extended interrupt mode enabled on the iommu device"), QEMU_MAX_VCPUS_WITHOUT_EIM); -- 2.43.0
On a Wednesday in 2025, Nathan Chen via Devel wrote:
Add support for parsing multiple IOMMU devices from the VM definition when "smmuv3" is the IOMMU model.
Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- src/conf/domain_conf.c | 84 +++++++++++++++----- src/conf/domain_conf.h | 9 ++- src/conf/domain_validate.c | 33 +++++--- src/conf/schemas/domaincommon.rng | 4 +- src/libvirt_private.syms | 2 + src/qemu/qemu_alias.c | 15 ++-- src/qemu/qemu_command.c | 127 +++++++++++++++--------------- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_domain_address.c | 5 +- src/qemu/qemu_driver.c | 8 +- src/qemu/qemu_postparse.c | 11 +-- src/qemu/qemu_validate.c | 2 +- 12 files changed, 184 insertions(+), 118 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index c56c321a6e..97fe4267ec 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -6244,84 +6244,85 @@ qemuBuildIOMMUCommandLine(virCommand *cmd, const virDomainDef *def, virQEMUCaps *qemuCaps) { + size_t i; g_autoptr(virJSONValue) props = NULL; g_autoptr(virJSONValue) wrapperProps = NULL; - const virDomainIOMMUDef *iommu = def->iommu;
- if (!iommu) + if (def->niommus == 0) return 0;
This condition is no longer needed - the for cycle already checks def->niommus.
- switch (iommu->model) { - case VIR_DOMAIN_IOMMU_MODEL_INTEL: - if (virJSONValueObjectAdd(&props, - "s:driver", "intel-iommu", - "s:id", iommu->info.alias, - "S:intremap", qemuOnOffAuto(iommu->intremap), - "T:caching-mode", iommu->caching_mode, - "S:eim", qemuOnOffAuto(iommu->eim), - "T:device-iotlb", iommu->iotlb, - "z:aw-bits", iommu->aw_bits, - "T:dma-translation", iommu->dma_translation, - NULL) < 0) - return -1; + for (i = 0; i < def->niommus; i++) { + virDomainIOMMUDef *iommu = def->iommus[i]; + switch (iommu->model) { + case VIR_DOMAIN_IOMMU_MODEL_INTEL: + if (virJSONValueObjectAdd(&props, + "s:driver", "intel-iommu", + "s:id", iommu->info.alias, + "S:intremap", qemuOnOffAuto(iommu->intremap), + "T:caching-mode", iommu->caching_mode, + "S:eim", qemuOnOffAuto(iommu->eim), + "T:device-iotlb", iommu->iotlb, + "z:aw-bits", iommu->aw_bits, + "T:dma-translation", iommu->dma_translation, + NULL) < 0) + return -1;
- if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) - return -1; + if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) + return -1;
- return 0; + return 0;
break; instead of return; - even though there can practially be only one device, the function iterates through all of them and should not exit early
+ case VIR_DOMAIN_IOMMU_MODEL_VIRTIO: + if (virJSONValueObjectAdd(&props, + "s:driver", "virtio-iommu", + "s:id", iommu->info.alias, + NULL) < 0) { + return -1; + }
- case VIR_DOMAIN_IOMMU_MODEL_VIRTIO: - if (virJSONValueObjectAdd(&props, - "s:driver", "virtio-iommu", - "s:id", iommu->info.alias, - NULL) < 0) { - return -1; - } + if (qemuBuildDeviceAddressProps(props, def, &iommu->info) < 0) + return -1;
- if (qemuBuildDeviceAddressProps(props, def, &iommu->info) < 0) - return -1; + if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) + return -1;
- if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) - return -1; + return 0;
Same here
+ case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: + /* There is no -device for SMMUv3, so nothing to be done here */ + return 0;
And here.
- return 0; + case VIR_DOMAIN_IOMMU_MODEL_AMD: + if (virJSONValueObjectAdd(&wrapperProps, + "s:driver", "AMDVI-PCI", + "s:id", iommu->info.alias, + NULL) < 0) + return -1;
- case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: - /* There is no -device for SMMUv3, so nothing to be done here */ - return 0; + if (qemuBuildDeviceAddressProps(wrapperProps, def, &iommu->info) < 0) + return -1;
- case VIR_DOMAIN_IOMMU_MODEL_AMD: - if (virJSONValueObjectAdd(&wrapperProps, - "s:driver", "AMDVI-PCI", - "s:id", iommu->info.alias, - NULL) < 0) - return -1; + if (qemuBuildDeviceCommandlineFromJSON(cmd, wrapperProps, def, qemuCaps) < 0) + return -1;
- if (qemuBuildDeviceAddressProps(wrapperProps, def, &iommu->info) < 0) - return -1; + if (virJSONValueObjectAdd(&props, + "s:driver", "amd-iommu", + "s:pci-id", iommu->info.alias, + "S:intremap", qemuOnOffAuto(iommu->intremap), + "T:pt", iommu->pt, + "T:xtsup", iommu->xtsup, + "T:device-iotlb", iommu->iotlb, + NULL) < 0) + return -1;
- if (qemuBuildDeviceCommandlineFromJSON(cmd, wrapperProps, def, qemuCaps) < 0) - return -1; + if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) + return -1;
- if (virJSONValueObjectAdd(&props, - "s:driver", "amd-iommu", - "s:pci-id", iommu->info.alias, - "S:intremap", qemuOnOffAuto(iommu->intremap), - "T:pt", iommu->pt, - "T:xtsup", iommu->xtsup, - "T:device-iotlb", iommu->iotlb, - NULL) < 0) - return -1; + return 0;
Here too.
- if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) + case VIR_DOMAIN_IOMMU_MODEL_LAST: + default: + virReportEnumRangeError(virDomainIOMMUModel, iommu->model); return -1; - - return 0; - - case VIR_DOMAIN_IOMMU_MODEL_LAST: - default: - virReportEnumRangeError(virDomainIOMMUModel, iommu->model); - return -1; + } }
return 0; diff --git a/src/qemu/qemu_postparse.c b/src/qemu/qemu_postparse.c index fd27f8be27..3b417ea5d0 100644 --- a/src/qemu/qemu_postparse.c +++ b/src/qemu/qemu_postparse.c @@ -1470,7 +1470,7 @@ qemuDomainDefAddDefaultDevices(virQEMUDriver *driver, } }
- if (addIOMMU && !def->iommu && + if (addIOMMU && !def->iommus && def->niommus == 0 && virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_INTEL_IOMMU) && virQEMUCapsGet(qemuCaps, QEMU_CAPS_INTEL_IOMMU_INTREMAP) && virQEMUCapsGet(qemuCaps, QEMU_CAPS_INTEL_IOMMU_EIM)) { @@ -1482,7 +1482,8 @@ qemuDomainDefAddDefaultDevices(virQEMUDriver *driver, iommu->intremap = VIR_TRISTATE_SWITCH_ON; iommu->eim = VIR_TRISTATE_SWITCH_ON;
- def->iommu = g_steal_pointer(&iommu); + def->iommus = g_new0(virDomainIOMMUDef *, 1); + def->iommus[def->niommus++] = g_steal_pointer(&iommu); }
if (qemuDomainDefAddDefaultAudioBackend(driver, def) < 0) @@ -1558,9 +1559,9 @@ qemuDomainDefEnableDefaultFeatures(virDomainDef *def, * domain already has IOMMU without inremap. This will be fixed in * qemuDomainIOMMUDefPostParse() but there domain definition can't be * modified so change it now. */ - if (def->iommu && - (def->iommu->intremap == VIR_TRISTATE_SWITCH_ON || - qemuDomainNeedsIOMMUWithEIM(def)) && + if (def->iommus && def->niommus == 1 && + (def->iommus[0]->intremap == VIR_TRISTATE_SWITCH_ON || + qemuDomainNeedsIOMMUWithEIM(def)) &&
qemuDomainNeedsIOMMUWithEIM was aligned properly originally.
def->features[VIR_DOMAIN_FEATURE_IOAPIC] == VIR_DOMAIN_IOAPIC_NONE) { def->features[VIR_DOMAIN_FEATURE_IOAPIC] = VIR_DOMAIN_IOAPIC_QEMU; }
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano
Introduce support for "pciBus" driver attribute for "smmuv3" IOMMU model. The "pciBus" attribute indicates the index of the controller that a smmuv3 IOMMU device is attached to, and differentiates the device-pluggable arm-smmuv3 model from the virt-machine-associated smmuv3 model. Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- docs/formatdomain.rst | 4 ++ src/conf/domain_conf.c | 16 +++++++ src/conf/domain_conf.h | 1 + src/conf/domain_validate.c | 28 ++++++++++-- src/conf/schemas/domaincommon.rng | 5 ++ src/qemu/qemu_command.c | 76 ++++++++++++++++++++++++++++--- 6 files changed, 121 insertions(+), 9 deletions(-) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 4c245f41a9..a181462d13 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -9238,6 +9238,10 @@ Example: Enable x2APIC mode. Useful for higher number of guest CPUs. :since:`Since 11.5.0` (QEMU/KVM and ``amd`` model only) + ``pciBus`` + The ``pciBus`` attribute notes the index of the controller that an + IOMMU device is attached to. (QEMU/KVM and ``smmuv3`` model only) + The ``virtio`` IOMMU devices can further have ``address`` element as described in `Device addresses`_ (address has to by type of ``pci``). diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a587dbf3e1..f3c1b3996e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2815,6 +2815,8 @@ virDomainIOMMUDefNew(void) iommu = g_new0(virDomainIOMMUDef, 1); + iommu->pci_bus = -1; + return g_steal_pointer(&iommu); } @@ -14497,6 +14499,10 @@ virDomainIOMMUDefParseXML(virDomainXMLOption *xmlopt, if (virXMLPropTristateSwitch(driver, "passthrough", VIR_XML_PROP_NONE, &iommu->pt) < 0) return NULL; + + if (virXMLPropInt(driver, "pciBus", 10, VIR_XML_PROP_NONE, + &iommu->pci_bus, -1) < 0) + return NULL; } if (virDomainDeviceInfoParseXML(xmlopt, node, ctxt, @@ -22194,6 +22200,12 @@ virDomainIOMMUDefCheckABIStability(virDomainIOMMUDef *src, dst->aw_bits, src->aw_bits); return false; } + if (src->pci_bus != dst->pci_bus) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target domain IOMMU device pci_bus value '%1$d' does not match source '%2$d'"), + dst->pci_bus, src->pci_bus); + return false; + } if (src->dma_translation != dst->dma_translation) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Target domain IOMMU device dma translation '%1$s' does not match source '%2$s'"), @@ -28525,6 +28537,10 @@ virDomainIOMMUDefFormat(virBuffer *buf, virBufferAsprintf(&driverAttrBuf, " xtsup='%s'", virTristateSwitchTypeToString(iommu->xtsup)); } + if (iommu->pci_bus >= 0) { + virBufferAsprintf(&driverAttrBuf, " pciBus='%d'", + iommu->pci_bus); + } virXMLFormatElement(&childBuf, "driver", &driverAttrBuf, NULL); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index f44e275270..4cf05d54e8 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -3055,6 +3055,7 @@ struct _virDomainIOMMUDef { virTristateSwitch eim; virTristateSwitch iotlb; unsigned int aw_bits; + int pci_bus; virDomainDeviceInfo info; virTristateSwitch dma_translation; virTristateSwitch xtsup; diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index 9fae071975..7f9d2c2fb6 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -1865,6 +1865,11 @@ virDomainDefIOMMUValidate(const virDomainDef *def) _("IOMMU model smmuv3 must be specified for multiple IOMMU definitions")); } + if (def->niommus > 1 && iommu->pci_bus < 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("device-pluggable IOMMU with pciBus attribute must be specified for multiple IOMMU definitions")); + } + if (iommu->intremap == VIR_TRISTATE_SWITCH_ON && def->features[VIR_DOMAIN_FEATURE_IOAPIC] != VIR_DOMAIN_IOAPIC_QEMU) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", @@ -3119,13 +3124,28 @@ virDomainIOMMUDefValidate(const virDomainIOMMUDef *iommu) { switch (iommu->model) { case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: + if (iommu->intremap != VIR_TRISTATE_SWITCH_ABSENT || + iommu->caching_mode != VIR_TRISTATE_SWITCH_ABSENT || + iommu->eim != VIR_TRISTATE_SWITCH_ABSENT || + iommu->iotlb != VIR_TRISTATE_SWITCH_ABSENT || + iommu->aw_bits != 0 || + iommu->dma_translation != VIR_TRISTATE_SWITCH_ABSENT || + iommu->xtsup != VIR_TRISTATE_SWITCH_ABSENT || + iommu->pt != VIR_TRISTATE_SWITCH_ABSENT) { + virReportError(VIR_ERR_XML_ERROR, + _("iommu model '%1$s' doesn't support some additional attributes"), + virDomainIOMMUModelTypeToString(iommu->model)); + return -1; + } + break; case VIR_DOMAIN_IOMMU_MODEL_VIRTIO: if (iommu->intremap != VIR_TRISTATE_SWITCH_ABSENT || iommu->caching_mode != VIR_TRISTATE_SWITCH_ABSENT || iommu->eim != VIR_TRISTATE_SWITCH_ABSENT || iommu->iotlb != VIR_TRISTATE_SWITCH_ABSENT || iommu->aw_bits != 0 || - iommu->dma_translation != VIR_TRISTATE_SWITCH_ABSENT) { + iommu->dma_translation != VIR_TRISTATE_SWITCH_ABSENT || + iommu->pci_bus >= 0) { virReportError(VIR_ERR_XML_ERROR, _("iommu model '%1$s' doesn't support additional attributes"), virDomainIOMMUModelTypeToString(iommu->model)); @@ -3137,7 +3157,8 @@ virDomainIOMMUDefValidate(const virDomainIOMMUDef *iommu) if (iommu->caching_mode != VIR_TRISTATE_SWITCH_ABSENT || iommu->eim != VIR_TRISTATE_SWITCH_ABSENT || iommu->aw_bits != 0 || - iommu->dma_translation != VIR_TRISTATE_SWITCH_ABSENT) { + iommu->dma_translation != VIR_TRISTATE_SWITCH_ABSENT || + iommu->pci_bus >= 0) { virReportError(VIR_ERR_XML_ERROR, _("iommu model '%1$s' doesn't support some additional attributes"), virDomainIOMMUModelTypeToString(iommu->model)); @@ -3147,7 +3168,8 @@ virDomainIOMMUDefValidate(const virDomainIOMMUDef *iommu) case VIR_DOMAIN_IOMMU_MODEL_INTEL: if (iommu->pt != VIR_TRISTATE_SWITCH_ABSENT || - iommu->xtsup != VIR_TRISTATE_SWITCH_ABSENT) { + iommu->xtsup != VIR_TRISTATE_SWITCH_ABSENT || + iommu->pci_bus >= 0) { virReportError(VIR_ERR_XML_ERROR, _("iommu model '%1$s' doesn't support some additional attributes"), virDomainIOMMUModelTypeToString(iommu->model)); diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng index ae3fa95904..9afc679258 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -6322,6 +6322,11 @@ <ref name="virOnOff"/> </attribute> </optional> + <optional> + <attribute name="pciBus"> + <data type="unsignedInt"/> + </attribute> + </optional> </element> </optional> <optional> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 97fe4267ec..ad749afc41 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -6239,6 +6239,62 @@ qemuBuildBootCommandLine(virCommand *cmd, } +static virJSONValue * +qemuBuildPCINestedSmmuv3DevProps(const virDomainDef *def, + const virDomainIOMMUDef *iommu, + size_t id) +{ + g_autoptr(virJSONValue) props = NULL; + g_autofree char *bus = NULL; + g_autofree char *smmuv3_id = NULL; + size_t i; + + for (i = 0; i < def->ncontrollers; i++) { + virDomainControllerDef *cont = def->controllers[i]; + if (cont->idx == iommu->pci_bus) { + if (cont->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI) { + const char *alias = cont->info.alias; + + if (!alias) + return NULL; + + if (virDomainDeviceAliasIsUserAlias(alias)) { + if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT && + iommu->pci_bus == 0) { + if (qemuDomainSupportsPCIMultibus(def)) + bus = g_strdup("pci.0"); + else + bus = g_strdup("pci"); + } else if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT) { + bus = g_strdup("pcie.0"); + } + } else { + bus = g_strdup(alias); + } + break; + } + } + } + + if (!bus) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not find a suitable controller for smmuv3.")); + return NULL; + } + + smmuv3_id = g_strdup_printf("smmuv3.%zu", id); + + if (virJSONValueObjectAdd(&props, + "s:driver", "arm-smmuv3", + "s:primary-bus", bus, + "s:id", smmuv3_id, + NULL) < 0) + return NULL; + + return g_steal_pointer(&props); +} + + static int qemuBuildIOMMUCommandLine(virCommand *cmd, const virDomainDef *def, @@ -6286,9 +6342,6 @@ qemuBuildIOMMUCommandLine(virCommand *cmd, return -1; return 0; - case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: - /* There is no -device for SMMUv3, so nothing to be done here */ - return 0; case VIR_DOMAIN_IOMMU_MODEL_AMD: if (virJSONValueObjectAdd(&wrapperProps, @@ -6318,6 +6371,17 @@ qemuBuildIOMMUCommandLine(virCommand *cmd, return 0; + case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: + if (iommu->pci_bus >= 0) { + if (!(props = qemuBuildPCINestedSmmuv3DevProps(def, iommu, i))) + return -1; + if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) + return -1; + break; + } else { + return 0; + } + case VIR_DOMAIN_IOMMU_MODEL_LAST: default: virReportEnumRangeError(virDomainIOMMUModel, iommu->model); @@ -10822,15 +10886,15 @@ qemuBuildCommandLine(virDomainObj *vm, if (qemuBuildBootCommandLine(cmd, def) < 0) return NULL; - if (qemuBuildIOMMUCommandLine(cmd, def, qemuCaps) < 0) - return NULL; - if (qemuBuildGlobalControllerCommandLine(cmd, def) < 0) return NULL; if (qemuBuildControllersCommandLine(cmd, def, qemuCaps) < 0) return NULL; + if (qemuBuildIOMMUCommandLine(cmd, def, qemuCaps) < 0) + return NULL; + if (qemuBuildMemoryDeviceCommandLine(cmd, cfg, def, priv) < 0) return NULL; -- 2.43.0
On a Wednesday in 2025, Nathan Chen via Devel wrote:
Introduce support for "pciBus" driver attribute for "smmuv3" IOMMU model. The "pciBus" attribute indicates the index of the controller that a smmuv3 IOMMU device is attached to, and differentiates the device-pluggable arm-smmuv3 model from the virt-machine-associated smmuv3 model.
Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- docs/formatdomain.rst | 4 ++ src/conf/domain_conf.c | 16 +++++++ src/conf/domain_conf.h | 1 + src/conf/domain_validate.c | 28 ++++++++++-- src/conf/schemas/domaincommon.rng | 5 ++ src/qemu/qemu_command.c | 76 ++++++++++++++++++++++++++++--- 6 files changed, 121 insertions(+), 9 deletions(-)
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 4c245f41a9..a181462d13 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -9238,6 +9238,10 @@ Example: Enable x2APIC mode. Useful for higher number of guest CPUs. :since:`Since 11.5.0` (QEMU/KVM and ``amd`` model only)
+ ``pciBus`` + The ``pciBus`` attribute notes the index of the controller that an + IOMMU device is attached to. (QEMU/KVM and ``smmuv3`` model only) + The ``virtio`` IOMMU devices can further have ``address`` element as described in `Device addresses`_ (address has to by type of ``pci``).
diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng index ae3fa95904..9afc679258 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -6322,6 +6322,11 @@ <ref name="virOnOff"/> </attribute> </optional> + <optional> + <attribute name="pciBus"> + <data type="unsignedInt"/> + </attribute> + </optional> </element> </optional> <optional> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 97fe4267ec..ad749afc41 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -6239,6 +6239,62 @@ qemuBuildBootCommandLine(virCommand *cmd, }
+static virJSONValue * +qemuBuildPCINestedSmmuv3DevProps(const virDomainDef *def, + const virDomainIOMMUDef *iommu, + size_t id) +{ + g_autoptr(virJSONValue) props = NULL; + g_autofree char *bus = NULL; + g_autofree char *smmuv3_id = NULL; + size_t i; +
+ for (i = 0; i < def->ncontrollers; i++) { + virDomainControllerDef *cont = def->controllers[i]; + if (cont->idx == iommu->pci_bus) { + if (cont->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI) { + const char *alias = cont->info.alias; + + if (!alias) + return NULL; + + if (virDomainDeviceAliasIsUserAlias(alias)) { + if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT && + iommu->pci_bus == 0) { + if (qemuDomainSupportsPCIMultibus(def)) + bus = g_strdup("pci.0"); + else + bus = g_strdup("pci"); + } else if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT) { + bus = g_strdup("pcie.0"); + } + } else { + bus = g_strdup(alias); + } + break; + } + } + }
This duplicates parts of 'qemuBuildDeviceAddressPCIGetBus'
+ + if (!bus) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not find a suitable controller for smmuv3.")); + return NULL; + } + + smmuv3_id = g_strdup_printf("smmuv3.%zu", id); + + if (virJSONValueObjectAdd(&props, + "s:driver", "arm-smmuv3", + "s:primary-bus", bus, + "s:id", smmuv3_id, + NULL) < 0) + return NULL; + + return g_steal_pointer(&props); +} + + static int qemuBuildIOMMUCommandLine(virCommand *cmd, const virDomainDef *def, @@ -6286,9 +6342,6 @@ qemuBuildIOMMUCommandLine(virCommand *cmd, return -1;
return 0; - case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: - /* There is no -device for SMMUv3, so nothing to be done here */ - return 0;
case VIR_DOMAIN_IOMMU_MODEL_AMD: if (virJSONValueObjectAdd(&wrapperProps, @@ -6318,6 +6371,17 @@ qemuBuildIOMMUCommandLine(virCommand *cmd,
return 0;
+ case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: + if (iommu->pci_bus >= 0) { + if (!(props = qemuBuildPCINestedSmmuv3DevProps(def, iommu, i))) + return -1; + if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) + return -1; + break; + } else { + return 0; + }
No need for the else branch.
+ case VIR_DOMAIN_IOMMU_MODEL_LAST: default: virReportEnumRangeError(virDomainIOMMUModel, iommu->model); @@ -10822,15 +10886,15 @@ qemuBuildCommandLine(virDomainObj *vm, if (qemuBuildBootCommandLine(cmd, def) < 0) return NULL;
- if (qemuBuildIOMMUCommandLine(cmd, def, qemuCaps) < 0) - return NULL; - if (qemuBuildGlobalControllerCommandLine(cmd, def) < 0) return NULL;
if (qemuBuildControllersCommandLine(cmd, def, qemuCaps) < 0) return NULL;
+ if (qemuBuildIOMMUCommandLine(cmd, def, qemuCaps) < 0) + return NULL; +
Formatting the IOMMU devices after the controllers makes sense, but should be in a separate patch.
if (qemuBuildMemoryDeviceCommandLine(cmd, cfg, def, priv) < 0) return NULL;
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano
On 11/18/2025 10:25 AM, Ján Tomko wrote:
On a Wednesday in 2025, Nathan Chen via Devel wrote:
Introduce support for "pciBus" driver attribute for "smmuv3" IOMMU model. The "pciBus" attribute indicates the index of the controller that a smmuv3 IOMMU device is attached to, and differentiates the device-pluggable arm-smmuv3 model from the virt-machine-associated smmuv3 model.
Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- docs/formatdomain.rst | 4 ++ src/conf/domain_conf.c | 16 +++++++ src/conf/domain_conf.h | 1 + src/conf/domain_validate.c | 28 ++++++++++-- src/conf/schemas/domaincommon.rng | 5 ++ src/qemu/qemu_command.c | 76 ++++++++++++++++++++++++++++--- 6 files changed, 121 insertions(+), 9 deletions(-)
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 4c245f41a9..a181462d13 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -9238,6 +9238,10 @@ Example: Enable x2APIC mode. Useful for higher number of guest CPUs. :since:`Since 11.5.0` (QEMU/KVM and ``amd`` model only)
+ ``pciBus`` + The ``pciBus`` attribute notes the index of the controller that an + IOMMU device is attached to. (QEMU/KVM and ``smmuv3`` model only) + The ``virtio`` IOMMU devices can further have ``address`` element as described in `Device addresses`_ (address has to by type of ``pci``).
diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/ domaincommon.rng index ae3fa95904..9afc679258 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -6322,6 +6322,11 @@ <ref name="virOnOff"/> </attribute> </optional> + <optional> + <attribute name="pciBus"> + <data type="unsignedInt"/> + </attribute> + </optional> </element> </optional> <optional> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 97fe4267ec..ad749afc41 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -6239,6 +6239,62 @@ qemuBuildBootCommandLine(virCommand *cmd, }
+static virJSONValue * +qemuBuildPCINestedSmmuv3DevProps(const virDomainDef *def, + const virDomainIOMMUDef *iommu, + size_t id) +{ + g_autoptr(virJSONValue) props = NULL; + g_autofree char *bus = NULL; + g_autofree char *smmuv3_id = NULL; + size_t i; +
+ for (i = 0; i < def->ncontrollers; i++) { + virDomainControllerDef *cont = def->controllers[i]; + if (cont->idx == iommu->pci_bus) { + if (cont->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI) { + const char *alias = cont->info.alias; + + if (!alias) + return NULL; + + if (virDomainDeviceAliasIsUserAlias(alias)) { + if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT && + iommu->pci_bus == 0) { + if (qemuDomainSupportsPCIMultibus(def)) + bus = g_strdup("pci.0"); + else + bus = g_strdup("pci"); + } else if (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT) { + bus = g_strdup("pcie.0"); + } + } else { + bus = g_strdup(alias); + } + break; + } + } + }
This duplicates parts of 'qemuBuildDeviceAddressPCIGetBus'
+ + if (!bus) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not find a suitable controller for smmuv3.")); + return NULL; + } + + smmuv3_id = g_strdup_printf("smmuv3.%zu", id); + + if (virJSONValueObjectAdd(&props, + "s:driver", "arm-smmuv3", + "s:primary-bus", bus, + "s:id", smmuv3_id, + NULL) < 0) + return NULL; + + return g_steal_pointer(&props); +} + + static int qemuBuildIOMMUCommandLine(virCommand *cmd, const virDomainDef *def, @@ -6286,9 +6342,6 @@ qemuBuildIOMMUCommandLine(virCommand *cmd, return -1;
return 0; - case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: - /* There is no -device for SMMUv3, so nothing to be done here */ - return 0;
case VIR_DOMAIN_IOMMU_MODEL_AMD: if (virJSONValueObjectAdd(&wrapperProps, @@ -6318,6 +6371,17 @@ qemuBuildIOMMUCommandLine(virCommand *cmd,
return 0;
+ case VIR_DOMAIN_IOMMU_MODEL_SMMUV3: + if (iommu->pci_bus >= 0) { + if (!(props = qemuBuildPCINestedSmmuv3DevProps(def, iommu, i))) + return -1; + if (qemuBuildDeviceCommandlineFromJSON(cmd, props, def, qemuCaps) < 0) + return -1; + break; + } else { + return 0; + }
No need for the else branch.
+ case VIR_DOMAIN_IOMMU_MODEL_LAST: default: virReportEnumRangeError(virDomainIOMMUModel, iommu->model); @@ -10822,15 +10886,15 @@ qemuBuildCommandLine(virDomainObj *vm, if (qemuBuildBootCommandLine(cmd, def) < 0) return NULL;
- if (qemuBuildIOMMUCommandLine(cmd, def, qemuCaps) < 0) - return NULL; - if (qemuBuildGlobalControllerCommandLine(cmd, def) < 0) return NULL;
if (qemuBuildControllersCommandLine(cmd, def, qemuCaps) < 0) return NULL;
+ if (qemuBuildIOMMUCommandLine(cmd, def, qemuCaps) < 0) + return NULL; +
Formatting the IOMMU devices after the controllers makes sense, but should be in a separate patch.
if (qemuBuildMemoryDeviceCommandLine(cmd, cfg, def, priv) < 0) return NULL;
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Hi Jano, Thank you for reviewing and providing feedback on these patches. I have sent out the next revision. -Nathan
Provide sample XML and CLI args for the device-pluggable smmuv3 XML schema for virt machine type. Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- .../iommu-smmuv3-pci-bus.aarch64-latest.args | 41 ++++++++++++ .../iommu-smmuv3-pci-bus.aarch64-latest.xml | 62 +++++++++++++++++++ .../qemuxmlconfdata/iommu-smmuv3-pci-bus.xml | 49 +++++++++++++++ tests/qemuxmlconftest.c | 1 + 4 files changed, 153 insertions(+) create mode 100644 tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.args create mode 100644 tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.xml create mode 100644 tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.xml diff --git a/tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.args b/tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.args new file mode 100644 index 0000000000..092fcf4623 --- /dev/null +++ b/tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.args @@ -0,0 +1,41 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/var/lib/libvirt/qemu/domain--1-guest \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain--1-guest/.local/share \ +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain--1-guest/.cache \ +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain--1-guest/.config \ +/usr/bin/qemu-system-aarch64 \ +-name guest=guest,debug-threads=on \ +-S \ +-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain--1-guest/master-key.aes"}' \ +-machine virt,usb=off,gic-version=2,dump-guest-core=off,memory-backend=mach-virt.ram,acpi=off \ +-accel tcg \ +-cpu cortex-a15 \ +-m size=1048576k \ +-object '{"qom-type":"memory-backend-ram","id":"mach-virt.ram","size":1073741824}' \ +-overcommit mem-lock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid 1ccfd97d-5eb4-478a-bbe6-88d254c16db7 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=1729,server=on,wait=off \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-boot strict=on \ +-device '{"driver":"pxb-pcie","bus_nr":252,"id":"pci.1","bus":"pcie.0","addr":"0x1"}' \ +-device '{"driver":"pxb-pcie","bus_nr":248,"id":"pci.2","bus":"pcie.0","addr":"0x2"}' \ +-device '{"driver":"pcie-root-port","port":0,"chassis":21,"id":"pci.3","bus":"pci.1","addr":"0x0"}' \ +-device '{"driver":"pcie-root-port","port":168,"chassis":22,"id":"pci.4","bus":"pci.2","addr":"0x0"}' \ +-device '{"driver":"arm-smmuv3","primary-bus":"pci.1","id":"smmuv3.0"}' \ +-device '{"driver":"arm-smmuv3","primary-bus":"pci.2","id":"smmuv3.1"}' \ +-audiodev '{"id":"audio1","driver":"none"}' \ +-object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \ +-device '{"driver":"virtio-rng-pci","rng":"objrng0","id":"rng0","bus":"pci.3","addr":"0x0"}' \ +-object '{"qom-type":"rng-random","id":"objrng1","filename":"/dev/urandom"}' \ +-device '{"driver":"virtio-rng-pci","rng":"objrng1","id":"rng1","bus":"pci.4","addr":"0x0"}' \ +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ +-msg timestamp=on diff --git a/tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.xml b/tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.xml new file mode 100644 index 0000000000..c0618c02ab --- /dev/null +++ b/tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.xml @@ -0,0 +1,62 @@ +<domain type='qemu'> + <name>guest</name> + <uuid>1ccfd97d-5eb4-478a-bbe6-88d254c16db7</uuid> + <memory unit='KiB'>1048576</memory> + <currentMemory unit='KiB'>1048576</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='aarch64' machine='virt'>hvm</type> + <boot dev='hd'/> + </os> + <features> + <gic version='2'/> + </features> + <cpu mode='custom' match='exact' check='none'> + <model fallback='forbid'>cortex-a15</model> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-aarch64</emulator> + <controller type='usb' index='0' model='none'/> + <controller type='pci' index='0' model='pcie-root'/> + <controller type='pci' index='1' model='pcie-expander-bus'> + <model name='pxb-pcie'/> + <target busNr='252'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/> + </controller> + <controller type='pci' index='2' model='pcie-expander-bus'> + <model name='pxb-pcie'/> + <target busNr='248'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </controller> + <controller type='pci' index='3' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='21' port='0x0'/> + <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> + </controller> + <controller type='pci' index='4' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='22' port='0xa8'/> + <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> + </controller> + <audio id='1' type='none'/> + <memballoon model='none'/> + <rng model='virtio'> + <backend model='random'>/dev/urandom</backend> + <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> + </rng> + <rng model='virtio'> + <backend model='random'>/dev/urandom</backend> + <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> + </rng> + <iommu model='smmuv3'> + <driver pciBus='1'/> + </iommu> + <iommu model='smmuv3'> + <driver pciBus='2'/> + </iommu> + </devices> +</domain> diff --git a/tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.xml b/tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.xml new file mode 100644 index 0000000000..0af73f2674 --- /dev/null +++ b/tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.xml @@ -0,0 +1,49 @@ +<domain type='qemu'> + <name>guest</name> + <uuid>1ccfd97d-5eb4-478a-bbe6-88d254c16db7</uuid> + <memory unit='KiB'>1048576</memory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='aarch64' machine='virt'>hvm</type> + </os> + <devices> + <emulator>/usr/bin/qemu-system-aarch64</emulator> + <controller type='usb' model='none'/> + <memballoon model='none'/> + <controller type='pci' index='0' model='pcie-root'/> + <controller type='pci' index='1' model='pcie-expander-bus'> + <model name='pxb-pcie'/> + <target busNr='252'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/> + </controller> + <controller type='pci' index='2' model='pcie-expander-bus'> + <model name='pxb-pcie'/> + <target busNr='248'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </controller> + <controller type='pci' index='3' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='21' port='0x0'/> + <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> + </controller> + <controller type='pci' index='4' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='22' port='0xa8'/> + <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> + </controller> + <rng model='virtio'> + <backend model='random'>/dev/urandom</backend> + <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> + </rng> + <rng model='virtio'> + <backend model='random'>/dev/urandom</backend> + <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> + </rng> + <iommu model='smmuv3'> + <driver pciBus='1'/> + </iommu> + <iommu model='smmuv3'> + <driver pciBus='2'/> + </iommu> + </devices> +</domain> diff --git a/tests/qemuxmlconftest.c b/tests/qemuxmlconftest.c index 6a2802353b..36a752ad2b 100644 --- a/tests/qemuxmlconftest.c +++ b/tests/qemuxmlconftest.c @@ -3037,6 +3037,7 @@ mymain(void) DO_TEST_CAPS_LATEST_ABI_UPDATE("intel-iommu-eim-autoadd"); DO_TEST_CAPS_LATEST_ABI_UPDATE("intel-iommu-eim-autoadd-v2"); DO_TEST_CAPS_ARCH_LATEST("iommu-smmuv3", "aarch64"); + DO_TEST_CAPS_ARCH_LATEST("iommu-smmuv3-pci-bus", "aarch64"); DO_TEST_CAPS_LATEST("virtio-iommu-x86_64"); DO_TEST_CAPS_ARCH_LATEST("virtio-iommu-aarch64", "aarch64"); DO_TEST_CAPS_LATEST_PARSE_ERROR("virtio-iommu-wrong-machine"); -- 2.43.0
On a Wednesday in 2025, Nathan Chen via Devel wrote:
Provide sample XML and CLI args for the device-pluggable smmuv3 XML schema for virt machine type.
Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- .../iommu-smmuv3-pci-bus.aarch64-latest.args | 41 ++++++++++++ .../iommu-smmuv3-pci-bus.aarch64-latest.xml | 62 +++++++++++++++++++ .../qemuxmlconfdata/iommu-smmuv3-pci-bus.xml | 49 +++++++++++++++ tests/qemuxmlconftest.c | 1 + 4 files changed, 153 insertions(+) create mode 100644 tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.args create mode 100644 tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.aarch64-latest.xml create mode 100644 tests/qemuxmlconfdata/iommu-smmuv3-pci-bus.xml
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano
participants (2)
-
Ján Tomko -
Nathan Chen