[PATCH 0/4] libvirt: AMD IOMMU fixes
This series refines libvirt's logic for large vCPU configurations (>255) on x86/q35 machine types. Historically, the threshold was tied to Intel's EIM capability, but the real requirement is x2APIC support for APIC IDs beyond 255. AMD platforms can satisfy this via xtsup on amd-iommu. Summary of changes: 1. Fix virDomainIOMMUDefEquals to include pt and xtsup attributes for proper checks. 2. Correct error messages for passthrough (pt) and xtsup attributes to reflect their actual names instead of "dma translation". 3. Rename QEMU_MAX_VCPUS_WITHOUT_EIM to QEMU_MAX_VCPUS_WITHOUT_X2APIC and update validation logic to accept either: - intel-iommu with eim='on', or - amd-iommu with xtsup='on' for guests with more than 255 vCPUs. Error messages now mention x2APIC mode instead of extended interrupt mode. 4. Add QEMU_CAPS_AMD_IOMMU_XTSUP capability and enable xtsup by default for AMD IOMMU when a Q35 domain has >255 vCPUs, similar to Intel EIM auto-enable logic. Also ensure intremap is turned on when required. This makes the behavior vendor-neutral and improves usability for AMD EPYC guests with large vCPU counts. No ABI changes beyond stricter equality checks; this is a clarification and extension of existing logic. Xiaotian Feng (4): conf: fix virDomainIOMMUDefEquals for amd_iommu conf: fix error log for passthrough and xtsup attributes conf: support >255 vcpu w/ amd-iommu xtsup qemu: Enable AMD IOMMU XTSUP by default src/conf/domain_conf.c | 6 ++++-- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_postparse.c | 42 +++++++++++++++++++++++------------- src/qemu/qemu_validate.c | 11 +++++----- src/qemu/qemu_validate.h | 2 +- 6 files changed, 41 insertions(+), 23 deletions(-) -- 2.34.1
iommu->pt and iommu->xtsup are missing in virDomainIOMMUDefEquals. Signed-off-by: Xiaotian Feng <Xiaotian.Feng@amd.com> Reviewed-by: Ankit Soni <Ankit.Soni@amd.com> Tested-by: Ankit Soni <Ankit.Soni@amd.com> --- src/conf/domain_conf.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 998b333c74..a64a1fd59d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -16805,6 +16805,8 @@ virDomainIOMMUDefEquals(const virDomainIOMMUDef *a, a->iotlb != b->iotlb || a->aw_bits != b->aw_bits || a->dma_translation != b->dma_translation || + a->xtsup != b->xtsup || + a->pt != b->pt || a->granule != b->granule) return false; -- 2.34.1
Correct error messages for passthrough (pt) and xtsup attributes to reflect their actual names instead of "dma translation". Signed-off-by: Xiaotian Feng <xiaotian.feng@amd.com> Reviewed-by: Ankit Soni <Ankit.Soni@amd.com> Tested-by: Ankit Soni <Ankit.Soni@amd.com> --- src/conf/domain_conf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a64a1fd59d..7f04eeb8b3 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -22596,14 +22596,14 @@ virDomainIOMMUDefCheckABIStability(virDomainIOMMUDef *src, } if (src->pt != dst->pt) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("Target domain IOMMU device dma translation '%1$s' does not match source '%2$s'"), + _("Target domain IOMMU device passthrough '%1$s' does not match source '%2$s'"), virTristateSwitchTypeToString(dst->pt), virTristateSwitchTypeToString(src->pt)); return false; } if (src->xtsup != dst->xtsup) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("Target domain IOMMU device dma translation '%1$s' does not match source '%2$s'"), + _("Target domain IOMMU device xtsup '%1$s' does not match source '%2$s'"), virTristateSwitchTypeToString(dst->xtsup), virTristateSwitchTypeToString(src->xtsup)); return false; -- 2.34.1
Rename QEMU_MAX_VCPUS_WITHOUT_EIM to QEMU_MAX_VCPUS_WITHOUT_X2APIC to clarify the limit is tied to APIC ID width. Validation now accepts either: - intel-iommu with eim='on', or - amd-iommu with xtsup='on' for guests with more than 255 vCPUs on x86/q35. Update error messages to mention x2APIC mode instead of extended interrupt mode. This reflects that AMD platforms can satisfy the same requirement via xtsup property on amd-iommu. Signed-off-by: Xiaotian Feng <xiaotian.feng@amd.com> Reviewed-by: Ankit Soni <Ankit.Soni@amd.com> Tested-by: Ankit Soni <Ankit.Soni@amd.com> --- src/qemu/qemu_postparse.c | 4 ++-- src/qemu/qemu_validate.c | 11 ++++++----- src/qemu/qemu_validate.h | 2 +- 3 files changed, 9 insertions(+), 8 deletions(-) diff --git a/src/qemu/qemu_postparse.c b/src/qemu/qemu_postparse.c index 8940cb09b3..58bd70c741 100644 --- a/src/qemu/qemu_postparse.c +++ b/src/qemu/qemu_postparse.c @@ -797,7 +797,7 @@ static bool qemuDomainNeedsIOMMUWithEIM(const virDomainDef *def) { return ARCH_IS_X86(def->os.arch) && - virDomainDefGetVcpusMax(def) > QEMU_MAX_VCPUS_WITHOUT_EIM && + virDomainDefGetVcpusMax(def) > QEMU_MAX_VCPUS_WITHOUT_X2APIC && qemuDomainIsQ35(def); } @@ -1204,7 +1204,7 @@ qemuDomainDefAddDefaultDevices(virQEMUDriver *driver, addImplicitSATA = true; addITCOWatchdog = true; - if (virDomainDefGetVcpusMax(def) > QEMU_MAX_VCPUS_WITHOUT_EIM) { + if (virDomainDefGetVcpusMax(def) > QEMU_MAX_VCPUS_WITHOUT_X2APIC) { addIOMMU = true; } } diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index 0eb5d5ea3b..fb7b2a4882 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -920,17 +920,18 @@ qemuValidateDomainVCpuTopology(const virDomainDef *def, virQEMUCaps *qemuCaps) } if (ARCH_IS_X86(def->os.arch) && - virDomainDefGetVcpusMax(def) > QEMU_MAX_VCPUS_WITHOUT_EIM) { + virDomainDefGetVcpusMax(def) > QEMU_MAX_VCPUS_WITHOUT_X2APIC) { if (!qemuDomainIsQ35(def)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("more than %1$d vCPUs are only supported on q35-based machine types"), - QEMU_MAX_VCPUS_WITHOUT_EIM); + QEMU_MAX_VCPUS_WITHOUT_X2APIC); return -1; } - if (!def->iommus || def->iommus[0]->eim != VIR_TRISTATE_SWITCH_ON) { + if (!def->iommus || (def->iommus[0]->eim != VIR_TRISTATE_SWITCH_ON && + def->iommus[0]->xtsup != VIR_TRISTATE_SWITCH_ON)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("more than %1$d vCPUs require extended interrupt mode enabled on the iommu device"), - QEMU_MAX_VCPUS_WITHOUT_EIM); + _("more than %1$d vCPUs require EIM or XTSup mode enabled on the iommu device"), + QEMU_MAX_VCPUS_WITHOUT_X2APIC); return -1; } } diff --git a/src/qemu/qemu_validate.h b/src/qemu/qemu_validate.h index 9315be73f5..27dc120c3a 100644 --- a/src/qemu/qemu_validate.h +++ b/src/qemu/qemu_validate.h @@ -22,7 +22,7 @@ #include "qemu_capabilities.h" -#define QEMU_MAX_VCPUS_WITHOUT_EIM 255 +#define QEMU_MAX_VCPUS_WITHOUT_X2APIC 255 int qemuValidateDomainDef(const virDomainDef *def, -- 2.34.1
Add QEMU_CAPS_AMD_IOMMU_XTSUP capability and enable xtsup by default for AMD IOMMU when a Q35 domain has >255 vCPUs, similar to Intel EIM auto-enable logic. Also ensure intremap is turned on when required. Signed-off-by: Xiaotian Feng <xiaotian.feng@amd.com> Reviewed-by: Ankit Soni <Ankit.Soni@amd.com> Tested-by: Ankit Soni <Ankit.Soni@amd.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_postparse.c | 38 ++++++++++++++++++++++++------------ 3 files changed, 28 insertions(+), 13 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 5d75c23072..c8667fd77c 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -762,6 +762,7 @@ VIR_ENUM_IMPL(virQEMUCaps, "scsi-block.migrate-pr", /* QEMU_CAPS_DEVICE_SCSI_BLOCK_MIGRATE_PR */ "iommufd", /* QEMU_CAPS_OBJECT_IOMMUFD */ "uefi-vars", /* QEMU_CAPS_DEVICE_UEFI_VARS */ + "amd-iommu.xtsup", /* QEMU_CAPS_AMD_IOMMU_XTSUP */ ); @@ -1632,6 +1633,7 @@ static struct virQEMUCapsDevicePropsFlags virQEMUCapsDevicePropsVirtioBlkCCW[] = static struct virQEMUCapsDevicePropsFlags virQEMUCapsDevicePropsAMDIOMMU[] = { { "pci-id", QEMU_CAPS_AMD_IOMMU_PCI_ID, NULL }, + { "xtsup", QEMU_CAPS_AMD_IOMMU_XTSUP, NULL }, }; /* see documentation for virQEMUQAPISchemaPathGet for the query format */ diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index a48e1d0367..5662c81e71 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -736,6 +736,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_DEVICE_SCSI_BLOCK_MIGRATE_PR, /* persistent reservation migration support */ QEMU_CAPS_OBJECT_IOMMUFD, /* -object iommufd */ QEMU_CAPS_DEVICE_UEFI_VARS, /* -device uefi-vars-{x64,sysbus} */ + QEMU_CAPS_AMD_IOMMU_XTSUP, /* amd-iommu.xtsup */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/src/qemu/qemu_postparse.c b/src/qemu/qemu_postparse.c index 58bd70c741..79e02e34ac 100644 --- a/src/qemu/qemu_postparse.c +++ b/src/qemu/qemu_postparse.c @@ -794,7 +794,7 @@ qemuDomainPstoreDefPostParse(virDomainPstoreDef *pstore, static bool -qemuDomainNeedsIOMMUWithEIM(const virDomainDef *def) +qemuDomainNeedsIOMMUWithx2APIC(const virDomainDef *def) { return ARCH_IS_X86(def->os.arch) && virDomainDefGetVcpusMax(def) > QEMU_MAX_VCPUS_WITHOUT_X2APIC && @@ -808,22 +808,34 @@ qemuDomainIOMMUDefPostParse(virDomainIOMMUDef *iommu, virQEMUCaps *qemuCaps, unsigned int parseFlags) { - /* In case domain has huge number of vCPUS and Extended Interrupt Mode - * (EIM) is not explicitly turned off, let's enable it. If we didn't then + /* In case domain has huge number of vCPUS and x2APIC (intel EIM or AMD + * XTSUP) is not explicitly turned off, let's enable it. If we didn't then * guest will have troubles with interrupts. */ if (parseFlags & VIR_DOMAIN_DEF_PARSE_ABI_UPDATE && - qemuDomainNeedsIOMMUWithEIM(def) && - iommu && iommu->model == VIR_DOMAIN_IOMMU_MODEL_INTEL) { + qemuDomainNeedsIOMMUWithx2APIC(def) && iommu) { + if (iommu->model == VIR_DOMAIN_IOMMU_MODEL_INTEL) { + /* eim requires intremap. */ + if (iommu->intremap == VIR_TRISTATE_SWITCH_ABSENT && + virQEMUCapsGet(qemuCaps, QEMU_CAPS_INTEL_IOMMU_INTREMAP)) { + iommu->intremap = VIR_TRISTATE_SWITCH_ON; + } - /* eim requires intremap. */ - if (iommu->intremap == VIR_TRISTATE_SWITCH_ABSENT && - virQEMUCapsGet(qemuCaps, QEMU_CAPS_INTEL_IOMMU_INTREMAP)) { - iommu->intremap = VIR_TRISTATE_SWITCH_ON; + if (iommu->eim == VIR_TRISTATE_SWITCH_ABSENT && + virQEMUCapsGet(qemuCaps, QEMU_CAPS_INTEL_IOMMU_EIM)) { + iommu->eim = VIR_TRISTATE_SWITCH_ON; + } } - if (iommu->eim == VIR_TRISTATE_SWITCH_ABSENT && - virQEMUCapsGet(qemuCaps, QEMU_CAPS_INTEL_IOMMU_EIM)) { - iommu->eim = VIR_TRISTATE_SWITCH_ON; + if (iommu->model == VIR_DOMAIN_IOMMU_MODEL_AMD) { + if (iommu->intremap == VIR_TRISTATE_SWITCH_ABSENT && + virQEMUCapsGet(qemuCaps, QEMU_CAPS_INTEL_IOMMU_INTREMAP)) { + iommu->intremap = VIR_TRISTATE_SWITCH_ON; + } + + if (iommu->xtsup == VIR_TRISTATE_SWITCH_ABSENT && + virQEMUCapsGet(qemuCaps, QEMU_CAPS_AMD_IOMMU_XTSUP)) { + iommu->xtsup = VIR_TRISTATE_SWITCH_ON; + } } } @@ -1544,7 +1556,7 @@ qemuDomainDefEnableDefaultFeatures(virDomainDef *def, * modified so change it now. */ if (def->iommus && def->iommus[0]->pci_bus < 0 && (def->iommus[0]->intremap == VIR_TRISTATE_SWITCH_ON || - qemuDomainNeedsIOMMUWithEIM(def)) && + qemuDomainNeedsIOMMUWithx2APIC(def)) && def->features[VIR_DOMAIN_FEATURE_IOAPIC] == VIR_DOMAIN_IOAPIC_NONE) { def->features[VIR_DOMAIN_FEATURE_IOAPIC] = VIR_DOMAIN_IOAPIC_QEMU; } -- 2.34.1
participants (1)
-
Xiaotian Feng