[RFC PATCH v3 0/4] qemu: Implement support for EGM
The Grace SOC introduces Extended GPU Memory (EGM) [3], a feature that enables GPUs to efficiently access system memory within and across nodes. This patch series adds support for virtualizing EGM (vEGM) in libvirt, allowing VMs to utilize dedicated EGM memory regions through ACPI. This patch series is a follow-up RFC to the second EGM RFC series [0], to gather feedback from the libvirt community on the overall approach and implementation details. While kernel EGM driver support and QEMU acpi-egm-memory device support are not yet upstream, reference implementations are available [1][2] to enable testing and validation of the libvirt integration. Any community feedback is appreciated. Background and Use Cases ========================= EGM allows host memory to be partitioned into two regions: 1. Standard memory for Host OS usage 2. EGM region assigned to VMs as their system memory This technology enables various high-performance computing scenarios [3]: - Large memory pools for AI/ML workloads - High-performance computing applications - Memory extension for systems with limited main memory - GPU-accelerated workloads requiring large addressable memory Implementation Overview ======================= This series adds a new memory device model VIR_DOMAIN_MEMORY_MODEL_EGM with 'path' source attribute and 'pciDev' target attribute to denote host EGM device backing path and PCI device alias to associate the vEGM with, respectively. For instance, given the XML stanzas below: <memory model='egm' access='shared'> <source> <path>/dev/egm4</path> </source> <target> <size unit='KiB'>8388608</size> <node>0</node> <pciDev>ua-hostdev0</pciDev> </target> </memory> <memory model='egm' access='shared'> <source> <path>/dev/egm5</path> </source> <target> <size unit='KiB'>8388608</size> <node>1</node> <pciDev>ua-hostdev1</pciDev> </target> </memory> The corresponding qemu command line will include the following arguments: -object '{"qom-type":"memory-backend-file","id":"memegm0","mem-path":"/dev/egm4","share":true,"prealloc":true,"size":8589934592}' \ -object acpi-egm-memory,id=egm0,pci-dev=ua-hostdev0,node=0 \ -object '{"qom-type":"memory-backend-file","id":"memegm1","mem-path":"/dev/egm5","share":true,"prealloc":true,"size":8589934592}' \ -object acpi-egm-memory,id=egm1,pci-dev=ua-hostdev1,node=1 \ -numa node,nodeid=0,cpus=0-1,memdev=memegm4 \ -numa node,nodeid=1,cpus=2-3,memdev=memegm5 \ For a system where multiple GPUs are associated with a single host socket/NUMA node/EGM chardev, we consolidate the memory backing into a single memory-backend-file object per host EGM chardev. For instance, given the XML stanzas below: <memory model='egm' access='shared'> <source> <path>/dev/egm4</path> </source> <target> <size unit='KiB'>8388608</size> <node>0</node> <pciDev>ua-hostdev0</pciDev> </target> </memory> <memory model='egm' access='shared'> <source> <path>/dev/egm4</path> </source> <target> <size unit='KiB'>8388608</size> <node>0</node> <pciDev>ua-hostdev1</pciDev> </target> </memory> The corresponding qemu command line will include the following arguments: -object '{"qom-type":"memory-backend-file","id":"memegm0","mem-path":"/dev/egm4","share":true,"prealloc":true,"size":17179869184}' \ -object acpi-egm-memory,id=egm0,pci-dev=ua-hostdev0,node=0 \ -object acpi-egm-memory,id=egm1,pci-dev=ua-hostdev1,node=0 \ -numa node,nodeid=0,cpus=0-4,memdev=memegm0 \ Changes from RFCv2: - Decouple host EGM chardev name from guest EGM ID - Consolidate all acpi-egm-memory objects' memory into a single memory-backend-file per EGM chardev specified. Changes from RFCv1: - Use existing memory device infrastructure to represent EGM configuration - Added support for multiple EGM devices This series is on Github: https://github.com/NathanChenNVIDIA/libvirt/tree/egm-11-24-25 Thanks, Nathan [0] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/6RU7R... [1] https://github.com/ianm-nv/qemu/tree/6.8_ghvirt_egm_may2025 [2] https://github.com/NVIDIA/QEMU/commit/32db1b74fb99c0571724c7e69485e89098c148... [3] https://developer.nvidia.com/blog/nvidia-grace-hopper-superchip-architecture... Ian May (1): tests: Add qemuxmlconftest for ACPI EGM memory device Nathan Chen (3): conf: Support EGM memory device model qemu: Add cgroup, namespace, and seclabel setup for EGM memory device model qemu: Add qemu CLI support for EGM docs/formatdomain.rst | 18 +- src/conf/domain_conf.c | 34 +++- src/conf/domain_conf.h | 7 + src/conf/domain_postparse.c | 6 +- src/conf/domain_validate.c | 15 ++ src/conf/schemas/domaincommon.rng | 6 + src/qemu/qemu_alias.c | 7 +- src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_cgroup.c | 10 ++ src/qemu/qemu_command.c | 158 ++++++++++++++++-- src/qemu/qemu_domain.c | 15 +- src/qemu/qemu_domain_address.c | 3 + src/qemu/qemu_driver.c | 1 + src/qemu/qemu_hotplug.c | 1 + src/qemu/qemu_monitor_json.c | 1 + src/qemu/qemu_namespace.c | 3 + src/qemu/qemu_postparse.c | 1 + src/qemu/qemu_process.c | 2 + src/qemu/qemu_validate.c | 6 + src/security/apparmor/usr.sbin.libvirtd.in | 3 + src/security/security_apparmor.c | 2 + src/security/security_dac.c | 8 + src/security/security_selinux.c | 6 + src/security/virt-aa-helper.c | 4 + src/util/virfile.h | 2 +- tests/meson.build | 1 + tests/qemuegmmock.c | 67 ++++++++ .../acpi-egm-memory.aarch64-latest.args | 47 ++++++ .../acpi-egm-memory.aarch64-latest.xml | 124 ++++++++++++++ tests/qemuxmlconfdata/acpi-egm-memory.xml | 124 ++++++++++++++ tests/qemuxmlconftest.c | 8 +- 32 files changed, 672 insertions(+), 21 deletions(-) create mode 100644 tests/qemuegmmock.c create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.args create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.xml create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.xml -- 2.43.0
Add support for EGM memory device model with 'path' source attribute and 'pciDev' target attribute to denote host EGM device backing path and PCI device alias to associate the vEGM with, respectively. Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- docs/formatdomain.rst | 18 +++++++++++++++++- src/conf/domain_conf.c | 29 +++++++++++++++++++++++++++++ src/conf/domain_conf.h | 7 +++++++ src/conf/domain_postparse.c | 1 + src/conf/domain_validate.c | 15 +++++++++++++++ src/conf/schemas/domaincommon.rng | 6 ++++++ 6 files changed, 75 insertions(+), 1 deletion(-) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 1467fc7e10..56fa931747 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -9029,6 +9029,16 @@ Example: usage of the memory devices <size unit='KiB'>16384</size> </target> </memory> + <memory model='egm' access='shared'> + <source> + <path>/dev/egm0</path> + </source> + <target> + <size unit='KiB'>524288</size> + <node>0</node> + <pciDev>ua-hostdev0</pciDev> + </target> + </memory> </devices> ... @@ -9040,7 +9050,8 @@ Example: usage of the memory devices persistent memory device. :since:`Since 7.1.0` Provide ``virtio-mem`` model to add paravirtualized memory device. :since:`Since 7.9.0` Provide ``sgx-epc`` model to add a SGX enclave page cache (EPC) memory to the guest. - :since:`Since 8.10.0 and QEMU 7.0.0` + :since:`Since 8.10.0 and QEMU 7.0.0` Provide ``egm`` model to add a EGM + (Extended GPU Memory) device. ``access`` An optional attribute ``access`` ( :since:`since 3.2.0` ) that provides @@ -9175,6 +9186,11 @@ Example: usage of the memory devices The physical address in memory, where device is mapped. :since:`Since 9.4.0` + ``pciDev`` + For ``egm`` only. + The PCI device that is enabled to access the system memory via + association with the EGM device. + IOMMU devices ~~~~~~~~~~~~~ diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 541dad5bdc..2cda32fa6e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1522,6 +1522,7 @@ VIR_ENUM_IMPL(virDomainMemoryModel, "virtio-pmem", "virtio-mem", "sgx-epc", + "egm", ); VIR_ENUM_IMPL(virDomainShmemModel, @@ -3611,6 +3612,9 @@ void virDomainMemoryDefFree(virDomainMemoryDef *def) case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: virBitmapFree(def->source.sgx_epc.nodes); break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + g_free(def->source.egm.path); + g_free(def->target.egm.pciDev); case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -14084,6 +14088,10 @@ virDomainMemorySourceDefParseXML(xmlNodePtr node, } break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + def->source.egm.path = virXPathString("string(./path)", ctxt); + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -14160,6 +14168,10 @@ virDomainMemoryTargetDefParseXML(xmlNodePtr node, addr = &def->target.virtio_pmem.address; break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + def->target.egm.pciDev = virXPathString("string(./pciDev)", ctxt); + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: @@ -14381,6 +14393,7 @@ virDomainMemoryIsVirtioModel(const virDomainMemoryDef *def) case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; } @@ -16247,6 +16260,12 @@ virDomainMemoryFindByDefInternal(virDomainDef *def, continue; break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + if (STRNEQ(tmp->source.egm.path, mem->source.egm.path)) + continue; + if (STRNEQ(tmp->target.egm.pciDev, mem->target.egm.pciDev)) + continue; + case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -22196,6 +22215,7 @@ virDomainMemoryDefCheckABIStability(virDomainMemoryDef *src, case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -26700,6 +26720,10 @@ virDomainMemorySourceDefFormat(virBuffer *buf, } break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + virBufferEscapeString(&childBuf, "<path>%s</path>\n", def->source.egm.path); + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -26762,6 +26786,11 @@ virDomainMemoryTargetDefFormat(virBuffer *buf, } break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + if (def->target.egm.pciDev) + virBufferAsprintf(&childBuf, "<pciDev>%s</pciDev>\n", def->target.egm.pciDev); + break; + case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_NONE: diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index cb35ff06bd..e38f133bcc 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2745,6 +2745,7 @@ typedef enum { VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM, /* virtio-pmem memory device */ VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM, /* virtio-mem memory device */ VIR_DOMAIN_MEMORY_MODEL_SGX_EPC, /* SGX enclave page cache */ + VIR_DOMAIN_MEMORY_MODEL_EGM, /* Extended GPU memory */ VIR_DOMAIN_MEMORY_MODEL_LAST } virDomainMemoryModel; @@ -2777,6 +2778,9 @@ struct _virDomainMemoryDef { struct { virBitmap *nodes; /* source NUMA nodes */ } sgx_epc; + struct { + char *path; + } egm; } source; union { @@ -2802,6 +2806,9 @@ struct _virDomainMemoryDef { } virtio_mem; struct { } sgx_epc; + struct { + char *pciDev; + } egm; } target; virDomainDeviceInfo info; diff --git a/src/conf/domain_postparse.c b/src/conf/domain_postparse.c index 38e731348d..0181d21f0e 100644 --- a/src/conf/domain_postparse.c +++ b/src/conf/domain_postparse.c @@ -632,6 +632,7 @@ virDomainMemoryDefPostParse(virDomainMemoryDef *mem, case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index 4558e7b210..4d4febfbfc 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -2494,6 +2494,7 @@ virDomainMemoryDefCheckConflict(const virDomainMemoryDef *mem, } break; case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -2540,6 +2541,7 @@ virDomainMemoryDefCheckConflict(const virDomainMemoryDef *mem, switch (other->model) { case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_LAST: continue; break; @@ -2720,6 +2722,19 @@ virDomainMemoryDefValidate(const virDomainMemoryDef *mem, } break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + if (!mem->source.egm.path) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("path is required for model 'egm'")); + return -1; + } + if (!mem->target.egm.pciDev) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("pciDev is required for model 'egm'")); + return -1; + } + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: default: diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng index 1f9ac102a0..4ccc594659 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -7486,6 +7486,7 @@ <value>virtio-pmem</value> <value>virtio-mem</value> <value>sgx-epc</value> + <value>egm</value> </choice> </attribute> <optional> @@ -7617,6 +7618,11 @@ </attribute> </element> </optional> + <optional> + <element name="pciDev"> + <data type="string"/> + </element> + </optional> </interleave> </element> </define> -- 2.43.0
Implement proper isolation and access control for EGM memory devices: - Add device to cgroup for access control - Set up namespace mappings for device access - Ensure proper permissions in containerized environments - Allow EGM device path access to bypass SELinux, AppArmor, and DAC permissions Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- src/qemu/qemu_cgroup.c | 10 ++++++++++ src/qemu/qemu_namespace.c | 3 +++ src/security/apparmor/usr.sbin.libvirtd.in | 3 +++ src/security/security_apparmor.c | 2 ++ src/security/security_dac.c | 8 ++++++++ src/security/security_selinux.c | 6 ++++++ src/security/virt-aa-helper.c | 4 ++++ 7 files changed, 36 insertions(+) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 7dadef0739..8b70740121 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -577,6 +577,11 @@ qemuSetupMemoryDevicesCgroup(virDomainObj *vm, VIR_CGROUP_DEVICE_RW, false) < 0) return -1; break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + if (qemuCgroupAllowDevicePath(vm, mem->source.egm.path, + VIR_CGROUP_DEVICE_RW, false) < 0) + return -1; + break; case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: @@ -615,6 +620,11 @@ qemuTeardownMemoryDevicesCgroup(virDomainObj *vm, VIR_CGROUP_DEVICE_RW, false) < 0) return -1; break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + if (qemuCgroupDenyDevicePath(vm, mem->source.egm.path, + VIR_CGROUP_DEVICE_RWM, false) < 0) + return -1; + break; case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: diff --git a/src/qemu/qemu_namespace.c b/src/qemu/qemu_namespace.c index c689cc3e40..f6404cb280 100644 --- a/src/qemu/qemu_namespace.c +++ b/src/qemu/qemu_namespace.c @@ -394,6 +394,9 @@ qemuDomainSetupMemory(virDomainMemoryDef *mem, *paths = g_slist_prepend(*paths, g_strdup(QEMU_DEV_SGX_VEPVC)); *paths = g_slist_prepend(*paths, g_strdup(QEMU_DEV_SGX_PROVISION)); break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + *paths = g_slist_prepend(*paths, g_strdup(mem->source.egm.path)); + break; case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_DIMM: diff --git a/src/security/apparmor/usr.sbin.libvirtd.in b/src/security/apparmor/usr.sbin.libvirtd.in index 6267e4f737..2a6a4b979c 100644 --- a/src/security/apparmor/usr.sbin.libvirtd.in +++ b/src/security/apparmor/usr.sbin.libvirtd.in @@ -47,6 +47,9 @@ profile libvirtd @sbindir@/libvirtd flags=(attach_disconnected) { mount options=(rw, move) /{,var/}run/libvirt/qemu/*{,/} -> /dev/**, umount /{,var/}run/libvirt/qemu/*{,/}, + # Allow bind mounting EGM devices into qemu namespaces + mount options=(rw, bind) /dev/egm* -> /{,var/}run/libvirt/qemu/**, + network inet stream, network inet dgram, network inet6 stream, diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index 68ac39611f..ea04e756d6 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -631,6 +631,8 @@ AppArmorSetMemoryLabel(virSecurityManager *mgr, case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: + path = mem->source.egm.path; case VIR_DOMAIN_MEMORY_MODEL_LAST: break; } diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 2f788b872a..2d79009ee9 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -1890,6 +1890,9 @@ virSecurityDACRestoreMemoryLabel(virSecurityManager *mgr, * don't need to restore anything. */ break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + return virSecurityDACRestoreFileLabel(mgr, mem->source.egm.path); + case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_LAST: @@ -2121,6 +2124,11 @@ virSecurityDACSetMemoryLabel(virSecurityManager *mgr, return -1; break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + return virSecurityDACSetOwnership(mgr, NULL, + mem->source.egm.path, + user, group, true); + case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_LAST: diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 2f3cc274a5..b288778634 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1666,6 +1666,9 @@ virSecuritySELinuxSetMemoryLabel(virSecurityManager *mgr, seclabel->imagelabel, true) < 0) return -1; break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + path = mem->source.egm.path; + break; case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_DIMM: @@ -1709,6 +1712,9 @@ virSecuritySELinuxRestoreMemoryLabel(virSecurityManager *mgr, if (virSecuritySELinuxRestoreFileLabel(mgr, DEV_SGX_PROVISION, true, false) < 0) ret = -1; return ret; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + path = mem->source.egm.path; + break; case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: diff --git a/src/security/virt-aa-helper.c b/src/security/virt-aa-helper.c index de0a826063..0e387dd4be 100644 --- a/src/security/virt-aa-helper.c +++ b/src/security/virt-aa-helper.c @@ -1194,6 +1194,10 @@ get_files(vahControl * ctl) return -1; } break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + if (vah_add_file(&buf, mem->source.egm.path, "rw") != 0) + return -1; + break; case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: -- 2.43.0
Add qemu CLI support for EGM memory device model: - Specify EGM device path to memory-backend-file object - Support acpi-egm-memory object with id, pci-dev, and node attributes - Consolidate all acpi-egm-memory objects' memory into a single memory-backend-file per EGM chardev specified. Signed-off-by: Ian May <ianm@nvidia.com> Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- src/qemu/qemu_alias.c | 7 +- src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 158 ++++++++++++++++++++++++++++++--- src/qemu/qemu_domain.c | 13 ++- src/qemu/qemu_domain_address.c | 3 + src/qemu/qemu_driver.c | 1 + src/qemu/qemu_hotplug.c | 1 + src/qemu/qemu_monitor_json.c | 1 + src/qemu/qemu_postparse.c | 1 + src/qemu/qemu_process.c | 2 + src/qemu/qemu_validate.c | 6 ++ 12 files changed, 180 insertions(+), 16 deletions(-) diff --git a/src/qemu/qemu_alias.c b/src/qemu/qemu_alias.c index 400ce73283..719224e1ba 100644 --- a/src/qemu/qemu_alias.c +++ b/src/qemu/qemu_alias.c @@ -504,7 +504,8 @@ qemuDeviceMemoryGetAliasID(virDomainDef *def, * valid */ if (mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM && mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM && - mem->model != VIR_DOMAIN_MEMORY_MODEL_SGX_EPC) + mem->model != VIR_DOMAIN_MEMORY_MODEL_SGX_EPC && + mem->model != VIR_DOMAIN_MEMORY_MODEL_EGM) return mem->info.addr.dimm.slot; for (i = 0; i < def->nmems; i++) { @@ -553,6 +554,10 @@ qemuAssignDeviceMemoryAlias(virDomainDef *def, case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: prefix = "epc"; break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: { + prefix = "egm"; + break; + } case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: default: diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 92b863a826..3fc8fee4b3 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -755,6 +755,7 @@ VIR_ENUM_IMPL(virQEMUCaps, "disk-timed-stats", /* QEMU_CAPS_DISK_TIMED_STATS */ "query-accelerators", /* QEMU_CAPS_QUERY_ACCELERATORS */ "mshv", /* QEMU_CAPS_MSHV */ + "acpi-egm-memory", /* QEMU_CAPS_DEVICE_ACPI_EGM_MEMORY */ ); @@ -1462,6 +1463,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "tpm-emulator", QEMU_CAPS_DEVICE_TPM_EMULATOR }, { "tpm-passthrough", QEMU_CAPS_DEVICE_TPM_PASSTHROUGH }, { "acpi-generic-initiator", QEMU_CAPS_ACPI_GENERIC_INITIATOR }, + { "acpi-egm-memory", QEMU_CAPS_DEVICE_ACPI_EGM_MEMORY }, }; diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index f180844e66..3eb12235f4 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -730,6 +730,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_DISK_TIMED_STATS, /* timed stats support ('stats-intervals' property of disk frontends) */ QEMU_CAPS_QUERY_ACCELERATORS, /* query-accelerators command */ QEMU_CAPS_MSHV, /* -accel mshv */ + QEMU_CAPS_DEVICE_ACPI_EGM_MEMORY, /* For using extended GPU memory */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index b69fe23236..33848aa781 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -135,6 +135,21 @@ VIR_ENUM_IMPL(qemuACPITableSIG, "SLIC", "MSDM"); +typedef struct _qemuEGMBackendInfo { + char *alias; + unsigned long long totalSize; + bool created; + virDomainMemoryDef *firstMem; /* Pointer to first device for this path */ +} qemuEGMBackendInfo; + +static void +qemuEGMBackendInfoFree(qemuEGMBackendInfo *info) +{ + if (!info) + return; + g_free(info->alias); + g_free(info); +} const char * qemuAudioDriverTypeToString(virDomainAudioType type) @@ -992,6 +1007,7 @@ qemuBuildVirtioDevGetConfigDev(const virDomainDeviceDef *device, case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -3162,6 +3178,7 @@ qemuBuildMemoryGetPagesize(virQEMUDriverConfig *cfg, nvdimmPath = mem->source.virtio_pmem.path; break; case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -3362,6 +3379,9 @@ qemuBuildMemoryBackendProps(virJSONValue **backendProps, case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: nvdimmPath = mem->source.virtio_pmem.path; break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + nvdimmPath = mem->source.egm.path; + break; case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -3573,12 +3593,17 @@ qemuBuildMemoryDimmBackendStr(virCommand *cmd, virDomainMemoryDef *mem, virDomainDef *def, virQEMUDriverConfig *cfg, - qemuDomainObjPrivate *priv) + qemuDomainObjPrivate *priv, + GHashTable *egmBackends) { g_autoptr(virJSONValue) props = NULL; g_autoptr(virJSONValue) tcProps = NULL; virBitmap *nodemask = NULL; g_autofree char *alias = NULL; + unsigned long long originalSize = 0; + bool isEGM = (mem->model == VIR_DOMAIN_MEMORY_MODEL_EGM); + bool shouldCreateBackend = true; + qemuEGMBackendInfo *egmInfo = NULL; if (!mem->info.alias) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -3588,19 +3613,65 @@ qemuBuildMemoryDimmBackendStr(virCommand *cmd, alias = g_strdup_printf("mem%s", mem->info.alias); - if (qemuBuildMemoryBackendProps(&props, alias, cfg, priv, - def, mem, true, false, &nodemask) < 0) - return -1; + /* Handle EGM shared backend logic */ + if (isEGM && egmBackends) { + const char *egmPath = mem->source.egm.path; + egmInfo = g_hash_table_lookup(egmBackends, egmPath); - if (qemuBuildThreadContextProps(&tcProps, &props, def, priv, nodemask) < 0) - return -1; + if (egmInfo) { + alias = g_strdup(egmInfo->alias); + if (egmInfo->created) { + /* Backend already created, skip backend creation */ + shouldCreateBackend = false; + } else { + /* First device for this path - temporarily use accumulated size */ + originalSize = mem->size; + mem->size = egmInfo->totalSize; + egmInfo->created = true; + } + } + } - if (tcProps && - qemuBuildObjectCommandlineFromJSON(cmd, tcProps) < 0) - return -1; + if (shouldCreateBackend) { + /* Use existing function unchanged */ + if (qemuBuildMemoryBackendProps(&props, alias, cfg, priv, + def, mem, true, false, &nodemask) < 0) { + if (originalSize > 0) + mem->size = originalSize; /* Restore on error */ + return -1; + } - if (qemuBuildObjectCommandlineFromJSON(cmd, props) < 0) - return -1; + /* Restore original size after backend props are built */ + if (originalSize > 0) + mem->size = originalSize; + + if (qemuBuildThreadContextProps(&tcProps, &props, def, priv, nodemask) < 0) + return -1; + + if (tcProps && + qemuBuildObjectCommandlineFromJSON(cmd, tcProps) < 0) + return -1; + + if (qemuBuildObjectCommandlineFromJSON(cmd, props) < 0) + return -1; + } + + if (isEGM) { + g_autofree char *egmObjStr = NULL; + g_auto(virBuffer) buf = VIR_BUFFER_INITIALIZER; + + virBufferAsprintf(&buf, "acpi-egm-memory,id=%s", mem->info.alias); + + if (mem->target.egm.pciDev) + virBufferAsprintf(&buf, ",pci-dev=%s", mem->target.egm.pciDev); + + if (mem->targetNode >= 0) + virBufferAsprintf(&buf, ",node=%d", mem->targetNode); + + egmObjStr = virBufferContentAndReset(&buf); + + virCommandAddArgList(cmd, "-object", egmObjStr, NULL); + } return 0; } @@ -3671,6 +3742,7 @@ qemuBuildMemoryDeviceProps(virQEMUDriverConfig *cfg, dynamicMemslots = mem->target.virtio_mem.dynamicMemslots; break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: @@ -7104,6 +7176,7 @@ qemuAppendDomainMemoryMachineParams(virBuffer *buf, case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -7821,6 +7894,8 @@ qemuBuildNumaCommandLine(virQEMUDriverConfig *cfg, size_t ncells = virDomainNumaGetNodeCount(def->numa); ssize_t masterInitiator = -1; int rc; + g_autoptr(GHashTable) egmBackends = NULL; + size_t egmBackendCount = 0; if (!virDomainNumatuneNodesetIsAvailable(def->numa, priv->autoNodeset)) goto cleanup; @@ -7835,6 +7910,37 @@ qemuBuildNumaCommandLine(virQEMUDriverConfig *cfg, hmat = true; } + /* Pre-scan EGM devices to group by path and calculate total sizes */ + egmBackends = g_hash_table_new_full(g_str_hash, g_str_equal, g_free, + (GDestroyNotify)qemuEGMBackendInfoFree); + + for (i = 0; i < def->nmems; i++) { + if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_EGM) { + const char *egmPath = def->mems[i]->source.egm.path; + qemuEGMBackendInfo *info = g_hash_table_lookup(egmBackends, egmPath); + + if (!info) { + info = g_new0(qemuEGMBackendInfo, 1); + info->alias = g_strdup_printf("memegm%zu", egmBackendCount); + egmBackendCount++; + info->totalSize = def->mems[i]->size; + info->created = false; + info->firstMem = def->mems[i]; + g_hash_table_insert(egmBackends, g_strdup(egmPath), info); + } else { + info->totalSize += def->mems[i]->size; + } + } + } + + /* Build the actual backend and device objects */ + for (i = 0; i < def->nmems; i++) { + if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_EGM) { + if (qemuBuildMemoryDimmBackendStr(cmd, def->mems[i], def, cfg, priv, egmBackends) < 0) + goto cleanup; + } + } + nodeBackends = g_new0(virJSONValue *, ncells); nodemask = g_new0(virBitmap *, ncells); @@ -7870,8 +7976,18 @@ qemuBuildNumaCommandLine(virQEMUDriverConfig *cfg, for (i = 0; i < ncells; i++) { ssize_t initiator = virDomainNumaGetNodeInitiator(def->numa, i); unsigned long long memSize = virDomainNumaGetNodeMemorySize(def->numa, i); + bool egmBacked = false; + size_t k; + + for (k = 0; k < def->nmems; k++) { + if (def->mems[k]->model == VIR_DOMAIN_MEMORY_MODEL_EGM && + def->mems[k]->targetNode == (int)i) { + egmBacked = true; + break; + } + } - if (needBackend && memSize > 0) { + if (needBackend && memSize > 0 && !egmBacked) { g_autoptr(virJSONValue) tcProps = NULL; if (qemuBuildThreadContextProps(&tcProps, &nodeBackends[i], @@ -7901,7 +8017,15 @@ qemuBuildNumaCommandLine(virQEMUDriverConfig *cfg, if (memSize > 0) { if (needBackend) { - virBufferAsprintf(&buf, ",memdev=ram-node%zu", i); + if (egmBacked) { + /* Look up the actual backend alias for EGM */ + const char *egmPath = def->mems[k]->source.egm.path; + qemuEGMBackendInfo *egmInfo = g_hash_table_lookup(egmBackends, egmPath); + const char *backendAlias = egmInfo ? egmInfo->alias : def->mems[k]->info.alias; + virBufferAsprintf(&buf, ",memdev=%s", backendAlias); + } else { + virBufferAsprintf(&buf, ",memdev=ram-node%zu", i); + } } else { virBufferAsprintf(&buf, ",mem=%llu", memSize / 1024); } @@ -7965,7 +8089,10 @@ qemuBuildMemoryDeviceCommandLine(virCommand *cmd, for (i = 0; i < def->nmems; i++) { g_autoptr(virJSONValue) props = NULL; - if (qemuBuildMemoryDimmBackendStr(cmd, def->mems[i], def, cfg, priv) < 0) + if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_EGM) + continue; + + if (qemuBuildMemoryDimmBackendStr(cmd, def->mems[i], def, cfg, priv, NULL) < 0) return -1; switch (def->mems[i]->model) { @@ -7985,6 +8112,9 @@ qemuBuildMemoryDeviceCommandLine(virCommand *cmd, case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: break; + /* EGM memory backing is via memory-backend-file object */ + case VIR_DOMAIN_MEMORY_MODEL_EGM: + break; case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index ac56fc7cb4..14f2b3ec5d 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7219,6 +7219,7 @@ qemuDomainUpdateMemoryDeviceInfo(virDomainObj *vm, break; case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -7453,7 +7454,8 @@ qemuDomainAlignMemorySizes(virDomainDef *def) def->mems[i]->size = VIR_ROUND_UP(def->mems[i]->size, align); } - hotplugmem += def->mems[i]->size; + if (def->mems[i]->model != VIR_DOMAIN_MEMORY_MODEL_EGM) + hotplugmem += def->mems[i]->size; if (def->mems[i]->size > maxmemkb) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -7941,6 +7943,12 @@ qemuDomainDefValidateMemoryHotplugDevice(const virDomainMemoryDef *mem, virDomainMemoryModelTypeToString(mem->model)); return -1; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("hotplug is not supported for the %1$s device"), + virDomainMemoryModelTypeToString(mem->model)); + return -1; + case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: return -1; @@ -7999,6 +8007,7 @@ qemuDomainDefValidateMemoryHotplug(const virDomainDef *def, case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_LAST: case VIR_DOMAIN_MEMORY_MODEL_NONE: break; @@ -8046,6 +8055,8 @@ qemuDomainDefValidateMemoryHotplug(const virDomainDef *def, case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: /* sgx epc memory does not support hotplug, skip this check */ + case VIR_DOMAIN_MEMORY_MODEL_EGM: + /* egm memory does not support hotplug, skip this check */ case VIR_DOMAIN_MEMORY_MODEL_LAST: case VIR_DOMAIN_MEMORY_MODEL_NONE: break; diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index 7233df888c..97e533bf9a 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -3124,6 +3124,7 @@ qemuDomainAssignMemoryDeviceSlot(virDomainObj *vm, return qemuDomainEnsureVirtioAddress(&releaseaddr, vm, &dev); case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -3151,6 +3152,7 @@ qemuDomainReleaseMemoryDeviceSlot(virDomainObj *vm, break; case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; @@ -3185,6 +3187,7 @@ qemuDomainAssignMemorySlots(virDomainDef *def) /* handled in qemuDomainAssignPCIAddresses() */ break; case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f2e024dae3..e0ee056b92 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6712,6 +6712,7 @@ qemuDomainAttachMemoryConfig(virDomainDef *vmdef, case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: break; case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index fb426deb1a..890bb052b6 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -7348,6 +7348,7 @@ qemuDomainChangeMemoryLiveValidateChange(const virDomainMemoryDef *oldDef, case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_LAST: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("cannot modify memory of model '%1$s'"), diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 494d7ef515..081f0cedfd 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -7213,6 +7213,7 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitor *mon, switch ((virDomainMemoryModel) model) { case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: /* While 'id' attribute is marked as optional in QEMU's QAPI diff --git a/src/qemu/qemu_postparse.c b/src/qemu/qemu_postparse.c index dc5ade829a..3c2867edb3 100644 --- a/src/qemu/qemu_postparse.c +++ b/src/qemu/qemu_postparse.c @@ -1839,6 +1839,7 @@ qemuDomainDefNumaAutoAdd(virDomainDef *def, case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 0e50cd1ccc..d451c12dd0 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4156,6 +4156,7 @@ qemuProcessDomainMemoryDefNeedHugepagesPath(const virDomainMemoryDef *mem, case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: pagesize = mem->source.virtio_mem.pagesize; break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: @@ -4245,6 +4246,7 @@ qemuProcessNeedMemoryBackingPath(virDomainDef *def, case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: case VIR_DOMAIN_MEMORY_MODEL_SGX_EPC: + case VIR_DOMAIN_MEMORY_MODEL_EGM: case VIR_DOMAIN_MEMORY_MODEL_LAST: /* Backed by user provided path. Not stored in memory * backing dir anyway. */ diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index da08fd17cd..e026e2fdd0 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -5861,6 +5861,12 @@ qemuValidateDomainDeviceDefMemory(const virDomainMemoryDef *mem, break; + case VIR_DOMAIN_MEMORY_MODEL_EGM: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_ACPI_EGM_MEMORY)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("ACPI EGM memory device is not supported with this QEMU binary")); + return -1; + } case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; -- 2.43.0
On Tue, Nov 25, 2025 at 11:17:03 -0800, Nathan Chen via Devel wrote:
Add qemu CLI support for EGM memory device model: - Specify EGM device path to memory-backend-file object - Support acpi-egm-memory object with id, pci-dev, and node attributes - Consolidate all acpi-egm-memory objects' memory into a single memory-backend-file per EGM chardev specified.
Signed-off-by: Ian May <ianm@nvidia.com> Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- src/qemu/qemu_alias.c | 7 +- src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 158 ++++++++++++++++++++++++++++++--- src/qemu/qemu_domain.c | 13 ++- src/qemu/qemu_domain_address.c | 3 + src/qemu/qemu_driver.c | 1 + src/qemu/qemu_hotplug.c | 1 + src/qemu/qemu_monitor_json.c | 1 + src/qemu/qemu_postparse.c | 1 + src/qemu/qemu_process.c | 2 + src/qemu/qemu_validate.c | 6 ++ 12 files changed, 180 insertions(+), 16 deletions(-)
Note that I'm replying to this patch just due to the issue with qemu capabilities. This is not a full review. [...]
diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index f180844e66..3eb12235f4 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -730,6 +730,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_DISK_TIMED_STATS, /* timed stats support ('stats-intervals' property of disk frontends) */ QEMU_CAPS_QUERY_ACCELERATORS, /* query-accelerators command */ QEMU_CAPS_MSHV, /* -accel mshv */ + QEMU_CAPS_DEVICE_ACPI_EGM_MEMORY, /* For using extended GPU memory */
QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags;
For any further postin please separate the addition of the capability along with the detection and any change to the detected capabilities into a separate patch.
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index b69fe23236..33848aa781 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3573,12 +3593,17 @@ qemuBuildMemoryDimmBackendStr(virCommand *cmd, virDomainMemoryDef *mem, virDomainDef *def, virQEMUDriverConfig *cfg, - qemuDomainObjPrivate *priv) + qemuDomainObjPrivate *priv, + GHashTable *egmBackends) { g_autoptr(virJSONValue) props = NULL; g_autoptr(virJSONValue) tcProps = NULL; virBitmap *nodemask = NULL; g_autofree char *alias = NULL; + unsigned long long originalSize = 0; + bool isEGM = (mem->model == VIR_DOMAIN_MEMORY_MODEL_EGM); + bool shouldCreateBackend = true; + qemuEGMBackendInfo *egmInfo = NULL;
if (!mem->info.alias) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -3588,19 +3613,65 @@ qemuBuildMemoryDimmBackendStr(virCommand *cmd,
alias = g_strdup_printf("mem%s", mem->info.alias);
- if (qemuBuildMemoryBackendProps(&props, alias, cfg, priv, - def, mem, true, false, &nodemask) < 0) - return -1; + /* Handle EGM shared backend logic */ + if (isEGM && egmBackends) { + const char *egmPath = mem->source.egm.path; + egmInfo = g_hash_table_lookup(egmBackends, egmPath);
- if (qemuBuildThreadContextProps(&tcProps, &props, def, priv, nodemask) < 0) - return -1; + if (egmInfo) { + alias = g_strdup(egmInfo->alias); + if (egmInfo->created) { + /* Backend already created, skip backend creation */ + shouldCreateBackend = false; + } else { + /* First device for this path - temporarily use accumulated size */ + originalSize = mem->size; + mem->size = egmInfo->totalSize; + egmInfo->created = true; + } + } + }
- if (tcProps && - qemuBuildObjectCommandlineFromJSON(cmd, tcProps) < 0) - return -1; + if (shouldCreateBackend) { + /* Use existing function unchanged */ + if (qemuBuildMemoryBackendProps(&props, alias, cfg, priv, + def, mem, true, false, &nodemask) < 0) { + if (originalSize > 0) + mem->size = originalSize; /* Restore on error */ + return -1; + }
- if (qemuBuildObjectCommandlineFromJSON(cmd, props) < 0) - return -1; + /* Restore original size after backend props are built */ + if (originalSize > 0) + mem->size = originalSize; + + if (qemuBuildThreadContextProps(&tcProps, &props, def, priv, nodemask) < 0) + return -1; + + if (tcProps && + qemuBuildObjectCommandlineFromJSON(cmd, tcProps) < 0) + return -1; + + if (qemuBuildObjectCommandlineFromJSON(cmd, props) < 0) + return -1; + } + + if (isEGM) { + g_autofree char *egmObjStr = NULL; + g_auto(virBuffer) buf = VIR_BUFFER_INITIALIZER; + + virBufferAsprintf(&buf, "acpi-egm-memory,id=%s", mem->info.alias); + + if (mem->target.egm.pciDev) + virBufferAsprintf(&buf, ",pci-dev=%s", mem->target.egm.pciDev); + + if (mem->targetNode >= 0) + virBufferAsprintf(&buf, ",node=%d", mem->targetNode); + + egmObjStr = virBufferContentAndReset(&buf); + + virCommandAddArgList(cmd, "-object", egmObjStr, NULL); + }
return 0; }
[...]
@@ -7835,6 +7910,37 @@ qemuBuildNumaCommandLine(virQEMUDriverConfig *cfg, hmat = true; }
+ /* Pre-scan EGM devices to group by path and calculate total sizes */ + egmBackends = g_hash_table_new_full(g_str_hash, g_str_equal, g_free, + (GDestroyNotify)qemuEGMBackendInfoFree);
For creating hash tables use virHashNew, which uses our hashing function.
+ + for (i = 0; i < def->nmems; i++) { + if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_EGM) { + const char *egmPath = def->mems[i]->source.egm.path; + qemuEGMBackendInfo *info = g_hash_table_lookup(egmBackends, egmPath); + + if (!info) { + info = g_new0(qemuEGMBackendInfo, 1); + info->alias = g_strdup_printf("memegm%zu", egmBackendCount); + egmBackendCount++; + info->totalSize = def->mems[i]->size; + info->created = false; + info->firstMem = def->mems[i]; + g_hash_table_insert(egmBackends, g_strdup(egmPath), info); + } else { + info->totalSize += def->mems[i]->size; + } + } + } + + /* Build the actual backend and device objects */ + for (i = 0; i < def->nmems; i++) { + if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_EGM) { + if (qemuBuildMemoryDimmBackendStr(cmd, def->mems[i], def, cfg, priv, egmBackends) < 0) + goto cleanup; + } + } + nodeBackends = g_new0(virJSONValue *, ncells); nodemask = g_new0(virBitmap *, ncells);
@@ -7870,8 +7976,18 @@ qemuBuildNumaCommandLine(virQEMUDriverConfig *cfg, for (i = 0; i < ncells; i++) { ssize_t initiator = virDomainNumaGetNodeInitiator(def->numa, i); unsigned long long memSize = virDomainNumaGetNodeMemorySize(def->numa, i); + bool egmBacked = false; + size_t k; + + for (k = 0; k < def->nmems; k++) { + if (def->mems[k]->model == VIR_DOMAIN_MEMORY_MODEL_EGM && + def->mems[k]->targetNode == (int)i) { + egmBacked = true; + break; + } + }
- if (needBackend && memSize > 0) { + if (needBackend && memSize > 0 && !egmBacked) { g_autoptr(virJSONValue) tcProps = NULL;
if (qemuBuildThreadContextProps(&tcProps, &nodeBackends[i], @@ -7901,7 +8017,15 @@ qemuBuildNumaCommandLine(virQEMUDriverConfig *cfg,
if (memSize > 0) { if (needBackend) { - virBufferAsprintf(&buf, ",memdev=ram-node%zu", i); + if (egmBacked) { + /* Look up the actual backend alias for EGM */ + const char *egmPath = def->mems[k]->source.egm.path; + qemuEGMBackendInfo *egmInfo = g_hash_table_lookup(egmBackends, egmPath); + const char *backendAlias = egmInfo ? egmInfo->alias : def->mems[k]->info.alias; + virBufferAsprintf(&buf, ",memdev=%s", backendAlias); + } else { + virBufferAsprintf(&buf, ",memdev=ram-node%zu", i); + } } else { virBufferAsprintf(&buf, ",mem=%llu", memSize / 1024); }
Hmm 'qemuBuildMemoryDimmBackendStr' and 'qemuBuildNumaCommandLine' are getting quite out of hand these patches (they're kind of a mess already in the current state) and all of that to support some niche hardware. Please consider refactoring the code first to simplify it. This will be a nightmare to review otherwise.
On 11/25/2025 1:12 PM, Peter Krempa wrote:
On Tue, Nov 25, 2025 at 11:17:03 -0800, Nathan Chen via Devel wrote:
Add qemu CLI support for EGM memory device model: - Specify EGM device path to memory-backend-file object - Support acpi-egm-memory object with id, pci-dev, and node attributes - Consolidate all acpi-egm-memory objects' memory into a single memory-backend-file per EGM chardev specified.
Signed-off-by: Ian May<ianm@nvidia.com> Signed-off-by: Nathan Chen<nathanc@nvidia.com> --- src/qemu/qemu_alias.c | 7 +- src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 158 ++++++++++++++++++++++++++++++--- src/qemu/qemu_domain.c | 13 ++- src/qemu/qemu_domain_address.c | 3 + src/qemu/qemu_driver.c | 1 + src/qemu/qemu_hotplug.c | 1 + src/qemu/qemu_monitor_json.c | 1 + src/qemu/qemu_postparse.c | 1 + src/qemu/qemu_process.c | 2 + src/qemu/qemu_validate.c | 6 ++ 12 files changed, 180 insertions(+), 16 deletions(-) Note that I'm replying to this patch just due to the issue with qemu capabilities. This is not a full review.
[...]
diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index f180844e66..3eb12235f4 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -730,6 +730,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_DISK_TIMED_STATS, /* timed stats support ('stats-intervals' property of disk frontends) */ QEMU_CAPS_QUERY_ACCELERATORS, /* query-accelerators command */ QEMU_CAPS_MSHV, /* -accel mshv */ + QEMU_CAPS_DEVICE_ACPI_EGM_MEMORY, /* For using extended GPU memory */
QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; For any further postin please separate the addition of the capability along with the detection and any change to the detected capabilities into a separate patch.
Ok, I will separate it out in the next revision.
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index b69fe23236..33848aa781 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3573,12 +3593,17 @@ qemuBuildMemoryDimmBackendStr(virCommand *cmd, virDomainMemoryDef *mem, virDomainDef *def, virQEMUDriverConfig *cfg, - qemuDomainObjPrivate *priv) + qemuDomainObjPrivate *priv, + GHashTable *egmBackends) { g_autoptr(virJSONValue) props = NULL; g_autoptr(virJSONValue) tcProps = NULL; virBitmap *nodemask = NULL; g_autofree char *alias = NULL; + unsigned long long originalSize = 0; + bool isEGM = (mem->model == VIR_DOMAIN_MEMORY_MODEL_EGM); + bool shouldCreateBackend = true; + qemuEGMBackendInfo *egmInfo = NULL;
if (!mem->info.alias) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -3588,19 +3613,65 @@ qemuBuildMemoryDimmBackendStr(virCommand *cmd,
alias = g_strdup_printf("mem%s", mem->info.alias);
- if (qemuBuildMemoryBackendProps(&props, alias, cfg, priv, - def, mem, true, false, &nodemask) < 0) - return -1; + /* Handle EGM shared backend logic */ + if (isEGM && egmBackends) { + const char *egmPath = mem->source.egm.path; + egmInfo = g_hash_table_lookup(egmBackends, egmPath);
- if (qemuBuildThreadContextProps(&tcProps, &props, def, priv, nodemask) < 0) - return -1; + if (egmInfo) { + alias = g_strdup(egmInfo->alias); + if (egmInfo->created) { + /* Backend already created, skip backend creation */ + shouldCreateBackend = false; + } else { + /* First device for this path - temporarily use accumulated size */ + originalSize = mem->size; + mem->size = egmInfo->totalSize; + egmInfo->created = true; + } + } + }
- if (tcProps && - qemuBuildObjectCommandlineFromJSON(cmd, tcProps) < 0) - return -1; + if (shouldCreateBackend) { + /* Use existing function unchanged */ + if (qemuBuildMemoryBackendProps(&props, alias, cfg, priv, + def, mem, true, false, &nodemask) < 0) { + if (originalSize > 0) + mem->size = originalSize; /* Restore on error */ + return -1; + }
- if (qemuBuildObjectCommandlineFromJSON(cmd, props) < 0) - return -1; + /* Restore original size after backend props are built */ + if (originalSize > 0) + mem->size = originalSize; + + if (qemuBuildThreadContextProps(&tcProps, &props, def, priv, nodemask) < 0) + return -1; + + if (tcProps && + qemuBuildObjectCommandlineFromJSON(cmd, tcProps) < 0) + return -1; + + if (qemuBuildObjectCommandlineFromJSON(cmd, props) < 0) + return -1; + } + + if (isEGM) { + g_autofree char *egmObjStr = NULL; + g_auto(virBuffer) buf = VIR_BUFFER_INITIALIZER; + + virBufferAsprintf(&buf, "acpi-egm-memory,id=%s", mem->info.alias); + + if (mem->target.egm.pciDev) + virBufferAsprintf(&buf, ",pci-dev=%s", mem->target.egm.pciDev); + + if (mem->targetNode >= 0) + virBufferAsprintf(&buf, ",node=%d", mem->targetNode); + + egmObjStr = virBufferContentAndReset(&buf); + + virCommandAddArgList(cmd, "-object", egmObjStr, NULL); + }
return 0; } [...]
@@ -7835,6 +7910,37 @@ qemuBuildNumaCommandLine(virQEMUDriverConfig *cfg, hmat = true; }
+ /* Pre-scan EGM devices to group by path and calculate total sizes */ + egmBackends = g_hash_table_new_full(g_str_hash, g_str_equal, g_free, + (GDestroyNotify)qemuEGMBackendInfoFree); For creating hash tables use virHashNew, which uses our hashing function.
Will do, thanks.
+ + for (i = 0; i < def->nmems; i++) { + if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_EGM) { + const char *egmPath = def->mems[i]->source.egm.path; + qemuEGMBackendInfo *info = g_hash_table_lookup(egmBackends, egmPath); + + if (!info) { + info = g_new0(qemuEGMBackendInfo, 1); + info->alias = g_strdup_printf("memegm%zu", egmBackendCount); + egmBackendCount++; + info->totalSize = def->mems[i]->size; + info->created = false; + info->firstMem = def->mems[i]; + g_hash_table_insert(egmBackends, g_strdup(egmPath), info); + } else { + info->totalSize += def->mems[i]->size; + } + } + } + + /* Build the actual backend and device objects */ + for (i = 0; i < def->nmems; i++) { + if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_EGM) { + if (qemuBuildMemoryDimmBackendStr(cmd, def->mems[i], def, cfg, priv, egmBackends) < 0) + goto cleanup; + } + } + nodeBackends = g_new0(virJSONValue *, ncells); nodemask = g_new0(virBitmap *, ncells);
@@ -7870,8 +7976,18 @@ qemuBuildNumaCommandLine(virQEMUDriverConfig *cfg, for (i = 0; i < ncells; i++) { ssize_t initiator = virDomainNumaGetNodeInitiator(def->numa, i); unsigned long long memSize = virDomainNumaGetNodeMemorySize(def->numa, i); + bool egmBacked = false; + size_t k; + + for (k = 0; k < def->nmems; k++) { + if (def->mems[k]->model == VIR_DOMAIN_MEMORY_MODEL_EGM && + def->mems[k]->targetNode == (int)i) { + egmBacked = true; + break; + } + }
- if (needBackend && memSize > 0) { + if (needBackend && memSize > 0 && !egmBacked) { g_autoptr(virJSONValue) tcProps = NULL;
if (qemuBuildThreadContextProps(&tcProps, &nodeBackends[i], @@ -7901,7 +8017,15 @@ qemuBuildNumaCommandLine(virQEMUDriverConfig *cfg,
if (memSize > 0) { if (needBackend) { - virBufferAsprintf(&buf, ",memdev=ram-node%zu", i); + if (egmBacked) { + /* Look up the actual backend alias for EGM */ + const char *egmPath = def->mems[k]->source.egm.path; + qemuEGMBackendInfo *egmInfo = g_hash_table_lookup(egmBackends, egmPath); + const char *backendAlias = egmInfo ? egmInfo->alias : def->mems[k]->info.alias; + virBufferAsprintf(&buf, ",memdev=%s", backendAlias); + } else { + virBufferAsprintf(&buf, ",memdev=ram-node%zu", i); + } } else { virBufferAsprintf(&buf, ",mem=%llu", memSize / 1024); } Hmm 'qemuBuildMemoryDimmBackendStr' and 'qemuBuildNumaCommandLine' are getting quite out of hand these patches (they're kind of a mess already in the current state) and all of that to support some niche hardware.
Please consider refactoring the code first to simplify it. This will be a nightmare to review otherwise.
Ok, I will take some time to simplify the structure here in the next revision, either separating out the logic from qemuBuildMemoryDimmBackendStr and qemuBuildNumaCommandLine or making the implementation simpler within these functions. Thanks for taking a preliminary look at these! -Nathan
From: Ian May <ianm@nvidia.com> Add test coverage for the ACPI EGM memory device feature: - Add test case to qemuxmlconftest.c for aarch64 architecture - Add acpi-egm-memory capability to QEMU 10.0.0 aarch64 capabilities - Create test input XML with EGM device configuration - Generate expected output XML and QEMU command line args - Update validation to skip filesystem checks during tests The test validates XML parsing, formatting, device validation, and QEMU command line generation for the EGM device. Signed-off-by: Ian May <ianm@nvidia.com> Signed-off-by: Nathan Chen <nathanc@nvidia.com> --- src/conf/domain_conf.c | 5 +- src/conf/domain_postparse.c | 5 +- src/qemu/qemu_domain.c | 2 + src/util/virfile.h | 2 +- tests/meson.build | 1 + tests/qemuegmmock.c | 67 ++++++++++ .../acpi-egm-memory.aarch64-latest.args | 47 +++++++ .../acpi-egm-memory.aarch64-latest.xml | 124 ++++++++++++++++++ tests/qemuxmlconfdata/acpi-egm-memory.xml | 124 ++++++++++++++++++ tests/qemuxmlconftest.c | 8 +- 10 files changed, 381 insertions(+), 4 deletions(-) create mode 100644 tests/qemuegmmock.c create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.args create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.xml create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.xml diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 2cda32fa6e..3fba40476f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -8813,8 +8813,11 @@ virDomainDefGetMemoryInitial(const virDomainDef *def) size_t i; unsigned long long ret = def->mem.total_memory; - for (i = 0; i < def->nmems; i++) + for (i = 0; i < def->nmems; i++) { + if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_EGM) + continue; ret -= def->mems[i]->size; + } return ret; } diff --git a/src/conf/domain_postparse.c b/src/conf/domain_postparse.c index 0181d21f0e..bb4a61b7d8 100644 --- a/src/conf/domain_postparse.c +++ b/src/conf/domain_postparse.c @@ -44,8 +44,11 @@ virDomainDefPostParseMemory(virDomainDef *def, numaMemory = virDomainNumaGetMemorySize(def->numa); /* calculate the sizes of hotplug memory */ - for (i = 0; i < def->nmems; i++) + for (i = 0; i < def->nmems; i++) { + if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_EGM) + continue; hotplugMemory += def->mems[i]->size; + } if (numaMemory) { /* update the sizes in XML if nothing was set in the XML or ABI update diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 14f2b3ec5d..1b0b375a49 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8038,6 +8038,8 @@ qemuDomainDefValidateMemoryHotplug(const virDomainDef *def, } for (i = 0; i < def->nmems; i++) { + if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_EGM) + continue; hotplugMemory += def->mems[i]->size; switch (def->mems[i]->model) { diff --git a/src/util/virfile.h b/src/util/virfile.h index ce2ffb8ed4..5203ef4425 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -167,7 +167,7 @@ int virFileReadHeaderQuiet(const char *path, int maxlen, char **buf) int virFileReadLimFD(int fd, int maxlen, char **buf) G_GNUC_WARN_UNUSED_RESULT ATTRIBUTE_NONNULL(3); int virFileReadAll(const char *path, int maxlen, char **buf) - G_GNUC_WARN_UNUSED_RESULT ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3); + G_GNUC_WARN_UNUSED_RESULT ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3) ATTRIBUTE_MOCKABLE; int virFileReadAllQuiet(const char *path, int maxlen, char **buf) G_GNUC_WARN_UNUSED_RESULT ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3); int virFileReadBufQuiet(const char *file, char *buf, int len) diff --git a/tests/meson.build b/tests/meson.build index bb6ee6b4ee..28ee8591a3 100644 --- a/tests/meson.build +++ b/tests/meson.build @@ -174,6 +174,7 @@ if conf.has('WITH_QEMU') { 'name': 'qemucaps2xmlmock' }, { 'name': 'qemucapsprobemock', 'link_with': [ test_qemu_driver_lib ] }, { 'name': 'qemucpumock' }, + { 'name': 'qemuegmmock' }, { 'name': 'qemuhotplugmock', 'link_with': [ test_qemu_driver_lib, test_utils_qemu_lib, test_utils_lib ] }, { 'name': 'qemuxml2argvmock' }, { 'name': 'virhostidmock' }, diff --git a/tests/qemuegmmock.c b/tests/qemuegmmock.c new file mode 100644 index 0000000000..c915212f45 --- /dev/null +++ b/tests/qemuegmmock.c @@ -0,0 +1,67 @@ +/* + * Copyright (C) 2024 Red Hat, Inc. + * SPDX-License-Identifier: LGPL-2.1-or-later + */ + +#include <config.h> +#include <unistd.h> + +#include "internal.h" +#include "virfile.h" +#include "virmock.h" + +static bool (*real_virFileExists)(const char *path); +static int (*real_access)(const char *path, int mode); +static int (*real_virFileReadAll)(const char *path, int maxlen, char **buf); + +static void +init_syms(void) +{ + if (real_virFileExists && real_access && real_virFileReadAll) + return; + + VIR_MOCK_REAL_INIT(virFileExists); + VIR_MOCK_REAL_INIT(access); + VIR_MOCK_REAL_INIT(virFileReadAll); +} + +bool +virFileExists(const char *path) +{ + init_syms(); + + /* Mock EGM device paths for testing */ + if (g_str_has_prefix(path, "/dev/egm") || + g_str_has_prefix(path, "/sys/class/egm/")) + return true; + + return real_virFileExists(path); +} + +int +access(const char *path, int mode) +{ + init_syms(); + + /* Mock EGM device paths for testing */ + if (g_str_has_prefix(path, "/dev/egm") || + g_str_has_prefix(path, "/sys/class/egm/")) + return 0; /* success */ + + return real_access(path, mode); +} + +int +virFileReadAll(const char *path, int maxlen, char **buf) +{ + init_syms(); + + /* Mock EGM GPU device file for testing */ + if (g_str_has_prefix(path, "/sys/class/egm/") && + g_str_has_suffix(path, "/gpu_devices")) { + *buf = g_strdup("0000:01:00.0\n"); + return strlen(*buf); + } + + return real_virFileReadAll(path, maxlen, buf); +} diff --git a/tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.args b/tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.args new file mode 100644 index 0000000000..41d1fcc026 --- /dev/null +++ b/tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.args @@ -0,0 +1,47 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/var/lib/libvirt/qemu/domain--1-egm \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain--1-egm/.local/share \ +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain--1-egm/.cache \ +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain--1-egm/.config \ +/usr/bin/qemu-system-aarch64 \ +-name guest=egm,debug-threads=on \ +-S \ +-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain--1-egm/master-key.aes"}' \ +-machine virt,usb=off,gic-version=3,dump-guest-core=off,acpi=off \ +-accel kvm \ +-cpu host \ +-m size=524288k,maxmem=524288k \ +-overcommit mem-lock=off \ +-smp 4,sockets=2,dies=1,clusters=1,cores=2,threads=1 \ +-object '{"qom-type":"memory-backend-file","id":"memegm0","mem-path":"/dev/egm4","share":true,"size":268435456}' \ +-object acpi-egm-memory,id=egm0,pci-dev=ua-hostdev0,node=0 \ +-object acpi-egm-memory,id=egm1,pci-dev=ua-hostdev1,node=0 \ +-object '{"qom-type":"memory-backend-file","id":"memegm1","mem-path":"/dev/egm5","share":true,"size":268435456}' \ +-object acpi-egm-memory,id=egm2,pci-dev=ua-hostdev2,node=1 \ +-object acpi-egm-memory,id=egm3,pci-dev=ua-hostdev3,node=1 \ +-numa node,nodeid=0,cpus=0-1,memdev=memegm0 \ +-numa node,nodeid=1,cpus=2-3,memdev=memegm1 \ +-uuid 00010203-0405-4607-8809-0a0b0c0d0e0f \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=1729,server=on,wait=off \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-boot strict=on \ +-device '{"driver":"pcie-root-port","port":8,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x1"}' \ +-device '{"driver":"pcie-root-port","port":9,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x1.0x1"}' \ +-device '{"driver":"pcie-root-port","port":10,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x2"}' \ +-device '{"driver":"pcie-root-port","port":11,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x3"}' \ +-device '{"driver":"pcie-root-port","port":12,"chassis":5,"id":"pci.5","bus":"pcie.0","addr":"0x4"}' \ +-audiodev '{"id":"audio1","driver":"none"}' \ +-device '{"driver":"vfio-pci","host":"0000:01:00.0","id":"ua-hostdev0","bus":"pci.1","addr":"0x0"}' \ +-device '{"driver":"vfio-pci","host":"0000:02:00.0","id":"ua-hostdev1","bus":"pci.3","addr":"0x0"}' \ +-device '{"driver":"vfio-pci","host":"0000:03:00.0","id":"ua-hostdev2","bus":"pci.4","addr":"0x0"}' \ +-device '{"driver":"vfio-pci","host":"0000:04:00.0","id":"ua-hostdev3","bus":"pci.5","addr":"0x0"}' \ +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ +-msg timestamp=on diff --git a/tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.xml b/tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.xml new file mode 100644 index 0000000000..bd73d613e5 --- /dev/null +++ b/tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.xml @@ -0,0 +1,124 @@ +<domain type='kvm'> + <name>egm</name> + <uuid>00010203-0405-4607-8809-0a0b0c0d0e0f</uuid> + <maxMemory unit='KiB'>524288</maxMemory> + <memory unit='KiB'>524288</memory> + <currentMemory unit='KiB'>524288</currentMemory> + <vcpu placement='static'>4</vcpu> + <os> + <type arch='aarch64' machine='virt'>hvm</type> + <boot dev='hd'/> + </os> + <features> + <gic version='3'/> + </features> + <cpu mode='host-passthrough' check='none'> + <topology sockets='2' dies='1' clusters='1' cores='2' threads='1'/> + <numa> + <cell id='0' cpus='0-1' memory='262144' unit='KiB'/> + <cell id='1' cpus='2-3' memory='262144' unit='KiB'/> + </numa> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-aarch64</emulator> + <controller type='pci' index='0' model='pcie-root'/> + <controller type='pci' index='1' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='1' port='0x8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/> + </controller> + <controller type='pci' index='2' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='2' port='0x9'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='pci' index='3' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='3' port='0xa'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </controller> + <controller type='pci' index='4' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='4' port='0xb'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='5' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='5' port='0xc'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </controller> + <audio id='1' type='none'/> + <hostdev mode='subsystem' type='pci' managed='yes'> + <source> + <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> + </source> + <alias name='ua-hostdev0'/> + <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <source> + <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> + </source> + <alias name='ua-hostdev1'/> + <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <source> + <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> + </source> + <alias name='ua-hostdev2'/> + <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <source> + <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> + </source> + <alias name='ua-hostdev3'/> + <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/> + </hostdev> + <memory model='egm' access='shared'> + <source> + <path>/dev/egm4</path> + </source> + <target> + <size unit='KiB'>131072</size> + <node>0</node> + <pciDev>ua-hostdev0</pciDev> + </target> + </memory> + <memory model='egm' access='shared'> + <source> + <path>/dev/egm4</path> + </source> + <target> + <size unit='KiB'>131072</size> + <node>0</node> + <pciDev>ua-hostdev1</pciDev> + </target> + </memory> + <memory model='egm' access='shared'> + <source> + <path>/dev/egm5</path> + </source> + <target> + <size unit='KiB'>131072</size> + <node>1</node> + <pciDev>ua-hostdev2</pciDev> + </target> + </memory> + <memory model='egm' access='shared'> + <source> + <path>/dev/egm5</path> + </source> + <target> + <size unit='KiB'>131072</size> + <node>1</node> + <pciDev>ua-hostdev3</pciDev> + </target> + </memory> + </devices> +</domain> diff --git a/tests/qemuxmlconfdata/acpi-egm-memory.xml b/tests/qemuxmlconfdata/acpi-egm-memory.xml new file mode 100644 index 0000000000..bd73d613e5 --- /dev/null +++ b/tests/qemuxmlconfdata/acpi-egm-memory.xml @@ -0,0 +1,124 @@ +<domain type='kvm'> + <name>egm</name> + <uuid>00010203-0405-4607-8809-0a0b0c0d0e0f</uuid> + <maxMemory unit='KiB'>524288</maxMemory> + <memory unit='KiB'>524288</memory> + <currentMemory unit='KiB'>524288</currentMemory> + <vcpu placement='static'>4</vcpu> + <os> + <type arch='aarch64' machine='virt'>hvm</type> + <boot dev='hd'/> + </os> + <features> + <gic version='3'/> + </features> + <cpu mode='host-passthrough' check='none'> + <topology sockets='2' dies='1' clusters='1' cores='2' threads='1'/> + <numa> + <cell id='0' cpus='0-1' memory='262144' unit='KiB'/> + <cell id='1' cpus='2-3' memory='262144' unit='KiB'/> + </numa> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-aarch64</emulator> + <controller type='pci' index='0' model='pcie-root'/> + <controller type='pci' index='1' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='1' port='0x8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/> + </controller> + <controller type='pci' index='2' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='2' port='0x9'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='pci' index='3' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='3' port='0xa'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </controller> + <controller type='pci' index='4' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='4' port='0xb'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='5' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='5' port='0xc'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </controller> + <audio id='1' type='none'/> + <hostdev mode='subsystem' type='pci' managed='yes'> + <source> + <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> + </source> + <alias name='ua-hostdev0'/> + <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <source> + <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> + </source> + <alias name='ua-hostdev1'/> + <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <source> + <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> + </source> + <alias name='ua-hostdev2'/> + <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> + </hostdev> + <hostdev mode='subsystem' type='pci' managed='yes'> + <source> + <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> + </source> + <alias name='ua-hostdev3'/> + <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/> + </hostdev> + <memory model='egm' access='shared'> + <source> + <path>/dev/egm4</path> + </source> + <target> + <size unit='KiB'>131072</size> + <node>0</node> + <pciDev>ua-hostdev0</pciDev> + </target> + </memory> + <memory model='egm' access='shared'> + <source> + <path>/dev/egm4</path> + </source> + <target> + <size unit='KiB'>131072</size> + <node>0</node> + <pciDev>ua-hostdev1</pciDev> + </target> + </memory> + <memory model='egm' access='shared'> + <source> + <path>/dev/egm5</path> + </source> + <target> + <size unit='KiB'>131072</size> + <node>1</node> + <pciDev>ua-hostdev2</pciDev> + </target> + </memory> + <memory model='egm' access='shared'> + <source> + <path>/dev/egm5</path> + </source> + <target> + <size unit='KiB'>131072</size> + <node>1</node> + <pciDev>ua-hostdev3</pciDev> + </target> + </memory> + </devices> +</domain> diff --git a/tests/qemuxmlconftest.c b/tests/qemuxmlconftest.c index de03c58c8a..07ffdf164c 100644 --- a/tests/qemuxmlconftest.c +++ b/tests/qemuxmlconftest.c @@ -3211,6 +3211,10 @@ mymain(void) DO_TEST_CAPS_LATEST("devices-acpi-index"); + DO_TEST_CAPS_ARCH_LATEST_FULL("acpi-egm-memory", "aarch64", + ARG_QEMU_CAPS, QEMU_CAPS_DEVICE_ACPI_EGM_MEMORY, + ARG_END); + DO_TEST_CAPS_ARCH_LATEST_FULL("hvf-x86_64-q35-headless", "x86_64", ARG_CAPS_VARIANT, "+hvf", ARG_END); DO_TEST_CAPS_ARCH_LATEST_FULL("hvf-aarch64-virt-headless", "aarch64", ARG_CAPS_VARIANT, "+hvf", ARG_END); /* HVF guests should not work on Linux with KVM */ @@ -3317,7 +3321,9 @@ VIR_TEST_MAIN_PRELOAD(mymain, VIR_TEST_MOCK("domaincaps"), VIR_TEST_MOCK("virrandom"), VIR_TEST_MOCK("qemucpu"), - VIR_TEST_MOCK("virnuma")) + VIR_TEST_MOCK("virnuma"), + VIR_TEST_MOCK("virpci"), + VIR_TEST_MOCK("qemuegm")) #else -- 2.43.0
On Tue, Nov 25, 2025 at 11:17:00 -0800, Nathan Chen via Devel wrote: [...]
The corresponding qemu command line will include the following arguments:
-object '{"qom-type":"memory-backend-file","id":"memegm0","mem-path":"/dev/egm4","share":true,"prealloc":true,"size":17179869184}' \ -object acpi-egm-memory,id=egm0,pci-dev=ua-hostdev0,node=0 \ -object acpi-egm-memory,id=egm1,pci-dev=ua-hostdev1,node=0 \ -numa node,nodeid=0,cpus=0-4,memdev=memegm0 \
Changes from RFCv2: - Decouple host EGM chardev name from guest EGM ID - Consolidate all acpi-egm-memory objects' memory into a single memory-backend-file per EGM chardev specified.
Changes from RFCv1: - Use existing memory device infrastructure to represent EGM configuration - Added support for multiple EGM devices
This series is on Github: https://github.com/NathanChenNVIDIA/libvirt/tree/egm-11-24-25
Thanks, Nathan
[0] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/6RU7R... [1] https://github.com/ianm-nv/qemu/tree/6.8_ghvirt_egm_may2025 [2] https://github.com/NVIDIA/QEMU/commit/32db1b74fb99c0571724c7e69485e89098c148...
What is the state of this qemu series? I was looking at the aarch64 capability update and didn't see any of the 'acpi-egm-memory' so I went looking at the qemu mailing list and didn't find any submission adding the aforementioned object type. Note that unless the qemu functionallity is commited to the upstream repository we will not be taking any patches using it.
On 11/27/2025 1:05 AM, Peter Krempa wrote:
On Tue, Nov 25, 2025 at 11:17:00 -0800, Nathan Chen via Devel wrote:
[...]
The corresponding qemu command line will include the following arguments:
-object '{"qom-type":"memory-backend-file","id":"memegm0","mem-path":"/dev/egm4","share":true,"prealloc":true,"size":17179869184}' \ -object acpi-egm-memory,id=egm0,pci-dev=ua-hostdev0,node=0 \ -object acpi-egm-memory,id=egm1,pci-dev=ua-hostdev1,node=0 \ -numa node,nodeid=0,cpus=0-4,memdev=memegm0 \
Changes from RFCv2: - Decouple host EGM chardev name from guest EGM ID - Consolidate all acpi-egm-memory objects' memory into a single memory-backend-file per EGM chardev specified.
Changes from RFCv1: - Use existing memory device infrastructure to represent EGM configuration - Added support for multiple EGM devices
This series is on Github: https://github.com/NathanChenNVIDIA/libvirt/tree/egm-11-24-25
Thanks, Nathan
[0]https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/ thread/6RU7R2NQEDEUU7JFPM6DTXJBWUDXTYWE/ [1]https://github.com/ianm-nv/qemu/tree/6.8_ghvirt_egm_may2025 [2]https://github.com/NVIDIA/QEMU/ commit/32db1b74fb99c0571724c7e69485e89098c14874 What is the state of this qemu series? I was looking at the aarch64 capability update and didn't see any of the 'acpi-egm-memory' so I went looking at the qemu mailing list and didn't find any submission adding the aforementioned object type.
The associated qemu series has not been submitted for upstream feedback yet, and the underlying kernel series had its first RFC [0] submitted for upstream review in September of this year. [0] https://lore.kernel.org/all/20250904040828.319452-1-ankita@nvidia.com/
Note that unless the qemu functionallity is commited to the upstream repository we will not be taking any patches using it.
Understood, we are submitting patches earlier to identify potential issues with our Libvirt implementation while the qemu portion is being developed.
participants (2)
-
Nathan Chen -
Peter Krempa