[PATCH 00/10] Introduce virtio-mem <memory/> model

Technically, this is another version of: https://www.redhat.com/archives/libvir-list/2020-December/msg00199.html But since virtio-pmem part is pushed now, I've reworked virtio-mem a bit and sending it as a new series. For curious ones, David summarized behaviour well when implementing virtio-mem support in kernel: https://lwn.net/Articles/755423/ For less curious ones: # virsh update-memory $dom --requested-size 4G adds additional 4GiB of RAM to guest; # virsh update-memory $dom --requested-size 0 removes those 4GiB added earlier. Patches are also available on my GitLab: https://gitlab.com/MichalPrivoznik/libvirt/-/tree/virtio_mem_v3 Michal Prívozník (10): virhostmem: Introduce virHostMemGetTHPSize() qemu_capabilities: Introduce QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI conf: Introduce virtio-mem <memory/> model qemu: Build command line for virtio-mem qemu: Wire up <memory/> live update qemu: Wire up MEMORY_DEVICE_SIZE_CHANGE event qemu: Refresh the actual size of virtio-mem on monitor reconnect qemu: Recalculate balloon on MEMORY_DEVICE_SIZE_CHANGE event and reconnect virsh: Introduce update-memory command news: document recent virtio memory addition NEWS.rst | 7 + docs/formatdomain.rst | 42 +++- docs/manpages/virsh.rst | 31 +++ docs/schemas/domaincommon.rng | 16 ++ src/conf/domain_conf.c | 100 ++++++++- src/conf/domain_conf.h | 13 ++ src/conf/domain_validate.c | 39 ++++ src/libvirt_private.syms | 3 + src/qemu/qemu_alias.c | 10 +- src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 13 +- src/qemu/qemu_domain.c | 50 ++++- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 37 ++- src/qemu/qemu_driver.c | 211 +++++++++++++++++- src/qemu/qemu_hotplug.c | 18 ++ src/qemu/qemu_hotplug.h | 5 + src/qemu/qemu_monitor.c | 37 +++ src/qemu/qemu_monitor.h | 27 +++ src/qemu/qemu_monitor_json.c | 94 ++++++-- src/qemu/qemu_monitor_json.h | 5 + src/qemu/qemu_process.c | 101 ++++++++- src/qemu/qemu_validate.c | 8 + src/security/security_apparmor.c | 1 + src/security/security_dac.c | 2 + src/security/security_selinux.c | 2 + src/util/virhostmem.c | 63 ++++++ src/util/virhostmem.h | 3 + tests/domaincapsmock.c | 9 + .../caps_5.1.0.x86_64.xml | 1 + .../caps_5.2.0.x86_64.xml | 1 + ...mory-hotplug-virtio-mem.x86_64-latest.args | 49 ++++ .../memory-hotplug-virtio-mem.xml | 66 ++++++ tests/qemuxml2argvtest.c | 1 + ...emory-hotplug-virtio-mem.x86_64-latest.xml | 1 + tests/qemuxml2xmltest.c | 1 + tools/virsh-domain.c | 154 +++++++++++++ 38 files changed, 1165 insertions(+), 60 deletions(-) create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml create mode 120000 tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml -- 2.26.2

New virHostMemGetTHPSize() is introduced which allows caller to obtain THP PMD (Page Middle Directory) size, which is equal to the minimal size that THP can use, taken from kernel doc (Documentation/admin-guide/mm/transhuge.rst): Some userspace (such as a test program, or an optimized memory allocation library) may want to know the size (in bytes) of a transparent hugepage:: cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size Since this size depends on the host architecture and the kernel it won't change whilst libvirtd is running. Therefore, we can use virOnce() and cache the value. Of course, we can be running under kernel that has THP disabled or has no notion of THP at all. In that case a negative value is returned to signal error. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virhostmem.c | 63 ++++++++++++++++++++++++++++++++++++++++ src/util/virhostmem.h | 3 ++ 3 files changed, 67 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index fbaf16704b..962d82680e 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2365,6 +2365,7 @@ virHostMemGetFreePages; virHostMemGetInfo; virHostMemGetParameters; virHostMemGetStats; +virHostMemGetTHPSize; virHostMemSetParameters; diff --git a/src/util/virhostmem.c b/src/util/virhostmem.c index ae42978ed2..ef7b97806f 100644 --- a/src/util/virhostmem.c +++ b/src/util/virhostmem.c @@ -45,11 +45,14 @@ #include "virstring.h" #include "virnuma.h" #include "virlog.h" +#include "virthread.h" #define VIR_FROM_THIS VIR_FROM_NONE VIR_LOG_INIT("util.hostmem"); +static unsigned long long virHostTHPPMDSize; +static virOnceControl virHostMemGetTHPSizeOnce = VIR_ONCE_CONTROL_INITIALIZER; #ifdef __FreeBSD__ # define BSD_MEMORY_STATS_ALL 4 @@ -920,3 +923,63 @@ virHostMemAllocPages(unsigned int npages, return ncounts; } + +#if defined(__linux__) +# define HPAGE_PMD_SIZE_PATH "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size" +static int +virHostMemGetTHPSizeSysfs(unsigned long long *size) +{ + g_autofree char *buf = NULL; + + /* 1KiB limit is more than enough. */ + if (virFileReadAll(HPAGE_PMD_SIZE_PATH, 1024, &buf) < 0) + return -1; + + virStringTrimOptionalNewline(buf); + if (virStrToLong_ull(buf, NULL, 10, size) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("unable to parse THP PMD size: %s"), buf); + return -1; + } + + /* Size is now in bytes. Convert to KiB. */ + *size >>= 10; + return 0; +} +#endif /* defined(__linux__) */ + + +static void +virHostMemGetTHPSizeOnceInit(void) +{ +#if defined(__linux__) + virHostMemGetTHPSizeSysfs(&virHostTHPPMDSize); +#else /* !defined(__linux__) */ + VIR_WARN("Getting THP size not ported yet"); +#endif /* !defined(__linux__) */ +} + + +/** + * virHostMemGetTHPSize: + * @size: returned size of THP in kibibytes + * + * Obtain Transparent Huge Page size in kibibytes. The size + * depends on host architecture and kernel. Because of virOnce(), + * do not rely on errno in case of failure. + * + * Returns: 0 on success, + * -1 on failure. + */ +int +virHostMemGetTHPSize(unsigned long long *size) +{ + if (virOnce(&virHostMemGetTHPSizeOnce, virHostMemGetTHPSizeOnceInit) < 0) + return -1; + + if (virHostTHPPMDSize == 0) + return -1; + + *size = virHostTHPPMDSize; + return 0; +} diff --git a/src/util/virhostmem.h b/src/util/virhostmem.h index 1369829807..bf15c40698 100644 --- a/src/util/virhostmem.h +++ b/src/util/virhostmem.h @@ -53,3 +53,6 @@ int virHostMemAllocPages(unsigned int npages, int startCell, unsigned int cellCount, bool add); + +int virHostMemGetTHPSize(unsigned long long *size) + G_GNUC_NO_INLINE; -- 2.26.2

This commit introduces a new capability that reflects virtio-mem-pci device support in QEMU: QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI, /* -device virtio-mem-pci */ The virtio-mem-pci device was introduced in QEMU 5.1. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml | 1 + 4 files changed, 5 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index d656732c3e..0be7a24c9a 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -610,6 +610,7 @@ VIR_ENUM_IMPL(virQEMUCaps, "dc390", "am53c974", "virtio-pmem-pci", + "virtio-mem-pci", ); @@ -1327,6 +1328,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "dc390", QEMU_CAPS_SCSI_DC390 }, { "am53c974", QEMU_CAPS_SCSI_AM53C974 }, { "virtio-pmem-pci", QEMU_CAPS_DEVICE_VIRTIO_PMEM_PCI }, + { "virtio-mem-pci", QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI }, }; diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index a14a78f959..76c7465c0e 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -590,6 +590,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_SCSI_DC390, /* -device dc-390 */ QEMU_CAPS_SCSI_AM53C974, /* -device am53c974 */ QEMU_CAPS_DEVICE_VIRTIO_PMEM_PCI, /* -device virtio-pmem-pci */ + QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI, /* -device virtio-mem-pci */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml index 07466093c9..a54806fb0c 100644 --- a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml @@ -250,6 +250,7 @@ <flag name='dc390'/> <flag name='am53c974'/> <flag name='virtio-pmem-pci'/> + <flag name='virtio-mem-pci'/> <version>5001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml index dea2ff4b54..1bbc38297d 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml @@ -251,6 +251,7 @@ <flag name='dc390'/> <flag name='am53c974'/> <flag name='virtio-pmem-pci'/> + <flag name='virtio-mem-pci'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> -- 2.26.2

On Fri, Jan 22, 2021 at 13:50:24 +0100, Michal Privoznik wrote:
This commit introduces a new capability that reflects virtio-mem-pci device support in QEMU:
QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI, /* -device virtio-mem-pci */
The virtio-mem-pci device was introduced in QEMU 5.1.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml | 1 + 4 files changed, 5 insertions(+)
Don't forget to update this since we now have 6.0.0 caps.

The virtio-mem is paravirtualized mechanism of adding/removing memory to/from a VM. A virtio-mem-pci device is split into blocks of equal size which are then exposed (all or only a requested portion of them) to the guest kernel to use as regular memory. Therefore, the device has two important attributes: 1) block-size, which defines the size of a block 2) requested-size, which defines how much memory (in bytes) is the device requested to expose to the guest. The 'block-size' is configured on command line and immutable throughout device's lifetime. The 'requested-size' can be set on the command line too, but also is adjustable via monitor. In fact, that is how management software places its requests to change the memory allocation. If it wants to give more memory to the guest it changes 'requested-size' to a bigger value, and if it wants to shrink guest memory it changes the 'requested-size' to a smaller value. Note, value of zero means that guest should release all memory offered by the device. Of course, guest has to cooperate. Therefore, there is a third attribute 'size' which is read only and reflects how much memory the guest still has. This can be different to 'requested-size', obviously. Because of name clash, I've named it 'actualsize' and it is dealt with in future commits (it is a runtime information anyway). In the backend, memory for virtio-mem is backed by usual objects: memory-backend-{ram,file,memfd} and their size puts the cap on the amount of memory that a virtio-mem device can offer to a guest. But we are already able to express this info using <size/> under <target/>. Therefore, we need only two more elements to cover 'block-size' and 'requested-size' attributes. This is the XML I've came up with: <memory model='virtio-mem'> <source> <nodemask>1-3</nodemask> <pagesize unit='KiB'>2048</pagesize> </source> <target> <size unit='KiB'>2097152</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>1048576</requested> </target> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </memory> I hope by now it is obvious that: 1) 'requested-size' must be an integer multiple of 'block-size', and 2) virtio-mem-pci device goes onto PCI bus and thus needs PCI address. Then there is a limitation that the minimal 'block-size' is transparent huge page size (I'll leave this without explanation). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/formatdomain.rst | 35 ++++++++-- docs/schemas/domaincommon.rng | 11 ++++ src/conf/domain_conf.c | 53 ++++++++++++++- src/conf/domain_conf.h | 3 + src/conf/domain_validate.c | 39 +++++++++++ src/qemu/qemu_alias.c | 1 + src/qemu/qemu_command.c | 1 + src/qemu/qemu_domain.c | 10 +++ src/qemu/qemu_domain_address.c | 37 ++++++++--- src/qemu/qemu_validate.c | 8 +++ src/security/security_apparmor.c | 1 + src/security/security_dac.c | 2 + src/security/security_selinux.c | 2 + tests/domaincapsmock.c | 9 +++ .../memory-hotplug-virtio-mem.xml | 66 +++++++++++++++++++ ...emory-hotplug-virtio-mem.x86_64-latest.xml | 1 + tests/qemuxml2xmltest.c | 1 + 17 files changed, 264 insertions(+), 16 deletions(-) create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml create mode 120000 tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index af540391db..2938758ec2 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -7267,6 +7267,18 @@ Example: usage of the memory devices <size unit='KiB'>524288</size> </target> </memory> + <memory model='virtio-mem'> + <source> + <nodemask>1-3</nodemask> + <pagesize unit='KiB'>2048</pagesize> + </source> + <target> + <size unit='KiB'>2097152</size> + <node>0</node> + <block unit='KiB'>2048</block> + <requested unit='KiB'>1048576</requested> + </target> + </memory> </devices> ... @@ -7274,7 +7286,8 @@ Example: usage of the memory devices Provide ``dimm`` to add a virtual DIMM module to the guest. :since:`Since 1.2.14` Provide ``nvdimm`` model that adds a Non-Volatile DIMM module. :since:`Since 3.2.0` Provide ``virtio-pmem`` model to add a paravirtualized - persistent memory device. :since:`Since 7.1.0` + persistent memory device. :since:`Since 7.1.0` Provide ``virtio-mem`` model + to add paravirtualized memory device. :since: `Since 7.1.0` ``access`` An optional attribute ``access`` ( :since:`since 3.2.0` ) that provides @@ -7297,10 +7310,11 @@ Example: usage of the memory devices allowed only for ``model='nvdimm'`` for pSeries guests. :since:`Since 6.2.0` ``source`` - For model ``dimm`` this element is optional and allows to fine tune the - source of the memory used for the given memory device. If the element is not - provided defaults configured via ``numatune`` are used. If ``dimm`` is - provided, then the following optional elements can be provided as well: + For model ``dimm`` and model ``virtio-mem`` this element is optional and + allows to fine tune the source of the memory used for the given memory + device. If the element is not provided defaults configured via ``numatune`` + are used. If the element is provided, then the following optional elements + can be provided: ``pagesize`` This element can be used to override the default host page size used for @@ -7366,6 +7380,17 @@ Example: usage of the memory devices so other backend types should use the ``readonly`` element. :since:`Since 5.0.0` + ``block`` + For ``virtio-mem`` only. + The size of an individual block, granularity of division of memory module. + Must be power of two and at least equal to size of a transparent hugepage + (2MiB on x84_64). The default is hypervisor dependant. + + ``requested`` + For ``virtio-mem`` only. + The total size of blocks exposed to the guest. Must respect ``block`` + granularity. + :anchor:`<a id="elementsIommu"/>` IOMMU devices diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index a4bddcf132..5bc120073e 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -6020,6 +6020,7 @@ <value>dimm</value> <value>nvdimm</value> <value>virtio-pmem</value> + <value>virtio-mem</value> </choice> </attribute> <optional> @@ -6104,6 +6105,16 @@ <ref name="unsignedInt"/> </element> </optional> + <optional> + <element name="block"> + <ref name="scaledInteger"/> + </element> + </optional> + <optional> + <element name="requested"> + <ref name="scaledInteger"/> + </element> + </optional> <optional> <element name="label"> <element name="size"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index dab4f10326..f8c5a40b24 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1310,6 +1310,7 @@ VIR_ENUM_IMPL(virDomainMemoryModel, "dimm", "nvdimm", "virtio-pmem", + "virtio-mem", ); VIR_ENUM_IMPL(virDomainShmemModel, @@ -5359,6 +5360,7 @@ virDomainMemoryDefPostParse(virDomainMemoryDefPtr mem, } break; + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: @@ -15322,6 +15324,7 @@ virDomainMemorySourceDefParseXML(xmlNodePtr node, switch (def->model) { case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: if (virDomainParseMemory("./pagesize", "./pagesize/@unit", ctxt, &def->pagesize, false, false) < 0) return -1; @@ -15388,7 +15391,8 @@ virDomainMemoryTargetDefParseXML(xmlNodePtr node, &def->size, true, false) < 0) return -1; - if (def->model == VIR_DOMAIN_MEMORY_MODEL_NVDIMM) { + switch (def->model) { + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: if (virDomainParseMemory("./label/size", "./label/size/@unit", ctxt, &def->labelsize, false, false) < 0) return -1; @@ -15407,6 +15411,23 @@ virDomainMemoryTargetDefParseXML(xmlNodePtr node, if (virXPathBoolean("boolean(./readonly)", ctxt)) def->readonly = true; + break; + + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (virDomainParseMemory("./block", "./block/@unit", ctxt, + &def->blocksize, false, false) < 0) + return -1; + + if (virDomainParseMemory("./requested", "./requested/@unit", ctxt, + &def->requestedsize, false, false) < 0) + return -1; + break; + + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + break; } return 0; @@ -17214,11 +17235,14 @@ virDomainMemoryFindByDefInternal(virDomainDefPtr def, /* target info -> always present */ if (tmp->model != mem->model || tmp->targetNode != mem->targetNode || - tmp->size != mem->size) + tmp->size != mem->size || + tmp->blocksize != mem->blocksize || + tmp->requestedsize != mem->requestedsize) continue; switch (mem->model) { case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: /* source stuff -> match with device */ if (tmp->pagesize != mem->pagesize) continue; @@ -22784,6 +22808,22 @@ virDomainMemoryDefCheckABIStability(virDomainMemoryDefPtr src, return false; } + if (src->blocksize != dst->blocksize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device block size '%llu' doesn't match " + "source memory device block size '%llu'"), + dst->blocksize, src->blocksize); + return false; + } + + if (src->requestedsize != dst->requestedsize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device requested size '%llu' doesn't match " + "source memory device requested size '%llu'"), + dst->requestedsize, src->requestedsize); + return false; + } + if (src->model == VIR_DOMAIN_MEMORY_MODEL_NVDIMM) { if (src->labelsize != dst->labelsize) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -26507,6 +26547,7 @@ virDomainMemorySourceDefFormat(virBufferPtr buf, switch (def->model) { case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: if (def->sourceNodes) { if (!(bitmap = virBitmapFormat(def->sourceNodes))) return -1; @@ -26563,6 +26604,14 @@ virDomainMemoryTargetDefFormat(virBufferPtr buf, if (def->readonly) virBufferAddLit(&childBuf, "<readonly/>\n"); + if (def->blocksize) { + virBufferAsprintf(&childBuf, "<block unit='KiB'>%llu</block>\n", + def->blocksize); + + virBufferAsprintf(&childBuf, "<requested unit='KiB'>%llu</requested>\n", + def->requestedsize); + } + virXMLFormatElement(buf, "target", NULL, &childBuf); } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 95ad052891..5d89ecfe9d 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2308,6 +2308,7 @@ typedef enum { VIR_DOMAIN_MEMORY_MODEL_DIMM, /* dimm hotpluggable memory device */ VIR_DOMAIN_MEMORY_MODEL_NVDIMM, /* nvdimm memory device */ VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM, /* virtio-pmem memory device */ + VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM, /* virtio-mem memory device */ VIR_DOMAIN_MEMORY_MODEL_LAST } virDomainMemoryModel; @@ -2328,6 +2329,8 @@ struct _virDomainMemoryDef { int targetNode; unsigned long long size; /* kibibytes */ unsigned long long labelsize; /* kibibytes; valid only for NVDIMM */ + unsigned long long blocksize; /* kibibytes; valid only for VIRTIO_MEM */ + unsigned long long requestedsize; /* kibibytes; valid only for VIRTIO_MEM */ bool readonly; /* valid only for NVDIMM */ /* required for QEMU NVDIMM ppc64 support */ diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index 649fc335ac..b5a0c09468 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -25,6 +25,7 @@ #include "virconftypes.h" #include "virlog.h" #include "virutil.h" +#include "virhostmem.h" #define VIR_FROM_THIS VIR_FROM_DOMAIN @@ -1389,6 +1390,8 @@ static int virDomainMemoryDefValidate(const virDomainMemoryDef *mem, const virDomainDef *def) { + unsigned long long thpSize; + switch (mem->model) { case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: if (!mem->nvdimmPath) { @@ -1442,6 +1445,42 @@ virDomainMemoryDefValidate(const virDomainMemoryDef *mem, _("virtio-pmem does not support NUMA nodes")); return -1; } + break; + + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (mem->requestedsize > mem->size) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("requested size must be smaller than @size")); + return -1; + } + + if (!VIR_IS_POW2(mem->blocksize)) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("block size must be a power of two")); + return -1; + } + + if (virHostMemGetTHPSize(&thpSize) < 0) { + /* We failed to get THP size, fall back to a sane default. On + * almost every architecture the size will be 2MiB, except for some + * funky arches like sparc and m68k. Use 2MiB and refine later if + * somebody complains. */ + thpSize = 2048; + } + + if (mem->blocksize < thpSize) { + virReportError(VIR_ERR_XML_DETAIL, + _("block size too small, must be at least %lluKiB"), + thpSize); + return -1; + } + + if (mem->requestedsize % mem->blocksize != 0) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("requested size must be an integer multiple of block size")); + return -1; + } + break; case VIR_DOMAIN_MEMORY_MODEL_DIMM: break; diff --git a/src/qemu/qemu_alias.c b/src/qemu/qemu_alias.c index b3492d6e85..2a4e2393c8 100644 --- a/src/qemu/qemu_alias.c +++ b/src/qemu/qemu_alias.c @@ -522,6 +522,7 @@ qemuAssignDeviceMemoryAlias(virDomainDefPtr def, case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: prefix = "virtiopmem"; break; + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: default: diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 1ec302d4ac..0a4a8f2646 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3303,6 +3303,7 @@ qemuBuildMemoryDeviceStr(const virDomainDef *def, device = "virtio-pmem-pci"; break; + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: default: diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 0c078a9388..233adc996f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8682,6 +8682,16 @@ qemuDomainDefValidateMemoryHotplugDevice(const virDomainMemoryDef *mem, needsNuma = false; break; + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI && + mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("only 'pci' addresses are supported for the %s device"), + virDomainMemoryModelTypeToString(mem->model)); + return -1; + } + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: return -1; diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index d40edadb4d..396acc7a09 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -380,9 +380,18 @@ qemuDomainPrimeVirtioDeviceAddresses(virDomainDefPtr def, } for (i = 0; i < def->nmems; i++) { - if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM && - def->mems[i]->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) - def->mems[i]->info.type = type; + switch (def->mems[i]->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (def->mems[i]->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + def->mems[i]->info.type = type; + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + break; + } } if (type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW) { @@ -1014,6 +1023,7 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, case VIR_DOMAIN_DEVICE_MEMORY: switch (dev->data.memory->model) { case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: return virtioFlags; case VIR_DOMAIN_MEMORY_MODEL_NONE: @@ -2441,12 +2451,19 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, for (i = 0; i < def->nmems; i++) { virDomainMemoryDefPtr mem = def->mems[i]; - if (mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM || - !virDeviceInfoPCIAddressIsWanted(&mem->info)) - continue; - - if (qemuDomainPCIAddressReserveNextAddr(addrs, &mem->info) < 0) - return -1; + switch (mem->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (virDeviceInfoPCIAddressIsWanted(&mem->info) && + qemuDomainPCIAddressReserveNextAddr(addrs, &mem->info) < 0) + return -1; + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + break; + } } return 0; @@ -3113,6 +3130,7 @@ qemuDomainAssignMemoryDeviceSlot(virQEMUDriverPtr driver, break; case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: return qemuDomainEnsurePCIAddress(vm, &dev, driver); break; @@ -3149,6 +3167,7 @@ qemuDomainAssignMemorySlots(virDomainDefPtr def) break; case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: /* handled in qemuDomainAssignPCIAddresses() */ break; case VIR_DOMAIN_MEMORY_MODEL_NONE: diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index a060bd98ba..0f7c89b972 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -4632,6 +4632,14 @@ qemuValidateDomainDeviceDefMemory(virDomainMemoryDefPtr mem, } break; + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtio-mem isn't supported by this QEMU binary")); + return -1; + } + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index 1f309c0c9f..60ffc814d8 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -693,6 +693,7 @@ AppArmorSetMemoryLabel(virSecurityManagerPtr mgr, return reload_profile(mgr, def, mem->nvdimmPath, true); case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; } diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 71d58758c4..118d9961dc 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -1894,6 +1894,7 @@ virSecurityDACRestoreMemoryLabel(virSecurityManagerPtr mgr, break; case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_LAST: case VIR_DOMAIN_MEMORY_MODEL_NONE: ret = 0; @@ -2077,6 +2078,7 @@ virSecurityDACSetMemoryLabel(virSecurityManagerPtr mgr, break; case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_LAST: case VIR_DOMAIN_MEMORY_MODEL_NONE: ret = 0; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 3563dbfb86..b6744efcd3 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1584,6 +1584,7 @@ virSecuritySELinuxSetMemoryLabel(virSecurityManagerPtr mgr, case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; } @@ -1611,6 +1612,7 @@ virSecuritySELinuxRestoreMemoryLabel(virSecurityManagerPtr mgr, break; case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: ret = 0; diff --git a/tests/domaincapsmock.c b/tests/domaincapsmock.c index d81a898dc0..34a2f9ad81 100644 --- a/tests/domaincapsmock.c +++ b/tests/domaincapsmock.c @@ -17,6 +17,7 @@ #include <config.h> #include "virhostcpu.h" +#include "virhostmem.h" #ifdef WITH_LIBXL # include "libxl/libxl_capabilities.h" #endif @@ -40,3 +41,11 @@ virHostCPUGetMicrocodeVersion(virArch hostArch G_GNUC_UNUSED) { return 0; } + +int +virHostMemGetTHPSize(unsigned long long *size) +{ + /* Pretend Transparent Huge Page size is 2MiB. */ + *size = 2048; + return 0; +} diff --git a/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml b/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml new file mode 100644 index 0000000000..7c9db0c808 --- /dev/null +++ b/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml @@ -0,0 +1,66 @@ +<domain type='kvm'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <maxMemory slots='16' unit='KiB'>1099511627776</maxMemory> + <memory unit='KiB'>8388608</memory> + <currentMemory unit='KiB'>8388608</currentMemory> + <vcpu placement='static' cpuset='0-1'>2</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='custom' match='exact' check='none'> + <model fallback='forbid'>qemu64</model> + <topology sockets='2' dies='1' cores='1' threads='1'/> + <numa> + <cell id='0' cpus='0-1' memory='2095104' unit='KiB'/> + </numa> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i386</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='usb' index='0' model='piix3-uhci'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </memballoon> + <memory model='virtio-mem'> + <target> + <size unit='KiB'>1048576</size> + <node>0</node> + <block unit='KiB'>2048</block> + <requested unit='KiB'>524288</requested> + </target> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </memory> + <memory model='virtio-mem'> + <source> + <nodemask>1-3</nodemask> + <pagesize unit='KiB'>2048</pagesize> + </source> + <target> + <size unit='KiB'>2097152</size> + <node>0</node> + <block unit='KiB'>2048</block> + <requested unit='KiB'>1048576</requested> + </target> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </memory> + </devices> +</domain> diff --git a/tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml b/tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml new file mode 120000 index 0000000000..a9d298129c --- /dev/null +++ b/tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml @@ -0,0 +1 @@ +../qemuxml2argvdata/memory-hotplug-virtio-mem.xml \ No newline at end of file diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 9553a8a4f8..ba23fdda00 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -1262,6 +1262,7 @@ mymain(void) QEMU_CAPS_DEVICE_NVDIMM, QEMU_CAPS_LAST); DO_TEST_CAPS_LATEST("memory-hotplug-virtio-pmem"); + DO_TEST_CAPS_LATEST("memory-hotplug-virtio-mem"); DO_TEST("net-udp", NONE); -- 2.26.2

On 1/22/21 9:50 AM, Michal Privoznik wrote:
The virtio-mem is paravirtualized mechanism of adding/removing memory to/from a VM. A virtio-mem-pci device is split into blocks of equal size which are then exposed (all or only a requested portion of them) to the guest kernel to use as regular memory. Therefore, the device has two important attributes:
1) block-size, which defines the size of a block 2) requested-size, which defines how much memory (in bytes) is the device requested to expose to the guest.
The 'block-size' is configured on command line and immutable throughout device's lifetime. The 'requested-size' can be set on the command line too, but also is adjustable via monitor. In fact, that is how management software places its requests to change the memory allocation. If it wants to give more memory to the guest it changes 'requested-size' to a bigger value, and if it wants to shrink guest memory it changes the 'requested-size' to a smaller value. Note, value of zero means that guest should release all memory offered by the device. Of course, guest has to cooperate. Therefore, there is a third attribute 'size' which is read only and reflects how much memory the guest still has. This can be different to 'requested-size', obviously. Because of name clash, I've named it 'actualsize' and it is dealt with in future commits (it is a runtime information anyway).
In the backend, memory for virtio-mem is backed by usual objects: memory-backend-{ram,file,memfd} and their size puts the cap on the amount of memory that a virtio-mem device can offer to a guest. But we are already able to express this info using <size/> under <target/>.
Therefore, we need only two more elements to cover 'block-size' and 'requested-size' attributes. This is the XML I've came up with:
<memory model='virtio-mem'> <source> <nodemask>1-3</nodemask> <pagesize unit='KiB'>2048</pagesize> </source> <target> <size unit='KiB'>2097152</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>1048576</requested> </target> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </memory>
I hope by now it is obvious that:
1) 'requested-size' must be an integer multiple of 'block-size', and 2) virtio-mem-pci device goes onto PCI bus and thus needs PCI address.
Then there is a limitation that the minimal 'block-size' is transparent huge page size (I'll leave this without explanation).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/formatdomain.rst | 35 ++++++++-- docs/schemas/domaincommon.rng | 11 ++++ src/conf/domain_conf.c | 53 ++++++++++++++- src/conf/domain_conf.h | 3 + src/conf/domain_validate.c | 39 +++++++++++ src/qemu/qemu_alias.c | 1 + src/qemu/qemu_command.c | 1 + src/qemu/qemu_domain.c | 10 +++ src/qemu/qemu_domain_address.c | 37 ++++++++--- src/qemu/qemu_validate.c | 8 +++ src/security/security_apparmor.c | 1 + src/security/security_dac.c | 2 + src/security/security_selinux.c | 2 + tests/domaincapsmock.c | 9 +++ .../memory-hotplug-virtio-mem.xml | 66 +++++++++++++++++++ ...emory-hotplug-virtio-mem.x86_64-latest.xml | 1 + tests/qemuxml2xmltest.c | 1 + 17 files changed, 264 insertions(+), 16 deletions(-) create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml create mode 120000 tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index af540391db..2938758ec2 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -7267,6 +7267,18 @@ Example: usage of the memory devices <size unit='KiB'>524288</size> </target> </memory> + <memory model='virtio-mem'> + <source> + <nodemask>1-3</nodemask> + <pagesize unit='KiB'>2048</pagesize> + </source> + <target> + <size unit='KiB'>2097152</size> + <node>0</node> + <block unit='KiB'>2048</block> + <requested unit='KiB'>1048576</requested> + </target> + </memory> </devices> ...
@@ -7274,7 +7286,8 @@ Example: usage of the memory devices Provide ``dimm`` to add a virtual DIMM module to the guest. :since:`Since 1.2.14` Provide ``nvdimm`` model that adds a Non-Volatile DIMM module. :since:`Since 3.2.0` Provide ``virtio-pmem`` model to add a paravirtualized - persistent memory device. :since:`Since 7.1.0` + persistent memory device. :since:`Since 7.1.0` Provide ``virtio-mem`` model + to add paravirtualized memory device. :since: `Since 7.1.0`
``access`` An optional attribute ``access`` ( :since:`since 3.2.0` ) that provides @@ -7297,10 +7310,11 @@ Example: usage of the memory devices allowed only for ``model='nvdimm'`` for pSeries guests. :since:`Since 6.2.0`
``source`` - For model ``dimm`` this element is optional and allows to fine tune the - source of the memory used for the given memory device. If the element is not - provided defaults configured via ``numatune`` are used. If ``dimm`` is - provided, then the following optional elements can be provided as well: + For model ``dimm`` and model ``virtio-mem`` this element is optional and + allows to fine tune the source of the memory used for the given memory + device. If the element is not provided defaults configured via ``numatune`` + are used. If the element is provided, then the following optional elements + can be provided:
``pagesize`` This element can be used to override the default host page size used for @@ -7366,6 +7380,17 @@ Example: usage of the memory devices so other backend types should use the ``readonly`` element. :since:`Since 5.0.0`
+ ``block`` + For ``virtio-mem`` only. + The size of an individual block, granularity of division of memory module. + Must be power of two and at least equal to size of a transparent hugepage + (2MiB on x84_64). The default is hypervisor dependant.
I don't think that 'dependant' is wrong in this context but 'dependent' is more common.
+ + ``requested`` + For ``virtio-mem`` only. + The total size of blocks exposed to the guest. Must respect ``block`` + granularity. + :anchor:`<a id="elementsIommu"/>`
IOMMU devices diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index a4bddcf132..5bc120073e 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -6020,6 +6020,7 @@ <value>dimm</value> <value>nvdimm</value> <value>virtio-pmem</value> + <value>virtio-mem</value> </choice> </attribute> <optional> @@ -6104,6 +6105,16 @@ <ref name="unsignedInt"/> </element> </optional> + <optional> + <element name="block"> + <ref name="scaledInteger"/> + </element> + </optional> + <optional> + <element name="requested"> + <ref name="scaledInteger"/> + </element> + </optional> <optional> <element name="label"> <element name="size"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index dab4f10326..f8c5a40b24 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1310,6 +1310,7 @@ VIR_ENUM_IMPL(virDomainMemoryModel, "dimm", "nvdimm", "virtio-pmem", + "virtio-mem", );
VIR_ENUM_IMPL(virDomainShmemModel, @@ -5359,6 +5360,7 @@ virDomainMemoryDefPostParse(virDomainMemoryDefPtr mem, } break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: @@ -15322,6 +15324,7 @@ virDomainMemorySourceDefParseXML(xmlNodePtr node,
switch (def->model) { case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: if (virDomainParseMemory("./pagesize", "./pagesize/@unit", ctxt, &def->pagesize, false, false) < 0) return -1; @@ -15388,7 +15391,8 @@ virDomainMemoryTargetDefParseXML(xmlNodePtr node, &def->size, true, false) < 0) return -1;
- if (def->model == VIR_DOMAIN_MEMORY_MODEL_NVDIMM) { + switch (def->model) { + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: if (virDomainParseMemory("./label/size", "./label/size/@unit", ctxt, &def->labelsize, false, false) < 0) return -1; @@ -15407,6 +15411,23 @@ virDomainMemoryTargetDefParseXML(xmlNodePtr node,
if (virXPathBoolean("boolean(./readonly)", ctxt)) def->readonly = true; + break; + + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (virDomainParseMemory("./block", "./block/@unit", ctxt, + &def->blocksize, false, false) < 0) + return -1; + + if (virDomainParseMemory("./requested", "./requested/@unit", ctxt, + &def->requestedsize, false, false) < 0) + return -1; + break; + + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + break; }
return 0; @@ -17214,11 +17235,14 @@ virDomainMemoryFindByDefInternal(virDomainDefPtr def, /* target info -> always present */ if (tmp->model != mem->model || tmp->targetNode != mem->targetNode || - tmp->size != mem->size) + tmp->size != mem->size || + tmp->blocksize != mem->blocksize || + tmp->requestedsize != mem->requestedsize) continue;
switch (mem->model) { case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: /* source stuff -> match with device */ if (tmp->pagesize != mem->pagesize) continue; @@ -22784,6 +22808,22 @@ virDomainMemoryDefCheckABIStability(virDomainMemoryDefPtr src, return false; }
+ if (src->blocksize != dst->blocksize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device block size '%llu' doesn't match " + "source memory device block size '%llu'"), + dst->blocksize, src->blocksize); + return false; + } + + if (src->requestedsize != dst->requestedsize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device requested size '%llu' doesn't match " + "source memory device requested size '%llu'"), + dst->requestedsize, src->requestedsize); + return false; + } + if (src->model == VIR_DOMAIN_MEMORY_MODEL_NVDIMM) { if (src->labelsize != dst->labelsize) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -26507,6 +26547,7 @@ virDomainMemorySourceDefFormat(virBufferPtr buf,
switch (def->model) { case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: if (def->sourceNodes) { if (!(bitmap = virBitmapFormat(def->sourceNodes))) return -1; @@ -26563,6 +26604,14 @@ virDomainMemoryTargetDefFormat(virBufferPtr buf, if (def->readonly) virBufferAddLit(&childBuf, "<readonly/>\n");
+ if (def->blocksize) { + virBufferAsprintf(&childBuf, "<block unit='KiB'>%llu</block>\n", + def->blocksize); + + virBufferAsprintf(&childBuf, "<requested unit='KiB'>%llu</requested>\n", + def->requestedsize); + } + virXMLFormatElement(buf, "target", NULL, &childBuf); }
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 95ad052891..5d89ecfe9d 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2308,6 +2308,7 @@ typedef enum { VIR_DOMAIN_MEMORY_MODEL_DIMM, /* dimm hotpluggable memory device */ VIR_DOMAIN_MEMORY_MODEL_NVDIMM, /* nvdimm memory device */ VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM, /* virtio-pmem memory device */ + VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM, /* virtio-mem memory device */
VIR_DOMAIN_MEMORY_MODEL_LAST } virDomainMemoryModel; @@ -2328,6 +2329,8 @@ struct _virDomainMemoryDef { int targetNode; unsigned long long size; /* kibibytes */ unsigned long long labelsize; /* kibibytes; valid only for NVDIMM */ + unsigned long long blocksize; /* kibibytes; valid only for VIRTIO_MEM */ + unsigned long long requestedsize; /* kibibytes; valid only for VIRTIO_MEM */ bool readonly; /* valid only for NVDIMM */
/* required for QEMU NVDIMM ppc64 support */ diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index 649fc335ac..b5a0c09468 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -25,6 +25,7 @@ #include "virconftypes.h" #include "virlog.h" #include "virutil.h" +#include "virhostmem.h"
#define VIR_FROM_THIS VIR_FROM_DOMAIN
@@ -1389,6 +1390,8 @@ static int virDomainMemoryDefValidate(const virDomainMemoryDef *mem, const virDomainDef *def) { + unsigned long long thpSize; + switch (mem->model) { case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: if (!mem->nvdimmPath) { @@ -1442,6 +1445,42 @@ virDomainMemoryDefValidate(const virDomainMemoryDef *mem, _("virtio-pmem does not support NUMA nodes")); return -1; } + break; + + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (mem->requestedsize > mem->size) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("requested size must be smaller than @size")); + return -1; + } + + if (!VIR_IS_POW2(mem->blocksize)) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("block size must be a power of two")); + return -1; + } + + if (virHostMemGetTHPSize(&thpSize) < 0) { + /* We failed to get THP size, fall back to a sane default. On + * almost every architecture the size will be 2MiB, except for some + * funky arches like sparc and m68k. Use 2MiB and refine later if + * somebody complains. */ + thpSize = 2048;
FWIW, a Power 9 server uses 2MiB too: $ cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size 2097152 I don't think you should worry about too much since x86 is the only arch that is supporting virtio-mem (for now). Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
+ } + + if (mem->blocksize < thpSize) { + virReportError(VIR_ERR_XML_DETAIL, + _("block size too small, must be at least %lluKiB"), + thpSize); + return -1; + } + + if (mem->requestedsize % mem->blocksize != 0) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("requested size must be an integer multiple of block size")); + return -1; + } + break;
case VIR_DOMAIN_MEMORY_MODEL_DIMM: break; diff --git a/src/qemu/qemu_alias.c b/src/qemu/qemu_alias.c index b3492d6e85..2a4e2393c8 100644 --- a/src/qemu/qemu_alias.c +++ b/src/qemu/qemu_alias.c @@ -522,6 +522,7 @@ qemuAssignDeviceMemoryAlias(virDomainDefPtr def, case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: prefix = "virtiopmem"; break; + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: default: diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 1ec302d4ac..0a4a8f2646 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3303,6 +3303,7 @@ qemuBuildMemoryDeviceStr(const virDomainDef *def, device = "virtio-pmem-pci"; break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: default: diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 0c078a9388..233adc996f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8682,6 +8682,16 @@ qemuDomainDefValidateMemoryHotplugDevice(const virDomainMemoryDef *mem, needsNuma = false; break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI && + mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("only 'pci' addresses are supported for the %s device"), + virDomainMemoryModelTypeToString(mem->model)); + return -1; + } + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: return -1; diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index d40edadb4d..396acc7a09 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -380,9 +380,18 @@ qemuDomainPrimeVirtioDeviceAddresses(virDomainDefPtr def, }
for (i = 0; i < def->nmems; i++) { - if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM && - def->mems[i]->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) - def->mems[i]->info.type = type; + switch (def->mems[i]->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (def->mems[i]->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + def->mems[i]->info.type = type; + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + break; + } }
if (type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW) { @@ -1014,6 +1023,7 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, case VIR_DOMAIN_DEVICE_MEMORY: switch (dev->data.memory->model) { case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: return virtioFlags;
case VIR_DOMAIN_MEMORY_MODEL_NONE: @@ -2441,12 +2451,19 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def, for (i = 0; i < def->nmems; i++) { virDomainMemoryDefPtr mem = def->mems[i];
- if (mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM || - !virDeviceInfoPCIAddressIsWanted(&mem->info)) - continue; - - if (qemuDomainPCIAddressReserveNextAddr(addrs, &mem->info) < 0) - return -1; + switch (mem->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (virDeviceInfoPCIAddressIsWanted(&mem->info) && + qemuDomainPCIAddressReserveNextAddr(addrs, &mem->info) < 0) + return -1; + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + break; + } }
return 0; @@ -3113,6 +3130,7 @@ qemuDomainAssignMemoryDeviceSlot(virQEMUDriverPtr driver, break;
case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: return qemuDomainEnsurePCIAddress(vm, &dev, driver); break;
@@ -3149,6 +3167,7 @@ qemuDomainAssignMemorySlots(virDomainDefPtr def) break;
case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: /* handled in qemuDomainAssignPCIAddresses() */ break; case VIR_DOMAIN_MEMORY_MODEL_NONE: diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index a060bd98ba..0f7c89b972 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -4632,6 +4632,14 @@ qemuValidateDomainDeviceDefMemory(virDomainMemoryDefPtr mem, } break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtio-mem isn't supported by this QEMU binary")); + return -1; + } + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index 1f309c0c9f..60ffc814d8 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -693,6 +693,7 @@ AppArmorSetMemoryLabel(virSecurityManagerPtr mgr, return reload_profile(mgr, def, mem->nvdimmPath, true); case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; } diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 71d58758c4..118d9961dc 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -1894,6 +1894,7 @@ virSecurityDACRestoreMemoryLabel(virSecurityManagerPtr mgr, break;
case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_LAST: case VIR_DOMAIN_MEMORY_MODEL_NONE: ret = 0; @@ -2077,6 +2078,7 @@ virSecurityDACSetMemoryLabel(virSecurityManagerPtr mgr, break;
case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_LAST: case VIR_DOMAIN_MEMORY_MODEL_NONE: ret = 0; diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c index 3563dbfb86..b6744efcd3 100644 --- a/src/security/security_selinux.c +++ b/src/security/security_selinux.c @@ -1584,6 +1584,7 @@ virSecuritySELinuxSetMemoryLabel(virSecurityManagerPtr mgr,
case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_LAST: break; } @@ -1611,6 +1612,7 @@ virSecuritySELinuxRestoreMemoryLabel(virSecurityManagerPtr mgr, break;
case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: ret = 0; diff --git a/tests/domaincapsmock.c b/tests/domaincapsmock.c index d81a898dc0..34a2f9ad81 100644 --- a/tests/domaincapsmock.c +++ b/tests/domaincapsmock.c @@ -17,6 +17,7 @@ #include <config.h>
#include "virhostcpu.h" +#include "virhostmem.h" #ifdef WITH_LIBXL # include "libxl/libxl_capabilities.h" #endif @@ -40,3 +41,11 @@ virHostCPUGetMicrocodeVersion(virArch hostArch G_GNUC_UNUSED) { return 0; } + +int +virHostMemGetTHPSize(unsigned long long *size) +{ + /* Pretend Transparent Huge Page size is 2MiB. */ + *size = 2048; + return 0; +} diff --git a/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml b/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml new file mode 100644 index 0000000000..7c9db0c808 --- /dev/null +++ b/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml @@ -0,0 +1,66 @@ +<domain type='kvm'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <maxMemory slots='16' unit='KiB'>1099511627776</maxMemory> + <memory unit='KiB'>8388608</memory> + <currentMemory unit='KiB'>8388608</currentMemory> + <vcpu placement='static' cpuset='0-1'>2</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='custom' match='exact' check='none'> + <model fallback='forbid'>qemu64</model> + <topology sockets='2' dies='1' cores='1' threads='1'/> + <numa> + <cell id='0' cpus='0-1' memory='2095104' unit='KiB'/> + </numa> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i386</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='usb' index='0' model='piix3-uhci'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='virtio'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </memballoon> + <memory model='virtio-mem'> + <target> + <size unit='KiB'>1048576</size> + <node>0</node> + <block unit='KiB'>2048</block> + <requested unit='KiB'>524288</requested> + </target> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </memory> + <memory model='virtio-mem'> + <source> + <nodemask>1-3</nodemask> + <pagesize unit='KiB'>2048</pagesize> + </source> + <target> + <size unit='KiB'>2097152</size> + <node>0</node> + <block unit='KiB'>2048</block> + <requested unit='KiB'>1048576</requested> + </target> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </memory> + </devices> +</domain> diff --git a/tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml b/tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml new file mode 120000 index 0000000000..a9d298129c --- /dev/null +++ b/tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml @@ -0,0 +1 @@ +../qemuxml2argvdata/memory-hotplug-virtio-mem.xml \ No newline at end of file diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 9553a8a4f8..ba23fdda00 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -1262,6 +1262,7 @@ mymain(void) QEMU_CAPS_DEVICE_NVDIMM, QEMU_CAPS_LAST); DO_TEST_CAPS_LATEST("memory-hotplug-virtio-pmem"); + DO_TEST_CAPS_LATEST("memory-hotplug-virtio-mem");
DO_TEST("net-udp", NONE);

On 22.01.21 19:16, Daniel Henrique Barboza wrote:
On 1/22/21 9:50 AM, Michal Privoznik wrote:
The virtio-mem is paravirtualized mechanism of adding/removing memory to/from a VM. A virtio-mem-pci device is split into blocks of equal size which are then exposed (all or only a requested portion of them) to the guest kernel to use as regular memory. Therefore, the device has two important attributes:
1) block-size, which defines the size of a block 2) requested-size, which defines how much memory (in bytes) is the device requested to expose to the guest.
The 'block-size' is configured on command line and immutable throughout device's lifetime. The 'requested-size' can be set on the command line too, but also is adjustable via monitor. In fact, that is how management software places its requests to change the memory allocation. If it wants to give more memory to the guest it changes 'requested-size' to a bigger value, and if it wants to shrink guest memory it changes the 'requested-size' to a smaller value. Note, value of zero means that guest should release all memory offered by the device. Of course, guest has to cooperate. Therefore, there is a third attribute 'size' which is read only and reflects how much memory the guest still has. This can be different to 'requested-size', obviously. Because of name clash, I've named it 'actualsize' and it is dealt with in future commits (it is a runtime information anyway).
In the backend, memory for virtio-mem is backed by usual objects: memory-backend-{ram,file,memfd} and their size puts the cap on the amount of memory that a virtio-mem device can offer to a guest. But we are already able to express this info using <size/> under <target/>.
Therefore, we need only two more elements to cover 'block-size' and 'requested-size' attributes. This is the XML I've came up with:
<memory model='virtio-mem'> <source> <nodemask>1-3</nodemask> <pagesize unit='KiB'>2048</pagesize> </source> <target> <size unit='KiB'>2097152</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>1048576</requested> </target> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </memory>
I hope by now it is obvious that:
1) 'requested-size' must be an integer multiple of 'block-size', and 2) virtio-mem-pci device goes onto PCI bus and thus needs PCI address.
Then there is a limitation that the minimal 'block-size' is transparent huge page size (I'll leave this without explanation).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/formatdomain.rst | 35 ++++++++-- docs/schemas/domaincommon.rng | 11 ++++ src/conf/domain_conf.c | 53 ++++++++++++++- src/conf/domain_conf.h | 3 + src/conf/domain_validate.c | 39 +++++++++++ src/qemu/qemu_alias.c | 1 + src/qemu/qemu_command.c | 1 + src/qemu/qemu_domain.c | 10 +++ src/qemu/qemu_domain_address.c | 37 ++++++++--- src/qemu/qemu_validate.c | 8 +++ src/security/security_apparmor.c | 1 + src/security/security_dac.c | 2 + src/security/security_selinux.c | 2 + tests/domaincapsmock.c | 9 +++ .../memory-hotplug-virtio-mem.xml | 66 +++++++++++++++++++ ...emory-hotplug-virtio-mem.x86_64-latest.xml | 1 + tests/qemuxml2xmltest.c | 1 + 17 files changed, 264 insertions(+), 16 deletions(-) create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml create mode 120000 tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index af540391db..2938758ec2 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -7267,6 +7267,18 @@ Example: usage of the memory devices <size unit='KiB'>524288</size> </target> </memory> + <memory model='virtio-mem'> + <source> + <nodemask>1-3</nodemask> + <pagesize unit='KiB'>2048</pagesize> + </source> + <target> + <size unit='KiB'>2097152</size> + <node>0</node> + <block unit='KiB'>2048</block> + <requested unit='KiB'>1048576</requested> + </target> + </memory> </devices> ...
@@ -7274,7 +7286,8 @@ Example: usage of the memory devices Provide ``dimm`` to add a virtual DIMM module to the guest. :since:`Since 1.2.14` Provide ``nvdimm`` model that adds a Non-Volatile DIMM module. :since:`Since 3.2.0` Provide ``virtio-pmem`` model to add a paravirtualized - persistent memory device. :since:`Since 7.1.0` + persistent memory device. :since:`Since 7.1.0` Provide ``virtio-mem`` model + to add paravirtualized memory device. :since: `Since 7.1.0`
``access`` An optional attribute ``access`` ( :since:`since 3.2.0` ) that provides @@ -7297,10 +7310,11 @@ Example: usage of the memory devices allowed only for ``model='nvdimm'`` for pSeries guests. :since:`Since 6.2.0`
``source`` - For model ``dimm`` this element is optional and allows to fine tune the - source of the memory used for the given memory device. If the element is not - provided defaults configured via ``numatune`` are used. If ``dimm`` is - provided, then the following optional elements can be provided as well: + For model ``dimm`` and model ``virtio-mem`` this element is optional and + allows to fine tune the source of the memory used for the given memory + device. If the element is not provided defaults configured via ``numatune`` + are used. If the element is provided, then the following optional elements + can be provided:
``pagesize`` This element can be used to override the default host page size used for @@ -7366,6 +7380,17 @@ Example: usage of the memory devices so other backend types should use the ``readonly`` element. :since:`Since 5.0.0`
+ ``block`` + For ``virtio-mem`` only. + The size of an individual block, granularity of division of memory module. + Must be power of two and at least equal to size of a transparent hugepage + (2MiB on x84_64). The default is hypervisor dependant.
I don't think that 'dependant' is wrong in this context but 'dependent' is more common.
+ + ``requested`` + For ``virtio-mem`` only. + The total size of blocks exposed to the guest. Must respect ``block`` + granularity. + :anchor:`<a id="elementsIommu"/>`
IOMMU devices diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index a4bddcf132..5bc120073e 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -6020,6 +6020,7 @@ <value>dimm</value> <value>nvdimm</value> <value>virtio-pmem</value> + <value>virtio-mem</value> </choice> </attribute> <optional> @@ -6104,6 +6105,16 @@ <ref name="unsignedInt"/> </element> </optional> + <optional> + <element name="block"> + <ref name="scaledInteger"/> + </element> + </optional> + <optional> + <element name="requested"> + <ref name="scaledInteger"/> + </element> + </optional> <optional> <element name="label"> <element name="size"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index dab4f10326..f8c5a40b24 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1310,6 +1310,7 @@ VIR_ENUM_IMPL(virDomainMemoryModel, "dimm", "nvdimm", "virtio-pmem", + "virtio-mem", );
VIR_ENUM_IMPL(virDomainShmemModel, @@ -5359,6 +5360,7 @@ virDomainMemoryDefPostParse(virDomainMemoryDefPtr mem, } break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: case VIR_DOMAIN_MEMORY_MODEL_DIMM: case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: @@ -15322,6 +15324,7 @@ virDomainMemorySourceDefParseXML(xmlNodePtr node,
switch (def->model) { case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: if (virDomainParseMemory("./pagesize", "./pagesize/@unit", ctxt, &def->pagesize, false, false) < 0) return -1; @@ -15388,7 +15391,8 @@ virDomainMemoryTargetDefParseXML(xmlNodePtr node, &def->size, true, false) < 0) return -1;
- if (def->model == VIR_DOMAIN_MEMORY_MODEL_NVDIMM) { + switch (def->model) { + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: if (virDomainParseMemory("./label/size", "./label/size/@unit", ctxt, &def->labelsize, false, false) < 0) return -1; @@ -15407,6 +15411,23 @@ virDomainMemoryTargetDefParseXML(xmlNodePtr node,
if (virXPathBoolean("boolean(./readonly)", ctxt)) def->readonly = true; + break; + + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (virDomainParseMemory("./block", "./block/@unit", ctxt, + &def->blocksize, false, false) < 0) + return -1; + + if (virDomainParseMemory("./requested", "./requested/@unit", ctxt, + &def->requestedsize, false, false) < 0) + return -1; + break; + + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + break; }
return 0; @@ -17214,11 +17235,14 @@ virDomainMemoryFindByDefInternal(virDomainDefPtr def, /* target info -> always present */ if (tmp->model != mem->model || tmp->targetNode != mem->targetNode || - tmp->size != mem->size) + tmp->size != mem->size || + tmp->blocksize != mem->blocksize || + tmp->requestedsize != mem->requestedsize) continue;
switch (mem->model) { case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: /* source stuff -> match with device */ if (tmp->pagesize != mem->pagesize) continue; @@ -22784,6 +22808,22 @@ virDomainMemoryDefCheckABIStability(virDomainMemoryDefPtr src, return false; }
+ if (src->blocksize != dst->blocksize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device block size '%llu' doesn't match " + "source memory device block size '%llu'"), + dst->blocksize, src->blocksize); + return false; + } + + if (src->requestedsize != dst->requestedsize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device requested size '%llu' doesn't match " + "source memory device requested size '%llu'"), + dst->requestedsize, src->requestedsize); + return false; + } + if (src->model == VIR_DOMAIN_MEMORY_MODEL_NVDIMM) { if (src->labelsize != dst->labelsize) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -26507,6 +26547,7 @@ virDomainMemorySourceDefFormat(virBufferPtr buf,
switch (def->model) { case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: if (def->sourceNodes) { if (!(bitmap = virBitmapFormat(def->sourceNodes))) return -1; @@ -26563,6 +26604,14 @@ virDomainMemoryTargetDefFormat(virBufferPtr buf, if (def->readonly) virBufferAddLit(&childBuf, "<readonly/>\n");
+ if (def->blocksize) { + virBufferAsprintf(&childBuf, "<block unit='KiB'>%llu</block>\n", + def->blocksize); + + virBufferAsprintf(&childBuf, "<requested unit='KiB'>%llu</requested>\n", + def->requestedsize); + } + virXMLFormatElement(buf, "target", NULL, &childBuf); }
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 95ad052891..5d89ecfe9d 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2308,6 +2308,7 @@ typedef enum { VIR_DOMAIN_MEMORY_MODEL_DIMM, /* dimm hotpluggable memory device */ VIR_DOMAIN_MEMORY_MODEL_NVDIMM, /* nvdimm memory device */ VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM, /* virtio-pmem memory device */ + VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM, /* virtio-mem memory device */
VIR_DOMAIN_MEMORY_MODEL_LAST } virDomainMemoryModel; @@ -2328,6 +2329,8 @@ struct _virDomainMemoryDef { int targetNode; unsigned long long size; /* kibibytes */ unsigned long long labelsize; /* kibibytes; valid only for NVDIMM */ + unsigned long long blocksize; /* kibibytes; valid only for VIRTIO_MEM */ + unsigned long long requestedsize; /* kibibytes; valid only for VIRTIO_MEM */ bool readonly; /* valid only for NVDIMM */
/* required for QEMU NVDIMM ppc64 support */ diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index 649fc335ac..b5a0c09468 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -25,6 +25,7 @@ #include "virconftypes.h" #include "virlog.h" #include "virutil.h" +#include "virhostmem.h"
#define VIR_FROM_THIS VIR_FROM_DOMAIN
@@ -1389,6 +1390,8 @@ static int virDomainMemoryDefValidate(const virDomainMemoryDef *mem, const virDomainDef *def) { + unsigned long long thpSize; + switch (mem->model) { case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: if (!mem->nvdimmPath) { @@ -1442,6 +1445,42 @@ virDomainMemoryDefValidate(const virDomainMemoryDef *mem, _("virtio-pmem does not support NUMA nodes")); return -1; } + break; + + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (mem->requestedsize > mem->size) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("requested size must be smaller than @size")); + return -1; + } + + if (!VIR_IS_POW2(mem->blocksize)) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("block size must be a power of two")); + return -1; + } + + if (virHostMemGetTHPSize(&thpSize) < 0) { + /* We failed to get THP size, fall back to a sane default. On + * almost every architecture the size will be 2MiB, except for some + * funky arches like sparc and m68k. Use 2MiB and refine later if + * somebody complains. */ + thpSize = 2048;
FWIW, a Power 9 server uses 2MiB too:
$ cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size 2097152
I don't think you should worry about too much since x86 is the only arch that is supporting virtio-mem (for now).
So, in QEMU we use the following logic right now: get_block_size() { block_size = backend_pagesize(); if (block_size == NATIVE_PAGE_SIZE) return MAX(block_size, native_thp_size()); return MAX(block_size, 1 * MiB); } detect_thp_size() { thp_size = read_from_file(); if (!thp_size || thp_size > 16 * MiB) { if (s390x) return 1 * MiB; return 2 * MiB; } return thp_size; } Especially, we also cap big block sizes (esp. arm64 with currently 512 MiB THP), as we prefer flexibility at this point. So yes, on a x86-4 *host* we'll usually end up 2 MiB in QEMU. On arm64 it can be quite different.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
-- Thanks, David / dhildenb

On Fri, Jan 22, 2021 at 13:50:25 +0100, Michal Privoznik wrote:
The virtio-mem is paravirtualized mechanism of adding/removing memory to/from a VM. A virtio-mem-pci device is split into blocks of equal size which are then exposed (all or only a requested portion of them) to the guest kernel to use as regular memory. Therefore, the device has two important attributes:
1) block-size, which defines the size of a block 2) requested-size, which defines how much memory (in bytes) is the device requested to expose to the guest.
The 'block-size' is configured on command line and immutable throughout device's lifetime. The 'requested-size' can be set on the command line too, but also is adjustable via monitor. In fact, that is how management software places its requests to change the memory allocation. If it wants to give more memory to the guest it changes 'requested-size' to a bigger value, and if it wants to shrink guest memory it changes the 'requested-size' to a smaller value. Note, value of zero means that guest should release all memory offered by the device. Of course, guest has to cooperate. Therefore, there is a third attribute 'size' which is read only and reflects how much memory the guest still has. This can be different to 'requested-size', obviously. Because of name clash, I've named it 'actualsize' and it is dealt with in future commits (it is a runtime information anyway).
In the backend, memory for virtio-mem is backed by usual objects: memory-backend-{ram,file,memfd} and their size puts the cap on the amount of memory that a virtio-mem device can offer to a guest. But we are already able to express this info using <size/> under <target/>.
Therefore, we need only two more elements to cover 'block-size' and 'requested-size' attributes. This is the XML I've came up with:
<memory model='virtio-mem'> <source> <nodemask>1-3</nodemask> <pagesize unit='KiB'>2048</pagesize> </source> <target> <size unit='KiB'>2097152</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>1048576</requested> </target> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </memory>
I hope by now it is obvious that:
1) 'requested-size' must be an integer multiple of 'block-size', and 2) virtio-mem-pci device goes onto PCI bus and thus needs PCI address.
Then there is a limitation that the minimal 'block-size' is transparent huge page size (I'll leave this without explanation).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
[...]
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index af540391db..2938758ec2 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst
[...]
@@ -7274,7 +7286,8 @@ Example: usage of the memory devices Provide ``dimm`` to add a virtual DIMM module to the guest. :since:`Since 1.2.14` Provide ``nvdimm`` model that adds a Non-Volatile DIMM module. :since:`Since 3.2.0` Provide ``virtio-pmem`` model to add a paravirtualized - persistent memory device. :since:`Since 7.1.0` + persistent memory device. :since:`Since 7.1.0` Provide ``virtio-mem`` model + to add paravirtualized memory device. :since: `Since 7.1.0`
s/: `/:`/
``access`` An optional attribute ``access`` ( :since:`since 3.2.0` ) that provides
[...]
@@ -7366,6 +7380,17 @@ Example: usage of the memory devices so other backend types should use the ``readonly`` element. :since:`Since 5.0.0`
+ ``block`` + For ``virtio-mem`` only. + The size of an individual block, granularity of division of memory module. + Must be power of two and at least equal to size of a transparent hugepage + (2MiB on x84_64). The default is hypervisor dependant. + + ``requested`` + For ``virtio-mem`` only. + The total size of blocks exposed to the guest. Must respect ``block`` + granularity.
This is a bit misleading. I'd avoid using 'blocks'. [...]
diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index 649fc335ac..b5a0c09468 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -25,6 +25,7 @@ #include "virconftypes.h" #include "virlog.h" #include "virutil.h" +#include "virhostmem.h"
#define VIR_FROM_THIS VIR_FROM_DOMAIN
@@ -1389,6 +1390,8 @@ static int virDomainMemoryDefValidate(const virDomainMemoryDef *mem, const virDomainDef *def) { + unsigned long long thpSize; + switch (mem->model) { case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: if (!mem->nvdimmPath) { @@ -1442,6 +1445,42 @@ virDomainMemoryDefValidate(const virDomainMemoryDef *mem, _("virtio-pmem does not support NUMA nodes")); return -1; } + break; + + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + if (mem->requestedsize > mem->size) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("requested size must be smaller than @size")); + return -1; + }
so mem->size is the maximum size of the virtio-mem provided memory? I don't think that it's obvious from the docs.
+ + if (!VIR_IS_POW2(mem->blocksize)) { + virReportError(VIR_ERR_XML_DETAIL, "%s", + _("block size must be a power of two")); + return -1; + } +
[...]
diff --git a/tests/domaincapsmock.c b/tests/domaincapsmock.c index d81a898dc0..34a2f9ad81 100644 --- a/tests/domaincapsmock.c +++ b/tests/domaincapsmock.c @@ -17,6 +17,7 @@ #include <config.h>
#include "virhostcpu.h" +#include "virhostmem.h" #ifdef WITH_LIBXL # include "libxl/libxl_capabilities.h" #endif @@ -40,3 +41,11 @@ virHostCPUGetMicrocodeVersion(virArch hostArch G_GNUC_UNUSED) { return 0; } + +int +virHostMemGetTHPSize(unsigned long long *size) +{ + /* Pretend Transparent Huge Page size is 2MiB. */ + *size = 2048; + return 0; +}
Put the mocking along with the implementation or separately. It seems weird to implement this in a patch which adds a new device model.

Nothing special is happening here. All important changes were done when for 'virtio-pmem' (adjusting the code to put virtio memory on PCI bus, generating alias using qemuDomainDeviceAliasIndex(). The only bit that might look suspicious is no prealloc for virtio-mem. But if you think about it, the whole purpose of this device is to change amount of memory exposed to guest on the fly. There is no point in locking the whole backend in memory. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_alias.c | 9 +++- src/qemu/qemu_command.c | 12 ++++- ...mory-hotplug-virtio-mem.x86_64-latest.args | 49 +++++++++++++++++++ tests/qemuxml2argvtest.c | 1 + 4 files changed, 68 insertions(+), 3 deletions(-) create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args diff --git a/src/qemu/qemu_alias.c b/src/qemu/qemu_alias.c index 2a4e2393c8..e539bf9c8e 100644 --- a/src/qemu/qemu_alias.c +++ b/src/qemu/qemu_alias.c @@ -475,8 +475,11 @@ qemuDeviceMemoryGetAliasID(virDomainDefPtr def, size_t i; int maxidx = 0; - /* virtio-pmem goes onto PCI bus and thus DIMM address is not valid */ - if (!oldAlias && mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM) + /* virtio-pmem and virtio-mem go onto PCI bus and thus DIMM address is not + * valid */ + if (!oldAlias && + mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM && + mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM) return mem->info.addr.dimm.slot; for (i = 0; i < def->nmems; i++) { @@ -523,6 +526,8 @@ qemuAssignDeviceMemoryAlias(virDomainDefPtr def, prefix = "virtiopmem"; break; case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + prefix = "virtiomem"; + break; case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: default: diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 0a4a8f2646..4e50dbc0fd 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3094,7 +3094,9 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps, } else if (useHugepage) { if (qemuGetDomainHupageMemPath(priv->driver, def, pagesize, &memPath) < 0) return -1; - prealloc = true; + /* For virtio-mem backed by hugepages we don't need prealloc. */ + if (mem->model != VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM) + prealloc = true; } else { /* We can have both pagesize and mem source. If that's the case, * prefer hugepages as those are more specific. */ @@ -3304,6 +3306,9 @@ qemuBuildMemoryDeviceStr(const virDomainDef *def, break; case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + device = "virtio-mem-pci"; + break; + case VIR_DOMAIN_MEMORY_MODEL_NONE: case VIR_DOMAIN_MEMORY_MODEL_LAST: default: @@ -3320,6 +3325,11 @@ qemuBuildMemoryDeviceStr(const virDomainDef *def, if (mem->labelsize) virBufferAsprintf(&buf, "label-size=%llu,", mem->labelsize * 1024); + if (mem->blocksize) { + virBufferAsprintf(&buf, "block-size=%llu,", mem->blocksize * 1024); + virBufferAsprintf(&buf, "requested-size=%llu,", mem->requestedsize * 1024); + } + if (mem->uuid) { char uuidstr[VIR_UUID_STRING_BUFLEN]; diff --git a/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args b/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args new file mode 100644 index 0000000000..403e05865f --- /dev/null +++ b/tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args @@ -0,0 +1,49 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-QEMUGuest1 \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-i386 \ +-name guest=QEMUGuest1,debug-threads=on \ +-S \ +-object secret,id=masterKey0,format=raw,\ +file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \ +-machine pc,accel=kvm,usb=off,dump-guest-core=off \ +-cpu qemu64 \ +-m size=2095104k,slots=16,maxmem=1099511627776k \ +-overcommit mem-lock=off \ +-smp 2,sockets=2,dies=1,cores=1,threads=1 \ +-object memory-backend-ram,id=ram-node0,size=2145386496 \ +-numa node,nodeid=0,cpus=0-1,memdev=ram-node0 \ +-object memory-backend-ram,id=memvirtiomem0,size=1073741824 \ +-device virtio-mem-pci,node=0,block-size=2097152,requested-size=536870912,\ +memdev=memvirtiomem0,id=virtiomem0,bus=pci.0,addr=0x2 \ +-object memory-backend-file,id=memvirtiomem1,\ +mem-path=/dev/hugepages2M/libvirt/qemu/-1-QEMUGuest1,size=2147483648,\ +host-nodes=1-3,policy=bind \ +-device virtio-mem-pci,node=0,block-size=2097152,requested-size=1073741824,\ +memdev=memvirtiomem1,id=virtiomem1,bus=pci.0,addr=0x4 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=1729,server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-boot strict=on \ +-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ +-blockdev '{"driver":"host_device","filename":"/dev/HostVG/QEMUGuest1",\ +"node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"raw",\ +"file":"libvirt-1-storage"}' \ +-device ide-hd,bus=ide.0,unit=0,drive=libvirt-1-format,id=ide0-0-0,bootindex=1 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 \ +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,\ +resourcecontrol=deny \ +-msg timestamp=on diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index cf77224fc3..0e261fb321 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -3054,6 +3054,7 @@ mymain(void) QEMU_CAPS_DEVICE_NVDIMM, QEMU_CAPS_LAST); DO_TEST_CAPS_LATEST("memory-hotplug-virtio-pmem"); + DO_TEST_CAPS_LATEST("memory-hotplug-virtio-mem"); DO_TEST("machine-aeskeywrap-on-caps", QEMU_CAPS_AES_KEY_WRAP, -- 2.26.2

As advertised in one of previous commits, we want' to be able to change 'requested-size' attribute of virtio-mem on the fly. This commit does exactly that. Changing anything else is checked for and forbidden. Once guest has changed the allocation, QEMU emits an event which we will use to track the allocation. In the next commit. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/conf/domain_conf.c | 23 +++++ src/conf/domain_conf.h | 3 + src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 175 ++++++++++++++++++++++++++++++++++- src/qemu/qemu_hotplug.c | 18 ++++ src/qemu/qemu_hotplug.h | 5 + src/qemu/qemu_monitor.c | 13 +++ src/qemu/qemu_monitor.h | 4 + src/qemu/qemu_monitor_json.c | 15 +++ src/qemu/qemu_monitor_json.h | 5 + 10 files changed, 261 insertions(+), 1 deletion(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index f8c5a40b24..b6fe5e4436 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -17297,6 +17297,29 @@ virDomainMemoryFindInactiveByDef(virDomainDefPtr def, } +ssize_t +virDomainMemoryFindByDeviceInfo(virDomainDefPtr def, + virDomainDeviceInfoPtr info) +{ + size_t i; + + for (i = 0; i < def->nmems; i++) { + virDomainMemoryDefPtr tmp = def->mems[i]; + + if (!virDomainDeviceInfoAddressIsEqual(&tmp->info, info)) + continue; + + /* alias, if present */ + if (STRNEQ_NULLABLE(tmp->info.alias, info->alias)) + continue; + + return i; + } + + return -1; +} + + /** * virDomainMemoryInsert: * diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 5d89ecfe9d..ef52328a6f 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -3609,6 +3609,9 @@ int virDomainMemoryFindByDef(virDomainDefPtr def, virDomainMemoryDefPtr mem) int virDomainMemoryFindInactiveByDef(virDomainDefPtr def, virDomainMemoryDefPtr mem) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; +ssize_t virDomainMemoryFindByDeviceInfo(virDomainDefPtr dev, + virDomainDeviceInfoPtr info) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; int virDomainShmemDefInsert(virDomainDefPtr def, virDomainShmemDefPtr shmem) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 962d82680e..2e7f92bcfe 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -494,6 +494,7 @@ virDomainMemballoonModelTypeFromString; virDomainMemballoonModelTypeToString; virDomainMemoryDefFree; virDomainMemoryFindByDef; +virDomainMemoryFindByDeviceInfo; virDomainMemoryFindInactiveByDef; virDomainMemoryInsert; virDomainMemoryModelTypeToString; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ed966cf7e3..fadf0240fc 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7087,6 +7087,168 @@ qemuDomainChangeDiskLive(virDomainObjPtr vm, return 0; } + +static bool +qemuDomainChangeMemoryLiveValidateChange(const virDomainMemoryDef *oldDef, + const virDomainMemoryDef *newDef) +{ + /* The only thing that is allowed to change is 'requestedsize' for virtio + * model. */ + if (oldDef->model != newDef->model) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory model from '%s' to '%s'"), + virDomainMemoryModelTypeToString(oldDef->model), + virDomainMemoryModelTypeToString(newDef->model)); + return false; + } + + if (oldDef->access != newDef->access) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory access from '%s' to '%s'"), + virDomainMemoryAccessTypeToString(oldDef->access), + virDomainMemoryAccessTypeToString(newDef->access)); + return false; + } + + if (oldDef->discard != newDef->discard) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory discard from '%s' to '%s'"), + virTristateBoolTypeToString(oldDef->discard), + virTristateBoolTypeToString(newDef->discard)); + return false; + } + + if (!virBitmapEqual(oldDef->sourceNodes, + newDef->sourceNodes)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cannot modify memory source nodes")); + return false; + } + + if (oldDef->pagesize != newDef->pagesize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory pagesize from '%llu' to '%llu'"), + oldDef->pagesize, + newDef->pagesize); + return false; + } + + if (STRNEQ_NULLABLE(oldDef->nvdimmPath, newDef->nvdimmPath)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory path from '%s' to '%s'"), + NULLSTR(oldDef->nvdimmPath), + NULLSTR(newDef->nvdimmPath)); + return false; + } + + if (oldDef->alignsize != newDef->alignsize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory align size from '%llu' to '%llu'"), + oldDef->alignsize, newDef->alignsize); + return false; + } + + if (oldDef->nvdimmPmem != newDef->nvdimmPmem) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory pmem from '%d' to '%d'"), + oldDef->nvdimmPmem, newDef->nvdimmPmem); + return false; + } + + if (oldDef->targetNode != newDef->targetNode) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory targetNode from '%d' to '%d'"), + oldDef->targetNode, newDef->targetNode); + return false; + } + + if (oldDef->size != newDef->size) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory size from '%llu' to '%llu'"), + oldDef->size, newDef->size); + return false; + } + + if (oldDef->labelsize != newDef->labelsize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory label size from '%llu' to '%llu'"), + oldDef->labelsize, newDef->labelsize); + return false; + } + if (oldDef->blocksize != newDef->blocksize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory block size from '%llu' to '%llu'"), + oldDef->blocksize, newDef->blocksize); + return false; + } + + /* requestedsize can change */ + + if (oldDef->readonly != newDef->readonly) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cannot modify memory pmem flag")); + return false; + } + + if ((oldDef->uuid || newDef->uuid) && + !(oldDef->uuid && newDef->uuid && + memcmp(oldDef->uuid, newDef->uuid, VIR_UUID_BUFLEN) == 0)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cannot modify memory UUID")); + return false; + } + + switch (oldDef->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + break; + + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory of model '%s'"), + virDomainMemoryModelTypeToString(oldDef->model)); + return false; + break; + } + + return true; +} + + +static int +qemuDomainChangeMemoryLive(virQEMUDriverPtr driver G_GNUC_UNUSED, + virDomainObjPtr vm, + virDomainDeviceDefPtr dev) +{ + virDomainMemoryDefPtr newDef = dev->data.memory; + virDomainMemoryDefPtr oldDef = NULL; + ssize_t idx; + + idx = virDomainMemoryFindByDeviceInfo(vm->def, &dev->data.memory->info); + if (idx < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("memory '%s' not found"), dev->data.memory->info.alias); + return -1; + } + + oldDef = vm->def->mems[idx]; + + if (!qemuDomainChangeMemoryLiveValidateChange(oldDef, newDef)) + return -1; + + if (qemuDomainChangeMemoryRequestedSize(driver, vm, + newDef->info.alias, + newDef->requestedsize) < 0) + return -1; + + oldDef->requestedsize = newDef->requestedsize; + return 0; +} + + static int qemuDomainUpdateDeviceLive(virDomainObjPtr vm, virDomainDeviceDefPtr dev, @@ -7128,6 +7290,18 @@ qemuDomainUpdateDeviceLive(virDomainObjPtr vm, ret = qemuDomainChangeNet(driver, vm, dev); break; + case VIR_DOMAIN_DEVICE_MEMORY: + if ((idx = virDomainMemoryFindByDeviceInfo(vm->def, &dev->data.memory->info)) >= 0) { + oldDev.data.memory = vm->def->mems[idx]; + if (virDomainDefCompatibleDevice(vm->def, dev, &oldDev, + VIR_DOMAIN_DEVICE_ACTION_UPDATE, + true) < 0) + return -1; + } + + ret = qemuDomainChangeMemoryLive(driver, vm, dev); + break; + case VIR_DOMAIN_DEVICE_FS: case VIR_DOMAIN_DEVICE_INPUT: case VIR_DOMAIN_DEVICE_SOUND: @@ -7143,7 +7317,6 @@ qemuDomainUpdateDeviceLive(virDomainObjPtr vm, case VIR_DOMAIN_DEVICE_HOSTDEV: case VIR_DOMAIN_DEVICE_CONTROLLER: case VIR_DOMAIN_DEVICE_REDIRDEV: - case VIR_DOMAIN_DEVICE_MEMORY: case VIR_DOMAIN_DEVICE_CHR: case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_TPM: diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 882e5d2384..88213a53f7 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -6702,3 +6702,21 @@ qemuDomainSetVcpuInternal(virQEMUDriverPtr driver, virBitmapFree(livevcpus); return ret; } + + +int +qemuDomainChangeMemoryRequestedSize(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *alias, + unsigned long long requestedsize) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int rc; + + qemuDomainObjEnterMonitor(driver, vm); + rc = qemuMonitorChangeMemoryRequestedSize(priv->mon, alias, requestedsize); + if (qemuDomainObjExitMonitor(driver, vm) < 0) + return -1; + + return rc; +} diff --git a/src/qemu/qemu_hotplug.h b/src/qemu/qemu_hotplug.h index 6287c5b5e8..9e551a1f82 100644 --- a/src/qemu/qemu_hotplug.h +++ b/src/qemu/qemu_hotplug.h @@ -160,3 +160,8 @@ int qemuHotplugAttachDBusVMState(virQEMUDriverPtr driver, int qemuHotplugRemoveDBusVMState(virQEMUDriverPtr driver, virDomainObjPtr vm, qemuDomainAsyncJob asyncJob); + +int qemuDomainChangeMemoryRequestedSize(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *alias, + unsigned long long requestedsize); diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 09b8617097..6b3c1c2f5e 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -4713,3 +4713,16 @@ qemuMonitorTransactionBackup(virJSONValuePtr actions, return qemuMonitorJSONTransactionBackup(actions, device, jobname, target, bitmap, syncmode); } + + +int +qemuMonitorChangeMemoryRequestedSize(qemuMonitorPtr mon, + const char *alias, + unsigned long long requestedsize) +{ + VIR_DEBUG("alias=%s requestedsize=%llu", alias, requestedsize); + + QEMU_CHECK_MONITOR(mon); + + return qemuMonitorJSONChangeMemoryRequestedSize(mon, alias, requestedsize); +} diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index a07617ec28..443ddddf9b 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -1516,3 +1516,7 @@ qemuMonitorTransactionBackup(virJSONValuePtr actions, const char *target, const char *bitmap, qemuMonitorTransactionBackupSyncMode syncmode); + +int qemuMonitorChangeMemoryRequestedSize(qemuMonitorPtr mon, + const char *alias, + unsigned long long requestedsize); diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 8a75a2734e..0ab4264522 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -9452,3 +9452,18 @@ qemuMonitorJSONGetCPUMigratable(qemuMonitorPtr mon, return virJSONValueGetBoolean(virJSONValueObjectGet(reply, "return"), migratable); } + + +int +qemuMonitorJSONChangeMemoryRequestedSize(qemuMonitorPtr mon, + const char *alias, + unsigned long long requestedsize) +{ + g_autofree char *path = g_strdup_printf("/machine/peripheral/%s", alias); + qemuMonitorJSONObjectProperty prop = { + .type = QEMU_MONITOR_OBJECT_PROPERTY_ULONG, + .val.ul = requestedsize * 1024, /* monitor needs bytes */ + }; + + return qemuMonitorJSONSetObjectProperty(mon, path, "requested-size", &prop); +} diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index ba1531fee8..53af2b4022 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -705,3 +705,8 @@ int qemuMonitorJSONSetDBusVMStateIdList(qemuMonitorPtr mon, int qemuMonitorJSONGetCPUMigratable(qemuMonitorPtr mon, bool *migratable); + +int +qemuMonitorJSONChangeMemoryRequestedSize(qemuMonitorPtr mon, + const char *alias, + unsigned long long requestedsize); -- 2.26.2

On 1/22/21 9:50 AM, Michal Privoznik wrote:
As advertised in one of previous commits, we want' to be able to
Extra " ' " after want. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
change 'requested-size' attribute of virtio-mem on the fly. This commit does exactly that. Changing anything else is checked for and forbidden.
Once guest has changed the allocation, QEMU emits an event which we will use to track the allocation. In the next commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/conf/domain_conf.c | 23 +++++ src/conf/domain_conf.h | 3 + src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 175 ++++++++++++++++++++++++++++++++++- src/qemu/qemu_hotplug.c | 18 ++++ src/qemu/qemu_hotplug.h | 5 + src/qemu/qemu_monitor.c | 13 +++ src/qemu/qemu_monitor.h | 4 + src/qemu/qemu_monitor_json.c | 15 +++ src/qemu/qemu_monitor_json.h | 5 + 10 files changed, 261 insertions(+), 1 deletion(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index f8c5a40b24..b6fe5e4436 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -17297,6 +17297,29 @@ virDomainMemoryFindInactiveByDef(virDomainDefPtr def, }
+ssize_t +virDomainMemoryFindByDeviceInfo(virDomainDefPtr def, + virDomainDeviceInfoPtr info) +{ + size_t i; + + for (i = 0; i < def->nmems; i++) { + virDomainMemoryDefPtr tmp = def->mems[i]; + + if (!virDomainDeviceInfoAddressIsEqual(&tmp->info, info)) + continue; + + /* alias, if present */ + if (STRNEQ_NULLABLE(tmp->info.alias, info->alias)) + continue; + + return i; + } + + return -1; +} + + /** * virDomainMemoryInsert: * diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 5d89ecfe9d..ef52328a6f 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -3609,6 +3609,9 @@ int virDomainMemoryFindByDef(virDomainDefPtr def, virDomainMemoryDefPtr mem) int virDomainMemoryFindInactiveByDef(virDomainDefPtr def, virDomainMemoryDefPtr mem) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; +ssize_t virDomainMemoryFindByDeviceInfo(virDomainDefPtr dev, + virDomainDeviceInfoPtr info) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT;
int virDomainShmemDefInsert(virDomainDefPtr def, virDomainShmemDefPtr shmem) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 962d82680e..2e7f92bcfe 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -494,6 +494,7 @@ virDomainMemballoonModelTypeFromString; virDomainMemballoonModelTypeToString; virDomainMemoryDefFree; virDomainMemoryFindByDef; +virDomainMemoryFindByDeviceInfo; virDomainMemoryFindInactiveByDef; virDomainMemoryInsert; virDomainMemoryModelTypeToString; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ed966cf7e3..fadf0240fc 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7087,6 +7087,168 @@ qemuDomainChangeDiskLive(virDomainObjPtr vm, return 0; }
+ +static bool +qemuDomainChangeMemoryLiveValidateChange(const virDomainMemoryDef *oldDef, + const virDomainMemoryDef *newDef) +{ + /* The only thing that is allowed to change is 'requestedsize' for virtio + * model. */ + if (oldDef->model != newDef->model) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory model from '%s' to '%s'"), + virDomainMemoryModelTypeToString(oldDef->model), + virDomainMemoryModelTypeToString(newDef->model)); + return false; + } + + if (oldDef->access != newDef->access) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory access from '%s' to '%s'"), + virDomainMemoryAccessTypeToString(oldDef->access), + virDomainMemoryAccessTypeToString(newDef->access)); + return false; + } + + if (oldDef->discard != newDef->discard) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory discard from '%s' to '%s'"), + virTristateBoolTypeToString(oldDef->discard), + virTristateBoolTypeToString(newDef->discard)); + return false; + } + + if (!virBitmapEqual(oldDef->sourceNodes, + newDef->sourceNodes)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cannot modify memory source nodes")); + return false; + } + + if (oldDef->pagesize != newDef->pagesize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory pagesize from '%llu' to '%llu'"), + oldDef->pagesize, + newDef->pagesize); + return false; + } + + if (STRNEQ_NULLABLE(oldDef->nvdimmPath, newDef->nvdimmPath)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory path from '%s' to '%s'"), + NULLSTR(oldDef->nvdimmPath), + NULLSTR(newDef->nvdimmPath)); + return false; + } + + if (oldDef->alignsize != newDef->alignsize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory align size from '%llu' to '%llu'"), + oldDef->alignsize, newDef->alignsize); + return false; + } + + if (oldDef->nvdimmPmem != newDef->nvdimmPmem) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory pmem from '%d' to '%d'"), + oldDef->nvdimmPmem, newDef->nvdimmPmem); + return false; + } + + if (oldDef->targetNode != newDef->targetNode) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory targetNode from '%d' to '%d'"), + oldDef->targetNode, newDef->targetNode); + return false; + } + + if (oldDef->size != newDef->size) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory size from '%llu' to '%llu'"), + oldDef->size, newDef->size); + return false; + } + + if (oldDef->labelsize != newDef->labelsize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory label size from '%llu' to '%llu'"), + oldDef->labelsize, newDef->labelsize); + return false; + } + if (oldDef->blocksize != newDef->blocksize) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory block size from '%llu' to '%llu'"), + oldDef->blocksize, newDef->blocksize); + return false; + } + + /* requestedsize can change */ + + if (oldDef->readonly != newDef->readonly) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cannot modify memory pmem flag")); + return false; + } + + if ((oldDef->uuid || newDef->uuid) && + !(oldDef->uuid && newDef->uuid && + memcmp(oldDef->uuid, newDef->uuid, VIR_UUID_BUFLEN) == 0)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cannot modify memory UUID")); + return false; + } + + switch (oldDef->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + break; + + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("cannot modify memory of model '%s'"), + virDomainMemoryModelTypeToString(oldDef->model)); + return false; + break; + } + + return true; +} + + +static int +qemuDomainChangeMemoryLive(virQEMUDriverPtr driver G_GNUC_UNUSED, + virDomainObjPtr vm, + virDomainDeviceDefPtr dev) +{ + virDomainMemoryDefPtr newDef = dev->data.memory; + virDomainMemoryDefPtr oldDef = NULL; + ssize_t idx; + + idx = virDomainMemoryFindByDeviceInfo(vm->def, &dev->data.memory->info); + if (idx < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("memory '%s' not found"), dev->data.memory->info.alias); + return -1; + } + + oldDef = vm->def->mems[idx]; + + if (!qemuDomainChangeMemoryLiveValidateChange(oldDef, newDef)) + return -1; + + if (qemuDomainChangeMemoryRequestedSize(driver, vm, + newDef->info.alias, + newDef->requestedsize) < 0) + return -1; + + oldDef->requestedsize = newDef->requestedsize; + return 0; +} + + static int qemuDomainUpdateDeviceLive(virDomainObjPtr vm, virDomainDeviceDefPtr dev, @@ -7128,6 +7290,18 @@ qemuDomainUpdateDeviceLive(virDomainObjPtr vm, ret = qemuDomainChangeNet(driver, vm, dev); break;
+ case VIR_DOMAIN_DEVICE_MEMORY: + if ((idx = virDomainMemoryFindByDeviceInfo(vm->def, &dev->data.memory->info)) >= 0) { + oldDev.data.memory = vm->def->mems[idx]; + if (virDomainDefCompatibleDevice(vm->def, dev, &oldDev, + VIR_DOMAIN_DEVICE_ACTION_UPDATE, + true) < 0) + return -1; + } + + ret = qemuDomainChangeMemoryLive(driver, vm, dev); + break; + case VIR_DOMAIN_DEVICE_FS: case VIR_DOMAIN_DEVICE_INPUT: case VIR_DOMAIN_DEVICE_SOUND: @@ -7143,7 +7317,6 @@ qemuDomainUpdateDeviceLive(virDomainObjPtr vm, case VIR_DOMAIN_DEVICE_HOSTDEV: case VIR_DOMAIN_DEVICE_CONTROLLER: case VIR_DOMAIN_DEVICE_REDIRDEV: - case VIR_DOMAIN_DEVICE_MEMORY: case VIR_DOMAIN_DEVICE_CHR: case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_TPM: diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 882e5d2384..88213a53f7 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -6702,3 +6702,21 @@ qemuDomainSetVcpuInternal(virQEMUDriverPtr driver, virBitmapFree(livevcpus); return ret; } + + +int +qemuDomainChangeMemoryRequestedSize(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *alias, + unsigned long long requestedsize) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int rc; + + qemuDomainObjEnterMonitor(driver, vm); + rc = qemuMonitorChangeMemoryRequestedSize(priv->mon, alias, requestedsize); + if (qemuDomainObjExitMonitor(driver, vm) < 0) + return -1; + + return rc; +} diff --git a/src/qemu/qemu_hotplug.h b/src/qemu/qemu_hotplug.h index 6287c5b5e8..9e551a1f82 100644 --- a/src/qemu/qemu_hotplug.h +++ b/src/qemu/qemu_hotplug.h @@ -160,3 +160,8 @@ int qemuHotplugAttachDBusVMState(virQEMUDriverPtr driver, int qemuHotplugRemoveDBusVMState(virQEMUDriverPtr driver, virDomainObjPtr vm, qemuDomainAsyncJob asyncJob); + +int qemuDomainChangeMemoryRequestedSize(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *alias, + unsigned long long requestedsize); diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 09b8617097..6b3c1c2f5e 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -4713,3 +4713,16 @@ qemuMonitorTransactionBackup(virJSONValuePtr actions, return qemuMonitorJSONTransactionBackup(actions, device, jobname, target, bitmap, syncmode); } + + +int +qemuMonitorChangeMemoryRequestedSize(qemuMonitorPtr mon, + const char *alias, + unsigned long long requestedsize) +{ + VIR_DEBUG("alias=%s requestedsize=%llu", alias, requestedsize); + + QEMU_CHECK_MONITOR(mon); + + return qemuMonitorJSONChangeMemoryRequestedSize(mon, alias, requestedsize); +} diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index a07617ec28..443ddddf9b 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -1516,3 +1516,7 @@ qemuMonitorTransactionBackup(virJSONValuePtr actions, const char *target, const char *bitmap, qemuMonitorTransactionBackupSyncMode syncmode); + +int qemuMonitorChangeMemoryRequestedSize(qemuMonitorPtr mon, + const char *alias, + unsigned long long requestedsize); diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 8a75a2734e..0ab4264522 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -9452,3 +9452,18 @@ qemuMonitorJSONGetCPUMigratable(qemuMonitorPtr mon, return virJSONValueGetBoolean(virJSONValueObjectGet(reply, "return"), migratable); } + + +int +qemuMonitorJSONChangeMemoryRequestedSize(qemuMonitorPtr mon, + const char *alias, + unsigned long long requestedsize) +{ + g_autofree char *path = g_strdup_printf("/machine/peripheral/%s", alias); + qemuMonitorJSONObjectProperty prop = { + .type = QEMU_MONITOR_OBJECT_PROPERTY_ULONG, + .val.ul = requestedsize * 1024, /* monitor needs bytes */ + }; + + return qemuMonitorJSONSetObjectProperty(mon, path, "requested-size", &prop); +} diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index ba1531fee8..53af2b4022 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -705,3 +705,8 @@ int qemuMonitorJSONSetDBusVMStateIdList(qemuMonitorPtr mon, int qemuMonitorJSONGetCPUMigratable(qemuMonitorPtr mon, bool *migratable); + +int +qemuMonitorJSONChangeMemoryRequestedSize(qemuMonitorPtr mon, + const char *alias, + unsigned long long requestedsize);

On Fri, Jan 22, 2021 at 13:50:27 +0100, Michal Privoznik wrote:
As advertised in one of previous commits, we want' to be able to change 'requested-size' attribute of virtio-mem on the fly. This commit does exactly that. Changing anything else is checked for and forbidden.
Once guest has changed the allocation, QEMU emits an event which we will use to track the allocation. In the next commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
[...]
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index f8c5a40b24..b6fe5e4436 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -17297,6 +17297,29 @@ virDomainMemoryFindInactiveByDef(virDomainDefPtr def, }
+ssize_t +virDomainMemoryFindByDeviceInfo(virDomainDefPtr def, + virDomainDeviceInfoPtr info) +{ + size_t i; + + for (i = 0; i < def->nmems; i++) { + virDomainMemoryDefPtr tmp = def->mems[i]; + + if (!virDomainDeviceInfoAddressIsEqual(&tmp->info, info)) + continue; + + /* alias, if present */
This comment makes it look ...
+ if (STRNEQ_NULLABLE(tmp->info.alias, info->alias))
... as if you wanted to check alias only optionally ... when provided by the user, but STREQ_NULLABLE will reject it if user doesn't provide it while definition does. The intentions are also unclear because the expected function semantics are undocumented.

As advertised in previous commit, this event is delivered to us when virtio-mem module changes the allocation inside the guest. It comes with one attribute - size - which holds the new size of the virtio-mem (well, allocated size), in bytes. Mind you, this is not necessarily the same number as 'requested size'. It almost certainly will be when sizing the memory up, but it might not be when sizing the memory down - the guest kernel might be unable to free some blocks. This actual size is reported in the domain XML as an output element only. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/formatdomain.rst | 7 ++++++ docs/schemas/domaincommon.rng | 5 +++++ src/conf/domain_conf.c | 24 ++++++++++++++++++-- src/conf/domain_conf.h | 7 ++++++ src/libvirt_private.syms | 1 + src/qemu/qemu_domain.c | 3 +++ src/qemu/qemu_domain.h | 1 + src/qemu/qemu_driver.c | 33 ++++++++++++++++++++++++++++ src/qemu/qemu_monitor.c | 24 ++++++++++++++++++++ src/qemu/qemu_monitor.h | 20 +++++++++++++++++ src/qemu/qemu_monitor_json.c | 24 ++++++++++++++++++++ src/qemu/qemu_process.c | 41 +++++++++++++++++++++++++++++++++++ 12 files changed, 188 insertions(+), 2 deletions(-) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 2938758ec2..3088da1243 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -7277,6 +7277,7 @@ Example: usage of the memory devices <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>1048576</requested> + <actual unit='KiB'>524288</requested> </target> </memory> </devices> @@ -7391,6 +7392,12 @@ Example: usage of the memory devices The total size of blocks exposed to the guest. Must respect ``block`` granularity. + ``actual`` + Active XML for ``virtio-mem`` model may contain ``actual`` element that + reflects the actual size of the corresponding virtio memory device. The + element is formatted into live XML and never parsed, i.e. it is + output-only element. + :anchor:`<a id="elementsIommu"/>` IOMMU devices diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 5bc120073e..6d5c983bc6 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -6115,6 +6115,11 @@ <ref name="scaledInteger"/> </element> </optional> + <optional> + <element name="actual"> + <ref name="scaledInteger"/> + </element> + </optional> <optional> <element name="label"> <element name="size"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index b6fe5e4436..74c897b53e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -17320,6 +17320,21 @@ virDomainMemoryFindByDeviceInfo(virDomainDefPtr def, } +ssize_t +virDomainMemoryFindByDeviceAlias(virDomainDefPtr def, + const char *alias) +{ + size_t i; + + for (i = 0; i < def->nmems; i++) { + if (STREQ_NULLABLE(def->mems[i]->info.alias, alias)) + return i; + } + + return -1; +} + + /** * virDomainMemoryInsert: * @@ -26611,7 +26626,8 @@ virDomainMemorySourceDefFormat(virBufferPtr buf, static void virDomainMemoryTargetDefFormat(virBufferPtr buf, - virDomainMemoryDefPtr def) + virDomainMemoryDefPtr def, + unsigned int flags) { g_auto(virBuffer) childBuf = VIR_BUFFER_INIT_CHILD(buf); @@ -26633,6 +26649,10 @@ virDomainMemoryTargetDefFormat(virBufferPtr buf, virBufferAsprintf(&childBuf, "<requested unit='KiB'>%llu</requested>\n", def->requestedsize); + if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) { + virBufferAsprintf(&childBuf, "<actual unit='KiB'>%llu</actual>\n", + def->actualsize); + } } virXMLFormatElement(buf, "target", NULL, &childBuf); @@ -26665,7 +26685,7 @@ virDomainMemoryDefFormat(virBufferPtr buf, if (virDomainMemorySourceDefFormat(buf, def) < 0) return -1; - virDomainMemoryTargetDefFormat(buf, def); + virDomainMemoryTargetDefFormat(buf, def, flags); virDomainDeviceInfoFormat(buf, &def->info, flags); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index ef52328a6f..5f4a455963 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2331,6 +2331,9 @@ struct _virDomainMemoryDef { unsigned long long labelsize; /* kibibytes; valid only for NVDIMM */ unsigned long long blocksize; /* kibibytes; valid only for VIRTIO_MEM */ unsigned long long requestedsize; /* kibibytes; valid only for VIRTIO_MEM */ + unsigned long long actualsize; /* kibibytes, valid for VIRTIO_MEM and + active domain only, only to report never + parse */ bool readonly; /* valid only for NVDIMM */ /* required for QEMU NVDIMM ppc64 support */ @@ -3613,6 +3616,10 @@ ssize_t virDomainMemoryFindByDeviceInfo(virDomainDefPtr dev, virDomainDeviceInfoPtr info) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; +ssize_t virDomainMemoryFindByDeviceAlias(virDomainDefPtr def, + const char *alias) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; + int virDomainShmemDefInsert(virDomainDefPtr def, virDomainShmemDefPtr shmem) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; bool virDomainShmemDefEquals(virDomainShmemDefPtr src, virDomainShmemDefPtr dst) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 2e7f92bcfe..f3556cee82 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -494,6 +494,7 @@ virDomainMemballoonModelTypeFromString; virDomainMemballoonModelTypeToString; virDomainMemoryDefFree; virDomainMemoryFindByDef; +virDomainMemoryFindByDeviceAlias; virDomainMemoryFindByDeviceInfo; virDomainMemoryFindInactiveByDef; virDomainMemoryInsert; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 233adc996f..783756b191 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -10771,6 +10771,9 @@ qemuProcessEventFree(struct qemuProcessEvent *event) case QEMU_PROCESS_EVENT_JOB_STATUS_CHANGE: virObjectUnref(event->data); break; + case QEMU_PROCESS_EVENT_MEMORY_DEVICE_SIZE_CHANGE: + qemuMonitorMemoryDeviceSizeChangeFree(event->data); + break; case QEMU_PROCESS_EVENT_PR_DISCONNECT: case QEMU_PROCESS_EVENT_LAST: break; diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 154339ef8f..79ccccebcb 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -441,6 +441,7 @@ typedef enum { QEMU_PROCESS_EVENT_PR_DISCONNECT, QEMU_PROCESS_EVENT_RDMA_GID_STATUS_CHANGED, QEMU_PROCESS_EVENT_GUEST_CRASHLOADED, + QEMU_PROCESS_EVENT_MEMORY_DEVICE_SIZE_CHANGE, QEMU_PROCESS_EVENT_LAST } qemuProcessEventType; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index fadf0240fc..d64eb4d399 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4273,6 +4273,36 @@ processGuestCrashloadedEvent(virQEMUDriverPtr driver, } +static void +processMemoryDeviceSizeChange(virQEMUDriverPtr driver, + virDomainObjPtr vm, + qemuMonitorMemoryDeviceSizeChangePtr info) +{ + virDomainMemoryDefPtr mem = NULL; + ssize_t idx; + + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) + return; + + if (!virDomainObjIsActive(vm)) { + VIR_DEBUG("Domain is not running"); + goto endjob; + } + + idx = virDomainMemoryFindByDeviceAlias(vm->def, info->devAlias); + if (idx < 0) { + VIR_DEBUG("Memory device '%s' not found", info->devAlias); + goto endjob; + } + + mem = vm->def->mems[idx]; + mem->actualsize = VIR_DIV_UP(info->size, 1024); + + endjob: + qemuDomainObjEndJob(driver, vm); +} + + static void qemuProcessEventHandler(void *data, void *opaque) { struct qemuProcessEvent *processEvent = data; @@ -4322,6 +4352,9 @@ static void qemuProcessEventHandler(void *data, void *opaque) case QEMU_PROCESS_EVENT_GUEST_CRASHLOADED: processGuestCrashloadedEvent(driver, vm); break; + case QEMU_PROCESS_EVENT_MEMORY_DEVICE_SIZE_CHANGE: + processMemoryDeviceSizeChange(driver, vm, processEvent->data); + break; case QEMU_PROCESS_EVENT_LAST: break; } diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 6b3c1c2f5e..59f0265804 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -1437,6 +1437,20 @@ qemuMonitorEmitSpiceMigrated(qemuMonitorPtr mon) } +int +qemuMonitorEmitMemoryDeviceSizeChange(qemuMonitorPtr mon, + const char *devAlias, + unsigned long long size) +{ + int ret = -1; + VIR_DEBUG("mon=%p, devAlias='%s', size=%llu", mon, devAlias, size); + + QEMU_MONITOR_CALLBACK(mon, ret, domainMemoryDeviceSizeChange, mon->vm, devAlias, size); + + return ret; +} + + int qemuMonitorEmitMemoryFailure(qemuMonitorPtr mon, qemuMonitorEventMemoryFailurePtr mfp) @@ -4400,6 +4414,16 @@ qemuMonitorEventRdmaGidStatusFree(qemuMonitorRdmaGidStatusPtr info) } +void +qemuMonitorMemoryDeviceSizeChangeFree(qemuMonitorMemoryDeviceSizeChangePtr info) +{ + if (!info) + return; + + VIR_FREE(info->devAlias); +} + + int qemuMonitorSetWatchdogAction(qemuMonitorPtr mon, const char *action) diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 443ddddf9b..00428c14d2 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -108,6 +108,14 @@ struct _qemuMonitorRdmaGidStatus { }; +typedef struct _qemuMonitorMemoryDeviceSizeChange qemuMonitorMemoryDeviceSizeChange; +typedef qemuMonitorMemoryDeviceSizeChange *qemuMonitorMemoryDeviceSizeChangePtr; +struct _qemuMonitorMemoryDeviceSizeChange { + char *devAlias; + unsigned long long size; +}; + + typedef enum { QEMU_MONITOR_JOB_TYPE_UNKNOWN, /* internal value, not exposed by qemu */ QEMU_MONITOR_JOB_TYPE_COMMIT, @@ -153,6 +161,7 @@ struct _qemuMonitorJobInfo { char *qemuMonitorGuestPanicEventInfoFormatMsg(qemuMonitorEventPanicInfoPtr info); void qemuMonitorEventPanicInfoFree(qemuMonitorEventPanicInfoPtr info); void qemuMonitorEventRdmaGidStatusFree(qemuMonitorRdmaGidStatusPtr info); +void qemuMonitorMemoryDeviceSizeChangeFree(qemuMonitorMemoryDeviceSizeChangePtr info); typedef void (*qemuMonitorDestroyCallback)(qemuMonitorPtr mon, virDomainObjPtr vm, @@ -374,6 +383,12 @@ typedef int (*qemuMonitorDomainMemoryFailureCallback)(qemuMonitorPtr mon, qemuMonitorEventMemoryFailurePtr mfp, void *opaque); +typedef int (*qemuMonitorDomainMemoryDeviceSizeChange)(qemuMonitorPtr mon, + virDomainObjPtr vm, + const char *alias, + unsigned long long size, + void *opaque); + typedef struct _qemuMonitorCallbacks qemuMonitorCallbacks; typedef qemuMonitorCallbacks *qemuMonitorCallbacksPtr; struct _qemuMonitorCallbacks { @@ -411,6 +426,7 @@ struct _qemuMonitorCallbacks { qemuMonitorDomainRdmaGidStatusChangedCallback domainRdmaGidStatusChanged; qemuMonitorDomainGuestCrashloadedCallback domainGuestCrashloaded; qemuMonitorDomainMemoryFailureCallback domainMemoryFailure; + qemuMonitorDomainMemoryDeviceSizeChange domainMemoryDeviceSizeChange; }; qemuMonitorPtr qemuMonitorOpen(virDomainObjPtr vm, @@ -507,6 +523,10 @@ int qemuMonitorEmitSerialChange(qemuMonitorPtr mon, bool connected); int qemuMonitorEmitSpiceMigrated(qemuMonitorPtr mon); +int qemuMonitorEmitMemoryDeviceSizeChange(qemuMonitorPtr mon, + const char *devAlias, + unsigned long long size); + int qemuMonitorEmitMemoryFailure(qemuMonitorPtr mon, qemuMonitorEventMemoryFailurePtr mfp); diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 0ab4264522..b1dc527e8b 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -113,6 +113,7 @@ static void qemuMonitorJSONHandleDumpCompleted(qemuMonitorPtr mon, virJSONValueP static void qemuMonitorJSONHandlePRManagerStatusChanged(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleRdmaGidStatusChanged(qemuMonitorPtr mon, virJSONValuePtr data); static void qemuMonitorJSONHandleMemoryFailure(qemuMonitorPtr mon, virJSONValuePtr data); +static void qemuMonitorJSONHandleMemoryDeviceSizeChange(qemuMonitorPtr mon, virJSONValuePtr data); typedef struct { const char *type; @@ -133,6 +134,7 @@ static qemuEventHandler eventHandlers[] = { { "GUEST_CRASHLOADED", qemuMonitorJSONHandleGuestCrashloaded, }, { "GUEST_PANICKED", qemuMonitorJSONHandleGuestPanic, }, { "JOB_STATUS_CHANGE", qemuMonitorJSONHandleJobStatusChange, }, + { "MEMORY_DEVICE_SIZE_CHANGE", qemuMonitorJSONHandleMemoryDeviceSizeChange, }, { "MEMORY_FAILURE", qemuMonitorJSONHandleMemoryFailure, }, { "MIGRATION", qemuMonitorJSONHandleMigrationStatus, }, { "MIGRATION_PASS", qemuMonitorJSONHandleMigrationPass, }, @@ -1333,6 +1335,28 @@ qemuMonitorJSONHandleSpiceMigrated(qemuMonitorPtr mon, } +static void +qemuMonitorJSONHandleMemoryDeviceSizeChange(qemuMonitorPtr mon, + virJSONValuePtr data) +{ + const char *name; + unsigned long long size; + + if (!(name = virJSONValueObjectGetString(data, "id"))) { + VIR_WARN("missing device alias in MEMORY_DEVICE_SIZE_CHANGE event"); + return; + } + + if (virJSONValueObjectGetNumberUlong(data, "size", &size) < 0) { + VIR_WARN("missing new size for '%s' in MEMORY_DEVICE_SIZE_CHANGE event", name); + return; + } + + + qemuMonitorEmitMemoryDeviceSizeChange(mon, name, size); +} + + static void qemuMonitorJSONHandleMemoryFailure(qemuMonitorPtr mon, virJSONValuePtr data) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index f87a3c0f60..37cc6a28e5 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1937,6 +1937,46 @@ qemuProcessHandleMemoryFailure(qemuMonitorPtr mon G_GNUC_UNUSED, } +static int +qemuProcessHandleMemoryDeviceSizeChange(qemuMonitorPtr mon G_GNUC_UNUSED, + virDomainObjPtr vm, + const char *devAlias, + unsigned long long size, + void *opaque) +{ + virQEMUDriverPtr driver = opaque; + struct qemuProcessEvent *processEvent = NULL; + qemuMonitorMemoryDeviceSizeChangePtr info = NULL; + int ret = -1; + + virObjectLock(vm); + + VIR_DEBUG("Memory device '%s' changed size to '%llu' in domain '%s'", + devAlias, size, vm->def->name); + + info = g_new0(qemuMonitorMemoryDeviceSizeChange, 1); + info->devAlias = g_strdup(devAlias); + info->size = size; + + processEvent = g_new0(struct qemuProcessEvent, 1); + processEvent->eventType = QEMU_PROCESS_EVENT_MEMORY_DEVICE_SIZE_CHANGE; + processEvent->vm = virObjectRef(vm); + processEvent->data = g_steal_pointer(&info); + + if (virThreadPoolSendJob(driver->workerPool, 0, processEvent) < 0) { + qemuProcessEventFree(processEvent); + virObjectUnref(vm); + goto cleanup; + } + + ret = 0; + cleanup: + qemuMonitorMemoryDeviceSizeChangeFree(info); + virObjectUnlock(vm); + return ret; +} + + static qemuMonitorCallbacks monitorCallbacks = { .eofNotify = qemuProcessHandleMonitorEOF, .errorNotify = qemuProcessHandleMonitorError, @@ -1970,6 +2010,7 @@ static qemuMonitorCallbacks monitorCallbacks = { .domainRdmaGidStatusChanged = qemuProcessHandleRdmaGidStatusChanged, .domainGuestCrashloaded = qemuProcessHandleGuestCrashloaded, .domainMemoryFailure = qemuProcessHandleMemoryFailure, + .domainMemoryDeviceSizeChange = qemuProcessHandleMemoryDeviceSizeChange, }; static void -- 2.26.2

On Fri, Jan 22, 2021 at 13:50:28 +0100, Michal Privoznik wrote:
As advertised in previous commit, this event is delivered to us when virtio-mem module changes the allocation inside the guest. It comes with one attribute - size - which holds the new size of the virtio-mem (well, allocated size), in bytes. Mind you, this is not necessarily the same number as 'requested size'. It almost certainly will be when sizing the memory up, but it might not be when sizing the memory down - the guest kernel might be unable to free some blocks.
This actual size is reported in the domain XML as an output element only.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
[...]
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index b6fe5e4436..74c897b53e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -17320,6 +17320,21 @@ virDomainMemoryFindByDeviceInfo(virDomainDefPtr def, }
+ssize_t +virDomainMemoryFindByDeviceAlias(virDomainDefPtr def, + const char *alias)
This is finding the memory index rather than the memory object itself but the name seems to suggest otherwise.
+{ + size_t i; + + for (i = 0; i < def->nmems; i++) { + if (STREQ_NULLABLE(def->mems[i]->info.alias, alias)) + return i; + } + + return -1; +} + + /** * virDomainMemoryInsert: *
[...]
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index fadf0240fc..d64eb4d399 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4273,6 +4273,36 @@ processGuestCrashloadedEvent(virQEMUDriverPtr driver, }
+static void +processMemoryDeviceSizeChange(virQEMUDriverPtr driver, + virDomainObjPtr vm, + qemuMonitorMemoryDeviceSizeChangePtr info) +{ + virDomainMemoryDefPtr mem = NULL; + ssize_t idx; + + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) + return; + + if (!virDomainObjIsActive(vm)) { + VIR_DEBUG("Domain is not running"); + goto endjob; + } + + idx = virDomainMemoryFindByDeviceAlias(vm->def, info->devAlias); + if (idx < 0) { + VIR_DEBUG("Memory device '%s' not found", info->devAlias); + goto endjob; + } + + mem = vm->def->mems[idx];
Why didn't you add a helper which doesn't require doing this weird lookup instead?
+ mem->actualsize = VIR_DIV_UP(info->size, 1024); + + endjob: + qemuDomainObjEndJob(driver, vm); +} + + static void qemuProcessEventHandler(void *data, void *opaque) { struct qemuProcessEvent *processEvent = data;
[...]
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 6b3c1c2f5e..59f0265804 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c
[...]
@@ -4400,6 +4414,16 @@ qemuMonitorEventRdmaGidStatusFree(qemuMonitorRdmaGidStatusPtr info) }
+void +qemuMonitorMemoryDeviceSizeChangeFree(qemuMonitorMemoryDeviceSizeChangePtr info) +{ + if (!info) + return; + + VIR_FREE(info->devAlias);
Missing free of the whole struct? Also preferably use g_free nowadays. [...]
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index f87a3c0f60..37cc6a28e5 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1937,6 +1937,46 @@ qemuProcessHandleMemoryFailure(qemuMonitorPtr mon G_GNUC_UNUSED, }
+static int +qemuProcessHandleMemoryDeviceSizeChange(qemuMonitorPtr mon G_GNUC_UNUSED, + virDomainObjPtr vm, + const char *devAlias, + unsigned long long size, + void *opaque) +{ + virQEMUDriverPtr driver = opaque; + struct qemuProcessEvent *processEvent = NULL; + qemuMonitorMemoryDeviceSizeChangePtr info = NULL; + int ret = -1; + + virObjectLock(vm); + + VIR_DEBUG("Memory device '%s' changed size to '%llu' in domain '%s'", + devAlias, size, vm->def->name); + + info = g_new0(qemuMonitorMemoryDeviceSizeChange, 1); + info->devAlias = g_strdup(devAlias); + info->size = size; + + processEvent = g_new0(struct qemuProcessEvent, 1); + processEvent->eventType = QEMU_PROCESS_EVENT_MEMORY_DEVICE_SIZE_CHANGE; + processEvent->vm = virObjectRef(vm); + processEvent->data = g_steal_pointer(&info); + + if (virThreadPoolSendJob(driver->workerPool, 0, processEvent) < 0) { + qemuProcessEventFree(processEvent); + virObjectUnref(vm); + goto cleanup; + } + + ret = 0; + cleanup: + qemuMonitorMemoryDeviceSizeChangeFree(info);
As noted above, info is leaked since the function doesn't fre it.
+ virObjectUnlock(vm); + return ret; +}
A side note, do we get this event e.g. on VM reset? If no we need to wire up the reset of 'actual' size in such case as it would wrongly suggest that the VM is using it and it may not even get to loading the driver.

[...]
A side note, do we get this event e.g. on VM reset? If no we need to wire up the reset of 'actual' size in such case as it would wrongly suggest that the VM is using it and it may not even get to loading the driver.
QEMU fires the event whenever the value changes - including during system resets. -- Thanks, David / dhildenb

If the QEMU driver restarts it loses the track of the actual size of virtio-mem (because it's runtime type of information and thus not stored in XML) and therefore, we have to refresh it when reconnecting to the domain monitor. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 37 ++++++++++++++++++++---- src/qemu/qemu_monitor.h | 3 ++ src/qemu/qemu_monitor_json.c | 55 +++++++++++++++++++++--------------- src/qemu/qemu_process.c | 3 ++ 4 files changed, 70 insertions(+), 28 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 783756b191..5c40c02180 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -7992,9 +7992,21 @@ qemuDomainUpdateMemoryDeviceInfo(virQEMUDriverPtr driver, return -1; } - /* if qemu doesn't support the info request, just carry on */ - if (rc == -2) + /* If qemu doesn't support the info request, just carry on, unless we + * really need it. */ + if (rc == -2) { + for (i = 0; i < vm->def->nmems; i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i]; + + if (mem->model == VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("qemu did not return info on vitio-mem device")); + return -1; + } + } + return 0; + } if (rc < 0) return -1; @@ -8009,9 +8021,24 @@ qemuDomainUpdateMemoryDeviceInfo(virQEMUDriverPtr driver, if (!(dimm = virHashLookup(meminfo, mem->info.alias))) continue; - mem->info.type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DIMM; - mem->info.addr.dimm.slot = dimm->slot; - mem->info.addr.dimm.base = dimm->address; + switch (mem->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + mem->actualsize = VIR_DIV_UP(dimm->size, 1024); + break; + + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: + mem->info.type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DIMM; + mem->info.addr.dimm.slot = dimm->slot; + mem->info.addr.dimm.base = dimm->address; + break; + + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + /* nada */ + break; + } } virHashFree(meminfo); diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 00428c14d2..9668e287a1 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -1391,10 +1391,13 @@ typedef struct _qemuMonitorMemoryDeviceInfo qemuMonitorMemoryDeviceInfo; typedef qemuMonitorMemoryDeviceInfo *qemuMonitorMemoryDeviceInfoPtr; struct _qemuMonitorMemoryDeviceInfo { + /* For pc-dimm */ unsigned long long address; unsigned int slot; bool hotplugged; bool hotpluggable; + /* For virtio-mem */ + unsigned long long size; }; int qemuMonitorGetMemoryDeviceInfo(qemuMonitorPtr mon, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index b1dc527e8b..1922f84a5e 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -8228,7 +8228,6 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon, virJSONValuePtr cmd; virJSONValuePtr reply = NULL; virJSONValuePtr data = NULL; - qemuMonitorMemoryDeviceInfoPtr meminfo = NULL; size_t i; if (!(cmd = qemuMonitorJSONMakeCommand("query-memory-devices", NULL))) @@ -8249,6 +8248,9 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon, for (i = 0; i < virJSONValueArraySize(data); i++) { virJSONValuePtr elem = virJSONValueArrayGet(data, i); + g_autofree qemuMonitorMemoryDeviceInfoPtr meminfo = NULL; + virJSONValuePtr dimminfo; + const char *devalias; const char *type; if (!(type = virJSONValueObjectGetString(elem, "type"))) { @@ -8258,26 +8260,23 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon, goto cleanup; } + if (!(dimminfo = virJSONValueObjectGetObject(elem, "data"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-memory-devices reply data doesn't " + "contain enum data")); + goto cleanup; + } + + if (!(devalias = virJSONValueObjectGetString(dimminfo, "id"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("dimm memory info data is missing 'id'")); + goto cleanup; + } + + meminfo = g_new0(qemuMonitorMemoryDeviceInfo, 1); + /* dimm memory devices */ if (STREQ(type, "dimm")) { - virJSONValuePtr dimminfo; - const char *devalias; - - if (!(dimminfo = virJSONValueObjectGetObject(elem, "data"))) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("query-memory-devices reply data doesn't " - "contain enum data")); - goto cleanup; - } - - if (!(devalias = virJSONValueObjectGetString(dimminfo, "id"))) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("dimm memory info data is missing 'id'")); - goto cleanup; - } - - meminfo = g_new0(qemuMonitorMemoryDeviceInfo, 1); - if (virJSONValueObjectGetNumberUlong(dimminfo, "addr", &meminfo->address) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -8308,17 +8307,27 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon, } - if (virHashAddEntry(info, devalias, meminfo) < 0) + } else if (STREQ(type, "virtio-mem")) { + if (virJSONValueObjectGetNumberUlong(dimminfo, "size", + &meminfo->size) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("malformed/missing slot in dimm memory info")); goto cleanup; - - meminfo = NULL; + } + } else { + /* type not handled yet */ + continue; } + + if (virHashAddEntry(info, devalias, meminfo) < 0) + goto cleanup; + + meminfo = NULL; } ret = 0; cleanup: - VIR_FREE(meminfo); virJSONValueFree(cmd); virJSONValueFree(reply); return ret; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 37cc6a28e5..8d41f947af 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -8494,6 +8494,9 @@ qemuProcessReconnect(void *opaque) qemuDomainVcpuPersistOrder(obj->def); + if (qemuDomainUpdateMemoryDeviceInfo(driver, obj, QEMU_ASYNC_JOB_NONE) < 0) + goto error; + if (qemuProcessDetectIOThreadPIDs(driver, obj, QEMU_ASYNC_JOB_NONE) < 0) goto error; -- 2.26.2

On Fri, Jan 22, 2021 at 13:50:29 +0100, Michal Privoznik wrote:
If the QEMU driver restarts it loses the track of the actual size of virtio-mem (because it's runtime type of information and thus not stored in XML) and therefore, we have to refresh it when reconnecting to the domain monitor.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_domain.c | 37 ++++++++++++++++++++---- src/qemu/qemu_monitor.h | 3 ++ src/qemu/qemu_monitor_json.c | 55 +++++++++++++++++++++--------------- src/qemu/qemu_process.c | 3 ++ 4 files changed, 70 insertions(+), 28 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 783756b191..5c40c02180 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c
[...]
diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 00428c14d2..9668e287a1 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -1391,10 +1391,13 @@ typedef struct _qemuMonitorMemoryDeviceInfo qemuMonitorMemoryDeviceInfo; typedef qemuMonitorMemoryDeviceInfo *qemuMonitorMemoryDeviceInfoPtr;
struct _qemuMonitorMemoryDeviceInfo { + /* For pc-dimm */ unsigned long long address; unsigned int slot; bool hotplugged; bool hotpluggable; + /* For virtio-mem */ + unsigned long long size;
Menion the unit here. [...]
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index b1dc527e8b..1922f84a5e 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -8228,7 +8228,6 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon, virJSONValuePtr cmd; virJSONValuePtr reply = NULL; virJSONValuePtr data = NULL; - qemuMonitorMemoryDeviceInfoPtr meminfo = NULL; size_t i;
if (!(cmd = qemuMonitorJSONMakeCommand("query-memory-devices", NULL))) @@ -8249,6 +8248,9 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon,
for (i = 0; i < virJSONValueArraySize(data); i++) { virJSONValuePtr elem = virJSONValueArrayGet(data, i); + g_autofree qemuMonitorMemoryDeviceInfoPtr meminfo = NULL; + virJSONValuePtr dimminfo; + const char *devalias; const char *type;
if (!(type = virJSONValueObjectGetString(elem, "type"))) { @@ -8258,26 +8260,23 @@ qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon, goto cleanup; }
+ if (!(dimminfo = virJSONValueObjectGetObject(elem, "data"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-memory-devices reply data doesn't " + "contain enum data")); + goto cleanup; + } + + if (!(devalias = virJSONValueObjectGetString(dimminfo, "id"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("dimm memory info data is missing 'id'")); + goto cleanup; + }
This isn't future proof. The 'MemoryDeviceInfo' QAPI type which is the return value of 'query-memory-devices' doesn't have any permanent members, it's just an union of distinct sub types. The fields you are querying are re-defined for every type, but are not guaranteed to be present for any new type. Thus our code must not require them to be present, only when we are certain that we've got the correct discriminator.
+ + meminfo = g_new0(qemuMonitorMemoryDeviceInfo, 1); + /* dimm memory devices */ if (STREQ(type, "dimm")) {
[...]

Just like we are recalculating the amount of guest memory on BALLOON_CHANGE and on reconnect to the monitor, we should include the actual size of virtio-mem too. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 3 +++ src/qemu/qemu_process.c | 57 +++++++++++++++++++++++++++++++++-------- 2 files changed, 50 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d64eb4d399..2fd4429ba8 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4298,6 +4298,9 @@ processMemoryDeviceSizeChange(virQEMUDriverPtr driver, mem = vm->def->mems[idx]; mem->actualsize = VIR_DIV_UP(info->size, 1024); + /* fix the balloon size */ + ignore_value(qemuProcessRefreshBalloonState(driver, vm, QEMU_ASYNC_JOB_NONE)); + endjob: qemuDomainObjEndJob(driver, vm); } diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 8d41f947af..01d261d538 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1250,10 +1250,31 @@ qemuProcessHandleBalloonChange(qemuMonitorPtr mon G_GNUC_UNUSED, virQEMUDriverPtr driver = opaque; virObjectEventPtr event = NULL; g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + size_t i; virObjectLock(vm); event = virDomainEventBalloonChangeNewFromObj(vm, actual); + VIR_DEBUG("New balloon size before fixup: %lld", actual); + + for (i = 0; i < vm->def->nmems; i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i]; + + switch (mem->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + actual += mem->actualsize; + break; + + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + /* nada */ + break; + } + } + VIR_DEBUG("Updating balloon from %lld to %lld kb", vm->def->mem.cur_balloon, actual); vm->def->mem.cur_balloon = actual; @@ -2451,21 +2472,37 @@ qemuProcessRefreshBalloonState(virQEMUDriverPtr driver, int asyncJob) { unsigned long long balloon; + size_t i; int rc; - /* if no ballooning is available, the current size equals to the current - * full memory size */ - if (!virDomainDefHasMemballoon(vm->def)) { - vm->def->mem.cur_balloon = virDomainDefGetMemoryTotal(vm->def); - return 0; + if (virDomainDefHasMemballoon(vm->def)) { + if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) + return -1; + + rc = qemuMonitorGetBalloonInfo(qemuDomainGetMonitor(vm), &balloon); + if (qemuDomainObjExitMonitor(driver, vm) < 0 || rc < 0) + return -1; + } else { + balloon = virDomainDefGetMemoryTotal(vm->def); } - if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) - return -1; + for (i = 0; i < vm->def->nmems; i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i]; + + switch (mem->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + balloon += mem->actualsize; + break; - rc = qemuMonitorGetBalloonInfo(qemuDomainGetMonitor(vm), &balloon); - if (qemuDomainObjExitMonitor(driver, vm) < 0 || rc < 0) - return -1; + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_DIMM: + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_PMEM: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + /* nada */ + break; + } + } vm->def->mem.cur_balloon = balloon; -- 2.26.2

On Fri, Jan 22, 2021 at 13:50:30 +0100, Michal Privoznik wrote:
Just like we are recalculating the amount of guest memory on BALLOON_CHANGE and on reconnect to the monitor, we should include the actual size of virtio-mem too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 3 +++ src/qemu/qemu_process.c | 57 +++++++++++++++++++++++++++++++++-------- 2 files changed, 50 insertions(+), 10 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d64eb4d399..2fd4429ba8 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4298,6 +4298,9 @@ processMemoryDeviceSizeChange(virQEMUDriverPtr driver, mem = vm->def->mems[idx]; mem->actualsize = VIR_DIV_UP(info->size, 1024);
+ /* fix the balloon size */ + ignore_value(qemuProcessRefreshBalloonState(driver, vm, QEMU_ASYNC_JOB_NONE));
During VM lifetime, qemu is actively notifying us about balloon changes via an event, so explicit refresh is wrong here. You probably want to do the calculation here depending on the old and new value instead.
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 8d41f947af..01d261d538 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1250,10 +1250,31 @@ qemuProcessHandleBalloonChange(qemuMonitorPtr mon G_GNUC_UNUSED, virQEMUDriverPtr driver = opaque; virObjectEventPtr event = NULL; g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + size_t i;
virObjectLock(vm); event = virDomainEventBalloonChangeNewFromObj(vm, actual);
+ VIR_DEBUG("New balloon size before fixup: %lld", actual); + + for (i = 0; i < vm->def->nmems; i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i]; + + switch (mem->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + actual += mem->actualsize; + break;
So this means that the balloon driver never counts the memory provided by any virtio-mem device? If that is so then this is okay, but in that case it would be worth explaining it here. If it's not the case it's wrong to add the size of the additional devices here since you'd count them twice. [...]
@@ -2451,21 +2472,37 @@ qemuProcessRefreshBalloonState(virQEMUDriverPtr driver, int asyncJob) { unsigned long long balloon; + size_t i; int rc;
- /* if no ballooning is available, the current size equals to the current - * full memory size */ - if (!virDomainDefHasMemballoon(vm->def)) { - vm->def->mem.cur_balloon = virDomainDefGetMemoryTotal(vm->def); - return 0; + if (virDomainDefHasMemballoon(vm->def)) { + if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) + return -1; + + rc = qemuMonitorGetBalloonInfo(qemuDomainGetMonitor(vm), &balloon); + if (qemuDomainObjExitMonitor(driver, vm) < 0 || rc < 0) + return -1; + } else { + balloon = virDomainDefGetMemoryTotal(vm->def);
So the following code is wrong at least in case when this branch is taken: virDomainDefGetMemoryTotal: * Returns the current maximum memory size usable by the domain described by * @def. This size includes possible additional memory devices. The total_memory size is updated in virDomainDefPostParseMemory by adding up sizes of all 'mem' devices ...
}
- if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) - return -1; + for (i = 0; i < vm->def->nmems; i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i]; + + switch (mem->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + balloon += mem->actualsize; + break;
... and you then add the 'actualsize' component which is a fraction of the 'size' of a mem counted previously here. Thus when reconnecting to a VM which doesn't have ballon, but uses virtio-mem, the virtio-mem is counted twice. (This doesn't happen on refresh when starting a new VM since no guest code executed yet)

On 2/2/21 2:32 PM, Peter Krempa wrote:
On Fri, Jan 22, 2021 at 13:50:30 +0100, Michal Privoznik wrote:
Just like we are recalculating the amount of guest memory on BALLOON_CHANGE and on reconnect to the monitor, we should include the actual size of virtio-mem too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 3 +++ src/qemu/qemu_process.c | 57 +++++++++++++++++++++++++++++++++-------- 2 files changed, 50 insertions(+), 10 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d64eb4d399..2fd4429ba8 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4298,6 +4298,9 @@ processMemoryDeviceSizeChange(virQEMUDriverPtr driver, mem = vm->def->mems[idx]; mem->actualsize = VIR_DIV_UP(info->size, 1024);
+ /* fix the balloon size */ + ignore_value(qemuProcessRefreshBalloonState(driver, vm, QEMU_ASYNC_JOB_NONE));
During VM lifetime, qemu is actively notifying us about balloon changes via an event, so explicit refresh is wrong here.
You probably want to do the calculation here depending on the old and new value instead.
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 8d41f947af..01d261d538 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1250,10 +1250,31 @@ qemuProcessHandleBalloonChange(qemuMonitorPtr mon G_GNUC_UNUSED, virQEMUDriverPtr driver = opaque; virObjectEventPtr event = NULL; g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + size_t i;
virObjectLock(vm); event = virDomainEventBalloonChangeNewFromObj(vm, actual);
+ VIR_DEBUG("New balloon size before fixup: %lld", actual); + + for (i = 0; i < vm->def->nmems; i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i]; + + switch (mem->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + actual += mem->actualsize; + break;
So this means that the balloon driver never counts the memory provided by any virtio-mem device? If that is so then this is okay, but in that case it would be worth explaining it here.
No, balloon does not report virtio-mem memory.
If it's not the case it's wrong to add the size of the additional devices here since you'd count them twice.
[...]
@@ -2451,21 +2472,37 @@ qemuProcessRefreshBalloonState(virQEMUDriverPtr driver, int asyncJob) { unsigned long long balloon; + size_t i; int rc;
- /* if no ballooning is available, the current size equals to the current - * full memory size */ - if (!virDomainDefHasMemballoon(vm->def)) { - vm->def->mem.cur_balloon = virDomainDefGetMemoryTotal(vm->def); - return 0; + if (virDomainDefHasMemballoon(vm->def)) { + if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) + return -1; + + rc = qemuMonitorGetBalloonInfo(qemuDomainGetMonitor(vm), &balloon); + if (qemuDomainObjExitMonitor(driver, vm) < 0 || rc < 0) + return -1; + } else { + balloon = virDomainDefGetMemoryTotal(vm->def);
So the following code is wrong at least in case when this branch is taken:
virDomainDefGetMemoryTotal:
* Returns the current maximum memory size usable by the domain described by * @def. This size includes possible additional memory devices.
The total_memory size is updated in virDomainDefPostParseMemory by adding up sizes of all 'mem' devices ...
}
- if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) - return -1; + for (i = 0; i < vm->def->nmems; i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i]; + + switch (mem->model) { + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: + balloon += mem->actualsize; + break;
... and you then add the 'actualsize' component which is a fraction of the 'size' of a mem counted previously here.
Thus when reconnecting to a VM which doesn't have ballon, but uses virtio-mem, the virtio-mem is counted twice. (This doesn't happen on refresh when starting a new VM since no guest code executed yet)
I've spent some time thinking about this patch. Do we really need it? I mean, if we did fake balloon size how do users learn the actual size of the balloon? On one hand I understand that <currentMemory/> (= def->mem.cur_balloon) should reflect the actual size that guest sees (balloon + actual size of virtio mem), but on the other - as Jing showed there might be quite a few places where we might report just balloon size (virsh dommemstat). Somehow it suddenly feels wrong to mangle with reported balloon size. Michal

New 'update-memory' command is introduced which aims on making it user friendly to change <memory/> device. So far I just need to change <requested/> so I'm introducing --requested-size only; but the idea is that this is extensible for other cases too. For instance, want to change <myElement/>? Nnew --my-element argument can be easily introduced. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/manpages/virsh.rst | 31 ++++++++ tools/virsh-domain.c | 154 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 185 insertions(+) diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst index e3afa48f7b..32639e34ff 100644 --- a/docs/manpages/virsh.rst +++ b/docs/manpages/virsh.rst @@ -4891,6 +4891,37 @@ results as some fields may be autogenerated and thus match devices other than expected. +update-memory +------------- + +**Syntax:** + +:: + + update-memory domain [--print-xml] [--alias alias] + [[--live] [--config] | [--current]] + [--requested-size size] + +Update values for a ``<memory/>`` device. Not to be confused with overall +domain memory which is tuned via ``setmem`` and ``setmaxmem``. +This command finds ``<memory/>`` device inside given *domain*, changes +requested values and passes updated device XML to daemon. If *--print-xml* is +specified then the device is not changed, but the updated device XML is printed +to stdout. If there are more than one ``<memory/>`` devices in *domain* use +*--alias* to select the desired one. + +If *--live* is specified, affect a running domain. +If *--config* is specified, affect the next startup of a persistent guest. +If *--current* is specified, it is equivalent to either *--live* or +*--config*, depending on the current state of the guest. +Both *--live* and *--config* flags may be given, but *--current* is +exclusive. Not specifying any flag is the same as specifying *--current*. + +If *--requested-size* is specified then ``<requested/>`` under memory target is +changed to requested *size* (as scaled integer, see ``NOTES`` above). It +defaults to kibibytes if no suffix is provided. + + change-media ------------ diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 9746117bdb..0b32e6f408 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -9128,6 +9128,154 @@ cmdSetmaxmem(vshControl *ctl, const vshCmd *cmd) return ret; } + +/* + * "update-memory" command + */ +static const vshCmdInfo info_update_memory[] = { + {.name = "help", + .data = N_("update memory device of a domain") + }, + {.name = "desc", + .data = N_("Update values of a memory device of a domain") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_update_memory[] = { + VIRSH_COMMON_OPT_DOMAIN_FULL(0), + VIRSH_COMMON_OPT_DOMAIN_CONFIG, + VIRSH_COMMON_OPT_DOMAIN_LIVE, + VIRSH_COMMON_OPT_DOMAIN_CURRENT, + {.name = "print-xml", + .type = VSH_OT_BOOL, + .help = N_("print updated memory device XML instead of executing the change") + }, + {.name = "alias", + .type = VSH_OT_STRING, + .completer = virshDomainDeviceAliasCompleter, + .help = N_("memory device alias"), + }, + {.name = "requested-size", + .type = VSH_OT_INT, + .help = N_("new value of <requested/> size, as scaled integer (default KiB)") + }, + {.name = NULL} +}; + +static int +virshGetUpdatedMemoryXML(char **updatedMemoryXML, + vshControl *ctl, + const vshCmd *cmd, + virDomainPtr dom, + unsigned int flags) +{ + const char *alias = NULL; + g_autoptr(xmlDoc) doc = NULL; + g_autoptr(xmlXPathContext) ctxt = NULL; + g_autofree char *xpath = NULL; + int nmems; + g_autofree xmlNodePtr *mems = NULL; + g_autoptr(xmlBuffer) xmlbuf = NULL; + unsigned int domainXMLFlags = 0; + + if (flags & VIR_DOMAIN_AFFECT_CONFIG) + domainXMLFlags |= VIR_DOMAIN_XML_INACTIVE; + + if (virshDomainGetXMLFromDom(ctl, dom, domainXMLFlags, &doc, &ctxt) < 0) + return -1; + + if (vshCommandOptStringReq(ctl, cmd, "alias", &alias) < 0) + return -1; + + if (alias) { + xpath = g_strdup_printf("/domain/devices/memory[./alias/@name='%s']", alias); + } else { + xpath = g_strdup("/domain/devices/memory"); + } + + nmems = virXPathNodeSet(xpath, ctxt, &mems); + if (nmems < 0) { + vshSaveLibvirtError(); + return -1; + } else if (nmems == 0) { + vshError(ctl, _("no memory device found")); + return -1; + } else if (nmems > 1) { + vshError(ctl, _("multiple memory devices found, use --alias to select one")); + return -1; + } + + ctxt->node = mems[0]; + + if (vshCommandOptBool(cmd, "requested-size")) { + xmlNodePtr requestedSizeNode; + g_autofree char *kibibytesStr = NULL; + unsigned long long bytes = 0; + unsigned long kibibytes = 0; + + if (vshCommandOptScaledInt(ctl, cmd, "requested-size", &bytes, 1024, ULLONG_MAX) < 0) + return -1; + kibibytes = VIR_DIV_UP(bytes, 1024); + + requestedSizeNode = virXPathNode("./target/requested", ctxt); + + if (!requestedSizeNode) { + vshError(ctl, _("virtio-mem device is missing <requested/>")); + return -1; + } + + kibibytesStr = g_strdup_printf("%lu", kibibytes); + xmlNodeSetContent(requestedSizeNode, BAD_CAST kibibytesStr); + } + + if (!(*updatedMemoryXML = virXMLNodeToString(doc, mems[0]))) { + vshSaveLibvirtError(); + return -1; + } + + return 0; +} + +static bool +cmdUpdateMemory(vshControl *ctl, const vshCmd *cmd) +{ + virDomainPtr dom; + bool ret = false; + bool config = vshCommandOptBool(cmd, "config"); + bool live = vshCommandOptBool(cmd, "live"); + bool current = vshCommandOptBool(cmd, "current"); + g_autofree char *updatedMemoryXML = NULL; + unsigned int flags = VIR_DOMAIN_AFFECT_CURRENT; + + VSH_EXCLUSIVE_OPTIONS_VAR(current, live); + VSH_EXCLUSIVE_OPTIONS_VAR(current, config); + + if (config) + flags |= VIR_DOMAIN_AFFECT_CONFIG; + if (live) + flags |= VIR_DOMAIN_AFFECT_LIVE; + + if (!(dom = virshCommandOptDomain(ctl, cmd, NULL))) + return false; + + if (virshGetUpdatedMemoryXML(&updatedMemoryXML, ctl, cmd, dom, flags) < 0) + goto cleanup; + + if (vshCommandOptBool(cmd, "print-xml")) { + vshPrint(ctl, "%s", updatedMemoryXML); + } else { + if (virDomainUpdateDeviceFlags(dom, updatedMemoryXML, flags) < 0) + goto cleanup; + } + + ret = true; + cleanup: + virshDomainFree(dom); + return ret; +} + + /* * "memtune" command */ @@ -14995,6 +15143,12 @@ const vshCmdDef domManagementCmds[] = { .info = info_update_device, .flags = 0 }, + {.name = "update-memory", + .handler = cmdUpdateMemory, + .opts = opts_update_memory, + .info = info_update_memory, + .flags = 0 + }, {.name = "vcpucount", .handler = cmdVcpucount, .opts = opts_vcpucount, -- 2.26.2

On 1/22/21 9:50 AM, Michal Privoznik wrote:
New 'update-memory' command is introduced which aims on making it user friendly to change <memory/> device. So far I just need to change <requested/> so I'm introducing --requested-size only; but the idea is that this is extensible for other cases too. For instance, want to change <myElement/>? Nnew --my-element argument can be easily introduced.
I think you meant: "... want to change <myElement/>? A new --my-element argument ..." Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/manpages/virsh.rst | 31 ++++++++ tools/virsh-domain.c | 154 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 185 insertions(+)
diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst index e3afa48f7b..32639e34ff 100644 --- a/docs/manpages/virsh.rst +++ b/docs/manpages/virsh.rst @@ -4891,6 +4891,37 @@ results as some fields may be autogenerated and thus match devices other than expected.
+update-memory +------------- + +**Syntax:** + +:: + + update-memory domain [--print-xml] [--alias alias] + [[--live] [--config] | [--current]] + [--requested-size size] + +Update values for a ``<memory/>`` device. Not to be confused with overall +domain memory which is tuned via ``setmem`` and ``setmaxmem``. +This command finds ``<memory/>`` device inside given *domain*, changes +requested values and passes updated device XML to daemon. If *--print-xml* is +specified then the device is not changed, but the updated device XML is printed +to stdout. If there are more than one ``<memory/>`` devices in *domain* use +*--alias* to select the desired one. + +If *--live* is specified, affect a running domain. +If *--config* is specified, affect the next startup of a persistent guest. +If *--current* is specified, it is equivalent to either *--live* or +*--config*, depending on the current state of the guest. +Both *--live* and *--config* flags may be given, but *--current* is +exclusive. Not specifying any flag is the same as specifying *--current*. + +If *--requested-size* is specified then ``<requested/>`` under memory target is +changed to requested *size* (as scaled integer, see ``NOTES`` above). It +defaults to kibibytes if no suffix is provided. + + change-media ------------
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 9746117bdb..0b32e6f408 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -9128,6 +9128,154 @@ cmdSetmaxmem(vshControl *ctl, const vshCmd *cmd) return ret; }
+ +/* + * "update-memory" command + */ +static const vshCmdInfo info_update_memory[] = { + {.name = "help", + .data = N_("update memory device of a domain") + }, + {.name = "desc", + .data = N_("Update values of a memory device of a domain") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_update_memory[] = { + VIRSH_COMMON_OPT_DOMAIN_FULL(0), + VIRSH_COMMON_OPT_DOMAIN_CONFIG, + VIRSH_COMMON_OPT_DOMAIN_LIVE, + VIRSH_COMMON_OPT_DOMAIN_CURRENT, + {.name = "print-xml", + .type = VSH_OT_BOOL, + .help = N_("print updated memory device XML instead of executing the change") + }, + {.name = "alias", + .type = VSH_OT_STRING, + .completer = virshDomainDeviceAliasCompleter, + .help = N_("memory device alias"), + }, + {.name = "requested-size", + .type = VSH_OT_INT, + .help = N_("new value of <requested/> size, as scaled integer (default KiB)") + }, + {.name = NULL} +}; + +static int +virshGetUpdatedMemoryXML(char **updatedMemoryXML, + vshControl *ctl, + const vshCmd *cmd, + virDomainPtr dom, + unsigned int flags) +{ + const char *alias = NULL; + g_autoptr(xmlDoc) doc = NULL; + g_autoptr(xmlXPathContext) ctxt = NULL; + g_autofree char *xpath = NULL; + int nmems; + g_autofree xmlNodePtr *mems = NULL; + g_autoptr(xmlBuffer) xmlbuf = NULL; + unsigned int domainXMLFlags = 0; + + if (flags & VIR_DOMAIN_AFFECT_CONFIG) + domainXMLFlags |= VIR_DOMAIN_XML_INACTIVE; + + if (virshDomainGetXMLFromDom(ctl, dom, domainXMLFlags, &doc, &ctxt) < 0) + return -1; + + if (vshCommandOptStringReq(ctl, cmd, "alias", &alias) < 0) + return -1; + + if (alias) { + xpath = g_strdup_printf("/domain/devices/memory[./alias/@name='%s']", alias); + } else { + xpath = g_strdup("/domain/devices/memory"); + } + + nmems = virXPathNodeSet(xpath, ctxt, &mems); + if (nmems < 0) { + vshSaveLibvirtError(); + return -1; + } else if (nmems == 0) { + vshError(ctl, _("no memory device found")); + return -1; + } else if (nmems > 1) { + vshError(ctl, _("multiple memory devices found, use --alias to select one")); + return -1; + } + + ctxt->node = mems[0]; + + if (vshCommandOptBool(cmd, "requested-size")) { + xmlNodePtr requestedSizeNode; + g_autofree char *kibibytesStr = NULL; + unsigned long long bytes = 0; + unsigned long kibibytes = 0; + + if (vshCommandOptScaledInt(ctl, cmd, "requested-size", &bytes, 1024, ULLONG_MAX) < 0) + return -1; + kibibytes = VIR_DIV_UP(bytes, 1024); + + requestedSizeNode = virXPathNode("./target/requested", ctxt); + + if (!requestedSizeNode) { + vshError(ctl, _("virtio-mem device is missing <requested/>")); + return -1; + } + + kibibytesStr = g_strdup_printf("%lu", kibibytes); + xmlNodeSetContent(requestedSizeNode, BAD_CAST kibibytesStr); + } + + if (!(*updatedMemoryXML = virXMLNodeToString(doc, mems[0]))) { + vshSaveLibvirtError(); + return -1; + } + + return 0; +} + +static bool +cmdUpdateMemory(vshControl *ctl, const vshCmd *cmd) +{ + virDomainPtr dom; + bool ret = false; + bool config = vshCommandOptBool(cmd, "config"); + bool live = vshCommandOptBool(cmd, "live"); + bool current = vshCommandOptBool(cmd, "current"); + g_autofree char *updatedMemoryXML = NULL; + unsigned int flags = VIR_DOMAIN_AFFECT_CURRENT; + + VSH_EXCLUSIVE_OPTIONS_VAR(current, live); + VSH_EXCLUSIVE_OPTIONS_VAR(current, config); + + if (config) + flags |= VIR_DOMAIN_AFFECT_CONFIG; + if (live) + flags |= VIR_DOMAIN_AFFECT_LIVE; + + if (!(dom = virshCommandOptDomain(ctl, cmd, NULL))) + return false; + + if (virshGetUpdatedMemoryXML(&updatedMemoryXML, ctl, cmd, dom, flags) < 0) + goto cleanup; + + if (vshCommandOptBool(cmd, "print-xml")) { + vshPrint(ctl, "%s", updatedMemoryXML); + } else { + if (virDomainUpdateDeviceFlags(dom, updatedMemoryXML, flags) < 0) + goto cleanup; + } + + ret = true; + cleanup: + virshDomainFree(dom); + return ret; +} + + /* * "memtune" command */ @@ -14995,6 +15143,12 @@ const vshCmdDef domManagementCmds[] = { .info = info_update_device, .flags = 0 }, + {.name = "update-memory", + .handler = cmdUpdateMemory, + .opts = opts_update_memory, + .info = info_update_memory, + .flags = 0 + }, {.name = "vcpucount", .handler = cmdVcpucount, .opts = opts_vcpucount,

On Fri, Jan 22, 2021 at 13:50:31 +0100, Michal Privoznik wrote:
New 'update-memory' command is introduced which aims on making it user friendly to change <memory/> device. So far I just need to change <requested/> so I'm introducing --requested-size only; but the idea is that this is extensible for other cases too. For instance, want to change <myElement/>? Nnew --my-element argument can be easily introduced.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/manpages/virsh.rst | 31 ++++++++ tools/virsh-domain.c | 154 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 185 insertions(+)
diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst index e3afa48f7b..32639e34ff 100644 --- a/docs/manpages/virsh.rst +++ b/docs/manpages/virsh.rst @@ -4891,6 +4891,37 @@ results as some fields may be autogenerated and thus match devices other than expected.
+update-memory
update-memory-device-perhaps?
+------------- + +**Syntax:** + +:: + + update-memory domain [--print-xml] [--alias alias] + [[--live] [--config] | [--current]] + [--requested-size size] + +Update values for a ``<memory/>`` device. Not to be confused with overall +domain memory which is tuned via ``setmem`` and ``setmaxmem``.
So that you don't have to add this disclaimer?
+This command finds ``<memory/>`` device inside given *domain*, changes +requested values and passes updated device XML to daemon. If *--print-xml* is +specified then the device is not changed, but the updated device XML is printed +to stdout. If there are more than one ``<memory/>`` devices in *domain* use +*--alias* to select the desired one. + +If *--live* is specified, affect a running domain. +If *--config* is specified, affect the next startup of a persistent guest. +If *--current* is specified, it is equivalent to either *--live* or +*--config*, depending on the current state of the guest. +Both *--live* and *--config* flags may be given, but *--current* is +exclusive. Not specifying any flag is the same as specifying *--current*. + +If *--requested-size* is specified then ``<requested/>`` under memory target is +changed to requested *size* (as scaled integer, see ``NOTES`` above). It +defaults to kibibytes if no suffix is provided.
... this document doesn't mention that it works only for the property of virtio-mem. Users of virsh tend to not read other docs, so please add it.
+ + change-media ------------
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 9746117bdb..0b32e6f408 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -9128,6 +9128,154 @@ cmdSetmaxmem(vshControl *ctl, const vshCmd *cmd) return ret; }
+ +/* + * "update-memory" command + */ +static const vshCmdInfo info_update_memory[] = { + {.name = "help", + .data = N_("update memory device of a domain") + }, + {.name = "desc", + .data = N_("Update values of a memory device of a domain") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_update_memory[] = { + VIRSH_COMMON_OPT_DOMAIN_FULL(0), + VIRSH_COMMON_OPT_DOMAIN_CONFIG, + VIRSH_COMMON_OPT_DOMAIN_LIVE, + VIRSH_COMMON_OPT_DOMAIN_CURRENT, + {.name = "print-xml", + .type = VSH_OT_BOOL, + .help = N_("print updated memory device XML instead of executing the change") + }, + {.name = "alias", + .type = VSH_OT_STRING, + .completer = virshDomainDeviceAliasCompleter,
This completes also non-memory devices.
+ .help = N_("memory device alias"), + }, + {.name = "requested-size", + .type = VSH_OT_INT, + .help = N_("new value of <requested/> size, as scaled integer (default KiB)") + }, + {.name = NULL} +}; + +static int +virshGetUpdatedMemoryXML(char **updatedMemoryXML, + vshControl *ctl, + const vshCmd *cmd, + virDomainPtr dom, + unsigned int flags) +{ + const char *alias = NULL; + g_autoptr(xmlDoc) doc = NULL; + g_autoptr(xmlXPathContext) ctxt = NULL; + g_autofree char *xpath = NULL; + int nmems; + g_autofree xmlNodePtr *mems = NULL; + g_autoptr(xmlBuffer) xmlbuf = NULL; + unsigned int domainXMLFlags = 0; + + if (flags & VIR_DOMAIN_AFFECT_CONFIG) + domainXMLFlags |= VIR_DOMAIN_XML_INACTIVE; + + if (virshDomainGetXMLFromDom(ctl, dom, domainXMLFlags, &doc, &ctxt) < 0) + return -1; + + if (vshCommandOptStringReq(ctl, cmd, "alias", &alias) < 0) + return -1; + + if (alias) { + xpath = g_strdup_printf("/domain/devices/memory[./alias/@name='%s']", alias); + } else { + xpath = g_strdup("/domain/devices/memory"); + } + + nmems = virXPathNodeSet(xpath, ctxt, &mems); + if (nmems < 0) { + vshSaveLibvirtError(); + return -1; + } else if (nmems == 0) { + vshError(ctl, _("no memory device found")); + return -1; + } else if (nmems > 1) { + vshError(ctl, _("multiple memory devices found, use --alias to select one"));
So if you don't have useraliases, you can't use this for inactive XML with 2 virtio mem?
+ return -1; + } + + ctxt->node = mems[0]; + + if (vshCommandOptBool(cmd, "requested-size")) { + xmlNodePtr requestedSizeNode; + g_autofree char *kibibytesStr = NULL; + unsigned long long bytes = 0; + unsigned long kibibytes = 0; + + if (vshCommandOptScaledInt(ctl, cmd, "requested-size", &bytes, 1024, ULLONG_MAX) < 0) + return -1; + kibibytes = VIR_DIV_UP(bytes, 1024); + + requestedSizeNode = virXPathNode("./target/requested", ctxt); + + if (!requestedSizeNode) { + vshError(ctl, _("virtio-mem device is missing <requested/>"));
Here you mention virtio-mem.
+ return -1; + } + + kibibytesStr = g_strdup_printf("%lu", kibibytes); + xmlNodeSetContent(requestedSizeNode, BAD_CAST kibibytesStr); + } + + if (!(*updatedMemoryXML = virXMLNodeToString(doc, mems[0]))) { + vshSaveLibvirtError(); + return -1; + } + + return 0; +} + +static bool +cmdUpdateMemory(vshControl *ctl, const vshCmd *cmd) +{ + virDomainPtr dom; + bool ret = false; + bool config = vshCommandOptBool(cmd, "config"); + bool live = vshCommandOptBool(cmd, "live"); + bool current = vshCommandOptBool(cmd, "current"); + g_autofree char *updatedMemoryXML = NULL; + unsigned int flags = VIR_DOMAIN_AFFECT_CURRENT; + + VSH_EXCLUSIVE_OPTIONS_VAR(current, live); + VSH_EXCLUSIVE_OPTIONS_VAR(current, config); + + if (config) + flags |= VIR_DOMAIN_AFFECT_CONFIG; + if (live) + flags |= VIR_DOMAIN_AFFECT_LIVE; + + if (!(dom = virshCommandOptDomain(ctl, cmd, NULL))) + return false; + + if (virshGetUpdatedMemoryXML(&updatedMemoryXML, ctl, cmd, dom, flags) < 0) + goto cleanup; + + if (vshCommandOptBool(cmd, "print-xml")) { + vshPrint(ctl, "%s", updatedMemoryXML); + } else { + if (virDomainUpdateDeviceFlags(dom, updatedMemoryXML, flags) < 0) + goto cleanup; + } + + ret = true; + cleanup: + virshDomainFree(dom); + return ret;
You can use g_autoptr(virshDomain) dom = NULL; on top instead of the cleanup section.

On 2/2/21 2:41 PM, Peter Krempa wrote:
On Fri, Jan 22, 2021 at 13:50:31 +0100, Michal Privoznik wrote:
New 'update-memory' command is introduced which aims on making it user friendly to change <memory/> device. So far I just need to change <requested/> so I'm introducing --requested-size only; but the idea is that this is extensible for other cases too. For instance, want to change <myElement/>? Nnew --my-element argument can be easily introduced.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- docs/manpages/virsh.rst | 31 ++++++++ tools/virsh-domain.c | 154 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 185 insertions(+)
diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst index e3afa48f7b..32639e34ff 100644 --- a/docs/manpages/virsh.rst +++ b/docs/manpages/virsh.rst @@ -4891,6 +4891,37 @@ results as some fields may be autogenerated and thus match devices other than expected.
+update-memory
update-memory-device-perhaps?
Okay.
+------------- + +**Syntax:** + +:: + + update-memory domain [--print-xml] [--alias alias] + [[--live] [--config] | [--current]] + [--requested-size size] + +Update values for a ``<memory/>`` device. Not to be confused with overall +domain memory which is tuned via ``setmem`` and ``setmaxmem``.
So that you don't have to add this disclaimer?
+This command finds ``<memory/>`` device inside given *domain*, changes +requested values and passes updated device XML to daemon. If *--print-xml* is +specified then the device is not changed, but the updated device XML is printed +to stdout. If there are more than one ``<memory/>`` devices in *domain* use +*--alias* to select the desired one. + +If *--live* is specified, affect a running domain. +If *--config* is specified, affect the next startup of a persistent guest. +If *--current* is specified, it is equivalent to either *--live* or +*--config*, depending on the current state of the guest. +Both *--live* and *--config* flags may be given, but *--current* is +exclusive. Not specifying any flag is the same as specifying *--current*. + +If *--requested-size* is specified then ``<requested/>`` under memory target is +changed to requested *size* (as scaled integer, see ``NOTES`` above). It +defaults to kibibytes if no suffix is provided.
... this document doesn't mention that it works only for the property of virtio-mem. Users of virsh tend to not read other docs, so please add it.
I can do that, but virsh errors out if it didn't find any <requested/>, like this: virsh # update-memory gentoo --alias virtiopmem0 --print-xml --requested-size 5 error: virtio-mem device is missing <requested/> Okay, the error message is misleading a bit (will fix it), because I was trying to modify virtio-pmem. How badly do we want to protect users from themselves?
+ + change-media ------------
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 9746117bdb..0b32e6f408 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -9128,6 +9128,154 @@ cmdSetmaxmem(vshControl *ctl, const vshCmd *cmd) return ret; }
+ +/* + * "update-memory" command + */ +static const vshCmdInfo info_update_memory[] = { + {.name = "help", + .data = N_("update memory device of a domain") + }, + {.name = "desc", + .data = N_("Update values of a memory device of a domain") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_update_memory[] = { + VIRSH_COMMON_OPT_DOMAIN_FULL(0), + VIRSH_COMMON_OPT_DOMAIN_CONFIG, + VIRSH_COMMON_OPT_DOMAIN_LIVE, + VIRSH_COMMON_OPT_DOMAIN_CURRENT, + {.name = "print-xml", + .type = VSH_OT_BOOL, + .help = N_("print updated memory device XML instead of executing the change") + }, + {.name = "alias", + .type = VSH_OT_STRING, + .completer = virshDomainDeviceAliasCompleter,
This completes also non-memory devices.
Yes, and I am okay with that. Alias is optional and needed only if two or more <memory/> devices exist inside domain (regardless of their model). We already allow completion of mutually exclusive --options (because the exclusivity is not expressed in --options definition). We could have .completer_flags as an ORed set of virDomainDeviceType, except not really because that's an enum with continuous values and not a bitmask. I could write a new completer, but if we want to do that for every device (eventually) we would need ~20 different completers. Waste of time. Again, I don't think we should guard users from shooting themselves into foot.
+ .help = N_("memory device alias"), + }, + {.name = "requested-size", + .type = VSH_OT_INT, + .help = N_("new value of <requested/> size, as scaled integer (default KiB)") + }, + {.name = NULL} +}; + +static int +virshGetUpdatedMemoryXML(char **updatedMemoryXML, + vshControl *ctl, + const vshCmd *cmd, + virDomainPtr dom, + unsigned int flags) +{ + const char *alias = NULL; + g_autoptr(xmlDoc) doc = NULL; + g_autoptr(xmlXPathContext) ctxt = NULL; + g_autofree char *xpath = NULL; + int nmems; + g_autofree xmlNodePtr *mems = NULL; + g_autoptr(xmlBuffer) xmlbuf = NULL; + unsigned int domainXMLFlags = 0; + + if (flags & VIR_DOMAIN_AFFECT_CONFIG) + domainXMLFlags |= VIR_DOMAIN_XML_INACTIVE; + + if (virshDomainGetXMLFromDom(ctl, dom, domainXMLFlags, &doc, &ctxt) < 0) + return -1; + + if (vshCommandOptStringReq(ctl, cmd, "alias", &alias) < 0) + return -1; + + if (alias) { + xpath = g_strdup_printf("/domain/devices/memory[./alias/@name='%s']", alias); + } else { + xpath = g_strdup("/domain/devices/memory"); + } + + nmems = virXPathNodeSet(xpath, ctxt, &mems); + if (nmems < 0) { + vshSaveLibvirtError(); + return -1; + } else if (nmems == 0) { + vshError(ctl, _("no memory device found")); + return -1; + } else if (nmems > 1) { + vshError(ctl, _("multiple memory devices found, use --alias to select one"));
So if you don't have useraliases, you can't use this for inactive XML with 2 virtio mem?
Updating inactive XML is not even implemented yet, so can't use it even if you have only one virtio mem. If we will want that, then we need --address to match the device address because that is the only unique attribute. I'd leave it for future work.
+ return -1; + } + + ctxt->node = mems[0]; + + if (vshCommandOptBool(cmd, "requested-size")) { + xmlNodePtr requestedSizeNode; + g_autofree char *kibibytesStr = NULL; + unsigned long long bytes = 0; + unsigned long kibibytes = 0; + + if (vshCommandOptScaledInt(ctl, cmd, "requested-size", &bytes, 1024, ULLONG_MAX) < 0) + return -1; + kibibytes = VIR_DIV_UP(bytes, 1024); + + requestedSizeNode = virXPathNode("./target/requested", ctxt); + + if (!requestedSizeNode) { + vshError(ctl, _("virtio-mem device is missing <requested/>"));
Here you mention virtio-mem.
+ return -1; + } + + kibibytesStr = g_strdup_printf("%lu", kibibytes); + xmlNodeSetContent(requestedSizeNode, BAD_CAST kibibytesStr); + } + + if (!(*updatedMemoryXML = virXMLNodeToString(doc, mems[0]))) { + vshSaveLibvirtError(); + return -1; + } + + return 0; +} + +static bool +cmdUpdateMemory(vshControl *ctl, const vshCmd *cmd) +{ + virDomainPtr dom; + bool ret = false; + bool config = vshCommandOptBool(cmd, "config"); + bool live = vshCommandOptBool(cmd, "live"); + bool current = vshCommandOptBool(cmd, "current"); + g_autofree char *updatedMemoryXML = NULL; + unsigned int flags = VIR_DOMAIN_AFFECT_CURRENT; + + VSH_EXCLUSIVE_OPTIONS_VAR(current, live); + VSH_EXCLUSIVE_OPTIONS_VAR(current, config); + + if (config) + flags |= VIR_DOMAIN_AFFECT_CONFIG; + if (live) + flags |= VIR_DOMAIN_AFFECT_LIVE; + + if (!(dom = virshCommandOptDomain(ctl, cmd, NULL))) + return false; + + if (virshGetUpdatedMemoryXML(&updatedMemoryXML, ctl, cmd, dom, flags) < 0) + goto cleanup; + + if (vshCommandOptBool(cmd, "print-xml")) { + vshPrint(ctl, "%s", updatedMemoryXML); + } else { + if (virDomainUpdateDeviceFlags(dom, updatedMemoryXML, flags) < 0) + goto cleanup; + } + + ret = true; + cleanup: + virshDomainFree(dom); + return ret;
You can use g_autoptr(virshDomain) dom = NULL; on top instead of the cleanup section.
Yup. Michal

Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- NEWS.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/NEWS.rst b/NEWS.rst index 7a2d6649b4..33f316c8d4 100644 --- a/NEWS.rst +++ b/NEWS.rst @@ -18,6 +18,13 @@ v7.1.0 (unreleased) The virtio-pmem is a virtio variant of NVDIMM and just like NVDIMM virtio-pmem also allows accessing host pages bypassing guest page cache. + * Introduce virtio-mem ``<memory/>`` model + + New virtio-mem model is introduced for ``<memory/>`` device which is a + paravirtualized mechanism of adding/removing memory to/from a VM. Use + ``virDomainUpdateDeviceFlags()`` API to adjust amount of memory or ``virsh + update-memory`` for convenience. + * **Improvements** * **Bug fixes** -- 2.26.2

On 22.01.21 13:50, Michal Privoznik wrote:
Technically, this is another version of:
https://www.redhat.com/archives/libvir-list/2020-December/msg00199.html
But since virtio-pmem part is pushed now, I've reworked virtio-mem a bit and sending it as a new series.
For curious ones, David summarized behaviour well when implementing virtio-mem support in kernel:
... and for the really curious ones (because some details in that patch are a little outdated), there is plenty more at: https://virtio-mem.gitlab.io/ Thanks for all your effort Michal! -- Thanks, David / dhildenb

On 1/22/21 9:50 AM, Michal Privoznik wrote:
Technically, this is another version of:
https://www.redhat.com/archives/libvir-list/2020-December/msg00199.html
But since virtio-pmem part is pushed now, I've reworked virtio-mem a bit and sending it as a new series.
For curious ones, David summarized behaviour well when implementing virtio-mem support in kernel:
https://lwn.net/Articles/755423/
For less curious ones:
# virsh update-memory $dom --requested-size 4G
adds additional 4GiB of RAM to guest;
# virsh update-memory $dom --requested-size 0
removes those 4GiB added earlier.
Patches are also available on my GitLab:
https://gitlab.com/MichalPrivoznik/libvirt/-/tree/virtio_mem_v3
Code LGTM: Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> A few questions about the overall design: - it is mentioned that 'requested-size' should respect the granularity of the block unit, but later on the 'actual' attribute is added to track the size that the device was expanded/shrunk. What happens if we forfeit the granularity check of the memory increments? Will QEMU error out because we're requesting an invalid value or it will silently size the device to a plausible size? - Reading the lwn article I understood that David implemented this support for s390x as well. If that's the case, then I believe you should double check later on what's the THP size that Z uses to be sure that it's the same 2MiB value you're considering in patch 03. - in patch 03 it is mentioned that: "If it wants to give more memory to the guest it changes 'requested-size' to a bigger value, and if it wants to shrink guest memory it changes the 'requested-size' to a smaller value. Note, value of zero means that guest should release all memory offered by the device." Does size zero implicates the virtio-mem device unplug? Will the device still exist in the guest even with zeroed memory, acting as a sort of 'deflated virtio-balloon'? Thanks, DHB
Michal Prívozník (10): virhostmem: Introduce virHostMemGetTHPSize() qemu_capabilities: Introduce QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI conf: Introduce virtio-mem <memory/> model qemu: Build command line for virtio-mem qemu: Wire up <memory/> live update qemu: Wire up MEMORY_DEVICE_SIZE_CHANGE event qemu: Refresh the actual size of virtio-mem on monitor reconnect qemu: Recalculate balloon on MEMORY_DEVICE_SIZE_CHANGE event and reconnect virsh: Introduce update-memory command news: document recent virtio memory addition
NEWS.rst | 7 + docs/formatdomain.rst | 42 +++- docs/manpages/virsh.rst | 31 +++ docs/schemas/domaincommon.rng | 16 ++ src/conf/domain_conf.c | 100 ++++++++- src/conf/domain_conf.h | 13 ++ src/conf/domain_validate.c | 39 ++++ src/libvirt_private.syms | 3 + src/qemu/qemu_alias.c | 10 +- src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 13 +- src/qemu/qemu_domain.c | 50 ++++- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_domain_address.c | 37 ++- src/qemu/qemu_driver.c | 211 +++++++++++++++++- src/qemu/qemu_hotplug.c | 18 ++ src/qemu/qemu_hotplug.h | 5 + src/qemu/qemu_monitor.c | 37 +++ src/qemu/qemu_monitor.h | 27 +++ src/qemu/qemu_monitor_json.c | 94 ++++++-- src/qemu/qemu_monitor_json.h | 5 + src/qemu/qemu_process.c | 101 ++++++++- src/qemu/qemu_validate.c | 8 + src/security/security_apparmor.c | 1 + src/security/security_dac.c | 2 + src/security/security_selinux.c | 2 + src/util/virhostmem.c | 63 ++++++ src/util/virhostmem.h | 3 + tests/domaincapsmock.c | 9 + .../caps_5.1.0.x86_64.xml | 1 + .../caps_5.2.0.x86_64.xml | 1 + ...mory-hotplug-virtio-mem.x86_64-latest.args | 49 ++++ .../memory-hotplug-virtio-mem.xml | 66 ++++++ tests/qemuxml2argvtest.c | 1 + ...emory-hotplug-virtio-mem.x86_64-latest.xml | 1 + tests/qemuxml2xmltest.c | 1 + tools/virsh-domain.c | 154 +++++++++++++ 38 files changed, 1165 insertions(+), 60 deletions(-) create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml create mode 120000 tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml

Am 22.01.2021 um 19:53 schrieb Daniel Henrique Barboza <danielhb413@gmail.com>:
On 1/22/21 9:50 AM, Michal Privoznik wrote: Technically, this is another version of: https://www.redhat.com/archives/libvir-list/2020-December/msg00199.html But since virtio-pmem part is pushed now, I've reworked virtio-mem a bit and sending it as a new series. For curious ones, David summarized behaviour well when implementing virtio-mem support in kernel: https://lwn.net/Articles/755423/ For less curious ones: # virsh update-memory $dom --requested-size 4G adds additional 4GiB of RAM to guest; # virsh update-memory $dom --requested-size 0 removes those 4GiB added earlier. Patches are also available on my GitLab: https://gitlab.com/MichalPrivoznik/libvirt/-/tree/virtio_mem_v3
Code LGTM:
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Hi, Let me answer your questions.
A few questions about the overall design:
- it is mentioned that 'requested-size' should respect the granularity of the block unit, but later on the 'actual' attribute is added to track the size that the device was expanded/shrunk. What happens if we forfeit the granularity check of the memory increments? Will QEMU error out because we're requesting an invalid value or it will silently size the device to a plausible size?
QEMU will error out, stating that the request-size has to be properly aligned to the block-size.
- Reading the lwn article I understood that David implemented this support for s390x as well. If that's the case, then I believe you should double check later on what's the THP size that Z uses to be sure that it's the same 2MiB value you're considering in patch 03.
In the near future we might see arm64 and s390x support. The latter might probably take a bit longer. Both are not supported yet in QEMU/kernel. The QEMU code has an advanced block-size auto-detection code - e.g., querying from the kernel but limiting it to sane values (e.g., 512 MB on some arm64 configurations). Maybe we can borrow some of that or even sense the block size via QEMU? Borrowing might be easier. :) On x86-64 we are good to go with a 2MB default.
- in patch 03 it is mentioned that:
"If it wants to give more memory to the guest it changes 'requested-size' to a bigger value, and if it wants to shrink guest memory it changes the 'requested-size' to a smaller value. Note, value of zero means that guest should release all memory offered by the device."
Does size zero implicates the virtio-mem device unplug? Will the device still exist in the guest even with zeroed memory, acting as a sort of 'deflated virtio-balloon'?
Yes, the device will still exist, to be grown again later. Hotunplugging the device itself is not supported (yet, and also not in the near future). Thanks!

On 1/22/21 4:54 PM, David Hildenbrand wrote:
Am 22.01.2021 um 19:53 schrieb Daniel Henrique Barboza <danielhb413@gmail.com>:
On 1/22/21 9:50 AM, Michal Privoznik wrote: Technically, this is another version of: https://www.redhat.com/archives/libvir-list/2020-December/msg00199.html But since virtio-pmem part is pushed now, I've reworked virtio-mem a bit and sending it as a new series. For curious ones, David summarized behaviour well when implementing virtio-mem support in kernel: https://lwn.net/Articles/755423/ For less curious ones: # virsh update-memory $dom --requested-size 4G adds additional 4GiB of RAM to guest; # virsh update-memory $dom --requested-size 0 removes those 4GiB added earlier. Patches are also available on my GitLab: https://gitlab.com/MichalPrivoznik/libvirt/-/tree/virtio_mem_v3
Code LGTM:
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Hi,
Let me answer your questions.
Thanks for the reply!
A few questions about the overall design:
- it is mentioned that 'requested-size' should respect the granularity of the block unit, but later on the 'actual' attribute is added to track the size that the device was expanded/shrunk. What happens if we forfeit the granularity check of the memory increments? Will QEMU error out because we're requesting an invalid value or it will silently size the device to a plausible size?
QEMU will error out, stating that the request-size has to be properly aligned to the block-size.
'requested-size' granularity check stays then :)
- Reading the lwn article I understood that David implemented this support for s390x as well. If that's the case, then I believe you should double check later on what's the THP size that Z uses to be sure that it's the same 2MiB value you're considering in patch 03.
In the near future we might see arm64 and s390x support. The latter might probably take a bit longer. Both are not supported yet in QEMU/kernel.
Out of curiosity: are you aware of anyone working in enabling virtio-mem for pseries/ppc64? I'm wondering if there's some kind of architecture limitation in Power or if it's just a lack of interest.
The QEMU code has an advanced block-size auto-detection code - e.g., querying from the kernel but limiting it to sane values (e.g., 512 MB on some arm64 configurations). Maybe we can borrow some of that or even sense the block size via QEMU? Borrowing might be easier. :)
I guess it's a good candidate for a fancy QMP API.
On x86-64 we are good to go with a 2MB default.
- in patch 03 it is mentioned that:
"If it wants to give more memory to the guest it changes 'requested-size' to a bigger value, and if it wants to shrink guest memory it changes the 'requested-size' to a smaller value. Note, value of zero means that guest should release all memory offered by the device."
Does size zero implicates the virtio-mem device unplug? Will the device still exist in the guest even with zeroed memory, acting as a sort of 'deflated virtio-balloon'?
Yes, the device will still exist, to be grown again later. Hotunplugging the device itself is not supported (yet, and also not in the near future).
Assuming that virtio-mem has low overhead in the guest when it's 'deflated', I don't see any urgency into implementing hotunplug for this device TBH. Thanks, DHB
Thanks!

Out of curiosity: are you aware of anyone working in enabling virtio-mem for pseries/ppc64? I'm wondering if there's some kind of architecture limitation in Power or if it's just a lack of interest.
I remember there is interest, however: - arm64 and x86-64 is used more frequently in applicable (cloud?) setups, so it has high prio - s390x doesn‘t have any proper memory hot(un)plug, and as I have a strong s399x background, it‘s rather easy for me to implement - ppc64 at least supports hot(un)plug of DIMMs There is nothing fundamental speaking against ppc64 support AFAIR. A block size of 16MB should be possible. I‘m planning on looking into it, however, there are a lot of other things on my todo list for virtio-mem.
The QEMU code has an advanced block-size auto-detection code - e.g., querying from the kernel but limiting it to sane values (e.g., 512 MB on some arm64 configurations). Maybe we can borrow some of that or even sense the block size via QEMU? Borrowing might be easier. :)
I guess it's a good candidate for a fancy QMP API.
One can at least query the block-size via „qom-get“, but that requires to spin up an QEMU instance with a virtio-mem device.
On x86-64 we are good to go with a 2MB default.
- in patch 03 it is mentioned that:
"If it wants to give more memory to the guest it changes 'requested-size' to a bigger value, and if it wants to shrink guest memory it changes the 'requested-size' to a smaller value. Note, value of zero means that guest should release all memory offered by the device."
Does size zero implicates the virtio-mem device unplug? Will the device still exist in the guest even with zeroed memory, acting as a sort of 'deflated virtio-balloon'?
Yes, the device will still exist, to be grown again later. Hotunplugging the device itself is not supported (yet, and also not in the near future).
Assuming that virtio-mem has low overhead in the guest when it's 'deflated', I don't see any urgency into implementing hotunplug for this device TBH.
There are still things to be optimized in QEMU regarding virtual memory consumption, but that‘s more general work to be tackled within the next months. After that, not too much speaks against just letting the device stick around to provide more nemory later on demand. Thanks!

On 1/22/21 6:19 PM, David Hildenbrand wrote:
Out of curiosity: are you aware of anyone working in enabling virtio-mem for pseries/ppc64? I'm wondering if there's some kind of architecture limitation in Power or if it's just a lack of interest.
I remember there is interest, however:
- arm64 and x86-64 is used more frequently in applicable (cloud?) setups, so it has high prio - s390x doesn‘t have any proper memory hot(un)plug, and as I have a strong s399x background, it‘s rather easy for me to implement - ppc64 at least supports hot(un)plug of DIMMs
There is nothing fundamental speaking against ppc64 support AFAIR.
That's good to hear.
A block size of 16MB should be possible. I‘m planning on looking into it, however, there are a lot of other things on my todo list for virtio-mem.
I'm not familiar with the 'block size' concept of the virtio-mem device that would allow for 16MB increments. My knowledge of the pseries kernel/QEMU is that the guest visible memory must always be 256MiB aligned due to PAPR mechanics that forces a memory block to be at least this size. Albeit I believe that there is no constraints of the memory this device is providing being counted as non-hotplugable, then in this case the alignment shouldn't be needed. But I digress. Thanks for the insights. I'll ping some people inside IBM and see if we have a more immediate use case for virtio-mem in Power. Perhaps we can do some sort of collaboration with your work. Thanks, DHB
The QEMU code has an advanced block-size auto-detection code - e.g., querying from the kernel but limiting it to sane values (e.g., 512 MB on some arm64 configurations). Maybe we can borrow some of that or even sense the block size via QEMU? Borrowing might be easier. :)
I guess it's a good candidate for a fancy QMP API.
One can at least query the block-size via „qom-get“, but that requires to spin up an QEMU instance with a virtio-mem device.
On x86-64 we are good to go with a 2MB default.
- in patch 03 it is mentioned that:
"If it wants to give more memory to the guest it changes 'requested-size' to a bigger value, and if it wants to shrink guest memory it changes the 'requested-size' to a smaller value. Note, value of zero means that guest should release all memory offered by the device."
Does size zero implicates the virtio-mem device unplug? Will the device still exist in the guest even with zeroed memory, acting as a sort of 'deflated virtio-balloon'?
Yes, the device will still exist, to be grown again later. Hotunplugging the device itself is not supported (yet, and also not in the near future).
Assuming that virtio-mem has low overhead in the guest when it's 'deflated', I don't see any urgency into implementing hotunplug for this device TBH.
There are still things to be optimized in QEMU regarding virtual memory consumption, but that‘s more general work to be tackled within the next months. After that, not too much speaks against just letting the device stick around to provide more nemory later on demand.
Thanks!

On 22.01.21 23:03, Daniel Henrique Barboza wrote:
On 1/22/21 6:19 PM, David Hildenbrand wrote:
Out of curiosity: are you aware of anyone working in enabling virtio-mem for pseries/ppc64? I'm wondering if there's some kind of architecture limitation in Power or if it's just a lack of interest.
I remember there is interest, however:
- arm64 and x86-64 is used more frequently in applicable (cloud?) setups, so it has high prio - s390x doesn‘t have any proper memory hot(un)plug, and as I have a strong s399x background, it‘s rather easy for me to implement - ppc64 at least supports hot(un)plug of DIMMs
There is nothing fundamental speaking against ppc64 support AFAIR.
That's good to hear.
A block size of 16MB should be possible. I‘m planning on looking into it, however, there are a lot of other things on my todo list for virtio-mem.
I'm not familiar with the 'block size' concept of the virtio-mem device that would allow for 16MB increments. My knowledge of the pseries kernel/QEMU is that the guest visible memory must always be 256MiB aligned due to PAPR mechanics that forces a memory block to be at least this size. Albeit I believe that there is no constraints of the memory this device is providing being counted as non-hotplugable, then in this case the alignment shouldn't be needed.
In Linux guests, virtio-mem adds whole memory blocks (e.g., aligned 256MB), but is able to expose only parts of a memory block dynamically to Linux mm - essentially in 16MB on ppc64 IIRC. E.g., on x86-64 (and soon arm64), we mostly add 128MB memory blocks, but can operate on (currently) 4MB blocks (MAX_ORDER - 1) inside these blocks. A little like memory ballooning ... but also quite different :) So far the theory on ppc64. I have no prototype on ppc64, so we'll have to see what's actually possible.
But I digress. Thanks for the insights. I'll ping some people inside IBM and see if we have a more immediate use case for virtio-mem in Power. Perhaps we can do some sort of collaboration with your work.
Sure, I'll be happy to assist. -- Thanks, David / dhildenb

On Fri, Jan 22, 2021 at 13:50:21 +0100, Michal Privoznik wrote:
Technically, this is another version of:
https://www.redhat.com/archives/libvir-list/2020-December/msg00199.html
But since virtio-pmem part is pushed now, I've reworked virtio-mem a bit and sending it as a new series.
For curious ones, David summarized behaviour well when implementing virtio-mem support in kernel:
https://lwn.net/Articles/755423/
For less curious ones:
# virsh update-memory $dom --requested-size 4G
adds additional 4GiB of RAM to guest;
# virsh update-memory $dom --requested-size 0
removes those 4GiB added earlier.
Patches are also available on my GitLab:
https://gitlab.com/MichalPrivoznik/libvirt/-/tree/virtio_mem_v3
Patches 1-7,9-10 (but observe some individual comments, including the rename of the virsh commands): Reviewed-by: Peter Krempa <pkrempa@redhat.com> Patch 8 has severe semantic problems.

I did some test for virtio-mem with libvirt upstream version v7.0.0-153-g5ea3ecd07d & qemu-kvm-5.2.0-0.7.rc2.fc34.x86_64 S1. Start domain with memory device 1. Domain configuration- <maxMemory slots='16' unit='KiB'>10485760</maxMemory> <memory unit='KiB'>1572864</memory> <currentMemory unit='KiB'>1572864</currentMemory> ... <cpu mode='host-model' check='partial'> <numa> <cell id='0' cpus='0-3' memory='1048576' unit='KiB'/> </numa> </cpu> ... <memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>393216</requested> </target> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory> #virsh start pc Domain 'pc' started 2. The domain is started and check mem status, the actual size is "1048576" #virsh dommemstat pc actual 1048576 swap_in 0 swap_out 0 major_fault 257 minor_fault 130540 unused 604064 available 761328 usable 578428 last_update 1612325471 disk_caches 49632 hugetlb_pgalloc 0 hugetlb_pgfail 0 rss 460260 3. Then, check the active xml - # virsh dumpxml pc ... <memory unit='KiB'>1572864</memory> <currentMemory unit='KiB'>1048576</currentMemory> .... <memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>393216</requested> <actual unit='KiB'>0</actual> </target> <alias name='virtiomem0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory> Question1 : the value of actual is "0". Is it expected? S2. Also, tried to use hugepage to start the domain - <memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>393216</requested> <actual unit='KiB'>0</actual> </target> <alias name='virtiomem0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory> #virsh start pc error: Failed to start domain 'pc' error: internal error: process exited while connecting to monitor: 2021-02-03T05:50:33.157836Z qemu-system-x86_64: -object memory-backend-file,id=memvirtiomem0,mem-path=/dev/hugepages/libvirt/qemu/9-pc,size=536870912,host-nodes=0,policy=bind: can't open backing store /dev/hugepages/libvirt/qemu/9-pc for guest RAM: Permission denied Question 2: any bug here? On Tue, Feb 2, 2021 at 9:44 PM Peter Krempa <pkrempa@redhat.com> wrote:
On Fri, Jan 22, 2021 at 13:50:21 +0100, Michal Privoznik wrote:
Technically, this is another version of:
https://www.redhat.com/archives/libvir-list/2020-December/msg00199.html
But since virtio-pmem part is pushed now, I've reworked virtio-mem a bit and sending it as a new series.
For curious ones, David summarized behaviour well when implementing virtio-mem support in kernel:
https://lwn.net/Articles/755423/
For less curious ones:
# virsh update-memory $dom --requested-size 4G
adds additional 4GiB of RAM to guest;
# virsh update-memory $dom --requested-size 0
removes those 4GiB added earlier.
Patches are also available on my GitLab:
https://gitlab.com/MichalPrivoznik/libvirt/-/tree/virtio_mem_v3
Patches 1-7,9-10 (but observe some individual comments, including the rename of the virsh commands):
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Patch 8 has severe semantic problems.
-- Thanks & Regards, Jing,Qi

On 2/3/21 7:11 AM, Jing Qi wrote:
I did some test for virtio-mem with libvirt upstream version v7.0.0-153-g5ea3ecd07d & qemu-kvm-5.2.0-0.7.rc2.fc34.x86_64
S1. Start domain with memory device
1. Domain configuration- <maxMemory slots='16' unit='KiB'>10485760</maxMemory> <memory unit='KiB'>1572864</memory> <currentMemory unit='KiB'>1572864</currentMemory> ... <cpu mode='host-model' check='partial'> <numa> <cell id='0' cpus='0-3' memory='1048576' unit='KiB'/> </numa> </cpu> ... <memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>393216</requested> </target> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory> #virsh start pc Domain 'pc' started
2. The domain is started and check mem status, the actual size is "1048576" #virsh dommemstat pc actual 1048576
This is a bummer. What this represents is the actual value of membaloon.
swap_in 0 swap_out 0 major_fault 257 minor_fault 130540 unused 604064 available 761328 usable 578428 last_update 1612325471 disk_caches 49632 hugetlb_pgalloc 0 hugetlb_pgfail 0 rss 460260
3. Then, check the active xml - # virsh dumpxml pc ... <memory unit='KiB'>1572864</memory> <currentMemory unit='KiB'>1048576</currentMemory> .... <memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>393216</requested> <actual unit='KiB'>0</actual> </target> <alias name='virtiomem0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory>
Question1 : the value of actual is "0". Is it expected?
Yes and no. It shows that we need <actual/> because guest might ignore request to change the requested size. What you're probably experiencing is that you haven't loaded virtio_mem module and thus the virtio-mem device is ignoring requests for change (well, kernel is ignoring them, whatever).
S2. Also, tried to use hugepage to start the domain -
<memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>393216</requested> <actual unit='KiB'>0</actual> </target> <alias name='virtiomem0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory>
#virsh start pc error: Failed to start domain 'pc' error: internal error: process exited while connecting to monitor: 2021-02-03T05:50:33.157836Z qemu-system-x86_64: -object memory-backend-file,id=memvirtiomem0,mem-path=/dev/hugepages/libvirt/qemu/9-pc,size=536870912,host-nodes=0,policy=bind: can't open backing store /dev/hugepages/libvirt/qemu/9-pc for guest RAM: Permission denied
Question 2: any bug here?
Ah, good catch! I'll fix this in v2. Michal

Michal, I checked the virtio_mem module and it's loaded - lsmod |grep virtio_mem virtio_mem 32768 0 And I can't make the actual value change to non-zerio. -> virsh update-memory pc --requested-size 256M or ->virsh setmem pc 1000M <memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>262144</requested> <actual unit='KiB'>0</actual> </target> <alias name='virtiomem0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory> Any other suggestions ?
This is a bummer. What this represents is the actual value of membaloon. Yes and no. It shows that we need <actual/> because guest might ignore request to change the requested size. What you're probably experiencing is that you haven't loaded virtio_mem module and thus the virtio-mem device is ignoring requests for change (well, kernel is ignoring them, whatever).
On Thu, Feb 4, 2021 at 12:27 AM Michal Privoznik <mprivozn@redhat.com> wrote:
On 2/3/21 7:11 AM, Jing Qi wrote:
I did some test for virtio-mem with libvirt upstream version v7.0.0-153-g5ea3ecd07d & qemu-kvm-5.2.0-0.7.rc2.fc34.x86_64
S1. Start domain with memory device
1. Domain configuration- <maxMemory slots='16' unit='KiB'>10485760</maxMemory> <memory unit='KiB'>1572864</memory> <currentMemory unit='KiB'>1572864</currentMemory> ... <cpu mode='host-model' check='partial'> <numa> <cell id='0' cpus='0-3' memory='1048576' unit='KiB'/> </numa> </cpu> ... <memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>393216</requested> </target> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory> #virsh start pc Domain 'pc' started
2. The domain is started and check mem status, the actual size is "1048576" #virsh dommemstat pc actual 1048576
This is a bummer. What this represents is the actual value of membaloon.
swap_in 0 swap_out 0 major_fault 257 minor_fault 130540 unused 604064 available 761328 usable 578428 last_update 1612325471 disk_caches 49632 hugetlb_pgalloc 0 hugetlb_pgfail 0 rss 460260
3. Then, check the active xml - # virsh dumpxml pc ... <memory unit='KiB'>1572864</memory> <currentMemory unit='KiB'>1048576</currentMemory> .... <memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>393216</requested> <actual unit='KiB'>0</actual> </target> <alias name='virtiomem0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory>
Question1 : the value of actual is "0". Is it expected?
Yes and no. It shows that we need <actual/> because guest might ignore request to change the requested size. What you're probably experiencing is that you haven't loaded virtio_mem module and thus the virtio-mem device is ignoring requests for change (well, kernel is ignoring them, whatever).
S2. Also, tried to use hugepage to start the domain -
<memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>393216</requested> <actual unit='KiB'>0</actual> </target> <alias name='virtiomem0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory>
#virsh start pc error: Failed to start domain 'pc' error: internal error: process exited while connecting to monitor: 2021-02-03T05:50:33.157836Z qemu-system-x86_64: -object
memory-backend-file,id=memvirtiomem0,mem-path=/dev/hugepages/libvirt/qemu/9-pc,size=536870912,host-nodes=0,policy=bind:
can't open backing store /dev/hugepages/libvirt/qemu/9-pc for guest RAM: Permission denied
Question 2: any bug here?
Ah, good catch! I'll fix this in v2.
Michal
-- Thanks & Regards, Jing,Qi

On 2/4/21 3:33 AM, Jing Qi wrote:
Michal, I checked the virtio_mem module and it's loaded -
lsmod |grep virtio_mem virtio_mem 32768 0
And I can't make the actual value change to non-zerio. -> virsh update-memory pc --requested-size 256M or ->virsh setmem pc 1000M
This is unrelated. 'setmem' modifies balloon not virtio-mem-pci device.
<memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>262144</requested> <actual unit='KiB'>0</actual> </target> <alias name='virtiomem0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory> Any other suggestions ?
Is there something in the guest dmesg? This is what I get: virsh update-memory-device gentoo --alias ua-virtiomem 256M [ 17.619060] virtio_mem virtio3: plugged size: 0x0 [ 17.619062] virtio_mem virtio3: requested size: 0x10000000 [ 17.653072] Built 4 zonelists, mobility grouping on. Total pages: 2065850 [ 17.653074] Policy zone: Normal And I can see actual size updated: <memory model='virtio-mem'> <source> <pagesize unit='KiB'>2048</pagesize> </source> <target> <size unit='KiB'>4194304</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>262144</requested> <actual unit='KiB'>262144</actual> </target> <alias name='ua-virtiomem'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </memory> Maybe David knows the answer. Michal

Thanks Michal. I checked the virtio_mem module in the host last time. I tried a new guest image with virtio_mem module and start the domain , then the value of actual can be set correctly. Jing Qi On Thu, Feb 4, 2021 at 4:52 PM Michal Privoznik <mprivozn@redhat.com> wrote:
On 2/4/21 3:33 AM, Jing Qi wrote:
Michal, I checked the virtio_mem module and it's loaded -
lsmod |grep virtio_mem virtio_mem 32768 0
And I can't make the actual value change to non-zerio. -> virsh update-memory pc --requested-size 256M or ->virsh setmem pc 1000M
This is unrelated. 'setmem' modifies balloon not virtio-mem-pci device.
<memory model='virtio-mem'> <target> <size unit='KiB'>524288</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>262144</requested> <actual unit='KiB'>0</actual> </target> <alias name='virtiomem0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </memory> Any other suggestions ?
Is there something in the guest dmesg? This is what I get:
virsh update-memory-device gentoo --alias ua-virtiomem 256M
[ 17.619060] virtio_mem virtio3: plugged size: 0x0 [ 17.619062] virtio_mem virtio3: requested size: 0x10000000 [ 17.653072] Built 4 zonelists, mobility grouping on. Total pages: 2065850 [ 17.653074] Policy zone: Normal
And I can see actual size updated:
<memory model='virtio-mem'> <source> <pagesize unit='KiB'>2048</pagesize> </source> <target> <size unit='KiB'>4194304</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>262144</requested> <actual unit='KiB'>262144</actual> </target> <alias name='ua-virtiomem'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </memory>
Maybe David knows the answer.
Michal
-- Thanks & Regards, Jing,Qi

On 2/5/21 9:34 AM, Jing Qi wrote:
Thanks Michal. I checked the virtio_mem module in the host last time. I tried a new guest image with virtio_mem module and start the domain , then the value of actual can be set correctly.
Yeah, it's the guest that needs the module. Glad to hear it's working. Michal
participants (5)
-
Daniel Henrique Barboza
-
David Hildenbrand
-
Jing Qi
-
Michal Privoznik
-
Peter Krempa