On Fri, Jan 22, 2021 at 13:50:25 +0100, Michal Privoznik wrote:
The virtio-mem is paravirtualized mechanism of adding/removing
memory to/from a VM. A virtio-mem-pci device is split into blocks
of equal size which are then exposed (all or only a requested
portion of them) to the guest kernel to use as regular memory.
Therefore, the device has two important attributes:
1) block-size, which defines the size of a block
2) requested-size, which defines how much memory (in bytes)
is the device requested to expose to the guest.
The 'block-size' is configured on command line and immutable
throughout device's lifetime. The 'requested-size' can be set on
the command line too, but also is adjustable via monitor. In
fact, that is how management software places its requests to
change the memory allocation. If it wants to give more memory to
the guest it changes 'requested-size' to a bigger value, and if it
wants to shrink guest memory it changes the 'requested-size' to a
smaller value. Note, value of zero means that guest should
release all memory offered by the device. Of course, guest has to
cooperate. Therefore, there is a third attribute 'size' which is
read only and reflects how much memory the guest still has. This
can be different to 'requested-size', obviously. Because of name
clash, I've named it 'actualsize' and it is dealt with in future
commits (it is a runtime information anyway).
In the backend, memory for virtio-mem is backed by usual objects:
memory-backend-{ram,file,memfd} and their size puts the cap on
the amount of memory that a virtio-mem device can offer to a
guest. But we are already able to express this info using <size/>
under <target/>.
Therefore, we need only two more elements to cover 'block-size'
and 'requested-size' attributes. This is the XML I've came up
with:
<memory model='virtio-mem'>
<source>
<nodemask>1-3</nodemask>
<pagesize unit='KiB'>2048</pagesize>
</source>
<target>
<size unit='KiB'>2097152</size>
<node>0</node>
<block unit='KiB'>2048</block>
<requested unit='KiB'>1048576</requested>
</target>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x04' function='0x0'/>
</memory>
I hope by now it is obvious that:
1) 'requested-size' must be an integer multiple of
'block-size', and
2) virtio-mem-pci device goes onto PCI bus and thus needs PCI
address.
Then there is a limitation that the minimal 'block-size' is
transparent huge page size (I'll leave this without explanation).
Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
---
[...]
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
index af540391db..2938758ec2 100644
--- a/docs/formatdomain.rst
+++ b/docs/formatdomain.rst
[...]
@@ -7274,7 +7286,8 @@ Example: usage of the memory devices
Provide ``dimm`` to add a virtual DIMM module to the guest. :since:`Since
1.2.14` Provide ``nvdimm`` model that adds a Non-Volatile DIMM module.
:since:`Since 3.2.0` Provide ``virtio-pmem`` model to add a paravirtualized
- persistent memory device. :since:`Since 7.1.0`
+ persistent memory device. :since:`Since 7.1.0` Provide ``virtio-mem`` model
+ to add paravirtualized memory device. :since: `Since 7.1.0`
s/: `/:`/
``access``
An optional attribute ``access`` ( :since:`since 3.2.0` ) that provides
[...]
@@ -7366,6 +7380,17 @@ Example: usage of the memory devices
so other backend types should use the ``readonly`` element. :since:`Since
5.0.0`
+ ``block``
+ For ``virtio-mem`` only.
+ The size of an individual block, granularity of division of memory module.
+ Must be power of two and at least equal to size of a transparent hugepage
+ (2MiB on x84_64). The default is hypervisor dependant.
+
+ ``requested``
+ For ``virtio-mem`` only.
+ The total size of blocks exposed to the guest. Must respect ``block``
+ granularity.
This is a bit misleading. I'd avoid using 'blocks'.
[...]
diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c
index 649fc335ac..b5a0c09468 100644
--- a/src/conf/domain_validate.c
+++ b/src/conf/domain_validate.c
@@ -25,6 +25,7 @@
#include "virconftypes.h"
#include "virlog.h"
#include "virutil.h"
+#include "virhostmem.h"
#define VIR_FROM_THIS VIR_FROM_DOMAIN
@@ -1389,6 +1390,8 @@ static int
virDomainMemoryDefValidate(const virDomainMemoryDef *mem,
const virDomainDef *def)
{
+ unsigned long long thpSize;
+
switch (mem->model) {
case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
if (!mem->nvdimmPath) {
@@ -1442,6 +1445,42 @@ virDomainMemoryDefValidate(const virDomainMemoryDef *mem,
_("virtio-pmem does not support NUMA nodes"));
return -1;
}
+ break;
+
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM:
+ if (mem->requestedsize > mem->size) {
+ virReportError(VIR_ERR_XML_DETAIL, "%s",
+ _("requested size must be smaller than @size"));
+ return -1;
+ }
so mem->size is the maximum size of the virtio-mem provided memory? I
don't think that it's obvious from the docs.
+
+ if (!VIR_IS_POW2(mem->blocksize)) {
+ virReportError(VIR_ERR_XML_DETAIL, "%s",
+ _("block size must be a power of two"));
+ return -1;
+ }
+
[...]
diff --git a/tests/domaincapsmock.c b/tests/domaincapsmock.c
index d81a898dc0..34a2f9ad81 100644
--- a/tests/domaincapsmock.c
+++ b/tests/domaincapsmock.c
@@ -17,6 +17,7 @@
#include <config.h>
#include "virhostcpu.h"
+#include "virhostmem.h"
#ifdef WITH_LIBXL
# include "libxl/libxl_capabilities.h"
#endif
@@ -40,3 +41,11 @@ virHostCPUGetMicrocodeVersion(virArch hostArch G_GNUC_UNUSED)
{
return 0;
}
+
+int
+virHostMemGetTHPSize(unsigned long long *size)
+{
+ /* Pretend Transparent Huge Page size is 2MiB. */
+ *size = 2048;
+ return 0;
+}
Put the mocking along with the implementation or separately. It seems
weird to implement this in a patch which adds a new device model.