On Thu, Feb 18, 2021 at 14:31:00 +0100, Michal Privoznik wrote:
The virtio-mem is paravirtualized mechanism of adding/removing
memory to/from a VM. A virtio-mem-pci device is split into blocks
of equal size which are then exposed (all or only a requested
portion of them) to the guest kernel to use as regular memory.
Therefore, the device has two important attributes:
1) block-size, which defines the size of a block
2) requested-size, which defines how much memory (in bytes)
is the device requested to expose to the guest.
The 'block-size' is configured on command line and immutable
throughout device's lifetime. The 'requested-size' can be set on
the command line too, but also is adjustable via monitor. In
fact, that is how management software places its requests to
change the memory allocation. If it wants to give more memory to
the guest it changes 'requested-size' to a bigger value, and if it
wants to shrink guest memory it changes the 'requested-size' to a
smaller value. Note, value of zero means that guest should
release all memory offered by the device. Of course, guest has to
cooperate. Therefore, there is a third attribute 'size' which is
read only and reflects how much memory the guest still has. This
can be different to 'requested-size', obviously. Because of name
clash, I've named it 'actualsize' and it is dealt with in future
commits (it is a runtime information anyway).
In the backend, memory for virtio-mem is backed by usual objects:
memory-backend-{ram,file,memfd} and their size puts the cap on
the amount of memory that a virtio-mem device can offer to a
guest. But we are already able to express this info using <size/>
under <target/>.
Therefore, we need only two more elements to cover 'block-size'
and 'requested-size' attributes. This is the XML I've came up
with:
<memory model='virtio-mem'>
<source>
<nodemask>1-3</nodemask>
<pagesize unit='KiB'>2048</pagesize>
</source>
<target>
<size unit='KiB'>2097152</size>
<node>0</node>
<block unit='KiB'>2048</block>
<requested unit='KiB'>1048576</requested>
</target>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x04' function='0x0'/>
</memory>
I hope by now it is obvious that:
1) 'requested-size' must be an integer multiple of
'block-size', and
2) virtio-mem-pci device goes onto PCI bus and thus needs PCI
address.
Then there is a limitation that the minimal 'block-size' is
transparent huge page size (I'll leave this without explanation).
Since now we have (possibly) two or more devices that allow
memory inflation/deflation and accounting for all of them (and
thus keeping <currentMemory/> updated) might be hard. Therefore,
I'm deliberately forbidding memballoon. It's okay - virtio-mem is
superior to memballoon anyway. We can always reevaluate later.
Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
---
docs/formatdomain.rst | 51 +++++++++++++--
docs/schemas/domaincommon.rng | 11 ++++
src/conf/domain_conf.c | 53 ++++++++++++++-
src/conf/domain_conf.h | 3 +
src/conf/domain_validate.c | 39 +++++++++++
src/qemu/qemu_alias.c | 1 +
src/qemu/qemu_command.c | 1 +
src/qemu/qemu_domain.c | 27 +++++++-
src/qemu/qemu_domain_address.c | 38 ++++++++---
src/qemu/qemu_process.c | 2 +
src/qemu/qemu_validate.c | 22 ++++++-
src/security/security_apparmor.c | 1 +
src/security/security_dac.c | 2 +
src/security/security_selinux.c | 2 +
.../memory-hotplug-virtio-mem.xml | 64 +++++++++++++++++++
...emory-hotplug-virtio-mem.x86_64-latest.xml | 1 +
tests/qemuxml2xmltest.c | 1 +
17 files changed, 298 insertions(+), 21 deletions(-)
create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml
create mode 120000 tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
index 2587106191..05d359a32d 100644
--- a/docs/formatdomain.rst
+++ b/docs/formatdomain.rst
[...]
@@ -6935,6 +6937,16 @@ Example: automatically added device with KVM
</devices>
...
+Example: automatically added device with QEMU/KVM and a ``virtio-mem`` device:
+
+::
+
+ ...
+ <devices>
The example is missing the 'virtio-mem' device which causes the balloon
to be disabled.
+ <memballoon model='none'/>
+ </devices>
+ ...
+
Example: manually added device with static PCI slot 2 requested
::
[...]
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 541d592bbe..26b9b5583e 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -3679,10 +3679,21 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def,
}
if (addDefaultMemballoon && !def->memballoon) {
- virDomainMemballoonDefPtr memballoon;
- memballoon = g_new0(virDomainMemballoonDef, 1);
+ virDomainMemballoonDefPtr memballoon = g_new0(virDomainMemballoonDef, 1);
+ size_t i;
- memballoon->model = VIR_DOMAIN_MEMBALLOON_MODEL_VIRTIO;
+ /* To simplify virtio-mem implementation, memballoon has to be turned
+ * off if domain has a virtio-mem device. See
+ * qemuValidateDomainDeviceDefMemory() for more details. */
+ for (i = 0; i < def->nmems; i++) {
+ if (def->mems[i]->model == VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM)
+ break;
+ }
+
+ if (i == def->nmems)
+ memballoon->model = VIR_DOMAIN_MEMBALLOON_MODEL_VIRTIO;
+ else
+ memballoon->model = VIR_DOMAIN_MEMBALLOON_MODEL_NONE;
def->memballoon = memballoon;
}
This might be a point of contention. I can see why we'd want to do this
but people might have other ideas. Preferably I'd like danpb's input.
I think the best approach will be to not hide it into the massive patch
adding the boring XML bits but split it separately.
[...]
diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c
index 2541ae856a..6150083251 100644
--- a/src/qemu/qemu_validate.c
+++ b/src/qemu/qemu_validate.c
[...]
@@ -4688,6 +4689,25 @@
qemuValidateDomainDeviceDefMemory(virDomainMemoryDefPtr mem,
}
break;
+ case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM:
+ /* Accounting balloon and virtio-mem is hard. We have plenty of APIs
+ * which take balloon from QEMU and report it to users. We would have
+ * to change all that and account for virtio-mem actual size. Also,
+ * virtio-mem is supposed to be replacement for balloon. Disable
+ * coexistence of these two for now. */
+ if (virDomainDefHasMemballoon(def)) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("virtio-mem is not supported with memory
balloon"));
+ return -1;
+ }
+
+ if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI)) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("virtio-mem isn't supported by this QEMU
binary"));
+ return -1;
+ }
+ break;
And this one too. I agree with this hunk as is, but not in this commit.