Hi,
qemu recently added support for memory hotplug (hot unplug will arrive
later) since commit ~ bef3492d1169a54c966cecd0e0b1dd25e9341582 in qemu.git.
For the hotplug to work the VM needs to be started with a certain number
of "dimm" slots for plugging virtual memory modules. The memory of the
VM at startup has to occupy at least one of the slots. Later on the
management can decide to plug more memory to the guest by inserting a
virtual memory module into the guest.
For representing this in libvirt I'm thinking of using the <devices>
section of our domain XML where we'd add a new device type:
<memory type="ram">
<source .../> <!-- will be elaborated below -->
<target .../> <!-- will be elaborated below -->
<address type="acpi-dimm" slot="1"/>
</memory>
type="ram" to denote that we are adding RAM memory. This will allow
possible extensions for example to add a generic pflash (type="flash",
type="rom") device or other
memory type mapped in the address space.
To enable this infrastructure qemu needs two command line options
supplied, one setting the maximum amount of supportable memory and the
second one the maximum number of memory modules (capped at 256 due to ACPI).
The current XML format for specifying memory looks like this:
<memory unit='KiB'>524288</memory>
<currentMemory unit='KiB'>524288</currentMemory>
I'm thinking of adding following attributes to the memory element:
<memory slots='16' max-memory='1' max-memory-unit='TiB'/>
This would then be updated to the actual size after summing sizes of the
memory modules to:
<memory slots='16' max-memory='1' max-memory-unit='TiB'/>
unit='MiB'>512</memory>
This would also allow a possibility to specify the line above and
libvirt would then add a memory module holding the whole guest memory.
Representing the memory module as a device will then allow us to use the
existing hot(un)plug APIs to do the operations on the actual VM.
For the ram memory type the source and target elements will allow to
specify the following option.
For backing the guest with normal memory:
<source type="ram" size="500" unit="MiB"
host-node="2"/>
For hugepage-backed guest:
<source type="hugepage" page-size="2048" count="1024"
node="1"/>
Note: node attribute target's host numa node and is optional.
And possibly others possibly for the rom/flash types:
<source type="file" path="/asdf"/>
For targetting the RAM module the target element could have the
following format:
<target model="dimm" node='2' address='0xdeadbeef'/>
"node" determines the guest numa node to connect the memory "module"
to.
The attribute is optional for non-numa guests or node 0 is assumed.
"address" determines the address in the guest's memory space where the
memory will be mapped. This is optional and not recommended being set by
the user (except for special cases).
For expansion the model="pflash" device may be added.
For migration the target VM needs to be started with the hotplugged
modules already specified on the command line, which is in line how we
treat devices currently.
My suggestion above contrasts with the approach Michal and Martin took
when adding the numa and hugepage backing capabilities as they describe
a node while this describes the memory device beneath it. I think those
two approaches can co-exist whilst being mutually-exclusive. Simply when
using memory hotplug, the memory will need to be specified using the
memory modules. Non-hotplug guests could use the approach defined
originally.
That concludes my thoughts on this subject, but I'm open for discussion
or other approach (I didn't start implementation yet).
Thanks
Peter