[libvirt] [RFC PATCH 00/12] Add support for memory hotplug

!! this series applies on top of the cleanup series posted earlier !! Hi, this is my try to implement memory hotplug in libvirt (for the qemu) driver. This series is almost code-complete but is lacking tests and docs as I wanted to agree on design first before attempting to write the documentation. Additionally this series is also lacking code that would fix memballoon handling but I'm waiting on a clousure how it's supposed to work from qemu's side as it appears to be broken there too. The basic XML used to add a memory device: <memory model='acpi-dimm'> <target> <size unit='KiB'>524287</size> <node>0</node> </target> </memory> The <target> subelement is mandatory, whereas the <source> subelement (that contains source numa nodes, source page size ) is optional and is inferred from the NUMA tuning for given target numa node. Please note that at least one guest numa node has to be configured for the guest for this to work (limitation of qemu). What's missing in this series: - tests - docs - commit message touch-up - code to audit the memory size changes - code to make memory balloning working with correct size Peter Krempa (12): qemu: caps: Add capability bit for the "pc-dimm" device conf: Add support for parsing and formatting max memory and slot count qemu: Implement setup of memory hotplug parameters conf: Add device address type for dimm devices conf: Add interface to parse and format memory device information qemu: memdev: Add infrastructure to load memory device information qemu: migration: Forbid migration with memory modules lacking info qemu: add support for memory devices conf: Introduce helper to find duplicate device address qemu: conf: Add support for memory device cold(un)plug qemu: Implement memory device hotplug qemu: Implement memory device hotunplug docs/formatdomain.html.in | 19 + docs/schemas/domaincommon.rng | 76 +++ src/bhyve/bhyve_domain.c | 9 +- src/conf/domain_conf.c | 655 ++++++++++++++++++++- src/conf/domain_conf.h | 63 ++ src/libvirt_private.syms | 7 + src/libxl/libxl_domain.c | 8 + src/lxc/lxc_domain.c | 8 + src/openvz/openvz_driver.c | 14 +- src/parallels/parallels_driver.c | 6 +- src/phyp/phyp_driver.c | 6 +- src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 157 ++++- src/qemu/qemu_command.h | 15 + src/qemu/qemu_domain.c | 58 ++ src/qemu/qemu_domain.h | 6 + src/qemu/qemu_driver.c | 29 + src/qemu/qemu_hotplug.c | 178 ++++++ src/qemu/qemu_hotplug.h | 6 + src/qemu/qemu_migration.c | 14 + src/qemu/qemu_monitor.c | 45 ++ src/qemu/qemu_monitor.h | 14 + src/qemu/qemu_monitor_json.c | 116 ++++ src/qemu/qemu_monitor_json.h | 5 + src/qemu/qemu_process.c | 4 + src/uml/uml_driver.c | 9 +- src/vbox/vbox_common.c | 6 +- src/vmware/vmware_driver.c | 6 +- src/vmx/vmx.c | 6 +- src/xen/xen_driver.c | 7 + src/xenapi/xenapi_driver.c | 9 +- tests/domainschemadata/maxMemory.xml | 19 + tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 1 + .../qemuxml2argv-memory-hotplug-dimm.xml | 47 ++ .../qemuxml2argv-memory-hotplug-nonuma.xml | 22 + .../qemuxml2argv-memory-hotplug.args | 6 + .../qemuxml2argv-memory-hotplug.xml | 34 ++ tests/qemuxml2argvtest.c | 4 + tests/qemuxml2xmltest.c | 4 + 40 files changed, 1682 insertions(+), 19 deletions(-) create mode 100644 tests/domainschemadata/maxMemory.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nonuma.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.xml -- 2.2.2

The pc-dimm device represents a RAM memory module. --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 1 + 3 files changed, 4 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 13f3cd3..bc190c5 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -277,6 +277,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "vmware-svga.vgamem_mb", "qxl.vgamem_mb", "qxl-vga.vgamem_mb", + "pc-dimm", ); @@ -1524,6 +1525,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "usb-audio", QEMU_CAPS_OBJECT_USB_AUDIO }, { "iothread", QEMU_CAPS_OBJECT_IOTHREAD}, { "ivshmem", QEMU_CAPS_DEVICE_IVSHMEM }, + { "pc-dimm", QEMU_CAPS_DEVICE_PC_DIMM }, }; static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBlk[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 12e1688..66e67e6 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -223,6 +223,7 @@ typedef enum { QEMU_CAPS_VMWARE_SVGA_VGAMEM = 181, /* -device vmware-svga.vgamem_mb */ QEMU_CAPS_QXL_VGAMEM = 182, /* -device qxl.vgamem_mb */ QEMU_CAPS_QXL_VGA_VGAMEM = 183, /* -device qxl-vga.vgamem_mb */ + QEMU_CAPS_DEVICE_PC_DIMM = 184, /* pc-dimm device */ QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_2.1.1-1.caps b/tests/qemucapabilitiesdata/caps_2.1.1-1.caps index 4e040fc..5637edb 100644 --- a/tests/qemucapabilitiesdata/caps_2.1.1-1.caps +++ b/tests/qemucapabilitiesdata/caps_2.1.1-1.caps @@ -165,4 +165,5 @@ <flag name='vmware-svga.vgamem_mb'/> <flag name='qxl.vgamem_mb'/> <flag name='qxl-vga.vgamem_mb'/> + <flag name='pc-dimm'/> </qemuCaps> -- 2.2.2

On 01/30/2015 06:20 AM, Peter Krempa wrote:
The pc-dimm device represents a RAM memory module. --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 1 + 3 files changed, 4 insertions(+)
ACK -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Add a XML element that will allow to specify maximum supportable memory and the count of memory slots to use with memory hotplug. To avoid possible confusion and misuse of the new element this patch also explicitly forbids the use of the maxMemory setting in individual drivers's post parse callbacks. This limitation will be lifted when the support will be implemented. --- docs/formatdomain.html.in | 19 +++++++++++ docs/schemas/domaincommon.rng | 8 +++++ src/bhyve/bhyve_domain.c | 4 +++ src/conf/domain_conf.c | 64 ++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 7 ++++ src/libvirt_private.syms | 1 + src/libxl/libxl_domain.c | 5 +++ src/lxc/lxc_domain.c | 4 +++ src/openvz/openvz_driver.c | 11 +++++-- src/parallels/parallels_driver.c | 6 +++- src/phyp/phyp_driver.c | 6 +++- src/qemu/qemu_domain.c | 4 +++ src/uml/uml_driver.c | 6 +++- src/vbox/vbox_common.c | 6 +++- src/vmware/vmware_driver.c | 6 +++- src/vmx/vmx.c | 6 +++- src/xen/xen_driver.c | 4 +++ src/xenapi/xenapi_driver.c | 6 +++- tests/domainschemadata/maxMemory.xml | 19 +++++++++++ 19 files changed, 183 insertions(+), 9 deletions(-) create mode 100644 tests/domainschemadata/maxMemory.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index f8d5f89..12f7ede 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -664,6 +664,7 @@ <pre> <domain> ... + <maxMemory slots='123' unit='KiB'>1524288</maxMemory> <memory unit='KiB'>524288</memory> <currentMemory unit='KiB'>524288</currentMemory> ... @@ -697,6 +698,24 @@ <span class='since'><code>unit</code> since 0.9.11</span>, <span class='since'><code>dumpCore</code> since 0.10.2 (QEMU only)</span></dd> + <dt><code>maxMemory</code></dt> + <dd>The run time maximum memory allocation of the guest. The initial + memory specified by <code><memory></code> can be increased by + hot-plugging of memory to the limit specified by this element. + + The <code>unit</code> attribute behaves the same as for + <code>memory</code>. + + The <code>slots</code> attribute specifies the number of slots + available for adding memory to the guest. The bounds are hypervisor + specific. + + Note that due to alignment of the memory chunks added via memory + hotplug the full size allocation specified by this element may be + impossible to achieve. + <span class='since'>Since 1.2.12</span> + </dd> + <dt><code>currentMemory</code></dt> <dd>The actual allocation of memory for the guest. This value can be less than the maximum allocation, to allow for ballooning diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index d467dce..2dbfbec 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -577,6 +577,14 @@ --> <define name="resources"> <interleave> + <optional> + <element name="maxMemory"> + <ref name="scaledInteger"/> + <attribute name="slots"> + <ref name="unsignedInt"/> + </attribute> + </element> + </optional> <element name="memory"> <ref name='scaledInteger'/> <optional> diff --git a/src/bhyve/bhyve_domain.c b/src/bhyve/bhyve_domain.c index ecb1758..25ef852 100644 --- a/src/bhyve/bhyve_domain.c +++ b/src/bhyve/bhyve_domain.c @@ -67,6 +67,10 @@ bhyveDomainDefPostParse(virDomainDefPtr def, VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) < 0) return -1; + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index d562e1a..0cfc638 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -980,6 +980,27 @@ virDomainBlkioDeviceParseXML(xmlNodePtr root, } +/** + * virDomainDefCheckUnsupportedMemoryHotplug: + * @def: domain definition + * + * Returns -1 if the domain definition would enable memory hotplug via the + * <maxMemory> tunable and reports an error. Otherwise returns 0. + */ +int +virDomainDefCheckUnsupportedMemoryHotplug(virDomainDefPtr def) +{ + /* memory hotplug tunables are not supported by this driver */ + if (def->mem.max_memory > 0 || def->mem.memory_slots > 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("memory hotplug tunables <maxMemory> are not " + "supported by this hypervisor driver")); + return -1; + } + + return 0; +} + static void virDomainObjListDataFree(void *payload, const void *name ATTRIBUTE_UNUSED) @@ -12735,6 +12756,16 @@ virDomainDefParseXML(xmlDocPtr xml, &def->mem.cur_balloon, false, true) < 0) goto error; + if (virDomainParseMemory("./maxMemory[1]", NULL, ctxt, + &def->mem.max_memory, false, false) < 0) + goto error; + + if (virXPathUInt("string(./maxMemory[1]/@slots)", ctxt, &def->mem.memory_slots) == -2) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Failed to parse memory slot count")); + goto error; + } + /* and info about it */ if ((tmp = virXPathString("string(./memory[1]/@dumpCore)", ctxt)) && (def->mem.dump_core = virTristateSwitchTypeFromString(tmp)) <= 0) { @@ -12764,6 +12795,21 @@ virDomainDefParseXML(xmlDocPtr xml, def->mem.cur_balloon = def->mem.max_balloon; } + if ((def->mem.max_memory || def->mem.memory_slots) && + !(def->mem.max_memory && def->mem.memory_slots)) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("both maximum memory size and " + "memory slot count must be sepcified")); + goto error; + } + + if (def->mem.max_memory && + def->mem.max_memory < def->mem.max_balloon) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("maximum memory size must be equal or greater than " + "the actual memory size")); + goto error; + } if ((n = virXPathNodeSet("./memoryBacking/hugepages/page", ctxt, &nodes)) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -15643,6 +15689,18 @@ virDomainDefCheckABIStability(virDomainDefPtr src, goto error; } } + if (src->mem.memory_slots != dst->mem.memory_slots) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target domain memory slots count '%u' doesn't match source '%u"), + dst->mem.memory_slots, src->mem.memory_slots); + goto error; + } + if (src->mem.max_memory != dst->mem.max_memory) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target maximum memory size '%llu' doesn't match source '%llu'"), + dst->mem.max_memory, src->mem.max_memory); + goto error; + } if (src->vcpus != dst->vcpus) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -19311,6 +19369,12 @@ virDomainDefFormatInternal(virDomainDefPtr def, xmlIndentTreeOutput = oldIndentTreeOutput; } + if (def->mem.memory_slots && def->mem.max_memory) { + virBufferAsprintf(buf, + "<maxMemory slots='%u' unit='KiB'>%llu</maxMemory>\n", + def->mem.memory_slots, def->mem.max_memory); + } + virBufferAddLit(buf, "<memory"); if (def->mem.dump_core) virBufferAsprintf(buf, " dumpCore='%s'", diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 93f2314..97be518 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2021,6 +2021,10 @@ struct _virDomainMemtune { virDomainHugePagePtr hugepages; size_t nhugepages; + /* maximum supported memory for a guest, for hotplugging */ + unsigned long long max_memory; /* in kibibytes */ + unsigned int memory_slots; /* maximum count of RAM memory slots */ + bool nosharepages; bool locked; int dump_core; /* enum virTristateSwitch */ @@ -2312,6 +2316,9 @@ virDomainObjPtr virDomainObjListFindByName(virDomainObjListPtr doms, bool virDomainObjTaint(virDomainObjPtr obj, virDomainTaintFlags taint); + +int virDomainDefCheckUnsupportedMemoryHotplug(virDomainDefPtr def); + void virDomainPanicDefFree(virDomainPanicDefPtr panic); void virDomainResourceDefFree(virDomainResourceDefPtr resource); void virDomainGraphicsDefFree(virDomainGraphicsDefPtr def); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index cf5ccaf..4bad7c6 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -186,6 +186,7 @@ virDomainCpuPlacementModeTypeFromString; virDomainCpuPlacementModeTypeToString; virDomainDefAddImplicitControllers; virDomainDefCheckABIStability; +virDomainDefCheckUnsupportedMemoryHotplug; virDomainDefClearCCWAddresses; virDomainDefClearDeviceAliases; virDomainDefClearPCIAddresses; diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 856cfb4..01fb79c 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -564,6 +564,11 @@ libxlDomainDefPostParse(virDomainDefPtr def, def->nconsoles = 1; def->consoles[0] = chrdef; } + + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/lxc/lxc_domain.c b/src/lxc/lxc_domain.c index 55f5b49..1367b0c 100644 --- a/src/lxc/lxc_domain.c +++ b/src/lxc/lxc_domain.c @@ -94,6 +94,10 @@ virLXCDomainDefPostParse(virDomainDefPtr def, !(def->emulator = virDomainDefGetDefaultEmulator(def, caps))) return -1; + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/openvz/openvz_driver.c b/src/openvz/openvz_driver.c index 556f626..796044a 100644 --- a/src/openvz/openvz_driver.c +++ b/src/openvz/openvz_driver.c @@ -96,8 +96,15 @@ openvzDomainDefPostParse(virDomainDefPtr def, void *opaque ATTRIBUTE_UNUSED) { /* fill the init path */ - if (STREQ(def->os.type, "exe") && !def->os.init) - return VIR_STRDUP(def->os.init, "/sbin/init") < 0 ? -1 : 0; + if (STREQ(def->os.type, "exe") && !def->os.init) { + if (VIR_STRDUP(def->os.init, "/sbin/init") < 0) + return -1; + } + + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/parallels/parallels_driver.c b/src/parallels/parallels_driver.c index b569160..334081c 100644 --- a/src/parallels/parallels_driver.c +++ b/src/parallels/parallels_driver.c @@ -154,10 +154,14 @@ parallelsConnectGetCapabilities(virConnectPtr conn) } static int -parallelsDomainDefPostParse(virDomainDefPtr def ATTRIBUTE_UNUSED, +parallelsDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/phyp/phyp_driver.c b/src/phyp/phyp_driver.c index 6c5a91e..3703eab 100644 --- a/src/phyp/phyp_driver.c +++ b/src/phyp/phyp_driver.c @@ -1094,10 +1094,14 @@ openSSHSession(virConnectPtr conn, virConnectAuthPtr auth, static int -phypDomainDefPostParse(virDomainDefPtr def ATTRIBUTE_UNUSED, +phypDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 99c46d4..085bc81 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1040,6 +1040,10 @@ qemuDomainDefPostParse(virDomainDefPtr def, VIR_DOMAIN_INPUT_BUS_USB) < 0) return -1; + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/uml/uml_driver.c b/src/uml/uml_driver.c index 6ca038a..546ad48 100644 --- a/src/uml/uml_driver.c +++ b/src/uml/uml_driver.c @@ -444,10 +444,14 @@ umlDomainDeviceDefPostParse(virDomainDeviceDefPtr dev, static int -umlDomainDefPostParse(virDomainDefPtr def ATTRIBUTE_UNUSED, +umlDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/vbox/vbox_common.c b/src/vbox/vbox_common.c index bd3f50c..f6ad6de 100644 --- a/src/vbox/vbox_common.c +++ b/src/vbox/vbox_common.c @@ -249,10 +249,14 @@ static char *vboxGenerateMediumName(PRUint32 storageBus, } static int -vboxDomainDefPostParse(virDomainDefPtr def ATTRIBUTE_UNUSED, +vboxDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/vmware/vmware_driver.c b/src/vmware/vmware_driver.c index 2d7ba04..f0a0b5b 100644 --- a/src/vmware/vmware_driver.c +++ b/src/vmware/vmware_driver.c @@ -83,10 +83,14 @@ vmwareDataFreeFunc(void *data) } static int -vmwareDomainDefPostParse(virDomainDefPtr def ATTRIBUTE_UNUSED, +vmwareDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/vmx/vmx.c b/src/vmx/vmx.c index 2a794c7..0262f75 100644 --- a/src/vmx/vmx.c +++ b/src/vmx/vmx.c @@ -524,10 +524,14 @@ VIR_ENUM_IMPL(virVMXControllerModelSCSI, VIR_DOMAIN_CONTROLLER_MODEL_SCSI_LAST, * Helpers */ static int -vmxDomainDefPostParse(virDomainDefPtr def ATTRIBUTE_UNUSED, +vmxDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/xen/xen_driver.c b/src/xen/xen_driver.c index 5e6ef68..f603477 100644 --- a/src/xen/xen_driver.c +++ b/src/xen/xen_driver.c @@ -390,6 +390,10 @@ xenDomainDefPostParse(virDomainDefPtr def, def->memballoon = memballoon; } + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/src/xenapi/xenapi_driver.c b/src/xenapi/xenapi_driver.c index 0902f9a..3b44a46 100644 --- a/src/xenapi/xenapi_driver.c +++ b/src/xenapi/xenapi_driver.c @@ -70,10 +70,14 @@ xenapiDomainDeviceDefPostParse(virDomainDeviceDefPtr dev, static int -xenapiDomainDefPostParse(virDomainDefPtr def ATTRIBUTE_UNUSED, +xenapiDomainDefPostParse(virDomainDefPtr def, virCapsPtr caps ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { + /* memory hotplug tunables are not supported by this driver */ + if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) + return -1; + return 0; } diff --git a/tests/domainschemadata/maxMemory.xml b/tests/domainschemadata/maxMemory.xml new file mode 100644 index 0000000..df2e3d8 --- /dev/null +++ b/tests/domainschemadata/maxMemory.xml @@ -0,0 +1,19 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <maxMemory slots='9' unit='KiB'>1233456789</maxMemory> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + </devices> +</domain> -- 2.2.2

On 01/30/2015 06:20 AM, Peter Krempa wrote:
Add a XML element that will allow to specify maximum supportable memory
s/a XML/an XML/
and the count of memory slots to use with memory hotplug.
Might be nice to demonstrate that XML here in the commit message, not just in formatdomain.html.
To avoid possible confusion and misuse of the new element this patch also explicitly forbids the use of the maxMemory setting in individual drivers's post parse callbacks. This limitation will be lifted when the
s/drivers's/drivers'/
support will be implemented. --- docs/formatdomain.html.in | 19 +++++++++++ docs/schemas/domaincommon.rng | 8 +++++ src/bhyve/bhyve_domain.c | 4 +++ src/conf/domain_conf.c | 64 ++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 7 ++++ src/libvirt_private.syms | 1 + src/libxl/libxl_domain.c | 5 +++ src/lxc/lxc_domain.c | 4 +++ src/openvz/openvz_driver.c | 11 +++++-- src/parallels/parallels_driver.c | 6 +++- src/phyp/phyp_driver.c | 6 +++- src/qemu/qemu_domain.c | 4 +++ src/uml/uml_driver.c | 6 +++- src/vbox/vbox_common.c | 6 +++- src/vmware/vmware_driver.c | 6 +++- src/vmx/vmx.c | 6 +++- src/xen/xen_driver.c | 4 +++ src/xenapi/xenapi_driver.c | 6 +++- tests/domainschemadata/maxMemory.xml | 19 +++++++++++ 19 files changed, 183 insertions(+), 9 deletions(-) create mode 100644 tests/domainschemadata/maxMemory.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index f8d5f89..12f7ede 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -664,6 +664,7 @@ <pre> <domain> ... + <maxMemory slots='123' unit='KiB'>1524288</maxMemory>
123 is unusual; the example would look more realistic with a power of 2.
<memory unit='KiB'>524288</memory> <currentMemory unit='KiB'>524288</currentMemory>
Hmm. Historically, <memory> was the maximum memory, then ballooning was introduced and <currentMemory> was added to show the live difference between the ballooned current value and the boot-up maximum. But with the idea of hot-plug, I see where you are coming from - the balloon can only inflate or deflate up to the amount of memory currently plugged in, so <maxMemory> is the startup maximum, <memory> becomes the amount plugged in, and <currentMemory> reflects the balloon value (that is, current <= memory <= max). So I guess this makes sense; it may be more interesting figuring out how to expose it all through virsh.
... @@ -697,6 +698,24 @@ <span class='since'><code>unit</code> since 0.9.11</span>, <span class='since'><code>dumpCore</code> since 0.10.2 (QEMU only)</span></dd> + <dt><code>maxMemory</code></dt> + <dd>The run time maximum memory allocation of the guest. The initial + memory specified by <code><memory></code> can be increased by + hot-plugging of memory to the limit specified by this element. + + The <code>unit</code> attribute behaves the same as for + <code>memory</code>. + + The <code>slots</code> attribute specifies the number of slots + available for adding memory to the guest. The bounds are hypervisor + specific. + + Note that due to alignment of the memory chunks added via memory + hotplug the full size allocation specified by this element may be + impossible to achieve.
Is a hypervisor free to reject requests that aren't aligned properly? With <memory>, we had the odd situation that we allowed the hypervisor to round requests up, so that live numbers were different than the original startup numbers; it might be easier to not repeat that.
+ <optional> + <element name="maxMemory"> + <ref name="scaledInteger"/> + <attribute name="slots"> + <ref name="unsignedInt"/> + </attribute>
This says the slots attribute mandatory; is there ever a reason to allow it to be optional (defaulting to one slot, an all-or-none hotplug)?
+/** + * virDomainDefCheckUnsupportedMemoryHotplug: + * @def: domain definition + * + * Returns -1 if the domain definition would enable memory hotplug via the + * <maxMemory> tunable and reports an error. Otherwise returns 0. + */ +int +virDomainDefCheckUnsupportedMemoryHotplug(virDomainDefPtr def) +{ + /* memory hotplug tunables are not supported by this driver */ + if (def->mem.max_memory > 0 || def->mem.memory_slots > 0) {
Based on the XML, mem.memory_slots cannot be specified unless max_memory is also present (at least, assuming that you enforce that <maxMemory> be
= <memory>). But I guess it doesn't hurt to check both values.
@@ -12735,6 +12756,16 @@ virDomainDefParseXML(xmlDocPtr xml, &def->mem.cur_balloon, false, true) < 0) goto error;
+ if (virDomainParseMemory("./maxMemory[1]", NULL, ctxt, + &def->mem.max_memory, false, false) < 0) + goto error; + + if (virXPathUInt("string(./maxMemory[1]/@slots)", ctxt, &def->mem.memory_slots) == -2) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("Failed to parse memory slot count"));
At least this agrees with the RelaxNG that slots is mandatory, for now.
+ if ((def->mem.max_memory || def->mem.memory_slots) && + !(def->mem.max_memory && def->mem.memory_slots)) {
Shorter, but not necessarily easier to understand, as: if (!def->mem.max_memory != !def->mem.memory_slots) {
+ virReportError(VIR_ERR_XML_ERROR, "%s", + _("both maximum memory size and " + "memory slot count must be sepcified"));
s/sepcified/specified/ Overall, looks reasonable for now, although I want to get through the series first to see if anything else might be needed. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Thu, Feb 05, 2015 at 20:54:11 -0700, Eric Blake wrote:
On 01/30/2015 06:20 AM, Peter Krempa wrote:
Add a XML element that will allow to specify maximum supportable memory
s/a XML/an XML/
and the count of memory slots to use with memory hotplug.
Might be nice to demonstrate that XML here in the commit message, not just in formatdomain.html.
To avoid possible confusion and misuse of the new element this patch also explicitly forbids the use of the maxMemory setting in individual drivers's post parse callbacks. This limitation will be lifted when the
s/drivers's/drivers'/
support will be implemented. --- docs/formatdomain.html.in | 19 +++++++++++ docs/schemas/domaincommon.rng | 8 +++++ src/bhyve/bhyve_domain.c | 4 +++ src/conf/domain_conf.c | 64 ++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 7 ++++ src/libvirt_private.syms | 1 + src/libxl/libxl_domain.c | 5 +++ src/lxc/lxc_domain.c | 4 +++ src/openvz/openvz_driver.c | 11 +++++-- src/parallels/parallels_driver.c | 6 +++- src/phyp/phyp_driver.c | 6 +++- src/qemu/qemu_domain.c | 4 +++ src/uml/uml_driver.c | 6 +++- src/vbox/vbox_common.c | 6 +++- src/vmware/vmware_driver.c | 6 +++- src/vmx/vmx.c | 6 +++- src/xen/xen_driver.c | 4 +++ src/xenapi/xenapi_driver.c | 6 +++- tests/domainschemadata/maxMemory.xml | 19 +++++++++++ 19 files changed, 183 insertions(+), 9 deletions(-) create mode 100644 tests/domainschemadata/maxMemory.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index f8d5f89..12f7ede 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -664,6 +664,7 @@ <pre> <domain> ... + <maxMemory slots='123' unit='KiB'>1524288</maxMemory>
123 is unusual; the example would look more realistic with a power of 2.
<memory unit='KiB'>524288</memory> <currentMemory unit='KiB'>524288</currentMemory>
Hmm. Historically, <memory> was the maximum memory, then ballooning was introduced and <currentMemory> was added to show the live difference between the ballooned current value and the boot-up maximum.
But with the idea of hot-plug, I see where you are coming from - the balloon can only inflate or deflate up to the amount of memory currently plugged in, so <maxMemory> is the startup maximum, <memory> becomes the
yes, maxMemory is basically size of guest's address space
amount plugged in, and <currentMemory> reflects the balloon value (that
Now <memory> is a thing we need to clarify, there are two options: 1) <memory> will still determine the amount of startup memory thus will not include any memory added via "memory modules" or hotplug. Pros: - no change to semantics - no need to take care of changing the value when adding devices - the value will not change with hotplug or other operations Cons: - I'll have to add a way to express the "current maximum memory" - the memory amount with the ballon deflated 2) <memory> will become total of initial and added memory This will change semantics and require us to recalculate the totals every time. Pros: - you are able to see the total memory at any time - no need to introduce any new parameter for balloon setup - recalculating the total would actually fix the bug when you specify /domain/memory less than the sum of /domain/cpu/numa/cell/@memory : error: internal error: process exited while connecting to monitor: 2015-02-06T17:00:55.971851Z qemu-system-x86_64: total memory for NUMA nodes (0x40000000) should equal RAM size (0x100000) Cons: - we would need to calculate the total memory always (thus overwrite it's value by a sum of the NUMA node memory and memory devices memory - aplications would need to actually check for the change of the max memory
is, current <= memory <= max). So I guess this makes sense; it may be more interesting figuring out how to expose it all through virsh.
... @@ -697,6 +698,24 @@ <span class='since'><code>unit</code> since 0.9.11</span>, <span class='since'><code>dumpCore</code> since 0.10.2 (QEMU only)</span></dd> + <dt><code>maxMemory</code></dt> + <dd>The run time maximum memory allocation of the guest. The initial + memory specified by <code><memory></code> can be increased by + hot-plugging of memory to the limit specified by this element. + + The <code>unit</code> attribute behaves the same as for + <code>memory</code>. + + The <code>slots</code> attribute specifies the number of slots + available for adding memory to the guest. The bounds are hypervisor + specific. + + Note that due to alignment of the memory chunks added via memory + hotplug the full size allocation specified by this element may be + impossible to achieve.
Is a hypervisor free to reject requests that aren't aligned properly? With <memory>, we had the odd situation that we allowed the hypervisor to round requests up, so that live numbers were different than the original startup numbers; it might be easier to not repeat that.
+ <optional> + <element name="maxMemory"> + <ref name="scaledInteger"/> + <attribute name="slots"> + <ref name="unsignedInt"/> + </attribute>
This says the slots attribute mandatory; is there ever a reason to allow it to be optional (defaulting to one slot, an all-or-none hotplug)?
qemu enforces it that way, so I went for strictly all-or-none with the possibility to relax it once a different hypervisor will not enforce it
+/** + * virDomainDefCheckUnsupportedMemoryHotplug: + * @def: domain definition + * + * Returns -1 if the domain definition would enable memory hotplug via the + * <maxMemory> tunable and reports an error. Otherwise returns 0. + */ +int +virDomainDefCheckUnsupportedMemoryHotplug(virDomainDefPtr def) +{ + /* memory hotplug tunables are not supported by this driver */ + if (def->mem.max_memory > 0 || def->mem.memory_slots > 0) {
Based on the XML, mem.memory_slots cannot be specified unless max_memory is also present (at least, assuming that you enforce that <maxMemory> be
= <memory>). But I guess it doesn't hurt to check both values.
Hmm, yeah, the part after the logic or is dead code. Peter

On Fri, Feb 06, 2015 at 06:07:35PM +0100, Peter Krempa wrote:
On Thu, Feb 05, 2015 at 20:54:11 -0700, Eric Blake wrote:
On 01/30/2015 06:20 AM, Peter Krempa wrote:
Add a XML element that will allow to specify maximum supportable memory
s/a XML/an XML/
and the count of memory slots to use with memory hotplug.
Might be nice to demonstrate that XML here in the commit message, not just in formatdomain.html.
To avoid possible confusion and misuse of the new element this patch also explicitly forbids the use of the maxMemory setting in individual drivers's post parse callbacks. This limitation will be lifted when the
s/drivers's/drivers'/
support will be implemented. --- docs/formatdomain.html.in | 19 +++++++++++ docs/schemas/domaincommon.rng | 8 +++++ src/bhyve/bhyve_domain.c | 4 +++ src/conf/domain_conf.c | 64 ++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 7 ++++ src/libvirt_private.syms | 1 + src/libxl/libxl_domain.c | 5 +++ src/lxc/lxc_domain.c | 4 +++ src/openvz/openvz_driver.c | 11 +++++-- src/parallels/parallels_driver.c | 6 +++- src/phyp/phyp_driver.c | 6 +++- src/qemu/qemu_domain.c | 4 +++ src/uml/uml_driver.c | 6 +++- src/vbox/vbox_common.c | 6 +++- src/vmware/vmware_driver.c | 6 +++- src/vmx/vmx.c | 6 +++- src/xen/xen_driver.c | 4 +++ src/xenapi/xenapi_driver.c | 6 +++- tests/domainschemadata/maxMemory.xml | 19 +++++++++++ 19 files changed, 183 insertions(+), 9 deletions(-) create mode 100644 tests/domainschemadata/maxMemory.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index f8d5f89..12f7ede 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -664,6 +664,7 @@ <pre> <domain> ... + <maxMemory slots='123' unit='KiB'>1524288</maxMemory>
123 is unusual; the example would look more realistic with a power of 2.
<memory unit='KiB'>524288</memory> <currentMemory unit='KiB'>524288</currentMemory>
Hmm. Historically, <memory> was the maximum memory, then ballooning was introduced and <currentMemory> was added to show the live difference between the ballooned current value and the boot-up maximum.
But with the idea of hot-plug, I see where you are coming from - the balloon can only inflate or deflate up to the amount of memory currently plugged in, so <maxMemory> is the startup maximum, <memory> becomes the
yes, maxMemory is basically size of guest's address space
amount plugged in, and <currentMemory> reflects the balloon value (that
Now <memory> is a thing we need to clarify, there are two options:
1) <memory> will still determine the amount of startup memory thus will not include any memory added via "memory modules" or hotplug.
Pros: - no change to semantics - no need to take care of changing the value when adding devices - the value will not change with hotplug or other operations
Cons: - I'll have to add a way to express the "current maximum memory" - the memory amount with the ballon deflated
2) <memory> will become total of initial and added memory
This will change semantics and require us to recalculate the totals every time.
Pros: - you are able to see the total memory at any time - no need to introduce any new parameter for balloon setup - recalculating the total would actually fix the bug when you specify /domain/memory less than the sum of /domain/cpu/numa/cell/@memory : error: internal error: process exited while connecting to monitor: 2015-02-06T17:00:55.971851Z qemu-system-x86_64: total memory for NUMA nodes (0x40000000) should equal RAM size (0x100000)
Cons: - we would need to calculate the total memory always (thus overwrite it's value by a sum of the NUMA node memory and memory devices memory - aplications would need to actually check for the change of the max memory
I think we should pick '2' - <memory> should always reflect the amount of RAM currently plugged into the guest - ie sum of all DIMMS' capacities. If apps choose to use <maxMemory> and hotplug dimms, they'll easily know to re-read <memory> periodically. Likewise the virDomainGetInfo and virDomainGetMaxMemory API will need to be updated too. Damn our API naming is annoying in this respect
is, current <= memory <= max). So I guess this makes sense; it may be more interesting figuring out how to expose it all through virsh.
... @@ -697,6 +698,24 @@ <span class='since'><code>unit</code> since 0.9.11</span>, <span class='since'><code>dumpCore</code> since 0.10.2 (QEMU only)</span></dd> + <dt><code>maxMemory</code></dt> + <dd>The run time maximum memory allocation of the guest. The initial + memory specified by <code><memory></code> can be increased by + hot-plugging of memory to the limit specified by this element. + + The <code>unit</code> attribute behaves the same as for + <code>memory</code>. + + The <code>slots</code> attribute specifies the number of slots + available for adding memory to the guest. The bounds are hypervisor + specific. + + Note that due to alignment of the memory chunks added via memory + hotplug the full size allocation specified by this element may be + impossible to achieve.
Is a hypervisor free to reject requests that aren't aligned properly? With <memory>, we had the odd situation that we allowed the hypervisor to round requests up, so that live numbers were different than the original startup numbers; it might be easier to not repeat that.
+ <optional> + <element name="maxMemory"> + <ref name="scaledInteger"/> + <attribute name="slots"> + <ref name="unsignedInt"/> + </attribute>
This says the slots attribute mandatory; is there ever a reason to allow it to be optional (defaulting to one slot, an all-or-none hotplug)?
qemu enforces it that way, so I went for strictly all-or-none with the possibility to relax it once a different hypervisor will not enforce it
While QEMU enforces that you always provide slots, is there any value in just defaulting in the driver. eg if ommitted in the XML, the number of slots just gets filled in by the driver with a sensible value for that driver. eg we could default to adding 16 slots. Guess it isn't a big deal since apps already have to add a new <maxMemory> element. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

To enable memory hotplug the maximum memory size and slot count need to be specified. As qemu supports now other units than mebibytes when specifying memory, use the new interface in this case. --- docs/formatdomain.html.in | 2 +- src/qemu/qemu_command.c | 41 ++++++++++++++++++---- src/qemu/qemu_domain.c | 4 --- .../qemuxml2argv-memory-hotplug-nonuma.xml | 22 ++++++++++++ .../qemuxml2argv-memory-hotplug.args | 6 ++++ .../qemuxml2argv-memory-hotplug.xml | 34 ++++++++++++++++++ tests/qemuxml2argvtest.c | 4 +++ tests/qemuxml2xmltest.c | 3 ++ 8 files changed, 105 insertions(+), 11 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nonuma.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 12f7ede..c059195 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -713,7 +713,7 @@ Note that due to alignment of the memory chunks added via memory hotplug the full size allocation specified by this element may be impossible to achieve. - <span class='since'>Since 1.2.12</span> + <span class='since'>Since 1.2.12 supported by the QEMU driver.</span> </dd> <dt><code>currentMemory</code></dt> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index ec4f35b..5820fb5 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -8230,14 +8230,43 @@ qemuBuildCommandLine(virConnectPtr conn, if (qemuBuildDomainLoaderCommandLine(cmd, def, qemuCaps) < 0) goto error; - /* Set '-m MB' based on maxmem, because the lower 'memory' limit - * is set post-startup using the balloon driver. If balloon driver - * is not supported, then they're out of luck anyway. Update the - * XML to reflect our rounding. - */ virCommandAddArg(cmd, "-m"); + /* Update config to reflect our rounding of memory size as we + * specify it in mebibytes. */ def->mem.max_balloon = VIR_DIV_UP(def->mem.max_balloon, 1024) * 1024; - virCommandAddArgFormat(cmd, "%llu", def->mem.max_balloon / 1024); + + if (def->mem.max_memory || def->mem.memory_slots) { + /* round this to the nearest mebibyte, qemu requires at least alignment + * in 4KiB (page size) blocks */ + def->mem.max_memory = VIR_DIV_UP(def->mem.max_memory, 1024) * 1024; + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_PC_DIMM)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("memory hotplug isn't supported by this QEMU binary")); + goto error; + } + + /* due to guest support, qemu would silently enable NUMA with one node + * once the memory hotplug backend is enabled. To avoid possible + * confusion we will enforce user originated numa configuration along + * with memory hotplug. */ + if (!def->cpu || def->cpu->ncells == 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("At least one numa node has to be configured when " + "enabling memory hotplug")); + goto error; + } + + /* Use the 'k' suffix to let qemu handle the units */ + virCommandAddArgFormat(cmd, "size=%lluk,slots=%u,maxmem=%lluk", + def->mem.max_balloon, + def->mem.memory_slots, + def->mem.max_memory); + + } else { + virCommandAddArgFormat(cmd, "%llu", def->mem.max_balloon / 1024); + } + if (def->mem.nhugepages && (!def->cpu || !def->cpu->ncells)) { const long system_page_size = sysconf(_SC_PAGESIZE) / 1024; char *mem_path = NULL; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 085bc81..99c46d4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1040,10 +1040,6 @@ qemuDomainDefPostParse(virDomainDefPtr def, VIR_DOMAIN_INPUT_BUS_USB) < 0) return -1; - /* memory hotplug tunables are not supported by this driver */ - if (virDomainDefCheckUnsupportedMemoryHotplug(def) < 0) - return -1; - return 0; } diff --git a/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nonuma.xml b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nonuma.xml new file mode 100644 index 0000000..5c807ed --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nonuma.xml @@ -0,0 +1,22 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <maxMemory slots='9' unit='KiB'>1233456789</maxMemory> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <controller type='usb' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.args b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.args new file mode 100644 index 0000000..6c26586 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.args @@ -0,0 +1,6 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/bin/qemu -S -M pc \ +-m size=219136k,slots=16,maxmem=1099511627776k \ +-smp 2 -numa node,nodeid=0,cpus=0-1,mem=214 \ +-nographic -monitor unix:/tmp/test-monitor,server,nowait -no-acpi -boot c \ +-usb -hda /dev/HostVG/QEMUGuest1 -net none -serial none -parallel none diff --git a/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.xml b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.xml new file mode 100644 index 0000000..567a662 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.xml @@ -0,0 +1,34 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <maxMemory slots='16' unit='KiB'>1099511627776</maxMemory> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static' cpuset='0-1'>2</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu> + <topology sockets='2' cores='1' threads='1'/> + <numa> + <cell id='0' cpus='0-1' memory='219136' unit='KiB'/> + </numa> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='ide' index='0'/> + <controller type='usb' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 89afa81..2bc0ff3 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1517,6 +1517,10 @@ mymain(void) DO_TEST_PARSE_ERROR("shmem-msi-only", NONE); DO_TEST("cpu-host-passthrough-features", QEMU_CAPS_KVM, QEMU_CAPS_CPU_HOST); + DO_TEST_FAILURE("memory-hotplug-nonuma", QEMU_CAPS_DEVICE_PC_DIMM); + DO_TEST_FAILURE("memory-hotplug", 0); + DO_TEST("memory-hotplug", QEMU_CAPS_DEVICE_PC_DIMM, QEMU_CAPS_NUMA); + virObjectUnref(driver.config); virObjectUnref(driver.caps); virObjectUnref(driver.xmlopt); diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index f12983c..e840d63 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -418,6 +418,9 @@ mymain(void) DO_TEST("shmem"); DO_TEST("smbios"); + DO_TEST("memory-hotplug"); + DO_TEST("memory-hotplug-nonuma"); + virObjectUnref(driver.caps); virObjectUnref(driver.xmlopt); -- 2.2.2

On 01/30/2015 08:21 AM, Peter Krempa wrote:
To enable memory hotplug the maximum memory size and slot count need to be specified. As qemu supports now other units than mebibytes when specifying memory, use the new interface in this case. --- docs/formatdomain.html.in | 2 +- src/qemu/qemu_command.c | 41 ++++++++++++++++++---- src/qemu/qemu_domain.c | 4 --- .../qemuxml2argv-memory-hotplug-nonuma.xml | 22 ++++++++++++ .../qemuxml2argv-memory-hotplug.args | 6 ++++ .../qemuxml2argv-memory-hotplug.xml | 34 ++++++++++++++++++ tests/qemuxml2argvtest.c | 4 +++ tests/qemuxml2xmltest.c | 3 ++ 8 files changed, 105 insertions(+), 11 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nonuma.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 12f7ede..c059195 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -713,7 +713,7 @@ Note that due to alignment of the memory chunks added via memory hotplug the full size allocation specified by this element may be impossible to achieve. - <span class='since'>Since 1.2.12</span> + <span class='since'>Since 1.2.12 supported by the QEMU driver.</span> </dd>
<dt><code>currentMemory</code></dt>
My git am -3 wasn't happy with this change - didn't want to apply it: fatal: sha1 information is lacking or useless (docs/formatdomain.html.in). I just removed it and kept going... John

On Fri, Jan 30, 2015 at 02:21:00PM +0100, Peter Krempa wrote:
To enable memory hotplug the maximum memory size and slot count need to be specified. As qemu supports now other units than mebibytes when specifying memory, use the new interface in this case. --- docs/formatdomain.html.in | 2 +- src/qemu/qemu_command.c | 41 ++++++++++++++++++---- src/qemu/qemu_domain.c | 4 --- .../qemuxml2argv-memory-hotplug-nonuma.xml | 22 ++++++++++++ .../qemuxml2argv-memory-hotplug.args | 6 ++++ .../qemuxml2argv-memory-hotplug.xml | 34 ++++++++++++++++++ tests/qemuxml2argvtest.c | 4 +++ tests/qemuxml2xmltest.c | 3 ++ 8 files changed, 105 insertions(+), 11 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nonuma.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.xml
ACK Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

ACPI Dimm devices are described by the slot and base address. Add a new address type to be able to describe such address. --- docs/schemas/domaincommon.rng | 18 +++++++++++ src/conf/domain_conf.c | 69 ++++++++++++++++++++++++++++++++++++++++++- src/conf/domain_conf.h | 9 ++++++ 3 files changed, 95 insertions(+), 1 deletion(-) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 2dbfbec..d7e27c2 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -3944,6 +3944,18 @@ </attribute> </optional> </define> + <define name="acpidimmaddress"> + <optional> + <attribute name="slot"> + <ref name="unsignedInt"/> + </attribute> + </optional> + <optional> + <attribute name="base"> + <ref name="hexuint"/> + </attribute> + </optional> + </define> <define name="devices"> <element name="devices"> <interleave> @@ -4358,6 +4370,12 @@ </attribute> <ref name="isaaddress"/> </group> + <group> + <attribute name="type"> + <value>acpi-dimm</value> + </attribute> + <ref name="acpidimmaddress"/> + </group> </choice> </element> </define> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 0cfc638..0b9fb06 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -249,7 +249,8 @@ VIR_ENUM_IMPL(virDomainDeviceAddress, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST, "virtio-s390", "ccw", "virtio-mmio", - "isa") + "isa", + "acpi-dimm") VIR_ENUM_IMPL(virDomainDiskDevice, VIR_DOMAIN_DISK_DEVICE_LAST, "disk", @@ -3484,6 +3485,12 @@ virDomainDeviceInfoFormat(virBufferPtr buf, virBufferAsprintf(buf, " irq='0x%x'", info->addr.isa.irq); break; + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM: + virBufferAsprintf(buf, " slot='%u'", info->addr.acpiDimm.slot); + virBufferAsprintf(buf, " base='0x%llx'", info->addr.acpiDimm.base); + + break; + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: @@ -3853,6 +3860,41 @@ virDomainDeviceISAAddressParseXML(xmlNodePtr node, return ret; } + +static int +virDomainDeviceACPIDimmAddressParseXML(xmlNodePtr node, + virDomainDeviceACPIDimmAddressPtr addr) +{ + int ret = -1; + char *tmp = NULL; + + if (!(tmp = virXMLPropString(node, "slot")) || + virStrToLong_uip(tmp, NULL, 10, &addr->slot) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid or missing acpi-dimm slot id '%s'"), + NULLSTR(tmp)); + goto cleanup; + } + VIR_FREE(tmp); + + if (!(tmp = virXMLPropString(node, "base")) || + virStrToLong_ullp(tmp, NULL, 16, &addr->base) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid or missing acpi-dimm base address '%s'"), + NULLSTR(tmp)); + goto cleanup; + } + VIR_FREE(tmp); + + ret = 0; + + cleanup: + VIR_FREE(tmp); + + return ret; +} + + /* Parse the XML definition for a device address * @param node XML nodeset to parse for device address definition */ @@ -3994,6 +4036,11 @@ virDomainDeviceInfoParseXML(xmlNodePtr node, _("virtio-s390 bus doesn't have an address")); goto cleanup; + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM: + if (virDomainDeviceACPIDimmAddressParseXML(address, &info->addr.acpiDimm) < 0) + goto cleanup; + break; + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: break; @@ -14842,6 +14889,26 @@ virDomainDeviceInfoCheckABIStability(virDomainDeviceInfoPtr src, } break; + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM: + if (src->addr.acpiDimm.slot != dst->addr.acpiDimm.slot) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target device acpi dimm slot %d does not match " + "source %d"), + dst->addr.acpiDimm.slot, + src->addr.acpiDimm.slot); + return false; + } + + if (src->addr.acpiDimm.base != dst->addr.acpiDimm.base) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target device acpi dimm base addres '%llx' does " + "not match source '%llx'"), + dst->addr.acpiDimm.base, + src->addr.acpiDimm.base); + return false; + } + break; + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_USB: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_SPAPRVIO: case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 97be518..f01c2ab 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -234,6 +234,7 @@ typedef enum { VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ISA, + VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST } virDomainDeviceAddressType; @@ -309,6 +310,13 @@ struct _virDomainDeviceISAAddress { unsigned int irq; }; +typedef struct _virDomainDeviceACPIDimmAddress virDomainDeviceACPIDimmAddress; +typedef virDomainDeviceACPIDimmAddress *virDomainDeviceACPIDimmAddressPtr; +struct _virDomainDeviceACPIDimmAddress { + unsigned int slot; + unsigned long long base; +}; + typedef struct _virDomainDeviceInfo virDomainDeviceInfo; typedef virDomainDeviceInfo *virDomainDeviceInfoPtr; struct _virDomainDeviceInfo { @@ -327,6 +335,7 @@ struct _virDomainDeviceInfo { virDomainDeviceSpaprVioAddress spaprvio; virDomainDeviceCCWAddress ccw; virDomainDeviceISAAddress isa; + virDomainDeviceACPIDimmAddress acpiDimm; } addr; int mastertype; union { -- 2.2.2

On Fri, Jan 30, 2015 at 02:21:01PM +0100, Peter Krempa wrote:
ACPI Dimm devices are described by the slot and base address. Add a new address type to be able to describe such address.
} virDomainDeviceAddressType; @@ -309,6 +310,13 @@ struct _virDomainDeviceISAAddress { unsigned int irq; };
+typedef struct _virDomainDeviceACPIDimmAddress virDomainDeviceACPIDimmAddress; +typedef virDomainDeviceACPIDimmAddress *virDomainDeviceACPIDimmAddressPtr; +struct _virDomainDeviceACPIDimmAddress { + unsigned int slot; + unsigned long long base; +};
How are values for 'base' determined by apps filling in these addresses, or by libvirt ? Are there restrictions on valid values. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

WIP: TODO: docs Also forbid the new device in post parse callback in all driver that implement the callback to warn users right away that the device is not supported with their hypervisor. --- docs/schemas/domaincommon.rng | 50 ++++ src/bhyve/bhyve_domain.c | 5 +- src/conf/domain_conf.c | 317 ++++++++++++++++++++- src/conf/domain_conf.h | 33 +++ src/libvirt_private.syms | 2 + src/libxl/libxl_domain.c | 3 + src/lxc/lxc_domain.c | 4 + src/openvz/openvz_driver.c | 3 + src/qemu/qemu_domain.c | 3 + src/qemu/qemu_driver.c | 13 + src/qemu/qemu_hotplug.c | 3 + src/uml/uml_driver.c | 3 + src/xen/xen_driver.c | 3 + src/xenapi/xenapi_driver.c | 3 + .../qemuxml2argv-memory-hotplug-dimm.xml | 47 +++ 15 files changed, 490 insertions(+), 2 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index d7e27c2..6873911 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -3985,6 +3985,7 @@ <ref name="rng"/> <ref name="tpm"/> <ref name="shmem"/> + <ref name="memorydev"/> </choice> </zeroOrMore> <optional> @@ -4396,6 +4397,55 @@ </element> </define> + <define name="memorydev"> + <element name="memory"> + <attribute name="model"> + <choice> + <value>acpi-dimm</value> + </choice> + </attribute> + <interleave> + <optional> + <ref name="memorydev-source"/> + </optional> + <ref name="memorydev-target"/> + <optional> + <ref name="address"/> + </optional> + </interleave> + </element> + </define> + + <define name="memorydev-source"> + <element name="source"> + <interleave> + <optional> + <element name="pagesize"> + <ref name="scaledInteger"/> + </element> + </optional> + <optional> + <element name="nodemask"> + <ref name="cpuset"/> + </element> + </optional> + </interleave> + </element> + </define> + + <define name="memorydev-target"> + <element name="target"> + <interleave> + <element name="size"> + <ref name="scaledInteger"/> + </element> + <element name="node"> + <ref name="unsignedInt"/> + </element> + </interleave> + </element> + </define> + <define name="rng"> <element name="rng"> <attribute name="model"> diff --git a/src/bhyve/bhyve_domain.c b/src/bhyve/bhyve_domain.c index 25ef852..890963e 100644 --- a/src/bhyve/bhyve_domain.c +++ b/src/bhyve/bhyve_domain.c @@ -75,11 +75,14 @@ bhyveDomainDefPostParse(virDomainDefPtr def, } static int -bhyveDomainDeviceDefPostParse(virDomainDeviceDefPtr dev ATTRIBUTE_UNUSED, +bhyveDomainDeviceDefPostParse(virDomainDeviceDefPtr dev, const virDomainDef *def ATTRIBUTE_UNUSED, virCapsPtr caps ATTRIBUTE_UNUSED, void *opaque ATTRIBUTE_UNUSED) { + if (virDomainDeviceDefCheckUnsupportedMemoryDevice(dev) < 0) + return -1; + return 0; } diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 0b9fb06..298b574 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -236,7 +236,8 @@ VIR_ENUM_IMPL(virDomainDevice, VIR_DOMAIN_DEVICE_LAST, "rng", "shmem", "tpm", - "panic") + "panic", + "memory") VIR_ENUM_IMPL(virDomainDeviceAddress, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST, "none", @@ -779,6 +780,12 @@ VIR_ENUM_DECL(virDomainBlockJob) VIR_ENUM_IMPL(virDomainBlockJob, VIR_DOMAIN_BLOCK_JOB_TYPE_LAST, "", "", "copy", "", "active-commit") +VIR_ENUM_IMPL(virDomainMemoryModel, VIR_DOMAIN_MEMORY_MODEL_LAST, + "", "acpi-dimm") + +#define VIR_DOMAIN_XML_WRITE_FLAGS VIR_DOMAIN_XML_SECURE +#define VIR_DOMAIN_XML_READ_FLAGS VIR_DOMAIN_XML_INACTIVE + static virClassPtr virDomainObjClass; static virClassPtr virDomainObjListClass; static virClassPtr virDomainXMLOptionClass; @@ -1003,6 +1010,27 @@ virDomainDefCheckUnsupportedMemoryHotplug(virDomainDefPtr def) } +/** + * virDomainDeviceDefCheckUnsupportedMemoryDevice: + * @dev: device definition + * + * Returns -1 if the device definition describes a memory device and reports an + * error. Otherwise returns 0. + */ +int +virDomainDeviceDefCheckUnsupportedMemoryDevice(virDomainDeviceDefPtr dev) +{ + /* This driver doesn't yet know how to handle memory devices */ + if (dev->type == VIR_DOMAIN_DEVICE_MEMORY) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("memory devices are not supported by this driver")); + return -1; + } + + return 0; +} + + static void virDomainObjListDataFree(void *payload, const void *name ATTRIBUTE_UNUSED) { @@ -1934,6 +1962,15 @@ void virDomainRedirFilterDefFree(virDomainRedirFilterDefPtr def) VIR_FREE(def); } +void virDomainMemoryDefFree(virDomainMemoryDefPtr def) +{ + if (!def) + return; + + virDomainDeviceInfoClear(&def->info); + VIR_FREE(def); +} + void virDomainDeviceDefFree(virDomainDeviceDefPtr def) { if (!def) @@ -2003,6 +2040,9 @@ void virDomainDeviceDefFree(virDomainDeviceDefPtr def) case VIR_DOMAIN_DEVICE_PANIC: virDomainPanicDefFree(def->data.panic); break; + case VIR_DOMAIN_DEVICE_MEMORY: + virDomainMemoryDefFree(def->data.memory); + break; case VIR_DOMAIN_DEVICE_LAST: case VIR_DOMAIN_DEVICE_NONE: break; @@ -2200,6 +2240,10 @@ void virDomainDefFree(virDomainDefPtr def) virDomainRNGDefFree(def->rngs[i]); VIR_FREE(def->rngs); + for (i = 0; i < def->nmems; i++) + virDomainMemoryDefFree(def->mems[i]); + VIR_FREE(def->mems); + virDomainTPMDefFree(def->tpm); virDomainPanicDefFree(def->panic); @@ -2709,6 +2753,8 @@ virDomainDeviceGetInfo(virDomainDeviceDefPtr device) return &device->data.tpm->info; case VIR_DOMAIN_DEVICE_PANIC: return &device->data.panic->info; + case VIR_DOMAIN_DEVICE_MEMORY: + return &device->data.memory->info; /* The following devices do not contain virDomainDeviceInfo */ case VIR_DOMAIN_DEVICE_LEASE: @@ -2940,6 +2986,13 @@ virDomainDeviceInfoIterateInternal(virDomainDefPtr def, return -1; } + device.type = VIR_DOMAIN_DEVICE_MEMORY; + for (i = 0; i < def->nmems; i++) { + device.data.memory = def->mems[i]; + if (cb(def, &device, &def->mems[i]->info, opaque) < 0) + return -1; + } + /* Coverity is not very happy with this - all dead_error_condition */ #if !STATIC_ANALYSIS /* This switch statement is here to trigger compiler warning when adding @@ -2972,6 +3025,7 @@ virDomainDeviceInfoIterateInternal(virDomainDefPtr def, case VIR_DOMAIN_DEVICE_PANIC: case VIR_DOMAIN_DEVICE_LAST: case VIR_DOMAIN_DEVICE_RNG: + case VIR_DOMAIN_DEVICE_MEMORY: break; } #endif @@ -11109,6 +11163,119 @@ virDomainPMStateParseXML(xmlXPathContextPtr ctxt, return ret; } + +static int +virDomainMemorySourceDefParseXML(xmlNodePtr node, + xmlXPathContextPtr ctxt, + virDomainMemoryDefPtr def) +{ + int ret = -1; + char *nodemask = NULL; + xmlNodePtr save = ctxt->node; + ctxt->node = node; + + if (virDomainParseMemory("./pagesize", "./pagesize/@unit", ctxt, + &def->pagesize, false, false) < 0) + goto cleanup; + + if ((nodemask = virXPathString("string(./nodemask)", ctxt))) { + if (virBitmapParse(nodemask, 0, &def->sourceNodes, + VIR_DOMAIN_CPUMASK_LEN) < 0) + goto cleanup; + } + + ret = 0; + + cleanup: + VIR_FREE(nodemask); + ctxt->node = save; + return ret; +} + + +static int +virDomainMemoryTargetDefParseXML(xmlNodePtr node, + xmlXPathContextPtr ctxt, + virDomainMemoryDefPtr def) +{ + int ret = -1; + xmlNodePtr save = ctxt->node; + ctxt->node = node; + + if (virXPathInt("string(./node)", ctxt, &def->targetNode) < 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("invalid or missing value of memory device node")); + goto cleanup; + } + + if (virDomainParseMemory("./size", "./size/@unit", ctxt, + &def->size, true, false) < 0) + goto cleanup; + + ret = 0; + + cleanup: + ctxt->node = save; + return ret; +} + + +static virDomainMemoryDefPtr +virDomainMemoryDefParseXML(xmlNodePtr memdevNode, + xmlXPathContextPtr ctxt, + unsigned int flags) +{ + char *tmp = NULL; + xmlNodePtr save = ctxt->node; + xmlNodePtr node; + virDomainMemoryDefPtr def; + + ctxt->node = memdevNode; + + if (VIR_ALLOC(def) < 0) + return NULL; + + if (!(tmp = virXMLPropString(memdevNode, "model"))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing memory model")); + goto error; + } + + if ((def->model = virDomainMemoryModelTypeFromString(tmp)) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid memory model '%s'"), tmp); + goto error; + } + VIR_FREE(tmp); + + /* source */ + if ((node = virXPathNode("./source", ctxt)) && + virDomainMemorySourceDefParseXML(node, ctxt, def) < 0) + goto error; + + /* target */ + if (!(node = virXPathNode("./target", ctxt))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing <target> element for <memory> device")); + goto error; + } + + if (virDomainMemoryTargetDefParseXML(node, ctxt, def) < 0) + goto error; + + if (virDomainDeviceInfoParseXML(memdevNode, NULL, &def->info, flags) < 0) + goto error; + + return def; + + error: + VIR_FREE(tmp); + virDomainMemoryDefFree(def); + ctxt->node = save; + return NULL; +} + + virDomainDeviceDefPtr virDomainDeviceDefParse(const char *xmlStr, const virDomainDef *def, @@ -11243,6 +11410,10 @@ virDomainDeviceDefParse(const char *xmlStr, if (!(dev->data.panic = virDomainPanicDefParseXML(node))) goto error; break; + case VIR_DOMAIN_DEVICE_MEMORY: + if (!(dev->data.memory = virDomainMemoryDefParseXML(node, ctxt, flags))) + goto error; + break; case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_LAST: break; @@ -14438,6 +14609,23 @@ virDomainDefParseXML(xmlDocPtr xml, ctxt->node = node; VIR_FREE(nodes); + /* analysis of memory devices */ + if ((n = virXPathNodeSet("./devices/memory", ctxt, &nodes)) < 0) + goto error; + if (n && VIR_ALLOC_N(def->mems, n) < 0) + goto error; + + for (i = 0; i < n; i++) { + virDomainMemoryDefPtr mem = virDomainMemoryDefParseXML(nodes[i], + ctxt, + flags); + if (!mem) + goto error; + + def->mems[def->nmems++] = mem; + } + VIR_FREE(nodes); + /* analysis of the user namespace mapping */ if ((n = virXPathNodeSet("./idmap/uid", ctxt, &nodes)) < 0) goto error; @@ -15677,6 +15865,39 @@ virDomainPanicDefCheckABIStability(virDomainPanicDefPtr src, return virDomainDeviceInfoCheckABIStability(&src->info, &dst->info); } +static bool +virDomainMemoryDefCheckABIStability(virDomainMemoryDefPtr src, + virDomainMemoryDefPtr dst) +{ + if (src->model != dst->model) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device model '%s' " + "doesn't match source model '%s'"), + virDomainMemoryModelTypeToString(dst->model), + virDomainMemoryModelTypeToString(src->model)); + return false; + } + + if (src->targetNode != dst->targetNode) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device targetNode '%d' " + "doesn't match source targetNode '%d'"), + dst->targetNode, src->targetNode); + return false; + } + + if (src->size != dst->size) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device size '%llu' doesn't match " + "source memory device size '%llu'"), + dst->size, src->size); + return false; + } + + return virDomainDeviceInfoCheckABIStability(&src->info, &dst->info); +} + + /* This compares two configurations and looks for any differences * which will affect the guest ABI. This is primarily to allow * validation of custom XML config passed in during migration @@ -16114,6 +16335,18 @@ virDomainDefCheckABIStability(virDomainDefPtr src, goto error; } + if (src->nmems != dst->nmems) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target domain memory device count %zu " + "does not match source %zu"), dst->nmems, src->nmems); + goto error; + } + + for (i = 0; i < src->nmems; i++) { + if (!virDomainMemoryDefCheckABIStability(src->mems[i], dst->mems[i])) + goto error; + } + /* Coverity is not very happy with this - all dead_error_condition */ #if !STATIC_ANALYSIS /* This switch statement is here to trigger compiler warning when adding @@ -16145,6 +16378,7 @@ virDomainDefCheckABIStability(virDomainDefPtr src, case VIR_DOMAIN_DEVICE_TPM: case VIR_DOMAIN_DEVICE_PANIC: case VIR_DOMAIN_DEVICE_SHMEM: + case VIR_DOMAIN_DEVICE_MEMORY: break; } #endif @@ -18616,6 +18850,81 @@ virDomainRNGDefFree(virDomainRNGDefPtr def) VIR_FREE(def); } + +static int +virDomainMemorySourceDefFormat(virBufferPtr buf, + virDomainMemoryDefPtr def) +{ + char *bitmap = NULL; + int ret = -1; + + if (!def->pagesize && !def->sourceNodes) + return 0; + + virBufferAddLit(buf, "<source>\n"); + virBufferAdjustIndent(buf, 2); + + if (def->sourceNodes) { + if (!(bitmap = virBitmapFormat(def->sourceNodes))) + goto cleanup; + + virBufferAsprintf(buf, "<nodemask>%s</nodemask>\n", bitmap); + } + + if (def->pagesize) + virBufferAsprintf(buf, "<pagesize unit='KiB'>%llu</pagesize>\n", + def->pagesize); + + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</source>\n"); + + ret = 0; + + cleanup: + VIR_FREE(bitmap); + return ret; +} + + +static void +virDomainMemoryTargetDefFormat(virBufferPtr buf, + virDomainMemoryDefPtr def) +{ + virBufferAddLit(buf, "<target>\n"); + virBufferAdjustIndent(buf, 2); + + virBufferAsprintf(buf, "<size unit='KiB'>%llu</size>\n", def->size); + virBufferAsprintf(buf, "<node>%d</node>\n", def->targetNode); + + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</target>\n"); +} + +static int +virDomainMemoryDefFormat(virBufferPtr buf, + virDomainMemoryDefPtr def, + unsigned int flags) +{ + const char *model = virDomainMemoryModelTypeToString(def->model); + + virBufferAsprintf(buf, "<memory model='%s'>\n", model); + virBufferAdjustIndent(buf, 2); + + if (virDomainMemorySourceDefFormat(buf, def) < 0) + return -1; + + virDomainMemoryTargetDefFormat(buf, def); + + if (virDomainDeviceInfoNeedsFormat(&def->info, flags)) { + if (virDomainDeviceInfoFormat(buf, &def->info, flags) < 0) + return -1; + } + + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</memory>\n"); + return 0; +} + static void virDomainVideoAccelDefFormat(virBufferPtr buf, virDomainVideoAccelDefPtr def) @@ -20168,6 +20477,10 @@ virDomainDefFormatInternal(virDomainDefPtr def, if (virDomainShmemDefFormat(buf, def->shmems[n], flags) < 0) goto error; + for (n = 0; n < def->nmems; n++) + if (virDomainMemoryDefFormat(buf, def->mems[n], flags) < 0) + goto error; + virBufferAdjustIndent(buf, -2); virBufferAddLit(buf, "</devices>\n"); @@ -21570,6 +21883,8 @@ virDomainDeviceDefCopy(virDomainDeviceDefPtr src, break; case VIR_DOMAIN_DEVICE_PANIC: rc = virDomainPanicDefFormat(&buf, src->data.panic); + case VIR_DOMAIN_DEVICE_MEMORY: + rc = virDomainMemoryDefFormat(&buf, src->data.memory, flags); break; case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_SMARTCARD: diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index f01c2ab..18cbe45 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -132,6 +132,9 @@ typedef virDomainIdMapDef *virDomainIdMapDefPtr; typedef struct _virDomainPanicDef virDomainPanicDef; typedef virDomainPanicDef *virDomainPanicDefPtr; +typedef struct _virDomainMemoryDef virDomainMemoryDef; +typedef virDomainMemoryDef *virDomainMemoryDefPtr; + /* forward declarations virDomainChrSourceDef, required by * virDomainNetDef */ @@ -168,6 +171,7 @@ typedef enum { VIR_DOMAIN_DEVICE_SHMEM, VIR_DOMAIN_DEVICE_TPM, VIR_DOMAIN_DEVICE_PANIC, + VIR_DOMAIN_DEVICE_MEMORY, VIR_DOMAIN_DEVICE_LAST } virDomainDeviceType; @@ -198,6 +202,7 @@ struct _virDomainDeviceDef { virDomainShmemDefPtr shmem; virDomainTPMDefPtr tpm; virDomainPanicDefPtr panic; + virDomainMemoryDefPtr memory; } data; }; @@ -1954,6 +1959,28 @@ struct _virDomainRNGDef { virDomainDeviceInfo info; }; +typedef enum { + VIR_DOMAIN_MEMORY_MODEL_NONE, + VIR_DOMAIN_MEMORY_MODEL_ACPI_DIMM, /* dimm hotpluggable memory device */ + + VIR_DOMAIN_MEMORY_MODEL_LAST +} virDomainMemoryModel; + +struct _virDomainMemoryDef { + /* source */ + virBitmapPtr sourceNodes; + unsigned long long pagesize; /* kibibytes */ + + /* target */ + int model; /* virDomainMemoryModel */ + int targetNode; + unsigned long long size; /* kibibytes */ + + virDomainDeviceInfo info; +}; + +void virDomainMemoryDefFree(virDomainMemoryDefPtr def); + struct _virDomainIdMapEntry { unsigned int start; unsigned int target; @@ -2169,6 +2196,9 @@ struct _virDomainDef { size_t nshmems; virDomainShmemDefPtr *shmems; + size_t nmems; + virDomainMemoryDefPtr *mems; + /* Only 1 */ virDomainWatchdogDefPtr watchdog; virDomainMemballoonDefPtr memballoon; @@ -2327,6 +2357,7 @@ bool virDomainObjTaint(virDomainObjPtr obj, int virDomainDefCheckUnsupportedMemoryHotplug(virDomainDefPtr def); +int virDomainDeviceDefCheckUnsupportedMemoryDevice(virDomainDeviceDefPtr dev); void virDomainPanicDefFree(virDomainPanicDefPtr panic); void virDomainResourceDefFree(virDomainResourceDefPtr resource); @@ -2860,6 +2891,8 @@ VIR_ENUM_DECL(virDomainRNGModel) VIR_ENUM_DECL(virDomainRNGBackend) VIR_ENUM_DECL(virDomainTPMModel) VIR_ENUM_DECL(virDomainTPMBackend) +VIR_ENUM_DECL(virDomainMemoryModel) +VIR_ENUM_DECL(virDomainMemoryBackingModel) /* from libvirt.h */ VIR_ENUM_DECL(virDomainState) VIR_ENUM_DECL(virDomainNostateReason) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4bad7c6..a3d1815 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -210,6 +210,7 @@ virDomainDefPostParse; virDomainDeleteConfig; virDomainDeviceAddressIsValid; virDomainDeviceAddressTypeToString; +virDomainDeviceDefCheckUnsupportedMemoryDevice; virDomainDeviceDefCopy; virDomainDeviceDefFree; virDomainDeviceDefParse; @@ -325,6 +326,7 @@ virDomainLockFailureTypeFromString; virDomainLockFailureTypeToString; virDomainMemballoonModelTypeFromString; virDomainMemballoonModelTypeToString; +virDomainMemoryDefFree; virDomainNetAppendIpAddress; virDomainNetDefFormat; virDomainNetDefFree; diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 01fb79c..be07807 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -534,6 +534,9 @@ libxlDomainDeviceDefPostParse(virDomainDeviceDefPtr dev, } } + if (virDomainDeviceDefCheckUnsupportedMemoryDevice(dev) < 0) + return -1; + return 0; } diff --git a/src/lxc/lxc_domain.c b/src/lxc/lxc_domain.c index 1367b0c..c2180cb 100644 --- a/src/lxc/lxc_domain.c +++ b/src/lxc/lxc_domain.c @@ -113,6 +113,10 @@ virLXCDomainDeviceDefPostParse(virDomainDeviceDefPtr dev, dev->data.chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_NONE) dev->data.chr->targetType = VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_LXC; + + if (virDomainDeviceDefCheckUnsupportedMemoryDevice(dev) < 0) + return -1; + return 0; } diff --git a/src/openvz/openvz_driver.c b/src/openvz/openvz_driver.c index 796044a..ed6514c 100644 --- a/src/openvz/openvz_driver.c +++ b/src/openvz/openvz_driver.c @@ -130,6 +130,9 @@ openvzDomainDeviceDefPostParse(virDomainDeviceDefPtr dev, return -1; } + if (virDomainDeviceDefCheckUnsupportedMemoryDevice(dev) < 0) + return -1; + return 0; } diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 99c46d4..25540c4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1182,6 +1182,9 @@ qemuDomainDeviceDefPostParse(virDomainDeviceDefPtr dev, } } + if (virDomainDeviceDefCheckUnsupportedMemoryDevice(dev) < 0) + goto cleanup; + ret = 0; cleanup: diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 91fefa9..df23aaa 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7020,6 +7020,9 @@ qemuDomainAttachDeviceLive(virDomainObjPtr vm, dev->data.chr = NULL; break; + /*TODO: implement later */ + case VIR_DOMAIN_DEVICE_MEMORY: + case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_FS: case VIR_DOMAIN_DEVICE_INPUT: @@ -7095,6 +7098,8 @@ qemuDomainDetachDeviceLive(virDomainObjPtr vm, case VIR_DOMAIN_DEVICE_CHR: ret = qemuDomainDetachChrDevice(driver, vm, dev->data.chr); break; + case VIR_DOMAIN_DEVICE_MEMORY: + /* TODO: Implement later */ case VIR_DOMAIN_DEVICE_FS: case VIR_DOMAIN_DEVICE_INPUT: case VIR_DOMAIN_DEVICE_SOUND: @@ -7213,6 +7218,7 @@ qemuDomainUpdateDeviceLive(virConnectPtr conn, case VIR_DOMAIN_DEVICE_HOSTDEV: case VIR_DOMAIN_DEVICE_CONTROLLER: case VIR_DOMAIN_DEVICE_REDIRDEV: + case VIR_DOMAIN_DEVICE_MEMORY: case VIR_DOMAIN_DEVICE_CHR: case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_TPM: @@ -7338,6 +7344,9 @@ qemuDomainAttachDeviceConfig(virQEMUCapsPtr qemuCaps, dev->data.fs = NULL; break; + case VIR_DOMAIN_DEVICE_MEMORY: + /* TODO: implement later */ + case VIR_DOMAIN_DEVICE_INPUT: case VIR_DOMAIN_DEVICE_SOUND: case VIR_DOMAIN_DEVICE_VIDEO: @@ -7454,6 +7463,9 @@ qemuDomainDetachDeviceConfig(virDomainDefPtr vmdef, virDomainFSDefFree(fs); break; + case VIR_DOMAIN_DEVICE_MEMORY: + /* TODO: implement later */ + case VIR_DOMAIN_DEVICE_INPUT: case VIR_DOMAIN_DEVICE_SOUND: case VIR_DOMAIN_DEVICE_VIDEO: @@ -7569,6 +7581,7 @@ qemuDomainUpdateDeviceConfig(virQEMUCapsPtr qemuCaps, case VIR_DOMAIN_DEVICE_CONTROLLER: case VIR_DOMAIN_DEVICE_REDIRDEV: case VIR_DOMAIN_DEVICE_CHR: + case VIR_DOMAIN_DEVICE_MEMORY: case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_TPM: case VIR_DOMAIN_DEVICE_PANIC: diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 033b281..3a15e29 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -2911,6 +2911,9 @@ qemuDomainRemoveDevice(virQEMUDriverPtr driver, ret = qemuDomainRemoveChrDevice(driver, vm, dev->data.chr); break; + /* TODO: implement later */ + case VIR_DOMAIN_DEVICE_MEMORY: + case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_LEASE: case VIR_DOMAIN_DEVICE_FS: diff --git a/src/uml/uml_driver.c b/src/uml/uml_driver.c index 546ad48..5e2a576 100644 --- a/src/uml/uml_driver.c +++ b/src/uml/uml_driver.c @@ -439,6 +439,9 @@ umlDomainDeviceDefPostParse(virDomainDeviceDefPtr dev, return -1; } + if (virDomainDeviceDefCheckUnsupportedMemoryDevice(dev) < 0) + return -1; + return 0; } diff --git a/src/xen/xen_driver.c b/src/xen/xen_driver.c index f603477..1cceacb 100644 --- a/src/xen/xen_driver.c +++ b/src/xen/xen_driver.c @@ -372,6 +372,9 @@ xenDomainDeviceDefPostParse(virDomainDeviceDefPtr dev, } } + if (virDomainDeviceDefCheckUnsupportedMemoryDevice(dev) < 0) + return -1; + return 0; } diff --git a/src/xenapi/xenapi_driver.c b/src/xenapi/xenapi_driver.c index 3b44a46..abc6c6f 100644 --- a/src/xenapi/xenapi_driver.c +++ b/src/xenapi/xenapi_driver.c @@ -65,6 +65,9 @@ xenapiDomainDeviceDefPostParse(virDomainDeviceDefPtr dev, return -1; } + if (virDomainDeviceDefCheckUnsupportedMemoryDevice(dev) < 0) + return -1; + return 0; } diff --git a/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml new file mode 100644 index 0000000..2dba8a2 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml @@ -0,0 +1,47 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <maxMemory slots='9' unit='KiB'>1233456789</maxMemory> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <controller type='usb' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <memballoon model='virtio'/> + <memory model='acpi-dimm'> + <target> + <size unit='KiB'>123456</size> + <node>0</node> + </target> + </memory> + <memory model='acpi-dimm'> + <source> + <pagesize unit='KiB'>4</pagesize> + </source> + <target> + <size unit='KiB'>123456</size> + <node>2</node> + </target> + </memory> + <memory model='acpi-dimm'> + <source> + <nodemask>2-5,7</nodemask> + <pagesize unit='KiB'>4</pagesize> + </source> + <target> + <size unit='KiB'>123456</size> + <node>0</node> + </target> + </memory> + </devices> +</domain> -- 2.2.2

On 01/30/2015 08:21 AM, Peter Krempa wrote:
WIP: TODO: docs
Also forbid the new device in post parse callback in all driver that implement the callback to warn users right away that the device is not supported with their hypervisor. --- docs/schemas/domaincommon.rng | 50 ++++ src/bhyve/bhyve_domain.c | 5 +- src/conf/domain_conf.c | 317 ++++++++++++++++++++- src/conf/domain_conf.h | 33 +++ src/libvirt_private.syms | 2 + src/libxl/libxl_domain.c | 3 + src/lxc/lxc_domain.c | 4 + src/openvz/openvz_driver.c | 3 + src/qemu/qemu_domain.c | 3 + src/qemu/qemu_driver.c | 13 + src/qemu/qemu_hotplug.c | 3 + src/uml/uml_driver.c | 3 + src/xen/xen_driver.c | 3 + src/xenapi/xenapi_driver.c | 3 + .../qemuxml2argv-memory-hotplug-dimm.xml | 47 +++ 15 files changed, 490 insertions(+), 2 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml
<...snip...>
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 0b9fb06..298b574 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -236,7 +236,8 @@ VIR_ENUM_IMPL(virDomainDevice, VIR_DOMAIN_DEVICE_LAST, "rng", "shmem", "tpm", - "panic") + "panic", + "memory")
VIR_ENUM_IMPL(virDomainDeviceAddress, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST, "none", @@ -779,6 +780,12 @@ VIR_ENUM_DECL(virDomainBlockJob) VIR_ENUM_IMPL(virDomainBlockJob, VIR_DOMAIN_BLOCK_JOB_TYPE_LAST, "", "", "copy", "", "active-commit")
+VIR_ENUM_IMPL(virDomainMemoryModel, VIR_DOMAIN_MEMORY_MODEL_LAST, + "", "acpi-dimm") + +#define VIR_DOMAIN_XML_WRITE_FLAGS VIR_DOMAIN_XML_SECURE +#define VIR_DOMAIN_XML_READ_FLAGS VIR_DOMAIN_XML_INACTIVE + static virClassPtr virDomainObjClass; static virClassPtr virDomainObjListClass; static virClassPtr virDomainXMLOptionClass; @@ -1003,6 +1010,27 @@ virDomainDefCheckUnsupportedMemoryHotplug(virDomainDefPtr def) }
+/** + * virDomainDeviceDefCheckUnsupportedMemoryDevice: + * @dev: device definition + * + * Returns -1 if the device definition describes a memory device and reports an + * error. Otherwise returns 0. + */ +int +virDomainDeviceDefCheckUnsupportedMemoryDevice(virDomainDeviceDefPtr dev) +{ + /* This driver doesn't yet know how to handle memory devices */ + if (dev->type == VIR_DOMAIN_DEVICE_MEMORY) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("memory devices are not supported by this driver")); + return -1; + } + + return 0; +} + + static void virDomainObjListDataFree(void *payload, const void *name ATTRIBUTE_UNUSED) { @@ -1934,6 +1962,15 @@ void virDomainRedirFilterDefFree(virDomainRedirFilterDefPtr def) VIR_FREE(def); }
+void virDomainMemoryDefFree(virDomainMemoryDefPtr def) +{ + if (!def) + return; + + virDomainDeviceInfoClear(&def->info); + VIR_FREE(def); +} + void virDomainDeviceDefFree(virDomainDeviceDefPtr def) { if (!def) @@ -2003,6 +2040,9 @@ void virDomainDeviceDefFree(virDomainDeviceDefPtr def) case VIR_DOMAIN_DEVICE_PANIC: virDomainPanicDefFree(def->data.panic); break; + case VIR_DOMAIN_DEVICE_MEMORY: + virDomainMemoryDefFree(def->data.memory); + break; case VIR_DOMAIN_DEVICE_LAST: case VIR_DOMAIN_DEVICE_NONE: break; @@ -2200,6 +2240,10 @@ void virDomainDefFree(virDomainDefPtr def) virDomainRNGDefFree(def->rngs[i]); VIR_FREE(def->rngs);
+ for (i = 0; i < def->nmems; i++) + virDomainMemoryDefFree(def->mems[i]); + VIR_FREE(def->mems); + virDomainTPMDefFree(def->tpm);
virDomainPanicDefFree(def->panic); @@ -2709,6 +2753,8 @@ virDomainDeviceGetInfo(virDomainDeviceDefPtr device) return &device->data.tpm->info; case VIR_DOMAIN_DEVICE_PANIC: return &device->data.panic->info; + case VIR_DOMAIN_DEVICE_MEMORY: + return &device->data.memory->info;
/* The following devices do not contain virDomainDeviceInfo */ case VIR_DOMAIN_DEVICE_LEASE: @@ -2940,6 +2986,13 @@ virDomainDeviceInfoIterateInternal(virDomainDefPtr def, return -1; }
+ device.type = VIR_DOMAIN_DEVICE_MEMORY; + for (i = 0; i < def->nmems; i++) { + device.data.memory = def->mems[i]; + if (cb(def, &device, &def->mems[i]->info, opaque) < 0) + return -1; + } + /* Coverity is not very happy with this - all dead_error_condition */ #if !STATIC_ANALYSIS /* This switch statement is here to trigger compiler warning when adding @@ -2972,6 +3025,7 @@ virDomainDeviceInfoIterateInternal(virDomainDefPtr def, case VIR_DOMAIN_DEVICE_PANIC: case VIR_DOMAIN_DEVICE_LAST: case VIR_DOMAIN_DEVICE_RNG: + case VIR_DOMAIN_DEVICE_MEMORY: break; } #endif @@ -11109,6 +11163,119 @@ virDomainPMStateParseXML(xmlXPathContextPtr ctxt, return ret; }
+ +static int +virDomainMemorySourceDefParseXML(xmlNodePtr node, + xmlXPathContextPtr ctxt, + virDomainMemoryDefPtr def) +{ + int ret = -1; + char *nodemask = NULL; + xmlNodePtr save = ctxt->node; + ctxt->node = node; + + if (virDomainParseMemory("./pagesize", "./pagesize/@unit", ctxt, + &def->pagesize, false, false) < 0) + goto cleanup; + + if ((nodemask = virXPathString("string(./nodemask)", ctxt))) { + if (virBitmapParse(nodemask, 0, &def->sourceNodes, + VIR_DOMAIN_CPUMASK_LEN) < 0) + goto cleanup; + } + + ret = 0; + + cleanup: + VIR_FREE(nodemask); + ctxt->node = save; + return ret; +} + + +static int +virDomainMemoryTargetDefParseXML(xmlNodePtr node, + xmlXPathContextPtr ctxt, + virDomainMemoryDefPtr def) +{ + int ret = -1; + xmlNodePtr save = ctxt->node; + ctxt->node = node; + + if (virXPathInt("string(./node)", ctxt, &def->targetNode) < 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("invalid or missing value of memory device node")); + goto cleanup; + } + + if (virDomainParseMemory("./size", "./size/@unit", ctxt, + &def->size, true, false) < 0) + goto cleanup; + + ret = 0; + + cleanup: + ctxt->node = save; + return ret; +} + + +static virDomainMemoryDefPtr +virDomainMemoryDefParseXML(xmlNodePtr memdevNode, + xmlXPathContextPtr ctxt, + unsigned int flags) +{ + char *tmp = NULL; + xmlNodePtr save = ctxt->node; + xmlNodePtr node; + virDomainMemoryDefPtr def; + + ctxt->node = memdevNode; + + if (VIR_ALLOC(def) < 0) + return NULL; + + if (!(tmp = virXMLPropString(memdevNode, "model"))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing memory model")); + goto error; + } + + if ((def->model = virDomainMemoryModelTypeFromString(tmp)) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid memory model '%s'"), tmp); + goto error; + } + VIR_FREE(tmp); + + /* source */ + if ((node = virXPathNode("./source", ctxt)) && + virDomainMemorySourceDefParseXML(node, ctxt, def) < 0) + goto error; + + /* target */ + if (!(node = virXPathNode("./target", ctxt))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing <target> element for <memory> device")); + goto error; + } + + if (virDomainMemoryTargetDefParseXML(node, ctxt, def) < 0) + goto error; + + if (virDomainDeviceInfoParseXML(memdevNode, NULL, &def->info, flags) < 0) + goto error; + + return def; + + error: + VIR_FREE(tmp); + virDomainMemoryDefFree(def); + ctxt->node = save; + return NULL; +} + + virDomainDeviceDefPtr virDomainDeviceDefParse(const char *xmlStr, const virDomainDef *def, @@ -11243,6 +11410,10 @@ virDomainDeviceDefParse(const char *xmlStr, if (!(dev->data.panic = virDomainPanicDefParseXML(node))) goto error; break; + case VIR_DOMAIN_DEVICE_MEMORY: + if (!(dev->data.memory = virDomainMemoryDefParseXML(node, ctxt, flags))) + goto error; + break; case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_LAST: break; @@ -14438,6 +14609,23 @@ virDomainDefParseXML(xmlDocPtr xml, ctxt->node = node; VIR_FREE(nodes);
+ /* analysis of memory devices */ + if ((n = virXPathNodeSet("./devices/memory", ctxt, &nodes)) < 0) + goto error; + if (n && VIR_ALLOC_N(def->mems, n) < 0) + goto error; + + for (i = 0; i < n; i++) { + virDomainMemoryDefPtr mem = virDomainMemoryDefParseXML(nodes[i], + ctxt, + flags); + if (!mem) + goto error; + + def->mems[def->nmems++] = mem; + } + VIR_FREE(nodes); + /* analysis of the user namespace mapping */ if ((n = virXPathNodeSet("./idmap/uid", ctxt, &nodes)) < 0) goto error; @@ -15677,6 +15865,39 @@ virDomainPanicDefCheckABIStability(virDomainPanicDefPtr src, return virDomainDeviceInfoCheckABIStability(&src->info, &dst->info); }
+static bool +virDomainMemoryDefCheckABIStability(virDomainMemoryDefPtr src, + virDomainMemoryDefPtr dst) +{ + if (src->model != dst->model) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device model '%s' " + "doesn't match source model '%s'"), + virDomainMemoryModelTypeToString(dst->model), + virDomainMemoryModelTypeToString(src->model)); + return false; + } + + if (src->targetNode != dst->targetNode) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device targetNode '%d' " + "doesn't match source targetNode '%d'"), + dst->targetNode, src->targetNode); + return false; + } + + if (src->size != dst->size) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target memory device size '%llu' doesn't match " + "source memory device size '%llu'"), + dst->size, src->size); + return false; + } + + return virDomainDeviceInfoCheckABIStability(&src->info, &dst->info); +} + + /* This compares two configurations and looks for any differences * which will affect the guest ABI. This is primarily to allow * validation of custom XML config passed in during migration @@ -16114,6 +16335,18 @@ virDomainDefCheckABIStability(virDomainDefPtr src, goto error; }
+ if (src->nmems != dst->nmems) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target domain memory device count %zu " + "does not match source %zu"), dst->nmems, src->nmems); + goto error; + } + + for (i = 0; i < src->nmems; i++) { + if (!virDomainMemoryDefCheckABIStability(src->mems[i], dst->mems[i])) + goto error; + } + /* Coverity is not very happy with this - all dead_error_condition */ #if !STATIC_ANALYSIS /* This switch statement is here to trigger compiler warning when adding @@ -16145,6 +16378,7 @@ virDomainDefCheckABIStability(virDomainDefPtr src, case VIR_DOMAIN_DEVICE_TPM: case VIR_DOMAIN_DEVICE_PANIC: case VIR_DOMAIN_DEVICE_SHMEM: + case VIR_DOMAIN_DEVICE_MEMORY: break; } #endif @@ -18616,6 +18850,81 @@ virDomainRNGDefFree(virDomainRNGDefPtr def) VIR_FREE(def); }
+ +static int +virDomainMemorySourceDefFormat(virBufferPtr buf, + virDomainMemoryDefPtr def) +{ + char *bitmap = NULL; + int ret = -1; + + if (!def->pagesize && !def->sourceNodes) + return 0; + + virBufferAddLit(buf, "<source>\n"); + virBufferAdjustIndent(buf, 2); + + if (def->sourceNodes) { + if (!(bitmap = virBitmapFormat(def->sourceNodes))) + goto cleanup; + + virBufferAsprintf(buf, "<nodemask>%s</nodemask>\n", bitmap); + } + + if (def->pagesize) + virBufferAsprintf(buf, "<pagesize unit='KiB'>%llu</pagesize>\n", + def->pagesize); + + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</source>\n"); + + ret = 0; + + cleanup: + VIR_FREE(bitmap); + return ret; +} + + +static void +virDomainMemoryTargetDefFormat(virBufferPtr buf, + virDomainMemoryDefPtr def) +{ + virBufferAddLit(buf, "<target>\n"); + virBufferAdjustIndent(buf, 2); + + virBufferAsprintf(buf, "<size unit='KiB'>%llu</size>\n", def->size); + virBufferAsprintf(buf, "<node>%d</node>\n", def->targetNode); + + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</target>\n"); +} + +static int +virDomainMemoryDefFormat(virBufferPtr buf, + virDomainMemoryDefPtr def, + unsigned int flags) +{ + const char *model = virDomainMemoryModelTypeToString(def->model); + + virBufferAsprintf(buf, "<memory model='%s'>\n", model); + virBufferAdjustIndent(buf, 2); + + if (virDomainMemorySourceDefFormat(buf, def) < 0) + return -1; + + virDomainMemoryTargetDefFormat(buf, def); + + if (virDomainDeviceInfoNeedsFormat(&def->info, flags)) { + if (virDomainDeviceInfoFormat(buf, &def->info, flags) < 0) + return -1; + } + + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</memory>\n"); + return 0; +} + static void virDomainVideoAccelDefFormat(virBufferPtr buf, virDomainVideoAccelDefPtr def) @@ -20168,6 +20477,10 @@ virDomainDefFormatInternal(virDomainDefPtr def, if (virDomainShmemDefFormat(buf, def->shmems[n], flags) < 0) goto error;
+ for (n = 0; n < def->nmems; n++) + if (virDomainMemoryDefFormat(buf, def->mems[n], flags) < 0) + goto error; + virBufferAdjustIndent(buf, -2); virBufferAddLit(buf, "</devices>\n");
@@ -21570,6 +21883,8 @@ virDomainDeviceDefCopy(virDomainDeviceDefPtr src, break; case VIR_DOMAIN_DEVICE_PANIC: rc = virDomainPanicDefFormat(&buf, src->data.panic);
Coverity found a missing break here John
+ case VIR_DOMAIN_DEVICE_MEMORY: + rc = virDomainMemoryDefFormat(&buf, src->data.memory, flags); break; case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_SMARTCARD:

On Fri, Jan 30, 2015 at 02:21:02PM +0100, Peter Krempa wrote:
WIP: TODO: docs
Also forbid the new device in post parse callback in all driver that implement the callback to warn users right away that the device is not supported with their hypervisor. --- docs/schemas/domaincommon.rng | 50 ++++ src/bhyve/bhyve_domain.c | 5 +- src/conf/domain_conf.c | 317 ++++++++++++++++++++- src/conf/domain_conf.h | 33 +++ src/libvirt_private.syms | 2 + src/libxl/libxl_domain.c | 3 + src/lxc/lxc_domain.c | 4 + src/openvz/openvz_driver.c | 3 + src/qemu/qemu_domain.c | 3 + src/qemu/qemu_driver.c | 13 + src/qemu/qemu_hotplug.c | 3 + src/uml/uml_driver.c | 3 + src/xen/xen_driver.c | 3 + src/xenapi/xenapi_driver.c | 3 + .../qemuxml2argv-memory-hotplug-dimm.xml | 47 +++ 15 files changed, 490 insertions(+), 2 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml new file mode 100644 index 0000000..2dba8a2 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml @@ -0,0 +1,47 @@ +<domain type='qemu'>
...
+ <memory model='acpi-dimm'> + <source> + <nodemask>2-5,7</nodemask>
So, this is the host NUMA node that it is allocate from
+ <pagesize unit='KiB'>4</pagesize>
The host huge page size to use
+ </source> + <target> + <size unit='KiB'>123456</size>
The guest memory size
+ <node>0</node>
The guest NUMA node
+ </target> + </memory>
This isn't showing use of the <address type="acpi-dimm"> address. Is that always an output-only thing, or can apps provide that upfront like they do for other address types. All the info makes sense really. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Mon, Feb 09, 2015 at 10:28:49 +0000, Daniel Berrange wrote:
On Fri, Jan 30, 2015 at 02:21:02PM +0100, Peter Krempa wrote:
WIP: TODO: docs
Also forbid the new device in post parse callback in all driver that implement the callback to warn users right away that the device is not supported with their hypervisor. --- docs/schemas/domaincommon.rng | 50 ++++ src/bhyve/bhyve_domain.c | 5 +- src/conf/domain_conf.c | 317 ++++++++++++++++++++- src/conf/domain_conf.h | 33 +++ src/libvirt_private.syms | 2 + src/libxl/libxl_domain.c | 3 + src/lxc/lxc_domain.c | 4 + src/openvz/openvz_driver.c | 3 + src/qemu/qemu_domain.c | 3 + src/qemu/qemu_driver.c | 13 + src/qemu/qemu_hotplug.c | 3 + src/uml/uml_driver.c | 3 + src/xen/xen_driver.c | 3 + src/xenapi/xenapi_driver.c | 3 + .../qemuxml2argv-memory-hotplug-dimm.xml | 47 +++ 15 files changed, 490 insertions(+), 2 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml new file mode 100644 index 0000000..2dba8a2 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml @@ -0,0 +1,47 @@ +<domain type='qemu'>
...
+ <memory model='acpi-dimm'> + <source> + <nodemask>2-5,7</nodemask>
So, this is the host NUMA node that it is allocate from
+ <pagesize unit='KiB'>4</pagesize>
The host huge page size to use
+ </source>
By the way, the <source> subelement is optional and if not provided the config in <numatune> and <cpu> is used to infer the correct data. (In case it wasn't obvious as it's lacking docs)
+ <target> + <size unit='KiB'>123456</size>
The guest memory size
+ <node>0</node>
The guest NUMA node
+ </target> + </memory>
This isn't showing use of the <address type="acpi-dimm"> address. Is that always an output-only thing, or can apps provide that upfront like they do for other address types.
The apps will be able to provide it if needed, but having the physical offset of the module specified doesn't really seem to be a generally useful case. Currently the address is updated in the live definiton to carry accross migrations. In case the user specified the address it will be still queried and updated in the XML in case qemu would align the address so that migration will work even in case the alignment rules would change. I'll add a case where the address is specified to the test case. Peter

When using 'acpi-dimm' memory devices with qemu, some of the information like the slot number and base address need to be reloaded from qemu after process start so that it reflects the actual state. The state then allows to use memory devices across migrations. --- src/qemu/qemu_domain.c | 43 ++++++++++++++++ src/qemu/qemu_domain.h | 4 ++ src/qemu/qemu_monitor.c | 45 +++++++++++++++++ src/qemu/qemu_monitor.h | 14 ++++++ src/qemu/qemu_monitor_json.c | 116 +++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 5 ++ src/qemu/qemu_process.c | 4 ++ 7 files changed, 231 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 25540c4..df912a6 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2773,6 +2773,49 @@ qemuDomainUpdateDeviceList(virQEMUDriverPtr driver, return 0; } + +int +qemuDomainUpdateMemoryDeviceInfo(virQEMUDriverPtr driver, + virDomainObjPtr vm, + int asyncJob) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virHashTablePtr meminfo = NULL; + size_t i; + + if (vm->def->nmems == 0) + return 0; + + if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) + return -1; + if (qemuMonitorGetMemoryDeviceInfo(priv->mon, &meminfo) < 0) { + ignore_value(qemuDomainObjExitMonitor(driver, vm)); + return -1; + } + + if (qemuDomainObjExitMonitor(driver, vm) < 0) + return -1; + + for (i = 0; i < vm->def->nmems; i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i]; + qemuMonitorMemoryDeviceInfoPtr dimm; + + if (!mem->info.alias) + continue; + + if (!(dimm = virHashLookup(meminfo, mem->info.alias))) + continue; + + mem->info.type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM; + mem->info.addr.acpiDimm.slot = dimm->slot; + mem->info.addr.acpiDimm.base = dimm->address; + } + + virHashFree(meminfo); + return 0; +} + + bool qemuDomainDefCheckABIStability(virQEMUDriverPtr driver, virDomainDefPtr src, diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index b2c3881..8caec2a 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -391,6 +391,10 @@ extern virDomainDefParserConfig virQEMUDriverDomainDefParserConfig; int qemuDomainUpdateDeviceList(virQEMUDriverPtr driver, virDomainObjPtr vm, int asyncJob); +int qemuDomainUpdateMemoryDeviceInfo(virQEMUDriverPtr driver, + virDomainObjPtr vm, + int asyncJob); + bool qemuDomainDefCheckABIStability(virQEMUDriverPtr driver, virDomainDefPtr src, virDomainDefPtr dst); diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 6882a50..3dc5d12 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -4358,3 +4358,48 @@ void qemuMonitorIOThreadsInfoFree(qemuMonitorIOThreadsInfoPtr iothread) VIR_FREE(iothread->name); VIR_FREE(iothread); } + + +/** + * qemuMonitorGetMemoryDeviceInfo: + * @mon: pointer to the monitor + * @info: Location to return the hash of qemuMonitorMemoryDeviceInfo + * + * Retrieve state and addresses of frontend memory devices present in + * the guest. + * + * Returns 0 on success and fills @info with a newly allocated struct; if the + * data can't be retrieved due to lack of support in qemu, returns -2. On + * other errors returns -1. + */ +int +qemuMonitorGetMemoryDeviceInfo(qemuMonitorPtr mon, + virHashTablePtr *info) +{ + VIR_DEBUG("mon=%p info=%p", mon, info); + int ret; + + *info = NULL; + + if (!mon) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (!mon->json) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("JSON monitor is required")); + return -2; + } + + if (!(*info = virHashCreate(10, virHashValueFree))) + return -1; + + if ((ret = qemuMonitorJSONGetMemoryDeviceInfo(mon, *info)) < 0) { + virHashFree(*info); + *info = NULL; + } + + return ret; +} diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 133d42d..811ce49 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -892,6 +892,20 @@ int qemuMonitorGetIOThreads(qemuMonitorPtr mon, void qemuMonitorIOThreadsInfoFree(qemuMonitorIOThreadsInfoPtr iothread); +typedef struct _qemuMonitorMemoryDeviceInfo qemuMonitorMemoryDeviceInfo; +typedef qemuMonitorMemoryDeviceInfo *qemuMonitorMemoryDeviceInfoPtr; + +struct _qemuMonitorMemoryDeviceInfo { + unsigned long long address; + unsigned int slot; + bool hotplugged; + bool hotpluggable; +}; + +int qemuMonitorGetMemoryDeviceInfo(qemuMonitorPtr mon, + virHashTablePtr *info) + ATTRIBUTE_NONNULL(2); + /** * When running two dd process and using <> redirection, we need a * shell that will not truncate files. These two strings serve that diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index da5c14d..333a69a 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -6569,3 +6569,119 @@ qemuMonitorJSONGetIOThreads(qemuMonitorPtr mon, virJSONValueFree(reply); return ret; } + + +int +qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon, + virHashTablePtr info) +{ + int ret = -1; + virJSONValuePtr cmd; + virJSONValuePtr reply = NULL; + virJSONValuePtr data = NULL; + qemuMonitorMemoryDeviceInfoPtr meminfo = NULL; + ssize_t n; + size_t i; + + if (!(cmd = qemuMonitorJSONMakeCommand("query-memory-devices", NULL))) + return -1; + + ret = qemuMonitorJSONCommand(mon, cmd, &reply); + + if (ret == 0) + ret = qemuMonitorJSONCheckError(cmd, reply); + + if (ret < 0) + goto cleanup; + + ret = -1; + + if (!(data = virJSONValueObjectGet(reply, "return"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-memory-devices reply was missing return data")); + goto cleanup; + } + + if ((n = virJSONValueArraySize(data)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-memory-devices reply data was not an array")); + goto cleanup; + } + + for (i = 0; i < n; i++) { + virJSONValuePtr elem = virJSONValueArrayGet(data, i); + const char *type; + + if (!(type = virJSONValueObjectGetString(elem, "type"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-memory-devices reply data doesn't contain " + "enum type discriminator")); + goto cleanup; + } + + /* dimm memory devices */ + if (STREQ(type, "dimm")) { + virJSONValuePtr dimminfo; + const char *devalias; + + if (!(dimminfo = virJSONValueObjectGet(elem, "data"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-memory-devices reply data doesn't " + "contain enum data")); + goto cleanup; + } + + if (!(devalias = virJSONValueObjectGetString(dimminfo, "id"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("dimm memory info data is missing 'id'")); + goto cleanup; + } + + if (VIR_ALLOC(meminfo) < 0) + goto cleanup; + + if (virJSONValueObjectGetNumberUlong(dimminfo, "addr", + &meminfo->address) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("malformed/missing addr in dimm memory info")); + goto cleanup; + } + + if (virJSONValueObjectGetNumberUint(dimminfo, "slot", + &meminfo->slot) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("malformed/missing slot in dimm memory info")); + goto cleanup; + } + + if (virJSONValueObjectGetBoolean(dimminfo, "hotplugged", + &meminfo->hotplugged) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("malformed/missing hotplugged in dimm memory info")); + goto cleanup; + + } + + if (virJSONValueObjectGetBoolean(dimminfo, "hotpluggable", + &meminfo->hotpluggable) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("malformed/missing hotpluggable in dimm memory info")); + goto cleanup; + + } + + if (virHashAddEntry(info, devalias, meminfo) < 0) + goto cleanup; + + meminfo = NULL; + } + } + + ret = 0; + + cleanup: + VIR_FREE(meminfo); + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; +} diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index 1da1a00..eb51a60 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -475,4 +475,9 @@ int qemuMonitorJSONRTCResetReinjection(qemuMonitorPtr mon); int qemuMonitorJSONGetIOThreads(qemuMonitorPtr mon, qemuMonitorIOThreadsInfoPtr **iothreads) ATTRIBUTE_NONNULL(2); + +int qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon, + virHashTablePtr info) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); + #endif /* QEMU_MONITOR_JSON_H */ diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index b0f7b1c..ba7c8e2 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4871,6 +4871,10 @@ int qemuProcessStart(virConnectPtr conn, if (qemuDomainUpdateDeviceList(driver, vm, asyncJob) < 0) goto cleanup; + VIR_DEBUG("Updating info of memory devices"); + if (qemuDomainUpdateMemoryDeviceInfo(driver, vm, asyncJob) < 0) + goto cleanup; + /* Technically, qemuProcessStart can be called from inside * QEMU_ASYNC_JOB_MIGRATION_IN, but we are okay treating this like * a sync job since no other job can call into the domain until -- 2.2.2

On Fri, Jan 30, 2015 at 02:21:03PM +0100, Peter Krempa wrote:
When using 'acpi-dimm' memory devices with qemu, some of the information like the slot number and base address need to be reloaded from qemu after process start so that it reflects the actual state. The state then allows to use memory devices across migrations. --- src/qemu/qemu_domain.c | 43 ++++++++++++++++ src/qemu/qemu_domain.h | 4 ++ src/qemu/qemu_monitor.c | 45 +++++++++++++++++ src/qemu/qemu_monitor.h | 14 ++++++ src/qemu/qemu_monitor_json.c | 116 +++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 5 ++ src/qemu/qemu_process.c | 4 ++ 7 files changed, 231 insertions(+)
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index da5c14d..333a69a 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -6569,3 +6569,119 @@ qemuMonitorJSONGetIOThreads(qemuMonitorPtr mon, virJSONValueFree(reply); return ret; } + + +int +qemuMonitorJSONGetMemoryDeviceInfo(qemuMonitorPtr mon, + virHashTablePtr info) +{ + int ret = -1; + virJSONValuePtr cmd; + virJSONValuePtr reply = NULL; + virJSONValuePtr data = NULL; + qemuMonitorMemoryDeviceInfoPtr meminfo = NULL; + ssize_t n; + size_t i; + + if (!(cmd = qemuMonitorJSONMakeCommand("query-memory-devices", NULL))) + return -1; + + ret = qemuMonitorJSONCommand(mon, cmd, &reply); + + if (ret == 0) + ret = qemuMonitorJSONCheckError(cmd, reply);
Missing check for a missing command.
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index b0f7b1c..ba7c8e2 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4871,6 +4871,10 @@ int qemuProcessStart(virConnectPtr conn, if (qemuDomainUpdateDeviceList(driver, vm, asyncJob) < 0) goto cleanup;
+ VIR_DEBUG("Updating info of memory devices"); + if (qemuDomainUpdateMemoryDeviceInfo(driver, vm, asyncJob) < 0) + goto cleanup;
Return value of -2 should not be fatal. Jan
+ /* Technically, qemuProcessStart can be called from inside * QEMU_ASYNC_JOB_MIGRATION_IN, but we are okay treating this like * a sync job since no other job can call into the domain until -- 2.2.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 01/30/2015 08:21 AM, Peter Krempa wrote:
When using 'acpi-dimm' memory devices with qemu, some of the information like the slot number and base address need to be reloaded from qemu after process start so that it reflects the actual state. The state then allows to use memory devices across migrations. --- src/qemu/qemu_domain.c | 43 ++++++++++++++++ src/qemu/qemu_domain.h | 4 ++ src/qemu/qemu_monitor.c | 45 +++++++++++++++++ src/qemu/qemu_monitor.h | 14 ++++++ src/qemu/qemu_monitor_json.c | 116 +++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 5 ++ src/qemu/qemu_process.c | 4 ++ 7 files changed, 231 insertions(+)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 25540c4..df912a6 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2773,6 +2773,49 @@ qemuDomainUpdateDeviceList(virQEMUDriverPtr driver, return 0; }
+ +int +qemuDomainUpdateMemoryDeviceInfo(virQEMUDriverPtr driver, + virDomainObjPtr vm, + int asyncJob) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virHashTablePtr meminfo = NULL; + size_t i; + + if (vm->def->nmems == 0) + return 0; + + if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) + return -1; + if (qemuMonitorGetMemoryDeviceInfo(priv->mon, &meminfo) < 0) { + ignore_value(qemuDomainObjExitMonitor(driver, vm)); + return -1; + } + + if (qemuDomainObjExitMonitor(driver, vm) < 0)
This would leak meminfo
+ return -1; +
I know we don't have a "norm" for how this sequence goes, but along with Jan's point about -2 being checked back in the caller... The "norm" seems to be to: ret = qemuMonitor*(); if (*ExitMonitor) < 0) ret = -1; goto cleanup or error; if (ret < 0) goto cleanup; ... Caller would probably want to make sure the VM is still active or not.
+ for (i = 0; i < vm->def->nmems; i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i]; + qemuMonitorMemoryDeviceInfoPtr dimm; + + if (!mem->info.alias) + continue; + + if (!(dimm = virHashLookup(meminfo, mem->info.alias))) + continue; + + mem->info.type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM; + mem->info.addr.acpiDimm.slot = dimm->slot; + mem->info.addr.acpiDimm.base = dimm->address; + } +
cleanup:
+ virHashFree(meminfo); + return ret; +} + + bool qemuDomainDefCheckABIStability(virQEMUDriverPtr driver, virDomainDefPtr src,
<...snip...>
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index b0f7b1c..ba7c8e2 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4871,6 +4871,10 @@ int qemuProcessStart(virConnectPtr conn, if (qemuDomainUpdateDeviceList(driver, vm, asyncJob) < 0) goto cleanup;
+ VIR_DEBUG("Updating info of memory devices"); + if (qemuDomainUpdateMemoryDeviceInfo(driver, vm, asyncJob) < 0) + goto cleanup; +
Haven't looked forward yet, but is this something that needs to be done at qemuProcessReconnec ? John
/* Technically, qemuProcessStart can be called from inside * QEMU_ASYNC_JOB_MIGRATION_IN, but we are okay treating this like * a sync job since no other job can call into the domain until

Make sure that libvirt has all vital information needed to reliably represent configuration of guest's memory devices in case of a migration. This patch forbids migration in case the required slot number and module base address are not present (failed to be loaded from qemu via monitor). --- src/qemu/qemu_migration.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 8a8fa63..a1cc6fa 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2018,6 +2018,20 @@ qemuMigrationIsAllowed(virQEMUDriverPtr driver, virDomainObjPtr vm, } } + /* Verify that memory device config can be transferred reliably */ + for (i = 0; i < def->nmems; i++) { + virDomainMemoryDefPtr mem = def->mems[i]; + + if (mem->model == VIR_DOMAIN_MEMORY_MODEL_ACPI_DIMM && + mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain's acpi-dimm info lacks slot ID " + "or base address")); + + return false; + } + } + return true; } -- 2.2.2

On Fri, Jan 30, 2015 at 02:21:04PM +0100, Peter Krempa wrote:
Make sure that libvirt has all vital information needed to reliably represent configuration of guest's memory devices in case of a migration.
This patch forbids migration in case the required slot number and module base address are not present (failed to be loaded from qemu via monitor).
What would cause us to fail to load them from QEMU ? Is there a way we can assign the address upfront like we do for other device types, so we can avoid the need to query QEMU & thus avoid failure scenario too ? Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Mon, Feb 09, 2015 at 10:24:04 +0000, Daniel Berrange wrote:
On Fri, Jan 30, 2015 at 02:21:04PM +0100, Peter Krempa wrote:
Make sure that libvirt has all vital information needed to reliably represent configuration of guest's memory devices in case of a migration.
This patch forbids migration in case the required slot number and module base address are not present (failed to be loaded from qemu via monitor).
What would cause us to fail to load them from QEMU ? Is there a way we
Only a fatal monitor error or the command not being available would prevent getting the info. I think it shouldn't ever be a problem for anybody, but I wanted to be safe.
can assign the address upfront like we do for other device types, so we can avoid the need to query QEMU & thus avoid failure scenario too ?
Qemu is aligning the memory regions to round numbers. I wanted to avoid having to duplicate the logic used to do this in qemu and possibly have to mirror it in cases it changes. I don't really think users would gnerally care about the placement of the module so the info is only required in case of miration. Peter

Add support to start qemu instance with 'pc-dimm' device. Thanks to the refactors we are able to reuse the existing function to determine the parameters. --- src/qemu/qemu_command.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.c | 18 ++++++-- src/qemu/qemu_domain.h | 2 + tests/qemuxml2xmltest.c | 1 + 4 files changed, 132 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 5820fb5..7c31723 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1156,6 +1156,10 @@ qemuAssignDeviceAliases(virDomainDefPtr def, virQEMUCapsPtr qemuCaps) if (virAsprintf(&def->tpm->info.alias, "tpm%d", 0) < 0) return -1; } + for (i = 0; i < def->nmems; i++) { + if (virAsprintf(&def->mems[i]->info.alias, "dimm%zu", i) < 0) + return -1; + } return 0; } @@ -4748,6 +4752,97 @@ qemuBuildMemoryCellBackendStr(virDomainDefPtr def, } +static char * +qemuBuildMemoryDimmBackendStr(virDomainMemoryDefPtr mem, + virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virQEMUDriverConfigPtr cfg) +{ + virJSONValuePtr props = NULL; + char *alias = NULL; + const char *backendType; + char *ret = NULL; + + if (!mem->info.alias) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("memory device alias is not assigned")); + return NULL; + } + + if (virAsprintf(&alias, "mem%s", mem->info.alias) < 0) + goto cleanup; + + qemuDomainMemoryDeviceAlignSize(mem); + + if (qemuBuildMemoryBackendStr(mem->size, mem->pagesize, + mem->targetNode, mem->sourceNodes, NULL, + def, qemuCaps, cfg, + &backendType, &props, true) < 0) + goto cleanup; + + ret = qemuBuildObjectCommandlineFromJSON(backendType, alias, props); + + cleanup: + VIR_FREE(alias); + virJSONValueFree(props); + + return ret; +} + + +static char * +qemuBuildMemoryDeviceStr(virDomainMemoryDefPtr mem, + virQEMUCapsPtr qemuCaps) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (!mem->info.alias) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("missing alias for memory device")); + return NULL; + } + + switch ((virDomainMemoryModel) mem->model) { + case VIR_DOMAIN_MEMORY_MODEL_ACPI_DIMM: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_PC_DIMM)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("this qemu doesn't support the pc-dimm device")); + return NULL; + } + + if (mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM && + mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("only 'acpi-dimm' addresses are supported for the " + "pc-dimm device")); + return NULL; + } + + virBufferAsprintf(&buf, "pc-dimm,node=%d,memdev=mem%s,id=%s", + mem->targetNode, mem->info.alias, mem->info.alias); + + if (mem->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM) { + virBufferAsprintf(&buf, ",slot=%d", mem->info.addr.acpiDimm.slot); + virBufferAsprintf(&buf, ",base=%llu", mem->info.addr.acpiDimm.base); + } + + break; + + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("invalid memory device type")); + break; + + } + + if (virBufferCheckError(&buf) < 0) + return NULL; + + return virBufferContentAndReset(&buf); +} + + char * qemuBuildNicStr(virDomainNetDefPtr net, const char *prefix, @@ -8351,6 +8446,25 @@ qemuBuildCommandLine(virConnectPtr conn, if (qemuBuildNumaArgStr(cfg, def, cmd, qemuCaps, nodeset) < 0) goto error; + for (i = 0; i < def->nmems; i++) { + char *backStr; + char *dimmStr; + + if (!(backStr = qemuBuildMemoryDimmBackendStr(def->mems[i], def, + qemuCaps, cfg))) + goto error; + + if (!(dimmStr = qemuBuildMemoryDeviceStr(def->mems[i], qemuCaps))) { + VIR_FREE(backStr); + goto error; + } + + virCommandAddArgList(cmd, "-object", backStr, "-device", dimmStr, NULL); + + VIR_FREE(backStr); + VIR_FREE(dimmStr); + } + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_UUID)) virCommandAddArgList(cmd, "-uuid", uuid, NULL); if (def->virtType == VIR_DOMAIN_VIRT_XEN || diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index df912a6..ff03bdc 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1182,9 +1182,6 @@ qemuDomainDeviceDefPostParse(virDomainDeviceDefPtr dev, } } - if (virDomainDeviceDefCheckUnsupportedMemoryDevice(dev) < 0) - goto cleanup; - ret = 0; cleanup: @@ -2878,3 +2875,18 @@ qemuDomObjEndAPI(virDomainObjPtr *vm) virObjectUnref(*vm); *vm = NULL; } + + +/** + * qemuDomainMemoryDeviceAlignSize: + * @mem: memory device definition object + * + * Aligns the size of the memory module as qemu enforces it. The size is updated + * inplace. Default rounding is now to 1 MiB (qemu requires rouding to page, + * size so this should be safe). + */ +void +qemuDomainMemoryDeviceAlignSize(virDomainMemoryDefPtr mem) +{ + mem->size = VIR_DIV_UP(mem->size, 1024) * 1024; +} diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 8caec2a..f56aec7 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -418,4 +418,6 @@ int qemuDomainJobInfoToParams(qemuDomainJobInfoPtr jobInfo, void qemuDomObjEndAPI(virDomainObjPtr *vm); +void qemuDomainMemoryDeviceAlignSize(virDomainMemoryDefPtr mem); + #endif /* __QEMU_DOMAIN_H__ */ diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index e840d63..f9e8010 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -420,6 +420,7 @@ mymain(void) DO_TEST("memory-hotplug"); DO_TEST("memory-hotplug-nonuma"); + DO_TEST("memory-hotplug-dimm"); virObjectUnref(driver.caps); virObjectUnref(driver.xmlopt); -- 2.2.2

On 01/30/2015 08:21 AM, Peter Krempa wrote:
Add support to start qemu instance with 'pc-dimm' device. Thanks to the refactors we are able to reuse the existing function to determine the parameters. --- src/qemu/qemu_command.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.c | 18 ++++++-- src/qemu/qemu_domain.h | 2 + tests/qemuxml2xmltest.c | 1 + 4 files changed, 132 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 5820fb5..7c31723 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1156,6 +1156,10 @@ qemuAssignDeviceAliases(virDomainDefPtr def, virQEMUCapsPtr qemuCaps) if (virAsprintf(&def->tpm->info.alias, "tpm%d", 0) < 0) return -1; } + for (i = 0; i < def->nmems; i++) { + if (virAsprintf(&def->mems[i]->info.alias, "dimm%zu", i) < 0) + return -1; + }
return 0; } @@ -4748,6 +4752,97 @@ qemuBuildMemoryCellBackendStr(virDomainDefPtr def, }
+static char * +qemuBuildMemoryDimmBackendStr(virDomainMemoryDefPtr mem, + virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virQEMUDriverConfigPtr cfg) +{ + virJSONValuePtr props = NULL; + char *alias = NULL; + const char *backendType; + char *ret = NULL; + + if (!mem->info.alias) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("memory device alias is not assigned")); + return NULL; + } + + if (virAsprintf(&alias, "mem%s", mem->info.alias) < 0) + goto cleanup; + + qemuDomainMemoryDeviceAlignSize(mem); + + if (qemuBuildMemoryBackendStr(mem->size, mem->pagesize, + mem->targetNode, mem->sourceNodes, NULL, + def, qemuCaps, cfg, + &backendType, &props, true) < 0) + goto cleanup; + + ret = qemuBuildObjectCommandlineFromJSON(backendType, alias, props); + + cleanup: + VIR_FREE(alias); + virJSONValueFree(props); + + return ret; +} + + +static char * +qemuBuildMemoryDeviceStr(virDomainMemoryDefPtr mem, + virQEMUCapsPtr qemuCaps) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (!mem->info.alias) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("missing alias for memory device")); + return NULL; + } + + switch ((virDomainMemoryModel) mem->model) { + case VIR_DOMAIN_MEMORY_MODEL_ACPI_DIMM: + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_PC_DIMM)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("this qemu doesn't support the pc-dimm device")); + return NULL; + } + + if (mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM && + mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("only 'acpi-dimm' addresses are supported for the " + "pc-dimm device")); + return NULL; + } + + virBufferAsprintf(&buf, "pc-dimm,node=%d,memdev=mem%s,id=%s", + mem->targetNode, mem->info.alias, mem->info.alias); + + if (mem->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM) { + virBufferAsprintf(&buf, ",slot=%d", mem->info.addr.acpiDimm.slot); + virBufferAsprintf(&buf, ",base=%llu", mem->info.addr.acpiDimm.base); + } + + break; + + case VIR_DOMAIN_MEMORY_MODEL_NONE: + case VIR_DOMAIN_MEMORY_MODEL_LAST: + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("invalid memory device type")); + break; + + } + + if (virBufferCheckError(&buf) < 0) + return NULL; + + return virBufferContentAndReset(&buf); +} + + char * qemuBuildNicStr(virDomainNetDefPtr net, const char *prefix, @@ -8351,6 +8446,25 @@ qemuBuildCommandLine(virConnectPtr conn, if (qemuBuildNumaArgStr(cfg, def, cmd, qemuCaps, nodeset) < 0) goto error;
Coverity has a FORWARD_NULL complaint... Right above this code there's: 8445 if (def->cpu && def->cpu->ncells) 8446 if (qemuBuildNumaArgStr(cfg, def, cmd, qemuCaps, nodeset) < 0) 8447 goto error; 8448 So there's a chance "def->cpu == NULL"
+ for (i = 0; i < def->nmems; i++) { + char *backStr; + char *dimmStr; + + if (!(backStr = qemuBuildMemoryDimmBackendStr(def->mems[i], def, + qemuCaps, cfg)))
Because def->cpu is NULL above, Coverity points out that qemuBuildMemoryDimmBackendStr will call qemuBuildMemoryBackendStr which deref's def->cpu->cells[guestNode].memAccess John
+ goto error; + + if (!(dimmStr = qemuBuildMemoryDeviceStr(def->mems[i], qemuCaps))) { + VIR_FREE(backStr); + goto error; + } + + virCommandAddArgList(cmd, "-object", backStr, "-device", dimmStr, NULL); + + VIR_FREE(backStr); + VIR_FREE(dimmStr); + } + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_UUID)) virCommandAddArgList(cmd, "-uuid", uuid, NULL); if (def->virtType == VIR_DOMAIN_VIRT_XEN ||

On Wed, Feb 04, 2015 at 17:28:00 -0500, John Ferlan wrote:
On 01/30/2015 08:21 AM, Peter Krempa wrote:
Add support to start qemu instance with 'pc-dimm' device. Thanks to the refactors we are able to reuse the existing function to determine the parameters. --- src/qemu/qemu_command.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.c | 18 ++++++-- src/qemu/qemu_domain.h | 2 + tests/qemuxml2xmltest.c | 1 + 4 files changed, 132 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 5820fb5..7c31723 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c
...
+} + + char * qemuBuildNicStr(virDomainNetDefPtr net, const char *prefix, @@ -8351,6 +8446,25 @@ qemuBuildCommandLine(virConnectPtr conn, if (qemuBuildNumaArgStr(cfg, def, cmd, qemuCaps, nodeset) < 0) goto error;
Coverity has a FORWARD_NULL complaint... Right above this code there's:
8445 if (def->cpu && def->cpu->ncells) 8446 if (qemuBuildNumaArgStr(cfg, def, cmd, qemuCaps, nodeset) < 0) 8447 goto error; 8448
So there's a chance "def->cpu == NULL"
+ for (i = 0; i < def->nmems; i++) { + char *backStr; + char *dimmStr; + + if (!(backStr = qemuBuildMemoryDimmBackendStr(def->mems[i], def, + qemuCaps, cfg)))
Because def->cpu is NULL above, Coverity points out that qemuBuildMemoryDimmBackendStr will call qemuBuildMemoryBackendStr which deref's def->cpu->cells[guestNode].memAccess
Actually, def->cpu won't be NULL at this point as there is code that makes sure that NUMA is enabled before even letting the VM to start with memory devices so the warning should be a false positive due to a complex structure. There's a different problem though. The code is not range-checking that the memory device's guest NUMA node (passed as 'guestNode' in the code above) is actually in range of valid guest nodes. I'm going to add a check that will solve both of the problems including coverity's false positive. Thanks for the catch :). Peter

When adding devices to the definition it's useful to check whether the devices don't reside on a conflicting address. This patch adds a helper that iterates all device info and comapres the addresses with the given info. --- src/conf/domain_conf.c | 104 +++++++++++++++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 4 ++ 2 files changed, 108 insertions(+) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 298b574..6b9c15e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2783,6 +2783,96 @@ virDomainDeviceInfoNeedsFormat(virDomainDeviceInfoPtr info, unsigned int flags) return false; } +static bool +virDomainDeviceInfoAddressIsEqual(const virDomainDeviceInfo *a, + const virDomainDeviceInfo *b) +{ + + if (a->type != b->type) + return false; + + switch ((virDomainDeviceAddressType) a->type) { + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_LAST: + /* address types below don't have any specific data */ + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO: + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390: + break; + + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI: + /* the 'multi' field shouldn't be checked */ + if (a->addr.pci.domain != b->addr.pci.domain || + a->addr.pci.bus != b->addr.pci.bus || + a->addr.pci.slot != b->addr.pci.slot || + a->addr.pci.function != b->addr.pci.function) + return false; + break; + + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DRIVE: + if (memcmp(&a->addr.drive, &b->addr.drive, sizeof(a->addr.drive))) + return false; + break; + + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_SERIAL: + if (memcmp(&a->addr.vioserial, &b->addr.vioserial, sizeof(a->addr.vioserial))) + return false; + break; + + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCID: + if (memcmp(&a->addr.ccid, &b->addr.ccid, sizeof(a->addr.ccid))) + return false; + break; + + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_USB: + if (memcmp(&a->addr.usb, &b->addr.usb, sizeof(a->addr.usb))) + return false; + break; + + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_SPAPRVIO: + if (memcmp(&a->addr.spaprvio, &b->addr.spaprvio, sizeof(a->addr.spaprvio))) + return false; + break; + + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW: + /* the 'assigned' field denotes that the address was generated */ + if (a->addr.ccw.cssid != b->addr.ccw.cssid || + a->addr.ccw.ssid != b->addr.ccw.ssid || + a->addr.ccw.devno != b->addr.ccw.devno) + return false; + break; + + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ISA: + if (memcmp(&a->addr.isa, &b->addr.isa, sizeof(a->addr.isa))) + return false; + break; + + case VIR_DOMAIN_DEVICE_ADDRESS_TYPE_ACPI_DIMM: + if (memcmp(&a->addr.acpiDimm, &b->addr.acpiDimm, sizeof(a->addr.acpiDimm))) + return false; + break; + + } + + return true; +} + + +static int +virDomainDeviceInfoFindByAddressIterator(virDomainDefPtr def ATTRIBUTE_UNUSED, + virDomainDeviceDefPtr dev ATTRIBUTE_UNUSED, + virDomainDeviceInfoPtr info, + void *opaque) +{ + virDomainDeviceInfoPtr needle = opaque; + + /* break iteration if the info was found */ + if (virDomainDeviceInfoAddressIsEqual(info, needle)) + return -1; + + return 0; +} + + int virDomainDeviceInfoCopy(virDomainDeviceInfoPtr dst, virDomainDeviceInfoPtr src) @@ -3043,6 +3133,20 @@ virDomainDeviceInfoIterate(virDomainDefPtr def, } +bool +virDomainDeviceInfoHasAddress(virDomainDefPtr def, + virDomainDeviceInfoPtr info) +{ + if (virDomainDeviceInfoIterateInternal(def, + virDomainDeviceInfoFindByAddressIterator, + true, + info) < 0) + return true; + + return false; +} + + static int virDomainDefRejectDuplicateControllers(virDomainDefPtr def) { diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 18cbe45..0d2b8d8 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2429,6 +2429,10 @@ int virDomainDeviceInfoIterate(virDomainDefPtr def, virDomainDeviceInfoCallback cb, void *opaque); +bool virDomainDeviceInfoHasAddress(virDomainDefPtr def, + virDomainDeviceInfoPtr info) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK; + void virDomainDefFree(virDomainDefPtr vm); virDomainChrDefPtr virDomainChrDefNew(void); -- 2.2.2

On Fri, Jan 30, 2015 at 02:21:06PM +0100, Peter Krempa wrote:
When adding devices to the definition it's useful to check whether the devices don't reside on a conflicting address. This patch adds a helper that iterates all device info and comapres the addresses with the given info. --- src/conf/domain_conf.c | 104 +++++++++++++++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 4 ++ 2 files changed, 108 insertions(+)
ACK Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Add a few helpers that allow to operate with memory device definitions on the domain config and use them to implement memory device coldplug in the qemu driver. --- src/conf/domain_conf.c | 101 +++++++++++++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 10 +++++ src/libvirt_private.syms | 4 ++ src/qemu/qemu_driver.c | 15 ++++++- 4 files changed, 128 insertions(+), 2 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 6b9c15e..3c33f5a 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -12416,6 +12416,107 @@ virDomainChrRemove(virDomainDefPtr vmdef, return ret; } + +static int +virDomainMemoryFindByDefInternal(virDomainDefPtr def, + virDomainMemoryDefPtr mem, + bool allowAddressFallback) +{ + size_t i; + + for (i = 0; i < def->nmems; i++) { + virDomainMemoryDefPtr tmp = def->mems[i]; + + /* address, if present */ + if (allowAddressFallback) { + if (tmp->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + continue; + } else { + if (mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + !virDomainDeviceInfoAddressIsEqual(&tmp->info, &mem->info)) + continue; + } + + /* alias, if present */ + if (mem->info.alias && + STRNEQ_NULLABLE(tmp->info.alias, mem->info.alias)) + continue; + + /* target info -> always present */ + if (tmp->model != mem->model || + tmp->targetNode != mem->targetNode || + tmp->size != mem->size) + continue; + + /* source stuff -> match with device */ + if (tmp->pagesize != mem->pagesize) + continue; + + if (!virBitmapEqual(tmp->sourceNodes, mem->sourceNodes)) + continue; + + break; + } + + if (i == def->nmems) + return -1; + + return i; +} + + +int +virDomainMemoryFindByDef(virDomainDefPtr def, + virDomainMemoryDefPtr mem) +{ + return virDomainMemoryFindByDefInternal(def, mem, false); +} + + +int +virDomainMemoryFindInactiveByDef(virDomainDefPtr def, + virDomainMemoryDefPtr mem) +{ + int ret; + + if ((ret = virDomainMemoryFindByDefInternal(def, mem, false)) < 0) + ret = virDomainMemoryFindByDefInternal(def, mem, true); + + return ret; +} + + +int +virDomainMemoryInsert(virDomainDefPtr def, + virDomainMemoryDefPtr mem) +{ + int id = def->nmems; + + if (mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE && + virDomainDeviceInfoHasAddress(def, &mem->info)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("Domain already contains a device with the same " + "address")); + return -1; + } + + if (VIR_APPEND_ELEMENT(def->mems, def->nmems, mem) < 0) + return -1; + + return id; +} + + +virDomainMemoryDefPtr +virDomainMemoryRemove(virDomainDefPtr def, + int idx) +{ + virDomainMemoryDefPtr ret = def->mems[idx]; + VIR_DELETE_ELEMENT(def->mems, idx, def->nmems); + return ret; +} + + char * virDomainDefGetDefaultEmulator(virDomainDefPtr def, virCapsPtr caps) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 0d2b8d8..924fa4e 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2824,6 +2824,16 @@ virDomainChrDefGetSecurityLabelDef(virDomainChrDefPtr def, const char *model); typedef const char* (*virEventActionToStringFunc)(int type); typedef int (*virEventActionFromStringFunc)(const char *type); +int virDomainMemoryInsert(virDomainDefPtr def, virDomainMemoryDefPtr mem) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK; +virDomainMemoryDefPtr virDomainMemoryRemove(virDomainDefPtr def, int idx) + ATTRIBUTE_NONNULL(1); +int virDomainMemoryFindByDef(virDomainDefPtr def, virDomainMemoryDefPtr mem) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK; +int virDomainMemoryFindInactiveByDef(virDomainDefPtr def, + virDomainMemoryDefPtr mem) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK; + VIR_ENUM_DECL(virDomainTaint) VIR_ENUM_DECL(virDomainVirt) VIR_ENUM_DECL(virDomainBoot) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index a3d1815..9723878 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -327,6 +327,10 @@ virDomainLockFailureTypeToString; virDomainMemballoonModelTypeFromString; virDomainMemballoonModelTypeToString; virDomainMemoryDefFree; +virDomainMemoryFindByDef; +virDomainMemoryFindInactiveByDef; +virDomainMemoryInsert; +virDomainMemoryRemove; virDomainNetAppendIpAddress; virDomainNetDefFormat; virDomainNetDefFree; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index df23aaa..48b852a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7345,7 +7345,10 @@ qemuDomainAttachDeviceConfig(virQEMUCapsPtr qemuCaps, break; case VIR_DOMAIN_DEVICE_MEMORY: - /* TODO: implement later */ + if (virDomainMemoryInsert(vmdef, dev->data.memory) < 0) + return -1; + dev->data.memory = NULL; + break; case VIR_DOMAIN_DEVICE_INPUT: case VIR_DOMAIN_DEVICE_SOUND: @@ -7464,7 +7467,15 @@ qemuDomainDetachDeviceConfig(virDomainDefPtr vmdef, break; case VIR_DOMAIN_DEVICE_MEMORY: - /* TODO: implement later */ + if ((idx = virDomainMemoryFindInactiveByDef(vmdef, + dev->data.memory)) < 0) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("matching memory device was not found")); + return -1; + } + + virDomainMemoryDefFree(virDomainMemoryRemove(vmdef, idx)); + break; case VIR_DOMAIN_DEVICE_INPUT: case VIR_DOMAIN_DEVICE_SOUND: -- 2.2.2

Add code to hot-add memory devices to running qemu instances. --- src/qemu/qemu_command.c | 4 +-- src/qemu/qemu_command.h | 15 +++++++++ src/qemu/qemu_driver.c | 5 ++- src/qemu/qemu_hotplug.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_hotplug.h | 3 ++ 5 files changed, 110 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 7c31723..bb55deb 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4528,7 +4528,7 @@ qemuBuildControllerDevStr(virDomainDefPtr domainDef, * other configuration was used (to detect legacy configurations). Returns * -1 in case of an error. */ -static int +int qemuBuildMemoryBackendStr(unsigned long long size, unsigned long long pagesize, int guestNode, @@ -4790,7 +4790,7 @@ qemuBuildMemoryDimmBackendStr(virDomainMemoryDefPtr mem, } -static char * +char * qemuBuildMemoryDeviceStr(virDomainMemoryDefPtr mem, virQEMUCapsPtr qemuCaps) { diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index ae36bd8..f77c338 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -162,6 +162,21 @@ char *qemuBuildSoundDevStr(virDomainDefPtr domainDef, virDomainSoundDefPtr sound, virQEMUCapsPtr qemuCaps); +int qemuBuildMemoryBackendStr(unsigned long long size, + unsigned long long pagesize, + int guestNode, + virBitmapPtr userNodeset, + virBitmapPtr autoNodeset, + virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virQEMUDriverConfigPtr cfg, + const char **backendType, + virJSONValuePtr *backendProps, + bool force); + +char *qemuBuildMemoryDeviceStr(virDomainMemoryDefPtr mem, + virQEMUCapsPtr qemuCaps); + /* Legacy, pre device support */ char *qemuBuildPCIHostdevPCIDevStr(virDomainHostdevDefPtr dev, virQEMUCapsPtr qemuCaps); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 48b852a..92e4b68 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7020,8 +7020,11 @@ qemuDomainAttachDeviceLive(virDomainObjPtr vm, dev->data.chr = NULL; break; - /*TODO: implement later */ case VIR_DOMAIN_DEVICE_MEMORY: + ret = qemuDomainAttachMemory(driver, vm, + dev->data.memory); + dev->data.memory = NULL; + break; case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_FS: diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 3a15e29..f8b427d 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1569,6 +1569,92 @@ int qemuDomainAttachChrDevice(virQEMUDriverPtr driver, return ret; } + +int +qemuDomainAttachMemory(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainMemoryDefPtr mem) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); + char *devstr = NULL; + char *objalias = NULL; + const char *backendType; + virJSONValuePtr props = NULL; + int id; + int ret = -1; + + if (virAsprintf(&mem->info.alias, "dimm%zu", vm->def->nmems) < 0) + goto cleanup; + + if (virAsprintf(&objalias, "mem%s", mem->info.alias) < 0) + goto cleanup; + + if (!(devstr = qemuBuildMemoryDeviceStr(mem, priv->qemuCaps))) + goto cleanup; + + qemuDomainMemoryDeviceAlignSize(mem); + + if (qemuBuildMemoryBackendStr(mem->size, mem->pagesize, + mem->targetNode, mem->sourceNodes, NULL, + vm->def, priv->qemuCaps, cfg, + &backendType, &props, true) < 0) + goto cleanup; + + if (virDomainMemoryInsert(vm->def, mem) < 0) { + virJSONValueFree(props); + goto cleanup; + } + + qemuDomainObjEnterMonitor(driver, vm); + if (qemuMonitorAddObject(priv->mon, backendType, objalias, props) < 0) + goto removedef; + + if (qemuMonitorAddDevice(priv->mon, devstr) < 0) { + virErrorPtr err = virSaveLastError(); + ignore_value(qemuMonitorDelObject(priv->mon, objalias)); + virSetError(err); + virFreeError(err); + goto removedef; + } + + if (qemuDomainObjExitMonitor(driver, vm) < 0) { + /* we shouldn't touch mem now, as the def might be freed */ + mem = NULL; + goto cleanup; + } + + /* mem is consumed by vm->def */ + mem = NULL; + + /* this step is best effort, removing the device would be so much trouble */ + ignore_value(qemuDomainUpdateMemoryDeviceInfo(driver, vm, + QEMU_ASYNC_JOB_NONE)); + + ret = 0; + + cleanup: + virObjectUnref(cfg); + VIR_FREE(devstr); + VIR_FREE(objalias); + virDomainMemoryDefFree(mem); + return ret; + + removedef: + if (qemuDomainObjExitMonitor(driver, vm) < 0) { + mem = NULL; + goto cleanup; + } + + if ((id = virDomainMemoryFindByDef(vm->def, mem)) >= 0) + mem = virDomainMemoryRemove(vm->def, id); + else + mem = NULL; + + goto cleanup; +} + + static int qemuDomainAttachHostUSBDevice(virQEMUDriverPtr driver, virDomainObjPtr vm, diff --git a/src/qemu/qemu_hotplug.h b/src/qemu/qemu_hotplug.h index a30788d..c14fa0b 100644 --- a/src/qemu/qemu_hotplug.h +++ b/src/qemu/qemu_hotplug.h @@ -57,6 +57,9 @@ int qemuDomainAttachHostDevice(virConnectPtr conn, virDomainHostdevDefPtr hostdev); int qemuDomainFindGraphicsIndex(virDomainDefPtr def, virDomainGraphicsDefPtr dev); +int qemuDomainAttachMemory(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainMemoryDefPtr mem); int qemuDomainChangeGraphics(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainGraphicsDefPtr dev); -- 2.2.2

Add code to hot-remove memory devices from qemu. Unfortunately QEMU doesn't support this right now, so this is just for completenes. --- src/qemu/qemu_driver.c | 4 ++- src/qemu/qemu_hotplug.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++++- src/qemu/qemu_hotplug.h | 3 ++ 3 files changed, 96 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 92e4b68..3a83323 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7102,7 +7102,9 @@ qemuDomainDetachDeviceLive(virDomainObjPtr vm, ret = qemuDomainDetachChrDevice(driver, vm, dev->data.chr); break; case VIR_DOMAIN_DEVICE_MEMORY: - /* TODO: Implement later */ + ret = qemuDomainDetachMemoryDevice(driver, vm, dev->data.memory); + break; + case VIR_DOMAIN_DEVICE_FS: case VIR_DOMAIN_DEVICE_INPUT: case VIR_DOMAIN_DEVICE_SOUND: diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index f8b427d..c9e74b8 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -2715,6 +2715,44 @@ qemuDomainRemoveControllerDevice(virQEMUDriverPtr driver, } +static int +qemuDomainRemoveMemoryDevice(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainMemoryDefPtr mem) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virObjectEventPtr event; + char *backendAlias = NULL; + int rc; + int idx; + + VIR_DEBUG("Removing memory device %s from domain %p %s", + mem->info.alias, vm, vm->def->name); + + if ((event = virDomainEventDeviceRemovedNewFromObj(vm, mem->info.alias))) + qemuDomainEventQueue(driver, event); + + if (virAsprintf(&backendAlias, "mem%s", mem->info.alias) < 0) + goto error; + + qemuDomainObjEnterMonitor(driver, vm); + rc = qemuMonitorDelObject(priv->mon, backendAlias); + if (qemuDomainObjExitMonitor(driver, vm) < 0 || rc < 0) + goto error; + + if ((idx = virDomainMemoryFindByDef(vm->def, mem)) >= 0) + virDomainMemoryRemove(vm->def, idx); + + virDomainMemoryDefFree(mem); + VIR_FREE(backendAlias); + return 0; + + error: + VIR_FREE(backendAlias); + return -1; +} + + static void qemuDomainRemovePCIHostDevice(virQEMUDriverPtr driver, virDomainObjPtr vm, @@ -2997,8 +3035,9 @@ qemuDomainRemoveDevice(virQEMUDriverPtr driver, ret = qemuDomainRemoveChrDevice(driver, vm, dev->data.chr); break; - /* TODO: implement later */ case VIR_DOMAIN_DEVICE_MEMORY: + ret = qemuDomainRemoveMemoryDevice(driver, vm, dev->data.memory); + break; case VIR_DOMAIN_DEVICE_NONE: case VIR_DOMAIN_DEVICE_LEASE: @@ -3900,3 +3939,53 @@ int qemuDomainDetachChrDevice(virQEMUDriverPtr driver, VIR_FREE(devstr); return ret; } + + +int +qemuDomainDetachMemoryDevice(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainMemoryDefPtr memdef) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + virDomainMemoryDefPtr mem; + int idx; + int rc; + int ret = -1; + + if ((idx = virDomainMemoryFindByDef(vm->def, memdef)) < 0) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("device not present in domain configuration")); + return -1; + } + + mem = vm->def->mems[idx]; + + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("qemu does not support -device")); + return -1; + } + + if (!mem->info.alias) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("alias for the memory device was not found")); + return -1; + } + + qemuDomainMarkDeviceForRemoval(vm, &mem->info); + + qemuDomainObjEnterMonitor(driver, vm); + rc = qemuMonitorDelDevice(priv->mon, mem->info.alias); + if (qemuDomainObjExitMonitor(driver, vm) < 0 || rc < 0) + goto cleanup; + + rc = qemuDomainWaitForDeviceRemoval(vm); + if (rc == 0 || rc == 1) + ret = qemuDomainRemoveMemoryDevice(driver, vm, mem); + else + ret = 0; + + cleanup: + qemuDomainResetDeviceRemoval(vm); + return ret; +} diff --git a/src/qemu/qemu_hotplug.h b/src/qemu/qemu_hotplug.h index c14fa0b..34a0793 100644 --- a/src/qemu/qemu_hotplug.h +++ b/src/qemu/qemu_hotplug.h @@ -60,6 +60,9 @@ int qemuDomainFindGraphicsIndex(virDomainDefPtr def, int qemuDomainAttachMemory(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainMemoryDefPtr mem); +int qemuDomainDetachMemoryDevice(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainMemoryDefPtr memdef); int qemuDomainChangeGraphics(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainGraphicsDefPtr dev); -- 2.2.2

On Fri, Jan 30, 2015 at 02:20:57PM +0100, Peter Krempa wrote:
!! this series applies on top of the cleanup series posted earlier !!
Hi, this is my try to implement memory hotplug in libvirt (for the qemu) driver. This series is almost code-complete but is lacking tests and docs as I wanted to agree on design first before attempting to write the documentation.
Additionally this series is also lacking code that would fix memballoon handling but I'm waiting on a clousure how it's supposed to work from qemu's side as it appears to be broken there too.
The basic XML used to add a memory device: <memory model='acpi-dimm'> <target> <size unit='KiB'>524287</size> <node>0</node> </target> </memory>
The <target> subelement is mandatory, whereas the <source> subelement (that contains source numa nodes, source page size ) is optional and is inferred from the NUMA tuning for given target numa node.
Please note that at least one guest numa node has to be configured for the guest for this to work (limitation of qemu).
What's missing in this series: - tests - docs - commit message touch-up - code to audit the memory size changes - code to make memory balloning working with correct size
Looks good to me, but you might want another opinion on the XML format. Jan

On Fri, Jan 30, 2015 at 02:20:57PM +0100, Peter Krempa wrote:
!! this series applies on top of the cleanup series posted earlier !!
Hi, this is my try to implement memory hotplug in libvirt (for the qemu) driver. This series is almost code-complete but is lacking tests and docs as I wanted to agree on design first before attempting to write the documentation.
Additionally this series is also lacking code that would fix memballoon handling but I'm waiting on a clousure how it's supposed to work from qemu's side as it appears to be broken there too.
The basic XML used to add a memory device: <memory model='acpi-dimm'> <target> <size unit='KiB'>524287</size> <node>0</node> </target> </memory>
The <target> subelement is mandatory, whereas the <source> subelement (that contains source numa nodes, source page size ) is optional and is inferred from the NUMA tuning for given target numa node.
Our XML sure is getting verbose for complex guests using all features :-)
From my cursory review, I think the design is basically right. The only real questions I'll have are around the semantics of existing virDomain* APIs that report memory information.
Please note that at least one guest numa node has to be configured for the guest for this to work (limitation of qemu).
I think that's a perfectly reasonable requirement. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 01/30/2015 08:20 AM, Peter Krempa wrote:
!! this series applies on top of the cleanup series posted earlier !!
Hi, this is my try to implement memory hotplug in libvirt (for the qemu) driver. This series is almost code-complete but is lacking tests and docs as I wanted to agree on design first before attempting to write the documentation.
Additionally this series is also lacking code that would fix memballoon handling but I'm waiting on a clousure how it's supposed to work from qemu's side as it appears to be broken there too.
The basic XML used to add a memory device: <memory model='acpi-dimm'> <target> <size unit='KiB'>524287</size> <node>0</node> </target> </memory>
The <target> subelement is mandatory, whereas the <source> subelement (that contains source numa nodes, source page size ) is optional and is inferred from the NUMA tuning for given target numa node.
Please note that at least one guest numa node has to be configured for the guest for this to work (limitation of qemu).
Been kind of back-burner thinking about this series overall for the last couple of days. After reading Daniel's review - I think he touched on a few things I was pondering. Since I haven't followed the upstream QEMU work on this - I assume it's following that work very closely. I also assume other hypervisor's would/could be interested in providing similar functionality (hot plug memory) - I'm left wondering - is the model chosen flexible enough for that? You certainly will need a "healthy dose" of documentation to describe how to configure values properly. Perhaps even an extra utility to output what's available to be provided for the XML. The one area that seems to be very QEMU implementation specific is the "<address type='acpi-dimm' slot='#' base='0xHEX'/>" XML. Maybe that becomes a "<driver name='qemu' slot='#' base='0xHEX'/>" instead leaving "<address type='acpi-dimm'/>" or even empty? Allowing users to provide "base" implies either extra validation to ensure it's 'correct' (format & usage/duplication) or documentation to describe what/how to provide the data. Perhaps even an extra utility to provide "valid values". If QEMU is expecting specific values - maybe it'd be in our best interests not to expect the values. I suppose I'm having a difficult time visualizing the values and what they express. Also what if someone attempts to "poor man's clone" a live guest (dumpxml > file, edit file to change name, remove/change uuid, change disk source data, etc., then start). If they don't change those addresses - what will happen? I would think it would be "expected of" libvirt to ensure what's passed to QEMU is "uniquely expressed" anyway, right? What's still not clear to me is the relationship between 'acpi-dimm' and NUMA slots/nodes. FWIW: As an aside, after patch8, it would be beneficial to update qemuxml2argv-memory-hotplug.args in order to include the expected output of the "pc-dimm,node=%d,memdev=mem%s,id=%s" and "-object xxx -device xxxx"... A tangent... When I see 'acpi-' in the XML - I think firmware or some external memory device which firmware needs to find/recognize... Long ago I worked on a project that used a memory device as a cluster interconnect (memory channel). A "practical use" in a virtual world could be as an interconnect for migrations. This is what I thought of when I saw a <memory> "device" with 'acpi-...' as opposed to a memory device that's providing host NUMA nodes/slots to the guest as extra available hot-plug memory. John
What's missing in this series: - tests - docs - commit message touch-up - code to audit the memory size changes - code to make memory balloning working with correct size
Peter Krempa (12): qemu: caps: Add capability bit for the "pc-dimm" device conf: Add support for parsing and formatting max memory and slot count qemu: Implement setup of memory hotplug parameters conf: Add device address type for dimm devices conf: Add interface to parse and format memory device information qemu: memdev: Add infrastructure to load memory device information qemu: migration: Forbid migration with memory modules lacking info qemu: add support for memory devices conf: Introduce helper to find duplicate device address qemu: conf: Add support for memory device cold(un)plug qemu: Implement memory device hotplug qemu: Implement memory device hotunplug
docs/formatdomain.html.in | 19 + docs/schemas/domaincommon.rng | 76 +++ src/bhyve/bhyve_domain.c | 9 +- src/conf/domain_conf.c | 655 ++++++++++++++++++++- src/conf/domain_conf.h | 63 ++ src/libvirt_private.syms | 7 + src/libxl/libxl_domain.c | 8 + src/lxc/lxc_domain.c | 8 + src/openvz/openvz_driver.c | 14 +- src/parallels/parallels_driver.c | 6 +- src/phyp/phyp_driver.c | 6 +- src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 157 ++++- src/qemu/qemu_command.h | 15 + src/qemu/qemu_domain.c | 58 ++ src/qemu/qemu_domain.h | 6 + src/qemu/qemu_driver.c | 29 + src/qemu/qemu_hotplug.c | 178 ++++++ src/qemu/qemu_hotplug.h | 6 + src/qemu/qemu_migration.c | 14 + src/qemu/qemu_monitor.c | 45 ++ src/qemu/qemu_monitor.h | 14 + src/qemu/qemu_monitor_json.c | 116 ++++ src/qemu/qemu_monitor_json.h | 5 + src/qemu/qemu_process.c | 4 + src/uml/uml_driver.c | 9 +- src/vbox/vbox_common.c | 6 +- src/vmware/vmware_driver.c | 6 +- src/vmx/vmx.c | 6 +- src/xen/xen_driver.c | 7 + src/xenapi/xenapi_driver.c | 9 +- tests/domainschemadata/maxMemory.xml | 19 + tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 1 + .../qemuxml2argv-memory-hotplug-dimm.xml | 47 ++ .../qemuxml2argv-memory-hotplug-nonuma.xml | 22 + .../qemuxml2argv-memory-hotplug.args | 6 + .../qemuxml2argv-memory-hotplug.xml | 34 ++ tests/qemuxml2argvtest.c | 4 + tests/qemuxml2xmltest.c | 4 + 40 files changed, 1682 insertions(+), 19 deletions(-) create mode 100644 tests/domainschemadata/maxMemory.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-dimm.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nonuma.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug.xml

On Mon, Feb 09, 2015 at 08:38:52 -0500, John Ferlan wrote:
On 01/30/2015 08:20 AM, Peter Krempa wrote:
!! this series applies on top of the cleanup series posted earlier !!
Hi, this is my try to implement memory hotplug in libvirt (for the qemu) driver. This series is almost code-complete but is lacking tests and docs as I wanted to agree on design first before attempting to write the documentation.
Additionally this series is also lacking code that would fix memballoon handling but I'm waiting on a clousure how it's supposed to work from qemu's side as it appears to be broken there too.
The basic XML used to add a memory device: <memory model='acpi-dimm'> <target> <size unit='KiB'>524287</size> <node>0</node> </target> </memory>
The <target> subelement is mandatory, whereas the <source> subelement (that contains source numa nodes, source page size ) is optional and is inferred from the NUMA tuning for given target numa node.
Please note that at least one guest numa node has to be configured for the guest for this to work (limitation of qemu).
Been kind of back-burner thinking about this series overall for the last couple of days. After reading Daniel's review - I think he touched on a few things I was pondering. Since I haven't followed the upstream QEMU work on this - I assume it's following that work very closely. I also assume other hypervisor's would/could be interested in providing similar functionality (hot plug memory) - I'm left wondering - is the model chosen flexible enough for that? You certainly will need a "healthy dose" of documentation to describe how to configure values properly. Perhaps even an extra utility to output what's available to be provided for the XML.
The one area that seems to be very QEMU implementation specific is the "<address type='acpi-dimm' slot='#' base='0xHEX'/>" XML. Maybe that becomes a "<driver name='qemu' slot='#' base='0xHEX'/>" instead leaving "<address type='acpi-dimm'/>" or even empty? Allowing users to provide "base" implies either extra validation to ensure it's 'correct' (format & usage/duplication) or documentation to describe what/how to provide the data. Perhaps even an extra utility to provide "valid values". If
Firstly, the address is by default generated automatically by qemu. We represent it only for migration purposes so that the address doesn't change in that case. The user ideally shouldn't provide any address and let the hypervisor assign the module itself. That is the main reason I chose to do it via the <address> element. Users usually don't touch those while having the slot and base in the device itself would likely incline the users to provide some values. We can also make the address type qemu specific, but I still think that the right place for the slot and base data is the address element. Secondly, for validating of the actual data, we use qemu. The data is passed through to the emulator and if it is incorrect it will either be tweaked to the actual value used or rejected. We could (and I probably will add this check in the next version) check the slot ID as it has to be less than the maximum count configured at startup. For the address qemu is doing alignment and other checks that I don't want to represent in libvirt as it would be mostly black magic and require us to mirror qemu. Basically the user shouldn't touch the base address ... ever.
QEMU is expecting specific values - maybe it'd be in our best interests not to expect the values. I suppose I'm having a difficult time visualizing the values and what they express. Also what if someone attempts to "poor man's clone" a live guest (dumpxml > file, edit file to change name, remove/change uuid, change disk source data, etc., then start). If they don't change those addresses - what will happen? I
That is actually exactly what they should do. The address represents the start address in the physical address space in the guest where the memory module is placed at thus it can be duplicated.
would think it would be "expected of" libvirt to ensure what's passed to QEMU is "uniquely expressed" anyway, right? What's still not clear to me is the relationship between 'acpi-dimm' and NUMA slots/nodes.
FWIW: As an aside, after patch8, it would be beneficial to update qemuxml2argv-memory-hotplug.args in order to include the expected output of the "pc-dimm,node=%d,memdev=mem%s,id=%s" and "-object xxx -device xxxx"...
Tests are still missing for this series. I'll be tweaking them before the non-RFC posting.
A tangent... When I see 'acpi-' in the XML - I think firmware or some external memory device which firmware needs to find/recognize... Long
Well it actually is this case. If a piece of the unused physicall address space would become additional memory, then without any interaction between the hardware (in this case qemu) and the guest operating system the kernel wouldn't be able to guess that some memory appeared and is free to use. ACPI in this case is used to notify the guest kernel of the new memory that appeared.
ago I worked on a project that used a memory device as a cluster interconnect (memory channel). A "practical use" in a virtual world could be as an interconnect for migrations. This is what I thought of when I saw a <memory> "device" with 'acpi-...' as opposed to a memory device that's providing host NUMA nodes/slots to the guest as extra available hot-plug memory.
That sounds more like RDMA-migration to me.
John
Peter
participants (5)
-
Daniel P. Berrange
-
Eric Blake
-
John Ferlan
-
Ján Tomko
-
Peter Krempa