On 03/07/2017 05:01 PM, John Ferlan wrote:
On 02/27/2017 08:19 AM, Michal Privoznik wrote:
> NVDIMM is new type of memory introduced into QEMU 2.6. The idea
> is that we have a Non-Volatile memory module that keeps the data
> persistent across domain reboots.
>
> At the domain XML level, we already have some representation of
> 'dimm' modules. Long story short, we have <memory/> element that
> lives under <devices/>. Now, the element even has @model
> attribute which we can use to introduce new memory type:
Starting with "Long story short..."
how about instead:
NVDIMM will utilize the existing <memory/> element that lives under
<devices/> by adding a new attribute 'nvdimm' to the existing @model and
introduce a new <path/> element for <source/> while reusing other
fields. The resulting XML would appear as:
>
> <memory model='nvdimm'>
> <source>
> <path>/tmp/nvdimm</path>
> </source>
> <target>
> <size unit='KiB'>523264</size>
> <node>0</node>
> </target>
> <address type='dimm' slot='0'/>
> </memory>
>
> So far, this is just a XML parser/formatter extension. QEMU
> driver implementation is in the next commit.
>
> For more info on NVDIMM visit the following web page:
>
>
http://pmem.io/
Are there "limitations" to the number of nvdimm's that could be
supported in a guest? Mostly curious, but it's one of those I thought I
read something type, but cannot remember where type things...
Not at this level. I mean, there are no limitations in the NVDIM
specification. There are some on qemu level, i.e. the minimum for
label-size of NVDIMM is 128KiB, and the remaining size (total - label)
should be aligned to 4k. But if we are gonna check this we should do so
at qemu level, not here.
https://www.redhat.com/archives/libvir-list/2016-August/msg00650.html
>
> Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
> ---
> docs/formatdomain.html.in | 55 ++++++++----
> docs/schemas/domaincommon.rng | 32 ++++---
> src/conf/domain_conf.c | 97 ++++++++++++++++------
> src/conf/domain_conf.h | 2 +
> src/qemu/qemu_command.c | 6 ++
> src/qemu/qemu_domain.c | 5 ++
> .../qemuxml2argv-memory-hotplug-nvdimm.xml | 56 +++++++++++++
> .../qemuxml2xmlout-memory-hotplug-nvdimm.xml | 1 +
> tests/qemuxml2xmltest.c | 1 +
> 9 files changed, 204 insertions(+), 51 deletions(-)
> create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nvdimm.xml
> create mode 120000
tests/qemuxml2xmloutdata/qemuxml2xmlout-memory-hotplug-nvdimm.xml
>
> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index 02ce7924c..b76905cdc 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -7007,7 +7007,6 @@ qemu-kvm -net nic,model=? /dev/null
> guests' memory resource needs.
>
> Some hypervisors may require NUMA configured for the guest.
> - <span class="since">Since 1.2.14</span>
> </p>
>
> <p>
> @@ -7032,6 +7031,15 @@ qemu-kvm -net nic,model=? /dev/null
> <node>1</node>
> </target>
> </memory>
> + <memory model='nvdimm'>
> + <source>
> + <path>/tmp/nvdimm</path>
> + </source>
> + <target>
> + <size unit='KiB'>524288</size>
> + <node>1</node>
> + </target>
> + </memory>
> </devices>
> ...
> </pre>
> @@ -7039,28 +7047,47 @@ qemu-kvm -net nic,model=? /dev/null
> <dt><code>model</code></dt>
> <dd>
> <p>
> - Currently only the <code>dimm</code> model is supported in
order to
> - add a virtual DIMM module to the guest.
> + Provide <code>dimm</code> to add a virtual DIMM module to the
guest.
> + <span class="since">Since 1.2.14</span>
> + Provide <code>nvdimm</code> model adds a Non-Volatile DIMM
> + module. <span class="since">Since 3.1.0</span>
will be at least 3.2.0
> </p>
> </dd>
>
> <dt><code>source</code></dt>
> <dd>
> <p>
> - The optional source element allows to fine tune the source of the
> - memory used for the given memory device. If the element is not
> - provided defaults configured via <code>numatune</code> are
used.
> + For model <code>dimm</code> this element is optional and
allows to
> + fine tune the source of the memory used for the given memory device.
> + If the element is not provided defaults configured via
> + <code>numatune</code> are used. If
<code>dimm</code> is provided,
> + then the following optional elements can be provided as well:
> </p>
> - <p>
> - <code>pagesize</code> can optionally be used to override the
default
> - host page size used for backing the memory device.
>
> - The configured value must correspond to a page size supported by the
> - host.
> - </p>
> + <dl>
> + <dt><code>pagesize</code></dt>
> + <dd>
> + <p>
> + This element can optionally be used to override the default
I think at this point optionally is redundant since the above says "the
following optional elements":
> + host page size used for backing the memory device.
> + The configured value must correspond to a page size supported by the
> + host.
> + </p>
> + </dd>
> +
> + <dt><code>nodemask</code></dt>
> + <dd>
> + <p>
> + This element can optionally be used to override the default
same w/r/t optionally
> + set of NUMA nodes where the memory would be allocated.
> + </p>
> + </dd>
> + </dl>
> +
> <p>
> - <code>nodemask</code> can optionally be used to override the
default
> - set of NUMA nodes where the memory would be allocated.
> + For model <code>nvdimm</code> this element is mandatory and
has a
> + single child element <code>path</code> that represents a path
> + in the host that backs the nvdimm module in the guest.
> </p>
> </dd>
>
> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
> index c64544ac4..fafd3e982 100644
> --- a/docs/schemas/domaincommon.rng
> +++ b/docs/schemas/domaincommon.rng
> @@ -4737,6 +4737,7 @@
> <attribute name="model">
> <choice>
> <value>dimm</value>
> + <value>nvdimm</value>
> </choice>
> </attribute>
> <interleave>
> @@ -4756,18 +4757,27 @@
>
> <define name="memorydev-source">
> <element name="source">
> - <interleave>
> - <optional>
> - <element name="pagesize">
> - <ref name="scaledInteger"/>
> + <choice>
> + <group>
> + <interleave>
> + <optional>
> + <element name="pagesize">
> + <ref name="scaledInteger"/>
> + </element>
> + </optional>
> + <optional>
> + <element name="nodemask">
> + <ref name="cpuset"/>
> + </element>
> + </optional>
> + </interleave>
> + </group>
> + <group>
> + <element name="path">
> + <text/>
I would think this should be:
<ref name="absFilePath"/>
> </element>
> - </optional>
> - <optional>
> - <element name="nodemask">
> - <ref name="cpuset"/>
> - </element>
> - </optional>
> - </interleave>
> + </group>
> + </choice>
> </element>
> </define>
>
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 97d42fe99..4ffca7dc8 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -858,8 +858,11 @@ VIR_ENUM_DECL(virDomainBlockJob)
> VIR_ENUM_IMPL(virDomainBlockJob, VIR_DOMAIN_BLOCK_JOB_TYPE_LAST,
> "", "", "copy", "",
"active-commit")
>
> -VIR_ENUM_IMPL(virDomainMemoryModel, VIR_DOMAIN_MEMORY_MODEL_LAST,
> - "", "dimm")
> +VIR_ENUM_IMPL(virDomainMemoryModel,
> + VIR_DOMAIN_MEMORY_MODEL_LAST,
> + "",
> + "dimm",
> + "nvdimm")
>
> VIR_ENUM_IMPL(virDomainShmemModel, VIR_DOMAIN_SHMEM_MODEL_LAST,
> "ivshmem",
> @@ -2418,6 +2421,7 @@ void virDomainMemoryDefFree(virDomainMemoryDefPtr def)
> if (!def)
> return;
>
> + VIR_FREE(def->path);
> virBitmapFree(def->sourceNodes);
> virDomainDeviceInfoClear(&def->info);
> VIR_FREE(def);
> @@ -13755,20 +13759,36 @@ virDomainMemorySourceDefParseXML(xmlNodePtr node,
> xmlNodePtr save = ctxt->node;
> ctxt->node = node;
>
> - if (virDomainParseMemory("./pagesize", "./pagesize/@unit",
ctxt,
> - &def->pagesize, false, false) < 0)
> - goto cleanup;
> -
> - if ((nodemask = virXPathString("string(./nodemask)", ctxt))) {
> - if (virBitmapParse(nodemask, &def->sourceNodes,
> - VIR_DOMAIN_CPUMASK_LEN) < 0)
> + switch ((virDomainMemoryModel) def->model) {
> + case VIR_DOMAIN_MEMORY_MODEL_DIMM:
> + if (virDomainParseMemory("./pagesize",
"./pagesize/@unit", ctxt,
> + &def->pagesize, false, false) < 0)
> goto cleanup;
>
> - if (virBitmapIsAllClear(def->sourceNodes)) {
> - virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> - _("Invalid value of 'nodemask': %s"),
nodemask);
> + if ((nodemask = virXPathString("string(./nodemask)", ctxt))) {
> + if (virBitmapParse(nodemask, &def->sourceNodes,
> + VIR_DOMAIN_CPUMASK_LEN) < 0)
> + goto cleanup;
> +
> + if (virBitmapIsAllClear(def->sourceNodes)) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> + _("Invalid value of 'nodemask':
%s"), nodemask);
> + goto cleanup;
> + }
> + }
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
> + if (!(def->path = virXPathString("string(./path)", ctxt))) {
> + virReportError(VIR_ERR_XML_DETAIL, "%s",
> + _("path is required for model nvdimm'"));
> goto cleanup;
> }
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_NONE:
> + case VIR_DOMAIN_MEMORY_MODEL_LAST:
> + break;
> }
>
> ret = 0;
> @@ -15173,12 +15193,25 @@ virDomainMemoryFindByDefInternal(virDomainDefPtr def,
> tmp->size != mem->size)
> continue;
>
> - /* source stuff -> match with device */
> - if (tmp->pagesize != mem->pagesize)
> - continue;
> + switch ((virDomainMemoryModel) mem->model) {
> + case VIR_DOMAIN_MEMORY_MODEL_DIMM:
> + /* source stuff -> match with device */
> + if (tmp->pagesize != mem->pagesize)
> + continue;
>
> - if (!virBitmapEqual(tmp->sourceNodes, mem->sourceNodes))
> - continue;
> + if (!virBitmapEqual(tmp->sourceNodes, mem->sourceNodes))
> + continue;
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
> + if (STRNEQ(tmp->path, mem->path))
> + continue;
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_NONE:
> + case VIR_DOMAIN_MEMORY_MODEL_LAST:
> + break;
> + }
>
> break;
> }
> @@ -22571,23 +22604,35 @@ virDomainMemorySourceDefFormat(virBufferPtr buf,
> char *bitmap = NULL;
> int ret = -1;
>
> - if (!def->pagesize && !def->sourceNodes)
> + if (!def->pagesize && !def->sourceNodes && !def->path)
> return 0;
>
> virBufferAddLit(buf, "<source>\n");
> virBufferAdjustIndent(buf, 2);
>
> - if (def->sourceNodes) {
> - if (!(bitmap = virBitmapFormat(def->sourceNodes)))
> - goto cleanup;
> + switch ((virDomainMemoryModel) def->model) {
> + case VIR_DOMAIN_MEMORY_MODEL_DIMM:
> + if (def->sourceNodes) {
> + if (!(bitmap = virBitmapFormat(def->sourceNodes)))
> + goto cleanup;
>
> - virBufferAsprintf(buf, "<nodemask>%s</nodemask>\n",
bitmap);
> + virBufferAsprintf(buf,
"<nodemask>%s</nodemask>\n", bitmap);
> + }
> +
> + if (def->pagesize)
> + virBufferAsprintf(buf, "<pagesize
unit='KiB'>%llu</pagesize>\n",
> + def->pagesize);
> + break;
> +
> + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
> + virBufferAsprintf(buf, "<path>%s</path>\n",
def->path);
> + break;
This will need to be escaped properly (any weirdness with ,, too?).
FWIW: I'm using 'romfile' (the nvram property) as my reference.
> +
> + case VIR_DOMAIN_MEMORY_MODEL_NONE:
> + case VIR_DOMAIN_MEMORY_MODEL_LAST:
> + break;
> }
>
> - if (def->pagesize)
> - virBufferAsprintf(buf, "<pagesize
unit='KiB'>%llu</pagesize>\n",
> - def->pagesize);
> -
> virBufferAdjustIndent(buf, -2);
> virBufferAddLit(buf, "</source>\n");
>
So no need
> diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
> index 1e53cc328..dc949d3c9 100644
> --- a/src/conf/domain_conf.h
> +++ b/src/conf/domain_conf.h
> @@ -1996,6 +1996,7 @@ struct _virDomainRNGDef {
> typedef enum {
> VIR_DOMAIN_MEMORY_MODEL_NONE,
> VIR_DOMAIN_MEMORY_MODEL_DIMM, /* dimm hotpluggable memory device */
> + VIR_DOMAIN_MEMORY_MODEL_NVDIMM, /* nvdimm memory device */
>
> VIR_DOMAIN_MEMORY_MODEL_LAST
> } virDomainMemoryModel;
> @@ -2004,6 +2005,7 @@ struct _virDomainMemoryDef {
> /* source */
> virBitmapPtr sourceNodes;
> unsigned long long pagesize; /* kibibytes */
> + char *path;
Since it's "hard" to find path in sources in a simple (hah) search, can
this be nvdimm_path or something more specific?
Okay, nvdimmPath it is.
I would also think there'd need to be some Mem ABI Stability check added
in virDomainMemoryDefCheckABIStability that the src/dst path's are the same.
No. This is not needed. The contents of NVDIMM should be synced upon
migration. Just like NVRAM for UEFI is. Therefore, on migration the
backing file for NVDIMM can change and qemu should cope with it just fine.
Searching on 'nmems':
1. Will virDomainDefPostParseMemory need any adjustment to not account
for this type of memory or should it be included?
It should be included. This is one of the nice benefits of this feature
- it is a DIMM module after all. Therefore you have to have enough
memory slots available, the size of NVDIMM module is accounted in the
overal memory size of the guest, and so on.
2. Similar for virDomainDefGetMemoryInitial
Again. We want to account NVDIMM here. This is correct.
3. Will alignment be needed? qemuDomainAlignMemorySizes
Yes. That's why the code is dimm type agnostic.
>
> /* target */
> int model; /* virDomainMemoryModel */
> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index 7c52712d1..f628a9929 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c
> @@ -3421,6 +3421,12 @@ qemuBuildMemoryDeviceStr(virDomainMemoryDefPtr mem)
>
> break;
>
> + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
> + virReportError(VIR_ERR_NO_SUPPORT, "%s",
> + _("nvdimm not supported yet"));
> + return NULL;
> + break;
> +
> case VIR_DOMAIN_MEMORY_MODEL_NONE:
> case VIR_DOMAIN_MEMORY_MODEL_LAST:
> break;
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index c187214dc..5ec610564 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -5910,6 +5910,11 @@ qemuDomainDefValidateMemoryHotplugDevice(const
virDomainMemoryDef *mem,
> }
> break;
>
> + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM:
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("nvdimm hotplug not supported yet"));
> + return -1;
> +
> case VIR_DOMAIN_MEMORY_MODEL_NONE:
> case VIR_DOMAIN_MEMORY_MODEL_LAST:
> return -1;
What about migration support? That would assume the file exists in both
places or is somehow passed from source to target, right?
It should be enough that the file exists on the destination (regardless
of its initial content) and qemu overwrites its content during
migration. Libvirt does not pre-create the file on migration, just like
it doesn't pre-create it on domain startup.
Michal