On 08/11/2016 09:26 AM, Michal Privoznik wrote:
Now that NVDIMM has found its way into libvirt, users might want
to fine tune some settings for each module separately. One such
setting is 'share=on|off' for the memory-backend-file object.
This setting - just like its name suggest already - enables
sharing the nvdimm module with other applications. Under the hood
it controls whether qemu mmaps() the file as MAP_PRIVATE or
MAP_SHARED.
Yet again, we have such config knob in domain XML, but it's just
an attribute to numa <cell/>. This does not give fine enough
tuning on per-memdevice basis so we need to have the attribute
for each device too.
Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com>
---
docs/formatdomain.html.in | 15 ++++++-
docs/schemas/domaincommon.rng | 19 ++++++---
src/conf/domain_conf.c | 15 ++++++-
src/conf/domain_conf.h | 2 +
src/libvirt_private.syms | 2 +
...emuxml2argv-memory-hotplug-nvdimm-memAccess.xml | 49 ++++++++++++++++++++++
6 files changed, 93 insertions(+), 9 deletions(-)
create mode 100644
tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nvdimm-memAccess.xml
After reading this patch and the next one, I'm not sure I understand the
need. For a 'dimm' it already takes a value from whatever numa has for
the node. Adding this would seemingly allow someone to overwrite that
value. So what would happen if the numa node was private and the *dimm
node shared? The setting and usage does not restrict to nvdimm only.
The rest of this is my thoughts upon first read... I'm still hung up on
whether dimm and nvdimm should share nmems, but still figured I'd look
at more patches to provide more thoughts.
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 657df8f..06fe50d 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -1356,7 +1356,7 @@
<span class='since'>Since 1.2.9</span> the optional attribute
<code>memAccess</code> can control whether the memory is to be
mapped as "shared" or "private". This is valid only for
- hugepages-backed memory.
+ hugepages-backed memory and nvdimm modules.
</p>
<p>
@@ -6724,7 +6724,7 @@ qemu-kvm -net nic,model=? /dev/null
<pre>
...
<devices>
- <memory model='dimm'>
+ <memory model='dimm' memAccess='private'>
'dimm'?? Shouldn't this be on the 'nvdimm'?
The way I read the commit msg it's for nvdimm only. It would seem this
would allow the dimm
I would think there'd be concerns over nefarious uses of this hole in
the guest...
<target>
<size unit='KiB'>524287</size>
<node>0</node>
@@ -6762,6 +6762,17 @@ qemu-kvm -net nic,model=? /dev/null
</p>
</dd>
+ <dt><code>memAccess</code></dt>
+ <dd>
+ <p>
+ Then there is optional attribute <code>memAccess</code>
+ (<span class="since">Since 2.2.0</span>) that allows
2.3.0 at the earliest.
+ uses to fine tune mapping of the memory on per module
+ basis. Values are the same as for numa <cell/>:
+ <code>shared</code> and <code>private</code>.
+ </p>
+ </dd>
+
Placement. This is an attribute of <memory...> but currently "tied to"
nvdimm. Making this a <dt> at the same level of <source> gives me the
wrong impression. I think the text of the message belongs in the
previous paragraph. E.g. "... a Non-Volatile DIMM module. For NVDIMM
modules, an optional attribute <code>memAccess</code> can be provided.
This allows users to fine tune mapping of the memory on a per module
basis. Values are the same as..."
BTW: It took me a few searches to find the shared/private NUMA
discussion in the "#elementsCPU"... Makes me wonder if we should create
another label "#numaTopology" that allows us to link directly to that
discussion from right here.
<dt><code>source</code></dt>
<dd>
<p>
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index e6ad452..3e9d342 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -4467,6 +4467,15 @@
</element>
</define>
+ <define name="memAccess">
+ <attribute name="memAccess">
+ <choice>
+ <value>shared</value>
+ <value>private</value>
+ </choice>
+ </attribute>
+ </define>
+
<define name="numaCell">
<element name="cell">
<optional>
@@ -4486,12 +4495,7 @@
</attribute>
</optional>
<optional>
- <attribute name="memAccess">
- <choice>
- <value>shared</value>
- <value>private</value>
- </choice>
- </attribute>
+ <ref name="memAccess"/>
</optional>
</element>
</define>
@@ -4725,6 +4729,9 @@
<value>nvdimm</value>
</choice>
</attribute>
+ <optional>
+ <ref name="memAccess"/>
+ </optional>
<interleave>
<optional>
<ref name="memorydev-source"/>
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index cb5a053..84f76dd 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -13245,6 +13245,15 @@ virDomainMemoryDefParseXML(xmlNodePtr memdevNode,
}
VIR_FREE(tmp);
+ tmp = virXMLPropString(memdevNode, "memAccess");
+ if (tmp &&
+ (def->memAccess = virNumaMemAccessTypeFromString(tmp)) <= 0) {
+ virReportError(VIR_ERR_XML_ERROR,
+ _("invalid memAccess mode '%s'"), tmp);
+ goto error;
+ }
+ VIR_FREE(tmp);
+
An no check for NVDIMM usage only... What if someone does add this to
their DIMM? In the next patch, we'll take whatever this is set to and
change the memAccess based on that.
I would also think that if we did support this that we'd *have* to have
some ABI check to make sure src and dst used the same memAccess mode...
/* source */
if ((node = virXPathNode("./source", ctxt)) &&
virDomainMemorySourceDefParseXML(node, ctxt, def) < 0)
@@ -21800,7 +21809,11 @@ virDomainMemoryDefFormat(virBufferPtr buf,
{
const char *model = virDomainMemoryModelTypeToString(def->model);
- virBufferAsprintf(buf, "<memory model='%s'>\n", model);
+ virBufferAsprintf(buf, "<memory model='%s'", model);
+ if (def->memAccess)
+ virBufferAsprintf(buf, " memAccess='%s'",
+ virNumaMemAccessTypeToString(def->memAccess));
+ virBufferAddLit(buf, ">\n");
virBufferAdjustIndent(buf, 2);
if (virDomainMemorySourceDefFormat(buf, def) < 0)
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index 4e6a9bf..213768d 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -1945,6 +1945,8 @@ typedef enum {
} virDomainMemoryModel;
struct _virDomainMemoryDef {
+ virNumaMemAccess memAccess;
+
/* source */
virBitmapPtr sourceNodes;
unsigned long long pagesize; /* kibibytes */
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 419c33d..024c3e6 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2058,6 +2058,8 @@ virNumaGetNodeMemory;
virNumaGetPageInfo;
virNumaGetPages;
virNumaIsAvailable;
+virNumaMemAccessTypeFromString;
+virNumaMemAccessTypeToString;
virNumaNodeIsAvailable;
virNumaNodesetIsAvailable;
virNumaSetPagePoolSize;
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nvdimm-memAccess.xml
b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nvdimm-memAccess.xml
new file mode 100644
index 0000000..4ebea01
--- /dev/null
+++ b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nvdimm-memAccess.xml
@@ -0,0 +1,49 @@
+<domain type='qemu'>
+ <name>QEMUGuest1</name>
+ <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+ <maxMemory slots='16'
unit='KiB'>1099511627776</maxMemory>
+ <memory unit='KiB'>1267710</memory>
+ <currentMemory unit='KiB'>1267710</currentMemory>
+ <vcpu placement='static' cpuset='0-1'>2</vcpu>
+ <os>
+ <type arch='i686' machine='pc'>hvm</type>
+ <boot dev='hd'/>
+ </os>
+ <idmap>
+ <uid start='0' target='1000' count='10'/>
+ <gid start='0' target='1000' count='10'/>
+ </idmap>
+ <cpu>
+ <topology sockets='2' cores='1' threads='1'/>
+ <numa>
+ <cell id='0' cpus='0-1' memory='219136'
unit='KiB'/>
+ </numa>
+ </cpu>
+ <clock offset='utc'/>
+ <on_poweroff>destroy</on_poweroff>
+ <on_reboot>restart</on_reboot>
+ <on_crash>destroy</on_crash>
+ <devices>
+ <emulator>/usr/bin/qemu</emulator>
+ <disk type='block' device='disk'>
+ <source dev='/dev/HostVG/QEMUGuest1'/>
+ <target dev='hda' bus='ide'/>
+ <address type='drive' controller='0' bus='0'
target='0' unit='0'/>
+ </disk>
+ <controller type='ide' index='0'/>
+ <controller type='usb' index='0'/>
+ <controller type='pci' index='0' model='pci-root'/>
+ <input type='mouse' bus='ps2'/>
+ <input type='keyboard' bus='ps2'/>
+ <memballoon model='virtio'/>
+ <memory model='nvdimm' memAccess='private'>
+ <source>
+ <path>/tmp/nvdimm</path>
+ </source>
+ <target>
+ <size unit='KiB'>523264</size>
+ <node>0</node>
+ </target>
+ </memory>
+ </devices>
+</domain>
And again no xml2xml tests...
So in this case, we overwrite whatever the numa default is. If the numa
default were private, how would it be possible to have the nvdimm allow
shared?
John