[libvirt] [PATCH 0/3] numatune: Prefer old approach

Consider the following part of domain XML: <numatune> <memory mode='static' nodeset="0,2"/> </numatune> <cpu> <numa> <cell id='0' cpus='0' memory='65536' unit='KiB'/> </numa> </cpu> Yes, this have a great potential of breaking things. Especially, this will break migration between previous two or three upstream releases and current release we are working on, because libvirt started domains in more complicated way (even if not needed). After these patches, domains will be started in simpler way which is incompatible. On the other hand, we get backward compatibility with much more releases than we are about to break. Michal Privoznik (3): numatune_conf: Expose virDomainNumatuneNodeSpecified qemuxml2argvtest: Fake response from numad qemuBuildMemoryBackendStr: Report backend requirement more appropriately src/conf/numatune_conf.c | 2 +- src/conf/numatune_conf.h | 3 +++ src/libvirt_private.syms | 1 + src/qemu/qemu_command.c | 6 ++++-- tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.args | 5 +++++ tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.xml | 2 +- .../qemuxml2argvdata/qemuxml2argv-numatune-auto-prefer.args | 5 +++++ tests/qemuxml2argvtest.c | 12 +++++++++++- 8 files changed, 31 insertions(+), 5 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-auto-prefer.args -- 2.0.5

This function is going to be needed in the near future. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/conf/numatune_conf.c | 2 +- src/conf/numatune_conf.h | 3 +++ src/libvirt_private.syms | 1 + 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/src/conf/numatune_conf.c b/src/conf/numatune_conf.c index 7f322ea..d6cedaa 100644 --- a/src/conf/numatune_conf.c +++ b/src/conf/numatune_conf.c @@ -64,7 +64,7 @@ struct _virDomainNumatune { }; -static inline bool +inline bool virDomainNumatuneNodeSpecified(virDomainNumatunePtr numatune, int cellid) { diff --git a/src/conf/numatune_conf.h b/src/conf/numatune_conf.h index 28c4ce2..c4b3f6e 100644 --- a/src/conf/numatune_conf.h +++ b/src/conf/numatune_conf.h @@ -111,4 +111,7 @@ int virDomainNumatuneSpecifiedMaxNode(virDomainNumatunePtr numatune); bool virDomainNumatuneNodesetIsAvailable(virDomainNumatunePtr numatune, virBitmapPtr auto_nodeset); + +bool virDomainNumatuneNodeSpecified(virDomainNumatunePtr numatune, + int cellid); #endif /* __NUMATUNE_CONF_H__ */ diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 645aef1..5bdd075 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -638,6 +638,7 @@ virDomainNumatuneMaybeGetNodeset; virDomainNumatuneMemModeTypeFromString; virDomainNumatuneMemModeTypeToString; virDomainNumatuneNodesetIsAvailable; +virDomainNumatuneNodeSpecified; virDomainNumatuneParseXML; virDomainNumatunePlacementTypeFromString; virDomainNumatunePlacementTypeToString; -- 2.0.5

Well, we can pretend that we've asked numad for its suggestion and let qemu command line be built with that respect. Again, this alone has no big value, but see later commits which build on the top of this. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- tests/qemuxml2argvtest.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 316f479..2de92eb 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -265,12 +265,16 @@ static int testCompareXMLToArgvFiles(const char *xml, size_t i; size_t nnicindexes = 0; int *nicindexes = NULL; + virBitmapPtr nodeset = NULL; if (!(conn = virGetConnect())) goto out; conn->secretDriver = &fakeSecretDriver; conn->storageDriver = &fakeStorageDriver; + if (virBitmapParse("0-3", '\0', &nodeset, 4) < 0) + goto out; + if (!(vmdef = virDomainDefParseFile(xml, driver.caps, driver.xmlopt, QEMU_EXPECTED_VIRT_TYPES, VIR_DOMAIN_DEF_PARSE_INACTIVE))) { @@ -349,7 +353,7 @@ static int testCompareXMLToArgvFiles(const char *xml, VIR_NETDEV_VPORT_PROFILE_OP_NO_OP, &testCallbacks, false, (flags & FLAG_FIPS), - NULL, &nnicindexes, &nicindexes))) { + nodeset, &nnicindexes, &nicindexes))) { if (!virtTestOOMActive() && (flags & FLAG_EXPECT_FAILURE)) { ret = 0; @@ -403,6 +407,7 @@ static int testCompareXMLToArgvFiles(const char *xml, virCommandFree(cmd); virDomainDefFree(vmdef); virObjectUnref(conn); + virBitmapFree(nodeset); return ret; } -- 2.0.5

So, when building the '-numa' command line, the qemuBuildMemoryBackendStr() function does quite a lot of checks to chose the best backend, or to check if one is in fact needed. However, it returned that backend is needed even for this little fella: <numatune> <memory mode="strict" nodeset="0,2" /> </numatune> This can be guaranteed via CGroups entirely, there's no need to use memory-backend-ram to let qemu know where to get memory from. Well, as long as there's no <memnode/> element, which explicitly requires the backend. Long story short, we wouldn't have to care, as qemu works either way. However, the problem is migration (as always). Previously, libvirt would have started qemu with: -numa node,memory=X in this case and restricted memory placement in CGroups. Today, libvirt creates more complicated command line: -object memory-backend-ram,id=ram-node0,size=X -numa node,memdev=ram-node0 Again, one wouldn't find anything wrong with these two approaches. Both work just fine. Unless you try to migrated from the older libvirt into the newer one. These two approaches are, unfortunately, not compatible. My suggestion is, in order to allow users to migrate, lets use the older approach for as long as the newer one is not needed. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_command.c | 6 ++++-- tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.args | 5 +++++ tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.xml | 2 +- tests/qemuxml2argvdata/qemuxml2argv-numatune-auto-prefer.args | 5 +++++ tests/qemuxml2argvtest.c | 5 +++++ 5 files changed, 20 insertions(+), 3 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-auto-prefer.args diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 9c25788..cbe105e 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4699,7 +4699,9 @@ qemuBuildMemoryBackendStr(unsigned long long size, props = NULL; if (!hugepage) { - if ((nodemask || force) && + bool nodeSpecified = virDomainNumatuneNodeSpecified(def->numatune, guestNode); + + if ((userNodeset || nodeSpecified || force) && !virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_RAM)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("this qemu doesn't support the " @@ -4709,7 +4711,7 @@ qemuBuildMemoryBackendStr(unsigned long long size, /* report back that using the new backend is not necessary to achieve * the desired configuration */ - if (!nodemask) { + if (!userNodeset && !nodeSpecified) { ret = 1; goto cleanup; } diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.args b/tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.args new file mode 100644 index 0000000..481f72f --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.args @@ -0,0 +1,5 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 -S -M pc-q35-2.3 -m 128 \ +-smp 2,maxcpus=6,sockets=6,cores=1,threads=1 \ +-nographic -monitor unix:/tmp/test-monitor,server,nowait -no-acpi \ +-boot c -net none -serial none -parallel none diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.xml b/tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.xml index 01bbb3d..fe7c465 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.xml +++ b/tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.xml @@ -1,4 +1,4 @@ -<domain type='kvm'> +<domain type='qemu'> <name>dummy2</name> <uuid>4d92ec27-9ebf-400b-ae91-20c71c647c19</uuid> <memory unit='KiB'>131072</memory> diff --git a/tests/qemuxml2argvdata/qemuxml2argv-numatune-auto-prefer.args b/tests/qemuxml2argvdata/qemuxml2argv-numatune-auto-prefer.args new file mode 100644 index 0000000..0b1b0f5 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-numatune-auto-prefer.args @@ -0,0 +1,5 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/bin/kvm -S -M pc -m 64 -smp 1 \ +-numa node,nodeid=0,cpus=0,mem=64 \ +-nographic -monitor unix:/tmp/test-monitor,server,nowait \ +-no-acpi -boot c -usb -net none -serial none -parallel none diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 2de92eb..07a581f 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1260,6 +1260,9 @@ mymain(void) DO_TEST("cputune-zero-shares", QEMU_CAPS_NAME); DO_TEST_PARSE_ERROR("cputune-iothreadsched-toomuch", QEMU_CAPS_NAME); DO_TEST_PARSE_ERROR("cputune-vcpusched-overlap", QEMU_CAPS_NAME); + DO_TEST("cputune-numatune", QEMU_CAPS_SMP_TOPOLOGY, + QEMU_CAPS_OBJECT_MEMORY_RAM, + QEMU_CAPS_OBJECT_MEMORY_FILE); DO_TEST("numatune-memory", NONE); DO_TEST_PARSE_ERROR("numatune-memory-invalid-nodeset", NONE); @@ -1272,6 +1275,8 @@ mymain(void) DO_TEST_FAILURE("numatune-memnode-no-memory", NONE); DO_TEST("numatune-auto-nodeset-invalid", NONE); + DO_TEST("numatune-auto-prefer", QEMU_CAPS_OBJECT_MEMORY_RAM, + QEMU_CAPS_OBJECT_MEMORY_FILE); DO_TEST_FAILURE("numatune-static-nodeset-exceed-hostnode", QEMU_CAPS_OBJECT_MEMORY_RAM); DO_TEST_PARSE_ERROR("numatune-memnode-nocpu", NONE); -- 2.0.5

On Thu, Feb 12, 2015 at 18:03:53 +0100, Michal Privoznik wrote:
So, when building the '-numa' command line, the qemuBuildMemoryBackendStr() function does quite a lot of checks to chose the best backend, or to check if one is in fact needed. However, it returned that backend is needed even for this little fella:
<numatune> <memory mode="strict" nodeset="0,2" /> </numatune>
This can be guaranteed via CGroups entirely, there's no need to use memory-backend-ram to let qemu know where to get memory from. Well, as long as there's no <memnode/> element, which explicitly requires the backend. Long story short, we wouldn't have to care, as qemu works either way. However, the problem is migration (as always). Previously, libvirt would have started qemu with:
-numa node,memory=X
in this case and restricted memory placement in CGroups. Today, libvirt creates more complicated command line:
-object memory-backend-ram,id=ram-node0,size=X -numa node,memdev=ram-node0
Again, one wouldn't find anything wrong with these two approaches. Both work just fine. Unless you try to migrated from the older libvirt
s/migrated/migrate/
into the newer one. These two approaches are, unfortunately, not compatible. My suggestion is, in order to allow users to migrate, lets use the older approach for as long as the newer one is not needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ... diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.xml b/tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.xml index 01bbb3d..fe7c465 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.xml +++ b/tests/qemuxml2argvdata/qemuxml2argv-cputune-numatune.xml @@ -1,4 +1,4 @@ -<domain type='kvm'> +<domain type='qemu'> <name>dummy2</name> <uuid>4d92ec27-9ebf-400b-ae91-20c71c647c19</uuid> <memory unit='KiB'>131072</memory>
Looks like a useless change. Jirka

On Thu, Feb 12, 2015 at 06:03:50PM +0100, Michal Privoznik wrote:
Consider the following part of domain XML:
<numatune> <memory mode='static' nodeset="0,2"/> </numatune> <cpu> <numa> <cell id='0' cpus='0' memory='65536' unit='KiB'/> </numa> </cpu>
Yes, this have a great potential of breaking things. Especially, this will break migration between previous two or three upstream releases and current release we are working on, because libvirt started domains in more complicated way (even if not needed). After these patches, domains will be started in simpler way which is incompatible.
On the other hand, we get backward compatibility with much more releases than we are about to break.
So with this patch we get guaranteed migration compat for any configs which were valid under old libvirt. The new syntax will only ever be used if it is absolutely required, in which case migration wouldn't be supportable regardless. It is a shame we didn't think of doing tihs before, but I think I agree this is the right thing todo now. We're doomed either way, but this way feels like it minimises the doom to a smaller set of people. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Feb 12, 2015 at 18:03:50 +0100, Michal Privoznik wrote:
Consider the following part of domain XML:
<numatune> <memory mode='static' nodeset="0,2"/> </numatune> <cpu> <numa> <cell id='0' cpus='0' memory='65536' unit='KiB'/> </numa> </cpu>
Yes, this have a great potential of breaking things. Especially, this will break migration between previous two or three upstream releases and current release we are working on, because libvirt started domains in more complicated way (even if not needed). After these patches, domains will be started in simpler way which is incompatible.
On the other hand, we get backward compatibility with much more releases than we are about to break.
Michal Privoznik (3): numatune_conf: Expose virDomainNumatuneNodeSpecified qemuxml2argvtest: Fake response from numad qemuBuildMemoryBackendStr: Report backend requirement more appropriately
ACK, but see the two nits I found in 3/3. Jirka
participants (3)
-
Daniel P. Berrange
-
Jiri Denemark
-
Michal Privoznik