
On Wed, Jul 11, 2018 at 05:05:05PM +0200, Michal Privoznik wrote:
On 07/11/2018 10:22 AM, Pavel Hrdina wrote:
Previously we were ignoring "nodeset" attribute for hugepage pages if there was no guest NUMA topology configured in the domain XML. Commit <fa6bdf6afa878b8d7c5ed71664ee72be8967cdc5> partially fixed that issue but it introduced a somehow valid regression.
In case that there is no guest NUMA topology configured and the "nodeset" attribute is set to "0" it was accepted and was working properly even though it was not completely valid XML.
This patch introduces a workaround that it will ignore the nodeset="0" only in case that there is no guest NUMA topology in order not to hit the validation error.
After this commit the following XML configuration is valid:
<memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0'/> </hugepages> </memoryBacking>
but this configuration remains invalid:
<memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0'/> <page size='1048576' unit='KiB'/> </hugepages> </memoryBacking>
The issue with the second configuration is that it was originally working, however changing the order of the <page> elements resolved into using different page size for the guest. The code is written in a way that it expect only one page configured and always uses only the first page in case that there is no guest NUMA topology configured. See qemuBuildMemPathStr() function for details.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1591235
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/conf/domain_conf.c | 27 +++++++++++++++++ tests/qemuxml2argvdata/hugepages-pages10.args | 26 ++++++++++++++++ tests/qemuxml2argvtest.c | 2 +- .../qemuxml2xmloutdata/hugepages-pages10.xml | 30 +++++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 5 files changed, 85 insertions(+), 1 deletion(-) create mode 100644 tests/qemuxml2argvdata/hugepages-pages10.args create mode 100644 tests/qemuxml2xmloutdata/hugepages-pages10.xml
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 5249f59d1a..bf5000f7a2 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4085,6 +4085,31 @@ virDomainDefPostParseMemory(virDomainDefPtr def, }
+static void +virDomainDefPostParseMemtune(virDomainDefPtr def) +{ + size_t i; + + if (virDomainNumaGetNodeCount(def->numa) == 0) { + /* If guest NUMA is not configured and any hugepage page has nodemask + * set to "0" free and clear that nodemas, otherwise we would rise + * an error that there is no guest NUMA node configured. */ + for (i = 0; i < def->mem.nhugepages; i++) { + ssize_t nextBit; + + if (!def->mem.hugepages[i].nodemask) + continue; + + nextBit = virBitmapNextSetBit(def->mem.hugepages[i].nodemask, 0); + if (nextBit < 0) { + virBitmapFree(def->mem.hugepages[i].nodemask); + def->mem.hugepages[i].nodemask = NULL; + } + } + } +} + +
Problem is not that there is no guest NUMA node. The real problem is that there is no guest NUMA node left for the default <page/>. Consider the following example:
It's not that simple. For non-NUMA guest <memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0'/> <page size='1048576' unit='KiB'/> </hugepages> </memoryBacking> will start a guest with 2M pages but <memoryBacking> <hugepages> <page size='1048576' unit='KiB'/> <page size='2048' unit='KiB' nodeset='0'/> </hugepages> </memoryBacking> will start a guest with 1G pages. That's wrong and it should not depend on the order. This patch fixes this XML to work again: <memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0'/> </hugepages> </memoryBacking> by changing the parsed data into: <memoryBacking> <hugepages> <page size='2048' unit='KiB'/> </hugepages> </memoryBacking> which will pass the validation. As a side-effect it will also fix previous case by changing it into: <memoryBacking> <hugepages> <page size='1048576' unit='KiB'/> <page size='2048' unit='KiB'/> </hugepages> </memoryBacking> Which will fail the validation as there are two default pages. All of this applies only to non-NUMA guests.
<memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0-1'/> <page size='1048576' unit='KiB'/> </hugepages> </memoryBacking>
<cpu mode='host-passthrough' check='none'> <topology sockets='1' cores='4' threads='2'/> <numa> <cell id='0' cpus='0-3' memory='4194304' unit='KiB'/> <cell id='1' cpus='4-7' memory='4194304' unit='KiB'/> </numa> </cpu>
Do you consider this configuration valid?
Completely different issue that this patch is not trying to solve but we can handle that as well. Pavel