On Wed, Jul 11, 2018 at 05:05:05PM +0200, Michal Privoznik wrote:
On 07/11/2018 10:22 AM, Pavel Hrdina wrote:
> Previously we were ignoring "nodeset" attribute for hugepage pages
> if there was no guest NUMA topology configured in the domain XML.
> Commit <fa6bdf6afa878b8d7c5ed71664ee72be8967cdc5> partially fixed
> that issue but it introduced a somehow valid regression.
>
> In case that there is no guest NUMA topology configured and the
> "nodeset" attribute is set to "0" it was accepted and was
working
> properly even though it was not completely valid XML.
>
> This patch introduces a workaround that it will ignore the nodeset="0"
> only in case that there is no guest NUMA topology in order not to
> hit the validation error.
>
> After this commit the following XML configuration is valid:
>
> <memoryBacking>
> <hugepages>
> <page size='2048' unit='KiB' nodeset='0'/>
> </hugepages>
> </memoryBacking>
>
> but this configuration remains invalid:
>
> <memoryBacking>
> <hugepages>
> <page size='2048' unit='KiB' nodeset='0'/>
> <page size='1048576' unit='KiB'/>
> </hugepages>
> </memoryBacking>
>
> The issue with the second configuration is that it was originally
> working, however changing the order of the <page> elements resolved
> into using different page size for the guest. The code is written
> in a way that it expect only one page configured and always uses only
> the first page in case that there is no guest NUMA topology configured.
> See qemuBuildMemPathStr() function for details.
>
> Resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1591235
>
> Signed-off-by: Pavel Hrdina <phrdina(a)redhat.com>
> ---
> src/conf/domain_conf.c | 27 +++++++++++++++++
> tests/qemuxml2argvdata/hugepages-pages10.args | 26 ++++++++++++++++
> tests/qemuxml2argvtest.c | 2 +-
> .../qemuxml2xmloutdata/hugepages-pages10.xml | 30 +++++++++++++++++++
> tests/qemuxml2xmltest.c | 1 +
> 5 files changed, 85 insertions(+), 1 deletion(-)
> create mode 100644 tests/qemuxml2argvdata/hugepages-pages10.args
> create mode 100644 tests/qemuxml2xmloutdata/hugepages-pages10.xml
>
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 5249f59d1a..bf5000f7a2 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -4085,6 +4085,31 @@ virDomainDefPostParseMemory(virDomainDefPtr def,
> }
>
>
> +static void
> +virDomainDefPostParseMemtune(virDomainDefPtr def)
> +{
> + size_t i;
> +
> + if (virDomainNumaGetNodeCount(def->numa) == 0) {
> + /* If guest NUMA is not configured and any hugepage page has nodemask
> + * set to "0" free and clear that nodemas, otherwise we would
rise
> + * an error that there is no guest NUMA node configured. */
> + for (i = 0; i < def->mem.nhugepages; i++) {
> + ssize_t nextBit;
> +
> + if (!def->mem.hugepages[i].nodemask)
> + continue;
> +
> + nextBit = virBitmapNextSetBit(def->mem.hugepages[i].nodemask, 0);
> + if (nextBit < 0) {
> + virBitmapFree(def->mem.hugepages[i].nodemask);
> + def->mem.hugepages[i].nodemask = NULL;
> + }
> + }
> + }
> +}
> +
> +
Problem is not that there is no guest NUMA node. The real problem is
that there is no guest NUMA node left for the default <page/>. Consider
the following example:
It's not that simple. For non-NUMA guest
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB' nodeset='0'/>
<page size='1048576' unit='KiB'/>
</hugepages>
</memoryBacking>
will start a guest with 2M pages but
<memoryBacking>
<hugepages>
<page size='1048576' unit='KiB'/>
<page size='2048' unit='KiB' nodeset='0'/>
</hugepages>
</memoryBacking>
will start a guest with 1G pages. That's wrong and it should not
depend on the order.
This patch fixes this XML to work again:
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB' nodeset='0'/>
</hugepages>
</memoryBacking>
by changing the parsed data into:
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB'/>
</hugepages>
</memoryBacking>
which will pass the validation.
As a side-effect it will also fix previous case by changing it into:
<memoryBacking>
<hugepages>
<page size='1048576' unit='KiB'/>
<page size='2048' unit='KiB'/>
</hugepages>
</memoryBacking>
Which will fail the validation as there are two default pages.
All of this applies only to non-NUMA guests.
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB' nodeset='0-1'/>
<page size='1048576' unit='KiB'/>
</hugepages>
</memoryBacking>
<cpu mode='host-passthrough' check='none'>
<topology sockets='1' cores='4' threads='2'/>
<numa>
<cell id='0' cpus='0-3' memory='4194304'
unit='KiB'/>
<cell id='1' cpus='4-7' memory='4194304'
unit='KiB'/>
</numa>
</cpu>
Do you consider this configuration valid?
Completely different issue that this patch is not trying to solve
but we can handle that as well.
Pavel