[PATCH 0/3] qemu: Explicitly forbid changing nodeset for strict numatune

See 2/3 for explanation. Michal Prívozník (3): qemu: Allow VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE in qemuDomainSetNumaParamsLive() qemu: Explicitly forbid changing nodeset for strict numatune qemu_command: do use host-nodes for system memory src/qemu/qemu_command.c | 3 +- src/qemu/qemu_driver.c | 37 +++++++++++++------ .../numatune-system-memory.x86_64-latest.args | 2 +- 3 files changed, 27 insertions(+), 15 deletions(-) -- 2.32.0

The whole idea of VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE is that the memory location is restricted only via CGroups and thus can be changed on the fly (which is exactly what qemuDomainSetNumaParamsLive() does. Allow this mode there then. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index bae8b7c39b..e884dde721 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8777,10 +8777,10 @@ qemuDomainSetNumaParamsLive(virDomainObj *vm, size_t i = 0; if (virDomainNumatuneGetMode(vm->def->numa, -1, &mode) == 0 && - mode != VIR_DOMAIN_NUMATUNE_MEM_STRICT) { + mode != VIR_DOMAIN_NUMATUNE_MEM_STRICT && + mode != VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", - _("change of nodeset for running domain " - "requires strict numa mode")); + _("change of nodeset for running domain requires strict or restrictive numa mode")); return -1; } -- 2.32.0

Let's imagine a guest that's configured with strict numatune: <numatune> <memory mode='strict' nodeset='0'/> </numatune> For guests with NUMA: Depending on machine type used (see commit v6.4.0-rc1~75) we generate either: 1) -object '{"qom-type":"memory-backend-ram","id":"ram-node0",\ "size":20971520,"host-nodes":[0],"policy":"preferred"}' \ -numa node,nodeid=0,cpus=0,memdev=ram-node0 or 2) -numa node,nodeid=0,cpus=0,mem=20480 Later, when QEMU boots up and cpuset CGroup controller is available we further restrict QEMU there too. But there's a behaviour difference hidden: while in case 1) QEMU is restricted from beginning, in case 2) it is not and thus it may happen that it will allocate memory from different NUMA node and even though CGroup will try to migrate it, it may fail to do so (e.g. because memory is locked). Therefore, one can argue that case 2) is broken. NB, case 2) is exactly what mode 'restrictive' is for. However, in case 1) we are unable to update QEMU with new host-nodes, simply because it's lacking a command to do so. For guests without NUMA: It's very close to case 2) from above. We have commit v7.10.0-rc1~163 that prevents us from outputting host-nodes when generating memory-backend-* for system memory, but that simply allows QEMU to allocate memory anywhere and then relies on CGroups to move it to desired location. Due to all of this, there is no reliable way to change nodeset for mode 'strict'. Let's forbid it. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_driver.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e884dde721..0354e1474c 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8777,10 +8777,9 @@ qemuDomainSetNumaParamsLive(virDomainObj *vm, size_t i = 0; if (virDomainNumatuneGetMode(vm->def->numa, -1, &mode) == 0 && - mode != VIR_DOMAIN_NUMATUNE_MEM_STRICT && mode != VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", - _("change of nodeset for running domain requires strict or restrictive numa mode")); + _("change of nodeset for running domain requires restrictive numa mode")); return -1; } @@ -8913,17 +8912,31 @@ qemuDomainSetNumaParameters(virDomainPtr dom, goto endjob; } - if (nodeset && - qemuDomainSetNumaParamsLive(vm, nodeset) < 0) - goto endjob; + if (mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT) { + virBitmap *config_nodeset = NULL; - if (virDomainNumatuneSet(def->numa, - def->placement_mode == - VIR_DOMAIN_CPU_PLACEMENT_MODE_STATIC, - -1, mode, nodeset) < 0) - goto endjob; + if (virDomainNumatuneMaybeGetNodeset(def->numa, priv->autoNodeset, + &config_nodeset, -1) < 0) + goto endjob; - qemuDomainSaveStatus(vm); + if (!virBitmapEqual(nodeset, config_nodeset)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("can't change nodeset for strict mode for running domain")); + goto endjob; + } + } else { + if (nodeset && + qemuDomainSetNumaParamsLive(vm, nodeset) < 0) + goto endjob; + + if (virDomainNumatuneSet(def->numa, + def->placement_mode == + VIR_DOMAIN_CPU_PLACEMENT_MODE_STATIC, + -1, mode, nodeset) < 0) + goto endjob; + + qemuDomainSaveStatus(vm); + } } if (persistentDef) { -- 2.32.0

After previous commit, it's no longer possible to change nodeset for strict numatune. Therefore, we can start generating host-nodes onto command line again. This partially reverts d73265af6ec41104c20633b5c0a23688a62105e6. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_command.c | 3 +-- .../qemuxml2argvdata/numatune-system-memory.x86_64-latest.args | 2 +- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 6d00105b24..b554f4f025 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3906,8 +3906,7 @@ qemuBuildMemoryBackendProps(virJSONValue **backendProps, /* If mode is "restrictive", we should only use cgroups setting allowed memory * nodes, and skip passing the host-nodes and policy parameters to QEMU command * line which means we will use system default memory policy. */ - if (!systemMemory && nodemask && - mode != VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE) { + if (nodemask && mode != VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE) { if (!virNumaNodesetIsAvailable(nodemask)) return -1; if (virJSONValueObjectAdd(&props, diff --git a/tests/qemuxml2argvdata/numatune-system-memory.x86_64-latest.args b/tests/qemuxml2argvdata/numatune-system-memory.x86_64-latest.args index 2c1180b77e..fd93abe3eb 100644 --- a/tests/qemuxml2argvdata/numatune-system-memory.x86_64-latest.args +++ b/tests/qemuxml2argvdata/numatune-system-memory.x86_64-latest.args @@ -14,7 +14,7 @@ XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \ -accel tcg \ -cpu qemu64 \ -m 214 \ --object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":224395264}' \ +-object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":224395264,"host-nodes":[0],"policy":"bind"}' \ -overcommit mem-lock=off \ -smp 2,sockets=2,cores=1,threads=1 \ -uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ -- 2.32.0

On Wed, Dec 15, 2021 at 04:52:11PM +0100, Michal Privoznik wrote:
See 2/3 for explanation.
Michal Prívozník (3): qemu: Allow VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE in qemuDomainSetNumaParamsLive() qemu: Explicitly forbid changing nodeset for strict numatune qemu_command: do use host-nodes for system memory
Looks good but I think this change should be documented. Previously we only allowed live changes with strict mode and now we allow live changes only with restricted mode. Pavel

On 12/16/21 14:49, Pavel Hrdina wrote:
On Wed, Dec 15, 2021 at 04:52:11PM +0100, Michal Privoznik wrote:
See 2/3 for explanation.
Michal Prívozník (3): qemu: Allow VIR_DOMAIN_NUMATUNE_MEM_RESTRICTIVE in qemuDomainSetNumaParamsLive() qemu: Explicitly forbid changing nodeset for strict numatune qemu_command: do use host-nodes for system memory
Looks good but I think this change should be documented. Previously we only allowed live changes with strict mode and now we allow live changes only with restricted mode.
Good call. Let me fix that in v2. Michal
participants (3)
-
Michal Privoznik
-
Michal Prívozník
-
Pavel Hrdina