On 13/05/13 14:46, Hu Tao wrote:
On Thu, May 09, 2013 at 06:22:17PM +0800, Osier Yang wrote:
> When the numatune memory mode is not "strict", the cpuset.mems
> inherits the parent's setting, which causes problem like:
>
> % virsh dumpxml rhel6_local | grep interleave -2
> <vcpu placement='static'>2</vcpu>
> <numatune>
> <memory mode='interleave' nodeset='1-2'/>
> </numatune>
> <os>
>
> % cat /proc/3713/status | grep Mems_allowed_list
> Mems_allowed_list: 0-3
>
> % virsh numatune rhel6_local
> numa_mode : interleave
> numa_nodeset : 0-3
Yes the information is misleading.
> Though the domain process's memory binding is set with libnuma
> after the cgroup setting.
>
> The reason for only allowing "strict" mode in current code is the
> cpuset.mems doesn't understand the memory policy modes (interleave,
> prefered, strict), it actually equals to the "strict" mode
("strict"
> means the allocation will fail if the memory cannot be allocated on
> the target node. Default operation is to fall back to other nodes.
Default is localalloc.
> >From man numa(3)). However, writing the the cpuset.mems even if the
> numatune memory mode is not strict should be better than the blind
> inheritance anyway.
It's OK to interleave mode, combined with cpuset.memory_spread_xxx.
- cpuset.memory_spread_page flag: if set, spread page cache evenly on
allowed nodes
- cpuset.memory_spread_slab flag: if set, spread slab cache evenly on
allowed nodes
Looks reasonable.
But what about preferred mode? comparing:
strict: Strict means the allocation will fail if the memory cannot be
allocated on the target node.
preferred: The system will attempt to allocate memory from the
preferred node, but will fall back to other nodes if no
memory is available on the the preferred node.
For "preferred" mode, I have no idea, there is no related cgroup file(s)
like
memory_spread_*. If we set cpuset.mems with the nodeset, it means
the memory allocation will behave like 'strict', which is not expected.
> ---
> However, I'm not comfortable with the solution, since anyway the
> modes except "strict" are not meaningful for cpuset.mems.
>
> Another problem what I'm not sure about is: If the cpuset.cpus will
> affect the libnuma setting? Assuming without this patch, domain
> process's cpuset.mems will be set as '0-7' (8 NUMA nodes, each has 8
> CPUs). And the numatune memory mode is "interleave", and libnuma set
> the memory binding as "1-2". Even with this patch applied, setting
> cpuset.mems as "1-2", any potential problem?
>
> So this patch is mainly for raising up the problem, and to see if
> guys have any opinions. @hutao, since these codes are from you, any
> opinions/idea? Thanks.
> ---
> src/qemu/qemu_cgroup.c | 18 +++++++++++++-----
> 1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c
> index 33eebd7..22fe25b 100644
> --- a/src/qemu/qemu_cgroup.c
> +++ b/src/qemu/qemu_cgroup.c
> @@ -597,11 +597,9 @@ qemuSetupCpusetCgroup(virDomainObjPtr vm,
> if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET))
> return 0;
>
> - if ((vm->def->numatune.memory.nodemask ||
> - (vm->def->numatune.memory.placement_mode ==
> - VIR_NUMA_TUNE_MEM_PLACEMENT_MODE_AUTO)) &&
> - vm->def->numatune.memory.mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT) {
> -
> + if (vm->def->numatune.memory.nodemask ||
> + (vm->def->numatune.memory.placement_mode ==
> + VIR_NUMA_TUNE_MEM_PLACEMENT_MODE_AUTO)) {
> if (vm->def->numatune.memory.placement_mode ==
> VIR_NUMA_TUNE_MEM_PLACEMENT_MODE_AUTO)
> mem_mask = virBitmapFormat(nodemask);
> @@ -614,6 +612,16 @@ qemuSetupCpusetCgroup(virDomainObjPtr vm,
> goto cleanup;
> }
>
> + if (vm->def->numatune.memory.mode ==
> + VIR_DOMAIN_NUMATUNE_MEM_PREFERRED &&
> + strlen(mem_mask) != 1) {
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("NUMA memory tuning in 'preferred' mode
"
> + "only supports single node"));
> + goto cleanup;
> +
> + }
> +
> rc = virCgroupSetCpusetMems(priv->cgroup, mem_mask);
>
> if (rc != 0) {
> --
> 1.8.1.4
>
> --
> libvir-list mailing list
> libvir-list(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/libvir-list