On 4/10/19 12:35 AM, Daniel Henrique Barboza wrote:
Hi,
On 4/9/19 11:10 AM, Michal Privoznik wrote:
> If there's a domain configured as:
>
> <currentMemory unit='MiB'>4096</currentMemory>
> <numatune>
> <memory mode='strict' nodeset='1'/>
> </numatune>
>
> but there is not enough memory on NUMA node 1 the domain will start
> successfully because we allow it to allocate memory from other nodes.
> This is a result of some previous fix (v1.2.7-rc1~91). However, I've
> tested my fix successfully on a NUMA machine with F29 and recent kernel.
> So the kernel bug I'm mentioning in 4/4 is probably fixed then and we
> can drop the workaround.
I've tested out of curiosity your patch set in a Power8 system to see
if I could spot a difference, but in my case it didn't change the behavior.
I've tried a guest with the following numatune:
<memory unit='KiB'>67108864</memory>
<currentMemory unit='KiB'>67108864</currentMemory>
<vcpu placement='static' current='4'>16</vcpu>
<numatune>
<memory mode='strict' nodeset='0'/>
</numatune>
This is the numa setup of the host:
$ numactl -H
available: 4 nodes (0-1,16-17)
node 0 cpus: 0 8 16 24 32 40
node 0 size: 32606 MB
node 0 free: 24125 MB
node 1 cpus: 48 56 64 72 80 88
node 1 size: 32704 MB
node 1 free: 27657 MB
node 16 cpus: 96 104 112 120 128 136
node 16 size: 32704 MB
node 16 free: 25455 MB
node 17 cpus: 144 152 160 168 176 184
node 17 size: 32565 MB
node 17 free: 30030 MB
node distances:
node 0 1 16 17
0: 10 20 40 40
1: 20 10 40 40
16: 40 40 10 20
17: 40 40 20 10
If I understood it right, the patches removed the capability to allocate
memory from different numa nodes with the 'strict' setting, making the
guest failing to launch if the numa node does not have enough memory.
Unless I am getting something wrong, this guest shouldn't launch after
applying this patch (node 0 does not have 64Gb available). But the guest
is launching as if nothing changed.
I'll dig it further if I have the chance. I'm just curious if this is
something
that works differently with pseries guests.
Hey,
firstly thanks for testing this! And yes, it shows flaw in my patches.
Thing is, my patches set up emulator/cpuset.mems but at the time qemu is
doing its allocation it is still living under top level CGroup (which is
left untouched). Will post v2! Thanks.
Michal