On 4/11/19 11:56 AM, Michal Privoznik wrote:
On 4/11/19 4:23 PM, Daniel Henrique Barboza wrote:
> Hi,
>
> I've tested these patches again, twice, in similar setups like I tested
> the first version (first in a Power8, then in a Power9 server).
>
> Same results, though. Libvirt will not avoid the launch of a pseries
> guest,
> with numanode=strict, even if the numa node does not have available
> RAM. If I stress test the memory of the guest to force the allocation,
> QEMU exits with an error as soon as the memory of the host numa node
> is exhausted.
Yes, this is expected. I mean, by default qemu doesn't allocate memory
for the guest fully. You'd have to force it:
<memoryBacking>
<allocation mode='immediate'/>
</memoryBacking>
Tried with this extra setting, still no good. Domain still boots, even if
there is not enough memory to load up all its ram in the NUMA node
I am setting. For reference, this is the top of the guest XML:
<name>vm1</name>
<uuid>f48e9e35-8406-4784-875f-5185cb4d47d7</uuid>
<memory unit='KiB'>314572800</memory>
<currentMemory unit='KiB'>314572800</currentMemory>
<memoryBacking>
<allocation mode='immediate'/>
</memoryBacking>
<vcpu placement='static'>16</vcpu>
<numatune>
<memory mode='strict' nodeset='0'/>
</numatune>
<os>
<type arch='ppc64' machine='pseries'>hvm</type>
<boot dev='hd'/>
</os>
<clock offset='utc'/>
While doing this test, I recalled that some of my IBM peers recently
mentioned that they were unable to do a pre-allocation of the RAM
of a pseries guest using Libvirt, but they were able to do it using QEMU
directly (using -realtime mlock=on). In fact, I just tried it out with
command
line QEMU and the guest allocated all the memory at boot.
This means that the pseries guest is able to do mem pre-alloc. I'd say that
there might be something missing somewhere (XML, host setup, libvirt
config ...) or perhaps even a bug that is preventing Libvirt from doing
this pre-alloc. This explains why I can't verify this patch series. I'll
see if
I dig it further to understand why when I have the time.
Thanks,
DHB
>
> If I change the numanode setting to 'preferred' and repeats the test,
> QEMU
> doesn't exit with an error - the process starts to take memory from
> other
> numa nodes. This indicates that the numanode policy is apparently being
> forced in the QEMU process - however, it is not forced in VM boot.
>
> I've debugged it a little and haven't found anything wrong that jumps
> the
> eye. All functions that succeeds qemuSetupCpusetMems exits out with
> ret = 0. Unfortunately, I don't have access to a x86 server with more
> than
> one NUMA node to compare results.
>
> Since I can't say for sure if what I'm seeing is an exclusive pseries
> behavior, I see no problem into pushing this series upstream
> if it makes sense for x86. We can debug/fix the Power side later.
I bet that if you force the allocation then the domain will be unable
to boot.
Thanks for the testing!
Michal