On Fri, Nov 07, 2014 at 05:36:43PM +0800, Wang Rui wrote:
On 2014/11/5 16:07, Martin Kletzander wrote:
[...]
>>>> diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c
>>>> index b5bdb36..8685d6f 100644
>>>> --- a/src/qemu/qemu_cgroup.c
>>>> +++ b/src/qemu/qemu_cgroup.c
>>>> @@ -618,6 +618,11 @@ qemuSetupCpusetMems(virDomainObjPtr vm,
>>>> if (!virCgroupHasController(priv->cgroup,
VIR_CGROUP_CONTROLLER_CPUSET))
>>>> return 0;
>>>>
>>>> + if (virDomainNumatuneGetMode(vm->def->numatune, -1) !=
>>>> + VIR_DOMAIN_NUMATUNE_MEM_STRICT) {
>>>> + return 0;
>>>> + }
>>>> +
>>>
>>> One question, is it problem only for 'preferred' or
'interleaved' as
>>> well? Because if it's only problem for 'preferred', then the
check is
>>> wrong. If it's problem for 'interleaved' as well, then the
commit
>>> message is wrong.
>>>
>> 'interleave' with a single node(such as nodeset='0') will cause
the same error.
>> But 'interleave' mode should not live with a single node. So maybe
there's
>> another bugfix to check 'interleave' with single node.
>>
>
> Well, I'd be OK with just changing the commit message to mention that.
> This fix is still a valid one and will fix both issues, won't it?
>
>> If configured with 'interleave' and multiple nodes(such as
nodeset='0-1'),
>> VM can be started successfully. And cpuset.mems is set to the same nodeset.
>> So I'll revise my patch.
>>
>> I'll send patches V2. Conclusion:
>>
>> 1/3 : add check for 'interleave' mode with single numa node
>> 2/3 : fix this problem in qemu
>> 3/3 : fix this problem in lxc
>>
>> Is it OK?
>>
>>> Anyway, after either one is fixed, I can push this.
>>>
I tested this problem again and found that this error occurred with each
memory mode. It is broke by commit 411cea638f6ec8503b7142a31e58b1cd85dbeaba
which is produced by me.
qemu: move setting emulatorpin ahead of monitor showing up
I'm sorry for that.
That patch moved qemuSetupCgroupForEmulator before qemuSetupCgroupPostInit.
I have ideas to fix that.
1. Move qemuSetupCgroupPostInit ahead of monitor showing up, too.
Of course it's before qemuSetupCgroupForEmulator.
This action to fix the bug which is introduced by me.
(RFC)
That cannot be done, IIRC, because we need monitor to get the
vCPU <-> thread mapping from it.
2. Anyway the first problem is fixed, I have found the second problem
which
is I wanted to fix originally. If memory mode is 'preferred' and with
one node (such as nodeset='0'), domain's memory is not in node 0
absolutely. Assumption that node 0 doesn't have enough memory, memory
can be allocated on node 1. Then if we set cpuset.mems to '0', it may
cause OOM.
The solution is checking memory mode in (lxc)qemuSetupCpusetMems as my
patch on Tuesday. Such as
+ if (virDomainNumatuneGetMode(vm->def->numatune, -1) !=
+ VIR_DOMAIN_NUMATUNE_MEM_PREFERRED) {
Either this (as it makes sense to restrict qemu even for 'interleave'
or the previous check is fine too (just because that was what we did
before, I just rewrote it with few problems.
BTW:
3. After the first problem has been fixed, we can start domains with xml:
<numatune>
<memory mode='interleave' nodeset='0'/>
</numatune>
Is a single node '0' valid for 'interleave' ? I take
'interleave' as
'at least two nodes'.
Well, interleave of 1 node is effectively 'strict', isn't it? What
errors do you get if you try that? (my kernel stopped accepting
numa=fake=2 as a cmdline parameter :( )
Anyway, I think the best way would be mimicking the old behaviour by
just adding your first proposed fix "if (mode != STRICT) return 0",
just fit the fixed up comit message.
Martin