On Mon, Nov 28, 2016 at 09:45:15AM +0100, Viktor Mihajlovski wrote:
On 25.11.2016 17:14, Daniel P. Berrange wrote:
> On Fri, Nov 25, 2016 at 05:12:41PM +0100, Martin Kletzander wrote:
>> On Fri, Nov 25, 2016 at 03:39:17PM +0000, Daniel P. Berrange wrote:
>>> On Fri, Nov 25, 2016 at 04:35:14PM +0100, Martin Kletzander wrote:
>>>> On Fri, Nov 25, 2016 at 03:19:54PM +0100, Viktor Mihajlovski wrote:
>>>>> On 18.11.2016 17:44, Viktor Mihajlovski wrote:
>>>>>> With kernel 3.18 (since commit
3e32cb2e0a12b6915056ff04601cf1bb9b44f967) the
>>>>>> "unlimited" value for cgroup memory limits has changed
once again as its byte
>>>>>> value is now computed from a page counter.
>>>>>> The new "unlimited" value reported by the cgroup fs is
therefore 2**51-1 pages
>>>>>> which is (VIR_DOMAIN_MEMORY_PARAM_UNLIMITED - 3072). This results
e.g. in virsh
>>>>>> memtune displaying 9007199254740988 instead of unlimited for the
limits.
>>>>>>
>>>>
>>>> Ah, I wonder hoe many times we'll have to deal with this.
>>>>
>>>> Anyway, this means it is not enough to shift by 2 if the system hugepage
>>>> is more than 4k (e.g. 2M). I remember we had to change something due to
>>>> such hosts. virGetSystemPageSize(KB) should help with that.
>>>>
>>>> We could also make it 2^50 - pagesize just to make sure it will work for
>>>> a while, but some might not like it.
>>>
>>> I guess I'm not understanding how this code copes with the fact that
>>> we now have 3 different "unlimited" values to deal with.
>>>
>>> Could we not simply record the unlimited values and pick the right
>>> one based on kernel version we detect ?
>>>
>>
>> We have one, which we return for Get APIs that means that the setting
>> was not restricted. We would need to have a cgroup which we set to
>> some huge value and then read it, store it and compare it all the time.
>> It _should_ work as far as I can tell.
>
> No, I meant can we use some constants
>
> #define UNLIMTED_VALUE_KERNEL_A_B_C 2343243245325
> #define UNLIMTED_VALUE_KERNEL_D_E_F 3253253253253253
> #define UNLIMTED_VALUE_KERNEL_G_H_I 325325325325232
>
>
> if kver == A_B_C && value == UNLIMTED_VALUE_KERNEL_A_B_C
> value = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED
> else if kver == D_E_F && value == UNLIMTED_VALUE_KERNEL_D_E_F
> value = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED
> else if kver == G_H_I && value == UNLIMTED_VALUE_KERNEL_G_H_I
> value = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED
>
>
> Regards,
> Daniel
>
Probably easier to read, but relying on the reported kernel version
alone is not safe as it is not uncommon to find backports introducing
functionality from later kernels.
Similarly to QEMU (and others, I believe), I wouldn't rely on the
versions.
We could instead try to probe the max value. IIUC, the root memory
limit
cannot be modified from userspace, so this might be the right value to
look up. I can craft something in this direction, if you agree with the
approach.
That's the other idea I had, so I'd agree with it. But I still like the
original idea the best. Linux is just decreasing the value, so if we
have the smallest one possible and check that (if it is bigger, then
just return unlimited), ten we don't need any conditions. Yes, we will
need to update the value again in the future, but we'd need to do that
either way with Dan's approach. And this is the only downside it would
have compared to probing. The only other difference is the fact that
with older kernels you would see "unlimited" instead of the value you've
set, but I don't consider that downside since all the values this
applies to are really unlimited (and will be for few years to come, I
reckon).
--
Mit freundlichen Grüßen/Kind Regards
Viktor Mihajlovski
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martina Köderitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294