On Mon, Nov 28, 2016 at 06:03:36PM +0100, Viktor Mihajlovski wrote:
With kernel 3.18 (since commit
3e32cb2e0a12b6915056ff04601cf1bb9b44f967)
the "unlimited" value for cgroup memory limits has changed once again as
its byte value is now computed from a page counter.
The new "unlimited" value reported by the cgroup fs is therefore 2**51-1
pages which is (VIR_DOMAIN_MEMORY_PARAM_UNLIMITED - 3072). This results
e.g. in virsh memtune displaying 9007199254740988 instead of unlimited
for the limits.
This patch uses the value of memory.limit_in_bytes from the cgroup
memory root which is the system's "real" unlimited value for comparison.
See also libvirt commit 231656bbeb9e4d3bedc44362784c35eee21cf0f4 for the
history for kernel 3.12 and before.
I've tested this on F24 with the following configurations:
- no memory cgroup controller mounted
- memory cgroup controller mounted but not configured for libvirt
- memory cgroup controller mounted and configured
The first two fail as expected (and as before), the third case
works as expected.
Testing on other kernel versions highly welcome!
Not perfect yet in that we still provide a fallback to the old value.
We might consider failing right away if we can't get the system
value. I'd be inclined to do that, since we're probably facing
principal cgroup issues in this case.
Since the code is called only after reading another value worked, it
_should not_ fail =) But I'm OK with both failing and falling back to
the old value. Mostly because I don't think it will make any
(significant) difference.
Further, it's not the most efficient implementation. Obviously,
the
unlimited value can be read once and cached. However, I'd like to see
the question above resolved first.
But I would really like to cache the value in a global variable. You
can use VIR_ONCE_GLOBAL_INIT for that, but maybe it's too much,
especially if you init the value before any other thread could access
it.
Signed-off-by: Viktor Mihajlovski <mihajlov(a)linux.vnet.ibm.com>
---
src/util/vircgroup.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 55 insertions(+), 6 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c
index f151193..969dca5 100644
--- a/src/util/vircgroup.c
+++ b/src/util/vircgroup.c
@@ -2452,6 +2452,40 @@ virCgroupGetBlkioDeviceWeight(virCgroupPtr group,
}
+/*
+ * Retrieve the "memory.limit_in_bytes" value from the memory controller
+ * root dir. This value cannot be modified by userspace and therefore
+ * is the maximum limit value supported by cgroups on the local system.
+ */
+static int
+virCgroupGetMemoryUnlimited(unsigned long long int * mem_unlimited)
+{
+ int ret = -1;
+ virCgroupPtr group;
+
Also, all this ↓
+ if (VIR_ALLOC(group))
+ goto cleanup;
+
+ if (virCgroupDetectMounts(group))
+ goto cleanup;
+
+ if (!group->controllers[VIR_CGROUP_CONTROLLER_MEMORY].mountPoint)
+ goto cleanup;
+
+ if (VIR_STRDUP(group->controllers[VIR_CGROUP_CONTROLLER_MEMORY].placement,
+ "/.") < 0)
+ goto cleanup;
+
↑ would be cleaner this way:
if (virCgroupNew(-1, "/", NULL, NULL, &group) < 0)
return -1;
if (!virCgroupHasController(cgroup, VIR_CGROUP_CONTROLLER_MEMORY))
goto cleanup;
I'm not passing VIR_CGROUP_CONTROLLER_MEMORY to virCgroupNew() so that
it doesn't fail with a cryptic message.
+ ret = virCgroupGetValueU64(group,
+ VIR_CGROUP_CONTROLLER_MEMORY,
+ "memory.limit_in_bytes",
+ mem_unlimited);
+ cleanup:
+ virCgroupFree(&group);
+ return ret;
+}
+
+
/**
* virCgroupSetMemory:
*
@@ -2534,6 +2568,7 @@ int
virCgroupGetMemoryHardLimit(virCgroupPtr group, unsigned long long *kb)
{
long long unsigned int limit_in_bytes;
+ long long unsigned int unlimited_in_bytes;
int ret = -1;
if (virCgroupGetValueU64(group,
@@ -2541,9 +2576,13 @@ virCgroupGetMemoryHardLimit(virCgroupPtr group, unsigned long long
*kb)
"memory.limit_in_bytes", &limit_in_bytes) <
0)
goto cleanup;
- *kb = limit_in_bytes >> 10;
- if (*kb > VIR_DOMAIN_MEMORY_PARAM_UNLIMITED)
+ if (virCgroupGetMemoryUnlimited(&unlimited_in_bytes) < 0)
+ unlimited_in_bytes = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED << 10;
+
+ if (limit_in_bytes == unlimited_in_bytes)
I don't know why, but I would feel more comfortable with '>=' there,
although it doesn't make any difference (or sense).
*kb = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED;
+ else
+ *kb = limit_in_bytes >> 10;
ret = 0;
cleanup:
As a nit, helper function for these would be nice.
Otherwise, I like it.