[libvirt] [PATCH 00/53] implement cgroup v2 support

For more information check the kernel documentation [1]. This series implements cgroup v2 support into libvirt without devices, for that there will be separate series because it uses BPF and it's not that trivial. You can get it from my git as well [2]. [1] <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/cgroup-v2.rst> [2] <https://github.com/Antique/libvirt/tree/cgroupv2> Pavel Hrdina (53): util: introduce cgroup v2 files vircgroup: introduce virCgroupV2Available vircgroup: introduce virCgroupV2ValidateMachineGroup vircgroup: introduce virCgroupV2CopyMounts vircgroup: introduce virCgroupV2CopyPlacement vircgroup: introduce virCgroupV2DetectMounts vircgroup: introduce virCgroupV2DetectPlacement vircgroup: introduce virCgroupV2ValidatePlacement vircgroup: introduce virCgroupV2StealPlacement vircgroup: introduce virCgroupV2DetectControllers vircgroup: introduce virCgroupV2HasController vircgroup: introduce virCgroupV2GetAnyController vircgroup: introduce virCgroupV2PathOfController vircgroup: introduce virCgroupV2MakeGroup vircgroup: introduce virCgroupV2Remove vircgroup: introduce virCgroupV2AddTask vircgroup: introduce virCgroupV2HasEmptyTasks vircgroup: introduce virCgroupV2BindMount vircgroup: introduce virCgroupV2SetOwner vircgroup: register cgroup v2 backend vircgroup: introduce virCgroupV2SetBlkioWeight vircgroup: introduce virCgroupV2GetBlkioIoServiced vircgroup: introduce virCgroupV2GetBlkioIoDeviceServiced vircgroup: introduce virCgroupV2SetBlkioDeviceWeight vircgroup: introduce virCgroupV2SetBlkioDeviceReadIops vircgroup: introduce virCgroupV2SetBlkioDeviceWriteIops vircgroup: introduce virCgroupV2SetBlkioDeviceReadBps vircgroup: introduce virCgroupV2SetBlkioDeviceWriteBps vircgroup: introduce virCgroupV2SetMemory vircgroup: introduce virCgroupV2GetMemoryStat vircgroup: introduce virCgroupV2GetMemoryUsage vircgroup: introduce virCgroupV2SetMemoryHardLimit vircgroup: introduce virCgroupV2SetMemorySoftLimit vircgroup: introduce virCgroupV2SetMemSwapHardLimit vircgroup: introduce virCgroupV2GetMemSwapUsage vircgroup: introduce virCgroupV2SetCpuShares vircgroup: introduce virCgroupV2SetCpuCfsPeriod vircgroup: introduce virCgroupV2SetCpuCfsQuota vircgroup: introduce virCgroupV2SupportsCpuBW vircgroup: introduce virCgroupV2GetCpuacctUsage vircgroup: introduce virCgroupV2GetCpuacctStat vircgroup: add support for hybrid configuration vircgroupmock: change cgroup prefix vircgroupmock: add support to test cgroup v2 vircgrouptest: introduce initFakeFS and cleanupFakeFS helpers vircgrouptest: prepare testCgroupDetectMounts for cgroup v2 vircgrouptest: add detect mounts test for cgroup v2 vircgrouptest: add detect mounts test for hybrid cgroups vircgrouptest: prepare validateCgroup for cgroupv2 vircgrouptest: add cgroup v2 tests vircgrouptest: add hybrid tests virt-host-validate: rewrite cgroup detection to use util/vircgroup virt-host-validate: require freezer for LXC src/Makefile.am | 1 + src/libvirt_private.syms | 3 + src/util/Makefile.inc.am | 2 + src/util/vircgroup.c | 356 +++-- src/util/vircgroupbackend.c | 23 + src/util/vircgroupbackend.h | 20 +- src/util/vircgrouppriv.h | 11 +- src/util/vircgroupv2.c | 1615 +++++++++++++++++++++ src/util/vircgroupv2.h | 27 + tests/vircgroupdata/all-in-one.parsed | 1 + tests/vircgroupdata/cgroups1.parsed | 1 + tests/vircgroupdata/cgroups2.parsed | 1 + tests/vircgroupdata/cgroups3.parsed | 1 + tests/vircgroupdata/fedora-18.parsed | 1 + tests/vircgroupdata/fedora-21.parsed | 1 + tests/vircgroupdata/hybrid.cgroups | 12 + tests/vircgroupdata/hybrid.mounts | 23 + tests/vircgroupdata/hybrid.parsed | 11 + tests/vircgroupdata/hybrid.self.cgroup | 9 + tests/vircgroupdata/kubevirt.parsed | 1 + tests/vircgroupdata/ovirt-node-6.6.parsed | 1 + tests/vircgroupdata/ovirt-node-7.1.parsed | 1 + tests/vircgroupdata/rhel-7.1.parsed | 1 + tests/vircgroupdata/unified.cgroups | 13 + tests/vircgroupdata/unified.mounts | 20 + tests/vircgroupdata/unified.parsed | 11 + tests/vircgroupdata/unified.self.cgroup | 1 + tests/vircgroupmock.c | 169 ++- tests/vircgrouptest.c | 191 ++- tools/virt-host-validate-common.c | 162 +-- tools/virt-host-validate-common.h | 7 +- tools/virt-host-validate-lxc.c | 41 +- tools/virt-host-validate-qemu.c | 40 +- 33 files changed, 2436 insertions(+), 342 deletions(-) create mode 100644 src/util/vircgroupv2.c create mode 100644 src/util/vircgroupv2.h create mode 100644 tests/vircgroupdata/hybrid.cgroups create mode 100644 tests/vircgroupdata/hybrid.mounts create mode 100644 tests/vircgroupdata/hybrid.parsed create mode 100644 tests/vircgroupdata/hybrid.self.cgroup create mode 100644 tests/vircgroupdata/unified.cgroups create mode 100644 tests/vircgroupdata/unified.mounts create mode 100644 tests/vircgroupdata/unified.parsed create mode 100644 tests/vircgroupdata/unified.self.cgroup -- 2.17.1

Place cgroup v2 backend type before cgroup v1 to make it obvious that cgroup v2 is preferred implementation. Following patches will introduce support for hybrid configuration which will allow us to use both at the same time, but we should prefer cgroup v2 regardless. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/Makefile.am | 1 + src/libvirt_private.syms | 3 ++ src/util/Makefile.inc.am | 2 ++ src/util/vircgroup.c | 3 ++ src/util/vircgroupbackend.c | 2 ++ src/util/vircgroupbackend.h | 3 +- src/util/vircgrouppriv.h | 9 ++++++ src/util/vircgroupv2.c | 63 +++++++++++++++++++++++++++++++++++++ src/util/vircgroupv2.h | 27 ++++++++++++++++ 9 files changed, 112 insertions(+), 1 deletion(-) create mode 100644 src/util/vircgroupv2.c create mode 100644 src/util/vircgroupv2.h diff --git a/src/Makefile.am b/src/Makefile.am index f515569fd5..33ff280d78 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -675,6 +675,7 @@ libvirt_setuid_rpc_client_la_SOURCES = \ util/vircgroup.c \ util/vircgroupbackend.c \ util/vircgroupv1.c \ + util/vircgroupv2.c \ util/vircommand.c \ util/virconf.c \ util/virdbus.c \ diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 92363913e3..335210c31d 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1583,6 +1583,9 @@ virCgroupBackendRegister; # util/vircgroupv1.h virCgroupV1Register; +# util/vircgroupv2.h +virCgroupV2Register; + # util/virclosecallbacks.h virCloseCallbacksGet; virCloseCallbacksGetConn; diff --git a/src/util/Makefile.inc.am b/src/util/Makefile.inc.am index ad3be91867..cffbb357bc 100644 --- a/src/util/Makefile.inc.am +++ b/src/util/Makefile.inc.am @@ -27,6 +27,8 @@ UTIL_SOURCES = \ util/vircgroupbackend.h \ util/vircgroupv1.c \ util/vircgroupv1.h \ + util/vircgroupv2.c \ + util/vircgroupv2.h \ util/virclosecallbacks.c \ util/virclosecallbacks.h \ util/vircommand.c \ diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 548c873da8..1097b1f998 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1236,6 +1236,9 @@ virCgroupFree(virCgroupPtr *group) VIR_FREE((*group)->legacy[i].placement); } + VIR_FREE((*group)->unified.mountPoint); + VIR_FREE((*group)->unified.placement); + VIR_FREE((*group)->path); VIR_FREE(*group); } diff --git a/src/util/vircgroupbackend.c b/src/util/vircgroupbackend.c index d854c9711d..79fe6cb73d 100644 --- a/src/util/vircgroupbackend.c +++ b/src/util/vircgroupbackend.c @@ -21,6 +21,7 @@ #include "vircgroupbackend.h" #include "vircgroupv1.h" +#include "vircgroupv2.h" #include "virerror.h" #include "virthread.h" @@ -28,6 +29,7 @@ VIR_ENUM_DECL(virCgroupBackend); VIR_ENUM_IMPL(virCgroupBackend, VIR_CGROUP_BACKEND_TYPE_LAST, + "cgroup V2", "cgroup V1"); static virOnceControl virCgroupBackendOnce = VIR_ONCE_CONTROL_INITIALIZER; diff --git a/src/util/vircgroupbackend.h b/src/util/vircgroupbackend.h index 1c5744ef76..b1f19233e4 100644 --- a/src/util/vircgroupbackend.h +++ b/src/util/vircgroupbackend.h @@ -50,7 +50,8 @@ typedef enum { } virCgroupBackendTaskFlags; typedef enum { - VIR_CGROUP_BACKEND_TYPE_V1 = 0, + VIR_CGROUP_BACKEND_TYPE_V2 = 0, + VIR_CGROUP_BACKEND_TYPE_V1, VIR_CGROUP_BACKEND_TYPE_LAST, } virCgroupBackendType; diff --git a/src/util/vircgrouppriv.h b/src/util/vircgrouppriv.h index c50a25f195..4a0d75ddbc 100644 --- a/src/util/vircgrouppriv.h +++ b/src/util/vircgrouppriv.h @@ -45,12 +45,21 @@ struct _virCgroupV1Controller { typedef struct _virCgroupV1Controller virCgroupV1Controller; typedef virCgroupV1Controller *virCgroupV1ControllerPtr; +struct _virCgroupV2Controller { + int controllers; + char *mountPoint; + char *placement; +}; +typedef struct _virCgroupV2Controller virCgroupV2Controller; +typedef virCgroupV2Controller *virCgroupV2ControllerPtr; + struct _virCgroup { char *path; virCgroupBackendPtr backend; virCgroupV1Controller legacy[VIR_CGROUP_CONTROLLER_LAST]; + virCgroupV2Controller unified; }; int virCgroupSetValueStr(virCgroupPtr group, diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c new file mode 100644 index 0000000000..23bf81dae2 --- /dev/null +++ b/src/util/vircgroupv2.c @@ -0,0 +1,63 @@ +/* + * vircgroupv2.c: methods for cgroups v2 backend + * + * Copyright (C) 2018 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ +#include <config.h> + +#include "internal.h" + +#define __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ +#include "vircgrouppriv.h" +#undef __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ + +#include "vircgroup.h" +#include "vircgroupbackend.h" +#include "vircgroupv2.h" +#include "virlog.h" + +VIR_LOG_INIT("util.cgroup"); + +#define VIR_FROM_THIS VIR_FROM_CGROUP + +VIR_ENUM_DECL(virCgroupV2Controller); +VIR_ENUM_IMPL(virCgroupV2Controller, VIR_CGROUP_CONTROLLER_LAST, + "cpu", "cpuacct", "cpuset", "memory", "devices", + "freezer", "io", "net_cls", "perf_event", "name=systemd"); + +#ifdef __linux__ + +virCgroupBackend virCgroupV2Backend = { + .type = VIR_CGROUP_BACKEND_TYPE_V2, +}; + + +void +virCgroupV2Register(void) +{ + virCgroupBackendRegister(&virCgroupV2Backend); +} + +#else /* !__linux__ */ + +void +virCgroupV2Register(void) +{ + VIR_INFO("Control groups not supported on this platform"); +} + +#endif /* !__linux__ */ diff --git a/src/util/vircgroupv2.h b/src/util/vircgroupv2.h new file mode 100644 index 0000000000..a5d0bd0978 --- /dev/null +++ b/src/util/vircgroupv2.h @@ -0,0 +1,27 @@ +/* + * vircgroupv2.h: methods for cgroups v2 backend + * + * Copyright (C) 2018 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef __VIR_CGROUP_V2_H__ +# define __VIR_CGROUP_V2_H__ + +void +virCgroupV2Register(void); + +#endif /* __VIR_CGROUP_V2_H__ */ -- 2.17.1

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Place cgroup v2 backend type before cgroup v1 to make it obvious that cgroup v2 is preferred implementation.
Following patches will introduce support for hybrid configuration which will allow us to use both at the same time, but we should prefer cgroup v2 regardless.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/Makefile.am | 1 + src/libvirt_private.syms | 3 ++ src/util/Makefile.inc.am | 2 ++ src/util/vircgroup.c | 3 ++ src/util/vircgroupbackend.c | 2 ++ src/util/vircgroupbackend.h | 3 +- src/util/vircgrouppriv.h | 9 ++++++ src/util/vircgroupv2.c | 63 +++++++++++++++++++++++++++++++++++++ src/util/vircgroupv2.h | 27 ++++++++++++++++ 9 files changed, 112 insertions(+), 1 deletion(-) create mode 100644 src/util/vircgroupv2.c create mode 100644 src/util/vircgroupv2.h
ACK Michal

We cannot detect only mount points to figure out whether cgroup v2 is available because systemd uses cgroup v2 for process tracking and all controllers are mounted as cgroup v1 controllers. To make sure that this is no the situation we need to check 'cgroup.controllers' file if it's not empty to make sure that cgroup v2 is not mounted only for process tracking. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 52 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 23bf81dae2..78497bd31d 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -19,6 +19,10 @@ */ #include <config.h> +#ifdef __linux__ +# include <mntent.h> +#endif /* __linux__ */ + #include "internal.h" #define __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ @@ -28,7 +32,9 @@ #include "vircgroup.h" #include "vircgroupbackend.h" #include "vircgroupv2.h" +#include "virfile.h" #include "virlog.h" +#include "virstring.h" VIR_LOG_INIT("util.cgroup"); @@ -41,8 +47,54 @@ VIR_ENUM_IMPL(virCgroupV2Controller, VIR_CGROUP_CONTROLLER_LAST, #ifdef __linux__ +/* We're looking for one 'cgroup2' fs mount which has some + * controllers enabled. */ +static bool +virCgroupV2Available(void) +{ + bool ret = false; + FILE *mounts = NULL; + struct mntent entry; + char buf[CGROUP_MAX_VAL]; + + if (!(mounts = fopen("/proc/mounts", "r"))) + return false; + + while (getmntent_r(mounts, &entry, buf, sizeof(buf)) != NULL) { + if (STREQ(entry.mnt_type, "cgroup2")) { + ret = true; + break; + } + } + + /* Systemd uses cgroup v2 for process tracking but no controller is + * available. We should consider this configuration as cgroup v2 is + * not available. */ + if (ret) { + int rc; + VIR_AUTOFREE(char *) contFile = NULL; + VIR_AUTOFREE(char *) contStr = NULL; + + if (virAsprintf(&contFile, "%s/cgroup.controllers", entry.mnt_dir) < 0) + return false; + + rc = virFileReadAll(contFile, 1024 * 1024, &contStr); + if (rc < 0) + return false; + + if (STREQ(contStr, "")) + return false; + } + + VIR_FORCE_FCLOSE(mounts); + return ret; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, + + .available = virCgroupV2Available, }; -- 2.17.1

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
We cannot detect only mount points to figure out whether cgroup v2 is available because systemd uses cgroup v2 for process tracking and all controllers are mounted as cgroup v1 controllers.
To make sure that this is no the situation we need to check 'cgroup.controllers' file if it's not empty to make sure that cgroup v2 is not mounted only for process tracking.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 52 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 23bf81dae2..78497bd31d 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -19,6 +19,10 @@ */ #include <config.h>
+#ifdef __linux__ +# include <mntent.h> +#endif /* __linux__ */ + #include "internal.h"
#define __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ @@ -28,7 +32,9 @@ #include "vircgroup.h" #include "vircgroupbackend.h" #include "vircgroupv2.h" +#include "virfile.h" #include "virlog.h" +#include "virstring.h"
VIR_LOG_INIT("util.cgroup");
@@ -41,8 +47,54 @@ VIR_ENUM_IMPL(virCgroupV2Controller, VIR_CGROUP_CONTROLLER_LAST,
#ifdef __linux__
+/* We're looking for one 'cgroup2' fs mount which has some + * controllers enabled. */ +static bool +virCgroupV2Available(void) +{ + bool ret = false; + FILE *mounts = NULL; + struct mntent entry; + char buf[CGROUP_MAX_VAL]; + + if (!(mounts = fopen("/proc/mounts", "r"))) + return false; + + while (getmntent_r(mounts, &entry, buf, sizeof(buf)) != NULL) { + if (STREQ(entry.mnt_type, "cgroup2")) { + ret = true; + break; + } + } + + /* Systemd uses cgroup v2 for process tracking but no controller is + * available. We should consider this configuration as cgroup v2 is + * not available. */ + if (ret) { + int rc; + VIR_AUTOFREE(char *) contFile = NULL; + VIR_AUTOFREE(char *) contStr = NULL; + + if (virAsprintf(&contFile, "%s/cgroup.controllers", entry.mnt_dir) < 0) + return false; + + rc = virFileReadAll(contFile, 1024 * 1024, &contStr); + if (rc < 0) + return false; + + if (STREQ(contStr, "")) + return false; + } + + VIR_FORCE_FCLOSE(mounts); + return ret;
This is wrong on two levels. Firstly, if any of those conditions under 'if (ret)' fail, then @mounts is leaked. Secondly, @ret is set but then it's ignored and false is returned regardless. Michal

On Wed, Oct 03, 2018 at 04:49:35PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
We cannot detect only mount points to figure out whether cgroup v2 is available because systemd uses cgroup v2 for process tracking and all controllers are mounted as cgroup v1 controllers.
To make sure that this is no the situation we need to check 'cgroup.controllers' file if it's not empty to make sure that cgroup v2 is not mounted only for process tracking.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 52 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 23bf81dae2..78497bd31d 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -19,6 +19,10 @@ */ #include <config.h>
+#ifdef __linux__ +# include <mntent.h> +#endif /* __linux__ */ + #include "internal.h"
#define __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ @@ -28,7 +32,9 @@ #include "vircgroup.h" #include "vircgroupbackend.h" #include "vircgroupv2.h" +#include "virfile.h" #include "virlog.h" +#include "virstring.h"
VIR_LOG_INIT("util.cgroup");
@@ -41,8 +47,54 @@ VIR_ENUM_IMPL(virCgroupV2Controller, VIR_CGROUP_CONTROLLER_LAST,
#ifdef __linux__
+/* We're looking for one 'cgroup2' fs mount which has some + * controllers enabled. */ +static bool +virCgroupV2Available(void) +{ + bool ret = false; + FILE *mounts = NULL; + struct mntent entry; + char buf[CGROUP_MAX_VAL]; + + if (!(mounts = fopen("/proc/mounts", "r"))) + return false; + + while (getmntent_r(mounts, &entry, buf, sizeof(buf)) != NULL) { + if (STREQ(entry.mnt_type, "cgroup2")) { + ret = true; + break; + } + } + + /* Systemd uses cgroup v2 for process tracking but no controller is + * available. We should consider this configuration as cgroup v2 is + * not available. */ + if (ret) { + int rc; + VIR_AUTOFREE(char *) contFile = NULL; + VIR_AUTOFREE(char *) contStr = NULL; + + if (virAsprintf(&contFile, "%s/cgroup.controllers", entry.mnt_dir) < 0) + return false; + + rc = virFileReadAll(contFile, 1024 * 1024, &contStr); + if (rc < 0) + return false; + + if (STREQ(contStr, "")) + return false; + } + + VIR_FORCE_FCLOSE(mounts); + return ret;
This is wrong on two levels. Firstly, if any of those conditions under 'if (ret)' fail, then @mounts is leaked. Secondly, @ret is set but then it's ignored and false is returned regardless.
/me hides under table Right, I totally missed that with all the rebases and code movement. I'll fix that. Thanks Pavel

When reconnecting to a domain we are validating the cgroup name. In case of cgroup v2 we need to validate only the new format for host without systemd '{machinename}.libvirt-{drivername}' or scope name generated by systemd. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 43 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 78497bd31d..17096e52fa 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -35,6 +35,7 @@ #include "virfile.h" #include "virlog.h" #include "virstring.h" +#include "virsystemd.h" VIR_LOG_INIT("util.cgroup"); @@ -91,10 +92,52 @@ virCgroupV2Available(void) } +static bool +virCgroupV2ValidateMachineGroup(virCgroupPtr group, + const char *name ATTRIBUTE_UNUSED, + const char *drivername, + const char *machinename) +{ + VIR_AUTOFREE(char *) partmachinename = NULL; + VIR_AUTOFREE(char *) scopename = NULL; + char *tmp; + + if (virAsprintf(&partmachinename, "%s.libvirt-%s", machinename, + drivername) < 0) { + return false; + } + + if (virCgroupPartitionEscape(&partmachinename) < 0) + return false; + + if (!(scopename = virSystemdMakeScopeName(machinename, drivername, + false))) { + return false; + } + + if (virCgroupPartitionEscape(&scopename) < 0) + return false; + + if (!(tmp = strrchr(group->unified.placement, '/'))) + return false; + tmp++; + + if (STRNEQ(tmp, partmachinename) && + STRNEQ(tmp, scopename)) { + VIR_DEBUG("Name '%s' for unified does not match '%s' or '%s'", + tmp, partmachinename, scopename); + return false; + } + + return true; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, .available = virCgroupV2Available, + .validateMachineGroup = virCgroupV2ValidateMachineGroup, }; -- 2.17.1

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
When reconnecting to a domain we are validating the cgroup name. In case of cgroup v2 we need to validate only the new format for host without systemd '{machinename}.libvirt-{drivername}' or scope name generated by systemd.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 43 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 17096e52fa..d3f72b9006 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -133,11 +133,20 @@ virCgroupV2ValidateMachineGroup(virCgroupPtr group, } +static int +virCgroupV2CopyMounts(virCgroupPtr group, + virCgroupPtr parent) +{ + return VIR_STRDUP(group->unified.mountPoint, parent->unified.mountPoint); +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, .available = virCgroupV2Available, .validateMachineGroup = virCgroupV2ValidateMachineGroup, + .copyMounts = virCgroupV2CopyMounts, }; -- 2.17.1

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 9 +++++++++ 1 file changed, 9 insertions(+)
ACK Michal P.S. This is where I stop my review for today. I'll resume tomorrow.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index d3f72b9006..11d9876d36 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -141,12 +141,39 @@ virCgroupV2CopyMounts(virCgroupPtr group, } +static int +virCgroupV2CopyPlacement(virCgroupPtr group, + const char *path, + virCgroupPtr parent) +{ + if (path[0] == '/') { + if (VIR_STRDUP(group->unified.placement, path) < 0) + return -1; + } else { + /* + * parent == "/" + path="" => "/" + * parent == "/libvirt.service" + path == "" => "/libvirt.service" + * parent == "/libvirt.service" + path == "foo" => "/libvirt.service/foo" + */ + if (virAsprintf(&group->unified.placement, "%s%s%s", + parent->unified.placement, + (STREQ(parent->unified.placement, "/") || + STREQ(path, "") ? "" : "/"), + path) < 0) + return -1; + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, .available = virCgroupV2Available, .validateMachineGroup = virCgroupV2ValidateMachineGroup, .copyMounts = virCgroupV2CopyMounts, + .copyPlacement = virCgroupV2CopyPlacement, }; -- 2.17.1

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index d3f72b9006..11d9876d36 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -141,12 +141,39 @@ virCgroupV2CopyMounts(virCgroupPtr group, }
+static int +virCgroupV2CopyPlacement(virCgroupPtr group, + const char *path, + virCgroupPtr parent) +{ + if (path[0] == '/') { + if (VIR_STRDUP(group->unified.placement, path) < 0) + return -1; + } else {
Maybe it's the lack of morning coffee, but I had some difficulties parsing this.
+ /* + * parent == "/" + path="" => "/" + * parent == "/libvirt.service" + path == "" => "/libvirt.service" + * parent == "/libvirt.service" + path == "foo" => "/libvirt.service/foo"
s/\+/&&/ so that it looks like a C condition.
+ */ + if (virAsprintf(&group->unified.placement, "%s%s%s", + parent->unified.placement, + (STREQ(parent->unified.placement, "/") || + STREQ(path, "") ? "" : "/"), + path) < 0)
Maybe if you got rid of the ternary operator it would be more readable. But now that I finally understood this, I don't care that much :-)
+ return -1; + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
.available = virCgroupV2Available, .validateMachineGroup = virCgroupV2ValidateMachineGroup, .copyMounts = virCgroupV2CopyMounts, + .copyPlacement = virCgroupV2CopyPlacement, };
ACK Michal

On Thu, Oct 04, 2018 at 01:18:27PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index d3f72b9006..11d9876d36 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -141,12 +141,39 @@ virCgroupV2CopyMounts(virCgroupPtr group, }
+static int +virCgroupV2CopyPlacement(virCgroupPtr group, + const char *path, + virCgroupPtr parent) +{ + if (path[0] == '/') { + if (VIR_STRDUP(group->unified.placement, path) < 0) + return -1; + } else {
Maybe it's the lack of morning coffee, but I had some difficulties parsing this.
+ /* + * parent == "/" + path="" => "/" + * parent == "/libvirt.service" + path == "" => "/libvirt.service" + * parent == "/libvirt.service" + path == "foo" => "/libvirt.service/foo"
s/\+/&&/ so that it looks like a C condition.
+ */ + if (virAsprintf(&group->unified.placement, "%s%s%s", + parent->unified.placement, + (STREQ(parent->unified.placement, "/") || + STREQ(path, "") ? "" : "/"), + path) < 0)
Maybe if you got rid of the ternary operator it would be more readable. But now that I finally understood this, I don't care that much :-)
Yes, this bit is kind of weird and should be improved, it's basically copy&paste from cgroupv1 backend code. As a followup we can export it into a function and reuse it in both backends.
+ return -1; + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
.available = virCgroupV2Available, .validateMachineGroup = virCgroupV2ValidateMachineGroup, .copyMounts = virCgroupV2CopyMounts, + .copyPlacement = virCgroupV2CopyPlacement, };
ACK
Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 11d9876d36..eaf07397d5 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -167,6 +167,21 @@ virCgroupV2CopyPlacement(virCgroupPtr group, } +static int +virCgroupV2DetectMounts(virCgroupPtr group, + const char *mntType, + const char *mntOpts ATTRIBUTE_UNUSED, + const char *mntDir) +{ + if (STRNEQ(mntType, "cgroup2")) + return 0; + + VIR_FREE(group->unified.mountPoint); + + return VIR_STRDUP(group->unified.mountPoint, mntDir); +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -174,6 +189,7 @@ virCgroupBackend virCgroupV2Backend = { .validateMachineGroup = virCgroupV2ValidateMachineGroup, .copyMounts = virCgroupV2CopyMounts, .copyPlacement = virCgroupV2CopyPlacement, + .detectMounts = virCgroupV2DetectMounts, }; -- 2.17.1

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 11d9876d36..eaf07397d5 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -167,6 +167,21 @@ virCgroupV2CopyPlacement(virCgroupPtr group, }
+static int +virCgroupV2DetectMounts(virCgroupPtr group, + const char *mntType, + const char *mntOpts ATTRIBUTE_UNUSED, + const char *mntDir) +{ + if (STRNEQ(mntType, "cgroup2")) + return 0; + + VIR_FREE(group->unified.mountPoint); + + return VIR_STRDUP(group->unified.mountPoint, mntDir);
Looking at virCgroupDetectMounts() maybe we could have a new return value to stop going through mount table once we've found what we were looking for? E.g. -1 = an error, 0 not found but continue, 1 found and break the loop. Because the way this is written now - this will always pick up the last cgroup2 mount. I don't think that is a problem right now. Also, probably if you have more than one mount of cgroup2 then you're in a bigger trouble anyway. If you decide to work my suggestion in, then it can be saved as a follow up patch.
+} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -174,6 +189,7 @@ virCgroupBackend virCgroupV2Backend = { .validateMachineGroup = virCgroupV2ValidateMachineGroup, .copyMounts = virCgroupV2CopyMounts, .copyPlacement = virCgroupV2CopyPlacement, + .detectMounts = virCgroupV2DetectMounts, };
ACK Michal

On Thu, Oct 04, 2018 at 01:18:21PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 11d9876d36..eaf07397d5 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -167,6 +167,21 @@ virCgroupV2CopyPlacement(virCgroupPtr group, }
+static int +virCgroupV2DetectMounts(virCgroupPtr group, + const char *mntType, + const char *mntOpts ATTRIBUTE_UNUSED, + const char *mntDir) +{ + if (STRNEQ(mntType, "cgroup2")) + return 0; + + VIR_FREE(group->unified.mountPoint); + + return VIR_STRDUP(group->unified.mountPoint, mntDir);
Looking at virCgroupDetectMounts() maybe we could have a new return value to stop going through mount table once we've found what we were looking for? E.g. -1 = an error, 0 not found but continue, 1 found and break the loop.
Because the way this is written now - this will always pick up the last cgroup2 mount. I don't think that is a problem right now. Also, probably if you have more than one mount of cgroup2 then you're in a bigger trouble anyway. If you decide to work my suggestion in, then it can be saved as a follow up patch.
So I should have probably used the same comment from cgroup v1 backend. We do it there as well because of bind mounts and we are taking into account only the last entry from /proc/mounts. See commit <dacd160d747> for more information. I don't know whether it can/will happen for cgroup v2.
+} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -174,6 +189,7 @@ virCgroupBackend virCgroupV2Backend = { .validateMachineGroup = virCgroupV2ValidateMachineGroup, .copyMounts = virCgroupV2CopyMounts, .copyPlacement = virCgroupV2CopyPlacement, + .detectMounts = virCgroupV2DetectMounts, };
ACK
Michal

If the placement was copied from parent or set to absolute path there is nothing to do, otherwise set the placement based on process placement from /proc/self/cgroup or /proc/{pid}/cgroup. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index eaf07397d5..ed94d5da17 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -182,6 +182,31 @@ virCgroupV2DetectMounts(virCgroupPtr group, } +static int +virCgroupV2DetectPlacement(virCgroupPtr group, + const char *path, + const char *controllers ATTRIBUTE_UNUSED, + const char *selfpath) +{ + if (group->unified.placement) + return 0; + + /* + * selfpath == "/" + path="" -> "/" + * selfpath == "/libvirt.service" + path == "" -> "/libvirt.service" + * selfpath == "/libvirt.service" + path == "foo" -> "/libvirt.service/foo" + */ + if (virAsprintf(&group->unified.placement, + "%s%s%s", selfpath, + (STREQ(selfpath, "/") || + STREQ(path, "") ? "" : "/"), + path) < 0) + return -1; + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -190,6 +215,7 @@ virCgroupBackend virCgroupV2Backend = { .copyMounts = virCgroupV2CopyMounts, .copyPlacement = virCgroupV2CopyPlacement, .detectMounts = virCgroupV2DetectMounts, + .detectPlacement = virCgroupV2DetectPlacement, }; -- 2.17.1

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
If the placement was copied from parent or set to absolute path there is nothing to do, otherwise set the placement based on process placement from /proc/self/cgroup or /proc/{pid}/cgroup.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index eaf07397d5..ed94d5da17 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -182,6 +182,31 @@ virCgroupV2DetectMounts(virCgroupPtr group, }
+static int +virCgroupV2DetectPlacement(virCgroupPtr group, + const char *path, + const char *controllers ATTRIBUTE_UNUSED, + const char *selfpath) +{ + if (group->unified.placement) + return 0; + + /* + * selfpath == "/" + path="" -> "/" + * selfpath == "/libvirt.service" + path == "" -> "/libvirt.service" + * selfpath == "/libvirt.service" + path == "foo" -> "/libvirt.service/foo" + */ + if (virAsprintf(&group->unified.placement, + "%s%s%s", selfpath, + (STREQ(selfpath, "/") || + STREQ(path, "") ? "" : "/"), + path) < 0) + return -1;
Same comment here as in 05/53.
+ + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -190,6 +215,7 @@ virCgroupBackend virCgroupV2Backend = { .copyMounts = virCgroupV2CopyMounts, .copyPlacement = virCgroupV2CopyPlacement, .detectMounts = virCgroupV2DetectMounts, + .detectPlacement = virCgroupV2DetectPlacement, };
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index ed94d5da17..23443abfbb 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -207,6 +207,20 @@ virCgroupV2DetectPlacement(virCgroupPtr group, } +static int +virCgroupV2ValidatePlacement(virCgroupPtr group, + pid_t pid ATTRIBUTE_UNUSED) +{ + if (!group->unified.placement) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not find placement for v2 controller")); + return -1; + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -216,6 +230,7 @@ virCgroupBackend virCgroupV2Backend = { .copyPlacement = virCgroupV2CopyPlacement, .detectMounts = virCgroupV2DetectMounts, .detectPlacement = virCgroupV2DetectPlacement, + .validatePlacement = virCgroupV2ValidatePlacement, }; -- 2.17.1

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 23443abfbb..d4e0366780 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -221,6 +221,17 @@ virCgroupV2ValidatePlacement(virCgroupPtr group, } +static char * +virCgroupV2StealPlacement(virCgroupPtr group) +{ + char *ret; + + VIR_STEAL_PTR(ret, group->unified.placement); + + return ret; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -231,6 +242,7 @@ virCgroupBackend virCgroupV2Backend = { .detectMounts = virCgroupV2DetectMounts, .detectPlacement = virCgroupV2DetectPlacement, .validatePlacement = virCgroupV2ValidatePlacement, + .stealPlacement = virCgroupV2StealPlacement, }; -- 2.17.1

Cgroup v2 has only single mount point for all controllers. The list of controllers is stored in cgroup.controllers file, name of controllers are separated by space. In cgroup v2 there is no cpuacct controller, the cpu.stat file always exists with usage stats. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 67 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index d4e0366780..3200338fe3 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -32,6 +32,7 @@ #include "vircgroup.h" #include "vircgroupbackend.h" #include "vircgroupv2.h" +#include "virerror.h" #include "virfile.h" #include "virlog.h" #include "virstring.h" @@ -232,6 +233,71 @@ virCgroupV2StealPlacement(virCgroupPtr group) } +static int +virCgroupV2ParseControllersFile(virCgroupPtr group) +{ + int rc; + VIR_AUTOFREE(char *) contStr = NULL; + VIR_AUTOFREE(char *) contFile = NULL; + char **contList = NULL; + char **tmp; + + if (virAsprintf(&contFile, "%s/cgroup.controllers", + group->unified.mountPoint) < 0) + return -1; + + rc = virFileReadAll(contFile, 1024 * 1024, &contStr); + if (rc < 0) { + virReportSystemError(errno, _("Unable to read from '%s'"), contFile); + return -1; + } + + virTrimSpaces(contStr, NULL); + + contList = virStringSplit(contStr, " ", 20); + if (!contList) + return -1; + + tmp = contList; + + while (*tmp) { + int type = virCgroupV2ControllerTypeFromString(*tmp); + + if (type >= 0) + group->unified.controllers |= 1 << type; + + tmp++; + } + + virStringListFree(contList); + + return 0; +} + + +static int +virCgroupV2DetectControllers(virCgroupPtr group, + int controllers) +{ + size_t i; + + if (virCgroupV2ParseControllersFile(group) < 0) + return -1; + + group->unified.controllers |= 1 << VIR_CGROUP_CONTROLLER_CPUACCT; + + for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) + VIR_DEBUG("Controller '%s' present=%s", + virCgroupV2ControllerTypeToString(i), + (group->unified.controllers & 1 << i) ? "yes" : "no"); + + if (controllers >= 0) + return controllers & group->unified.controllers; + else + return group->unified.controllers; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -243,6 +309,7 @@ virCgroupBackend virCgroupV2Backend = { .detectPlacement = virCgroupV2DetectPlacement, .validatePlacement = virCgroupV2ValidatePlacement, .stealPlacement = virCgroupV2StealPlacement, + .detectControllers = virCgroupV2DetectControllers, }; -- 2.17.1

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Cgroup v2 has only single mount point for all controllers. The list of controllers is stored in cgroup.controllers file, name of controllers are separated by space.
In cgroup v2 there is no cpuacct controller, the cpu.stat file always exists with usage stats.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 67 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index d4e0366780..3200338fe3 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -32,6 +32,7 @@ #include "vircgroup.h" #include "vircgroupbackend.h" #include "vircgroupv2.h" +#include "virerror.h" #include "virfile.h" #include "virlog.h" #include "virstring.h" @@ -232,6 +233,71 @@ virCgroupV2StealPlacement(virCgroupPtr group) }
+static int +virCgroupV2ParseControllersFile(virCgroupPtr group) +{ + int rc; + VIR_AUTOFREE(char *) contStr = NULL; + VIR_AUTOFREE(char *) contFile = NULL; + char **contList = NULL; + char **tmp; + + if (virAsprintf(&contFile, "%s/cgroup.controllers", + group->unified.mountPoint) < 0) + return -1; + + rc = virFileReadAll(contFile, 1024 * 1024, &contStr); + if (rc < 0) { + virReportSystemError(errno, _("Unable to read from '%s'"), contFile); + return -1; + } + + virTrimSpaces(contStr, NULL); + + contList = virStringSplit(contStr, " ", 20); + if (!contList) + return -1; + + tmp = contList; + + while (*tmp) { + int type = virCgroupV2ControllerTypeFromString(*tmp); + + if (type >= 0) + group->unified.controllers |= 1 << type; + + tmp++; + } + + virStringListFree(contList); + + return 0; +} + + +static int +virCgroupV2DetectControllers(virCgroupPtr group, + int controllers) +{ + size_t i; + + if (virCgroupV2ParseControllersFile(group) < 0) + return -1; + + group->unified.controllers |= 1 << VIR_CGROUP_CONTROLLER_CPUACCT;
I'd put a comment here describing what you do in the commit message. Sure I can go to commit message if I want to, but a comment is more handy.
+ + for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) + VIR_DEBUG("Controller '%s' present=%s", + virCgroupV2ControllerTypeToString(i), + (group->unified.controllers & 1 << i) ? "yes" : "no"); + + if (controllers >= 0) + return controllers & group->unified.controllers; + else + return group->unified.controllers; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -243,6 +309,7 @@ virCgroupBackend virCgroupV2Backend = { .detectPlacement = virCgroupV2DetectPlacement, .validatePlacement = virCgroupV2ValidatePlacement, .stealPlacement = virCgroupV2StealPlacement, + .detectControllers = virCgroupV2DetectControllers, };
ACK Michal

On Thu, Oct 04, 2018 at 01:18:16PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Cgroup v2 has only single mount point for all controllers. The list of controllers is stored in cgroup.controllers file, name of controllers are separated by space.
In cgroup v2 there is no cpuacct controller, the cpu.stat file always exists with usage stats.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 67 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index d4e0366780..3200338fe3 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -32,6 +32,7 @@ #include "vircgroup.h" #include "vircgroupbackend.h" #include "vircgroupv2.h" +#include "virerror.h" #include "virfile.h" #include "virlog.h" #include "virstring.h" @@ -232,6 +233,71 @@ virCgroupV2StealPlacement(virCgroupPtr group) }
+static int +virCgroupV2ParseControllersFile(virCgroupPtr group) +{ + int rc; + VIR_AUTOFREE(char *) contStr = NULL; + VIR_AUTOFREE(char *) contFile = NULL; + char **contList = NULL; + char **tmp; + + if (virAsprintf(&contFile, "%s/cgroup.controllers", + group->unified.mountPoint) < 0) + return -1; + + rc = virFileReadAll(contFile, 1024 * 1024, &contStr); + if (rc < 0) { + virReportSystemError(errno, _("Unable to read from '%s'"), contFile); + return -1; + } + + virTrimSpaces(contStr, NULL); + + contList = virStringSplit(contStr, " ", 20); + if (!contList) + return -1; + + tmp = contList; + + while (*tmp) { + int type = virCgroupV2ControllerTypeFromString(*tmp); + + if (type >= 0) + group->unified.controllers |= 1 << type; + + tmp++; + } + + virStringListFree(contList); + + return 0; +} + + +static int +virCgroupV2DetectControllers(virCgroupPtr group, + int controllers) +{ + size_t i; + + if (virCgroupV2ParseControllersFile(group) < 0) + return -1; + + group->unified.controllers |= 1 << VIR_CGROUP_CONTROLLER_CPUACCT;
I'd put a comment here describing what you do in the commit message. Sure I can go to commit message if I want to, but a comment is more handy.
Since I was considering it I'll fix it before pushing, thanks.
+ + for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) + VIR_DEBUG("Controller '%s' present=%s", + virCgroupV2ControllerTypeToString(i), + (group->unified.controllers & 1 << i) ? "yes" : "no"); + + if (controllers >= 0) + return controllers & group->unified.controllers; + else + return group->unified.controllers; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -243,6 +309,7 @@ virCgroupBackend virCgroupV2Backend = { .detectPlacement = virCgroupV2DetectPlacement, .validatePlacement = virCgroupV2ValidatePlacement, .stealPlacement = virCgroupV2StealPlacement, + .detectControllers = virCgroupV2DetectControllers, };
ACK
Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 3200338fe3..5cd8e86b97 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -298,6 +298,14 @@ virCgroupV2DetectControllers(virCgroupPtr group, } +static bool +virCgroupV2HasController(virCgroupPtr group, + int controller) +{ + return group->unified.controllers & (1 << controller); +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -310,6 +318,7 @@ virCgroupBackend virCgroupV2Backend = { .validatePlacement = virCgroupV2ValidatePlacement, .stealPlacement = virCgroupV2StealPlacement, .detectControllers = virCgroupV2DetectControllers, + .hasController = virCgroupV2HasController, }; -- 2.17.1

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 5cd8e86b97..57be0b92c9 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -306,6 +306,13 @@ virCgroupV2HasController(virCgroupPtr group, } +static int +virCgroupV2GetAnyController(virCgroupPtr group) +{ + return ffs(group->unified.controllers) - 1; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -319,6 +326,7 @@ virCgroupBackend virCgroupV2Backend = { .stealPlacement = virCgroupV2StealPlacement, .detectControllers = virCgroupV2DetectControllers, .hasController = virCgroupV2HasController, + .getAnyController = virCgroupV2GetAnyController, }; -- 2.17.1

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 5cd8e86b97..57be0b92c9 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -306,6 +306,13 @@ virCgroupV2HasController(virCgroupPtr group, }
+static int +virCgroupV2GetAnyController(virCgroupPtr group) +{ + return ffs(group->unified.controllers) - 1;
Ah, for ffs() bits are numbered from 1. Okay, it just took me a while to realize that. Because in C we don't usually index things from 1, do we?
+} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -319,6 +326,7 @@ virCgroupBackend virCgroupV2Backend = { .stealPlacement = virCgroupV2StealPlacement, .detectControllers = virCgroupV2DetectControllers, .hasController = virCgroupV2HasController, + .getAnyController = virCgroupV2GetAnyController, };
ACK Michal

On Thu, Oct 04, 2018 at 01:18:14PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 5cd8e86b97..57be0b92c9 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -306,6 +306,13 @@ virCgroupV2HasController(virCgroupPtr group, }
+static int +virCgroupV2GetAnyController(virCgroupPtr group) +{ + return ffs(group->unified.controllers) - 1;
Ah, for ffs() bits are numbered from 1. Okay, it just took me a while to realize that. Because in C we don't usually index things from 1, do we?
I figured that out the hard way :) going through debug logs and investigating why the code is not working how it should. Reading man page would probably help me to realize that sooner.
+} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -319,6 +326,7 @@ virCgroupBackend virCgroupV2Backend = { .stealPlacement = virCgroupV2StealPlacement, .detectControllers = virCgroupV2DetectControllers, .hasController = virCgroupV2HasController, + .getAnyController = virCgroupV2GetAnyController, };
ACK
Michal

On Thu, 2018-10-04 at 13:41 +0200, Pavel Hrdina wrote:
On Thu, Oct 04, 2018 at 01:18:14PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 5cd8e86b97..57be0b92c9 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -306,6 +306,13 @@ virCgroupV2HasController(virCgroupPtr group, }
+static int +virCgroupV2GetAnyController(virCgroupPtr group) +{ + return ffs(group->unified.controllers) - 1;
Ah, for ffs() bits are numbered from 1. Okay, it just took me a while to realize that. Because in C we don't usually index things from 1, do we?
I figured that out the hard way :) going through debug logs and investigating why the code is not working how it should. Reading man page would probably help me to realize that sooner.
If you feel like doing so, please, add a comment here because I'm pretty sure someone else in the future will go through this code and spend some time trying to understand that " The least significant bit is position 1 and the most significant position is, for example, 32 or 64." (taken from their manpage).
+} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -319,6 +326,7 @@ virCgroupBackend virCgroupV2Backend = { .stealPlacement = virCgroupV2StealPlacement, .detectControllers = virCgroupV2DetectControllers, .hasController = virCgroupV2HasController, + .getAnyController = virCgroupV2GetAnyController, };
ACK
Michal
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 57be0b92c9..3ca463e4c2 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -313,6 +313,29 @@ virCgroupV2GetAnyController(virCgroupPtr group) } +static int +virCgroupV2PathOfController(virCgroupPtr group, + int controller, + const char *key, + char **path) +{ + if (!virCgroupV2HasController(group, controller)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("v2 controller '%s' is not available"), + virCgroupV2ControllerTypeToString(controller)); + return -1; + } + + if (virAsprintf(path, "%s%s/%s", + group->unified.mountPoint, + group->unified.placement, + key ? key : "") < 0) + return -1; + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -327,6 +350,7 @@ virCgroupBackend virCgroupV2Backend = { .detectControllers = virCgroupV2DetectControllers, .hasController = virCgroupV2HasController, .getAnyController = virCgroupV2GetAnyController, + .pathOfController = virCgroupV2PathOfController, }; -- 2.17.1

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 57be0b92c9..3ca463e4c2 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -313,6 +313,29 @@ virCgroupV2GetAnyController(virCgroupPtr group) }
+static int +virCgroupV2PathOfController(virCgroupPtr group, + int controller, + const char *key, + char **path) +{ + if (!virCgroupV2HasController(group, controller)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("v2 controller '%s' is not available"), + virCgroupV2ControllerTypeToString(controller)); + return -1; + }
Looks like this check ^^ can be moved to the caller: virCgroupPathOfController() { if (!virCgroupHasController()) error(); group->backend->pathOfController(); } But feel free to save that for a follow up patch.
+ + if (virAsprintf(path, "%s%s/%s", + group->unified.mountPoint, + group->unified.placement, + key ? key : "") < 0) + return -1; + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -327,6 +350,7 @@ virCgroupBackend virCgroupV2Backend = { .detectControllers = virCgroupV2DetectControllers, .hasController = virCgroupV2HasController, .getAnyController = virCgroupV2GetAnyController, + .pathOfController = virCgroupV2PathOfController, };
ACK Michal

On Thu, Oct 04, 2018 at 01:18:12PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 57be0b92c9..3ca463e4c2 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -313,6 +313,29 @@ virCgroupV2GetAnyController(virCgroupPtr group) }
+static int +virCgroupV2PathOfController(virCgroupPtr group, + int controller, + const char *key, + char **path) +{ + if (!virCgroupV2HasController(group, controller)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("v2 controller '%s' is not available"), + virCgroupV2ControllerTypeToString(controller)); + return -1; + }
Looks like this check ^^ can be moved to the caller:
virCgroupPathOfController() { if (!virCgroupHasController()) error();
group->backend->pathOfController(); }
But feel free to save that for a follow up patch.
Technically yes, but once hybrid support is enabled using the backend specific function in each backend will save some cycles to figure out which backend to use. Originally I had it like that but I realized that moving it to backend I can use the backend specific *HasController.
+ + if (virAsprintf(path, "%s%s/%s", + group->unified.mountPoint, + group->unified.placement, + key ? key : "") < 0) + return -1; + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -327,6 +350,7 @@ virCgroupBackend virCgroupV2Backend = { .detectControllers = virCgroupV2DetectControllers, .hasController = virCgroupV2HasController, .getAnyController = virCgroupV2GetAnyController, + .pathOfController = virCgroupV2PathOfController, };
ACK
Michal

When creating cgroup hierarchy we need to enable controllers in the parent cgroup in order to be usable. That means writing "+{controller}" into cgroup.subtree_control file. We can enable only controllers that are enabled for parent cgroup, that means we need to do that for the whole cgroup tree. Cgroups for threads needs to be handled differently in cgroup v2. There are two types of controllers: - domain controllers: these cannot be enabled for threads - threaded controllers: these can be enabled for threads In addition there are multiple types of cgroups: - domain: normal cgroup - domain threaded: a domain cgroup that serves as root for threaded cgroups - domain invalid: invalid cgroup, can be changed into threaded, this is the default state if you create subgroup inside domain threaded group or threaded group - threaded: threaded cgroup which can have domain threaded or threaded as parent group In order to create threaded cgroup it's sufficient to write "threaded" into cgroup.type file, it will automatically make parent cgroup "domain threaded" if it was only "domain". In case the parent cgroup is already "domain threaded" or "threaded" it will modify only the type of current cgroup. After that we can enable threaded controllers. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroup.c | 2 +- src/util/vircgroupbackend.h | 1 + src/util/vircgroupv2.c | 78 +++++++++++++++++++++++++++++++++++++ 3 files changed, 80 insertions(+), 1 deletion(-) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 1097b1f998..dc249bfe33 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -938,7 +938,7 @@ virCgroupNewThread(virCgroupPtr domain, if (virCgroupNew(-1, name, domain, controllers, group) < 0) return -1; - if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE) < 0) { + if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_THREAD) < 0) { virCgroupFree(group); return -1; } diff --git a/src/util/vircgroupbackend.h b/src/util/vircgroupbackend.h index b1f19233e4..86d1539e07 100644 --- a/src/util/vircgroupbackend.h +++ b/src/util/vircgroupbackend.h @@ -33,6 +33,7 @@ typedef enum { * before creating subcgroups and * attaching tasks */ + VIR_CGROUP_THREAD = 1 << 1, /* cgroup v2 handles threads differently */ } virCgroupBackendFlags; typedef enum { diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 3ca463e4c2..8fb9ace474 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -336,6 +336,83 @@ virCgroupV2PathOfController(virCgroupPtr group, } +static int +virCgroupV2EnableController(virCgroupPtr parent, + int controller) +{ + VIR_AUTOFREE(char *) val = NULL; + + if (virAsprintf(&val, "+%s", + virCgroupV2ControllerTypeToString(controller)) < 0) { + return -1; + } + + if (virCgroupSetValueStr(parent, controller, + "cgroup.subtree_control", val) < 0) { + return -1; + } + + return 0; +} + + +static int +virCgroupV2MakeGroup(virCgroupPtr parent ATTRIBUTE_UNUSED, + virCgroupPtr group, + bool create, + unsigned int flags) +{ + VIR_AUTOFREE(char *) path = NULL; + int controller; + + VIR_DEBUG("Make group %s", group->path); + + controller = virCgroupV2GetAnyController(group); + if (virCgroupV2PathOfController(group, controller, "", &path) < 0) + return -1; + + VIR_DEBUG("Make controller %s", path); + + if (!virFileExists(path)) { + if (!create || mkdir(path, 0755 || errno == EEXIST) < 0) { + virReportSystemError(errno, + _("Failed to create v2 cgroup '%s'"), + group->path); + return -1; + } + } + + if (create) { + if (flags & VIR_CGROUP_THREAD) { + if (virCgroupSetValueStr(group, VIR_CGROUP_CONTROLLER_CPU, + "cgroup.type", "threaded") < 0) { + return -1; + } + + if (virCgroupV2EnableController(parent, + VIR_CGROUP_CONTROLLER_CPU) < 0) { + return -1; + } + } else { + size_t i; + for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) { + if (!virCgroupV2HasController(parent, i)) + continue; + + /* Controllers that are implicitly enabled if available. */ + if (i == VIR_CGROUP_CONTROLLER_CPUACCT) + continue; + + if (virCgroupV2EnableController(parent, i) < 0) + return -1; + } + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -351,6 +428,7 @@ virCgroupBackend virCgroupV2Backend = { .hasController = virCgroupV2HasController, .getAnyController = virCgroupV2GetAnyController, .pathOfController = virCgroupV2PathOfController, + .makeGroup = virCgroupV2MakeGroup, }; -- 2.17.1

On Tue, 2018-10-02 at 10:43 +0200, Pavel Hrdina wrote:
When creating cgroup hierarchy we need to enable controllers in the parent cgroup in order to be usable. That means writing "+{controller}" into cgroup.subtree_control file. We can enable only controllers that are enabled for parent cgroup, that means we need to do that for the whole cgroup tree.
Cgroups for threads needs to be handled differently in cgroup v2. There are two types of controllers:
- domain controllers: these cannot be enabled for threads - threaded controllers: these can be enabled for threads
In addition there are multiple types of cgroups:
- domain: normal cgroup - domain threaded: a domain cgroup that serves as root for threaded cgroups - domain invalid: invalid cgroup, can be changed into threaded, this is the default state if you create subgroup inside domain threaded group or threaded group - threaded: threaded cgroup which can have domain threaded or threaded as parent group
In order to create threaded cgroup it's sufficient to write "threaded" into cgroup.type file, it will automatically make parent cgroup "domain threaded" if it was only "domain". In case the parent cgroup is already "domain threaded" or "threaded" it will modify only the type of current cgroup. After that we can enable threaded controllers.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroup.c | 2 +- src/util/vircgroupbackend.h | 1 + src/util/vircgroupv2.c | 78 +++++++++++++++++++++++++++++++++++++ 3 files changed, 80 insertions(+), 1 deletion(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 1097b1f998..dc249bfe33 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -938,7 +938,7 @@ virCgroupNewThread(virCgroupPtr domain, if (virCgroupNew(-1, name, domain, controllers, group) < 0) return -1;
- if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE) < 0) { + if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_THREAD) < 0) { virCgroupFree(group); return -1; } diff --git a/src/util/vircgroupbackend.h b/src/util/vircgroupbackend.h index b1f19233e4..86d1539e07 100644 --- a/src/util/vircgroupbackend.h +++ b/src/util/vircgroupbackend.h @@ -33,6 +33,7 @@ typedef enum { * before creating subcgroups and * attaching tasks */ + VIR_CGROUP_THREAD = 1 << 1, /* cgroup v2 handles threads differently */ } virCgroupBackendFlags;
typedef enum { diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 3ca463e4c2..8fb9ace474 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -336,6 +336,83 @@ virCgroupV2PathOfController(virCgroupPtr group, }
+static int +virCgroupV2EnableController(virCgroupPtr parent, + int controller) +{ + VIR_AUTOFREE(char *) val = NULL; + + if (virAsprintf(&val, "+%s", + virCgroupV2ControllerTypeToString(controller)) < 0) { + return -1; + } + + if (virCgroupSetValueStr(parent, controller, + "cgroup.subtree_control", val) < 0) { + return -1; + } + + return 0; +} + + +static int +virCgroupV2MakeGroup(virCgroupPtr parent ATTRIBUTE_UNUSED, + virCgroupPtr group, + bool create, + unsigned int flags) +{ + VIR_AUTOFREE(char *) path = NULL; + int controller; + + VIR_DEBUG("Make group %s", group->path); + + controller = virCgroupV2GetAnyController(group); + if (virCgroupV2PathOfController(group, controller, "", &path) < 0) + return -1; + + VIR_DEBUG("Make controller %s", path); + + if (!virFileExists(path)) { + if (!create || mkdir(path, 0755 || errno == EEXIST) < 0) {
The `mdkir(path, 0755 || errno == EEXIST) < 0` part of the check doesn't look right.

On Thu, Oct 04, 2018 at 02:43:18PM +0200, Fabiano Fidêncio wrote:
On Tue, 2018-10-02 at 10:43 +0200, Pavel Hrdina wrote:
When creating cgroup hierarchy we need to enable controllers in the parent cgroup in order to be usable. That means writing "+{controller}" into cgroup.subtree_control file. We can enable only controllers that are enabled for parent cgroup, that means we need to do that for the whole cgroup tree.
Cgroups for threads needs to be handled differently in cgroup v2. There are two types of controllers:
- domain controllers: these cannot be enabled for threads - threaded controllers: these can be enabled for threads
In addition there are multiple types of cgroups:
- domain: normal cgroup - domain threaded: a domain cgroup that serves as root for threaded cgroups - domain invalid: invalid cgroup, can be changed into threaded, this is the default state if you create subgroup inside domain threaded group or threaded group - threaded: threaded cgroup which can have domain threaded or threaded as parent group
In order to create threaded cgroup it's sufficient to write "threaded" into cgroup.type file, it will automatically make parent cgroup "domain threaded" if it was only "domain". In case the parent cgroup is already "domain threaded" or "threaded" it will modify only the type of current cgroup. After that we can enable threaded controllers.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroup.c | 2 +- src/util/vircgroupbackend.h | 1 + src/util/vircgroupv2.c | 78 +++++++++++++++++++++++++++++++++++++ 3 files changed, 80 insertions(+), 1 deletion(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 1097b1f998..dc249bfe33 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -938,7 +938,7 @@ virCgroupNewThread(virCgroupPtr domain, if (virCgroupNew(-1, name, domain, controllers, group) < 0) return -1;
- if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE) < 0) { + if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_THREAD) < 0) { virCgroupFree(group); return -1; } diff --git a/src/util/vircgroupbackend.h b/src/util/vircgroupbackend.h index b1f19233e4..86d1539e07 100644 --- a/src/util/vircgroupbackend.h +++ b/src/util/vircgroupbackend.h @@ -33,6 +33,7 @@ typedef enum { * before creating subcgroups and * attaching tasks */ + VIR_CGROUP_THREAD = 1 << 1, /* cgroup v2 handles threads differently */ } virCgroupBackendFlags;
typedef enum { diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 3ca463e4c2..8fb9ace474 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -336,6 +336,83 @@ virCgroupV2PathOfController(virCgroupPtr group, }
+static int +virCgroupV2EnableController(virCgroupPtr parent, + int controller) +{ + VIR_AUTOFREE(char *) val = NULL; + + if (virAsprintf(&val, "+%s", + virCgroupV2ControllerTypeToString(controller)) < 0) { + return -1; + } + + if (virCgroupSetValueStr(parent, controller, + "cgroup.subtree_control", val) < 0) { + return -1; + } + + return 0; +} + + +static int +virCgroupV2MakeGroup(virCgroupPtr parent ATTRIBUTE_UNUSED, + virCgroupPtr group, + bool create, + unsigned int flags) +{ + VIR_AUTOFREE(char *) path = NULL; + int controller; + + VIR_DEBUG("Make group %s", group->path); + + controller = virCgroupV2GetAnyController(group); + if (virCgroupV2PathOfController(group, controller, "", &path) < 0) + return -1; + + VIR_DEBUG("Make controller %s", path); + + if (!virFileExists(path)) { + if (!create || mkdir(path, 0755 || errno == EEXIST) < 0) {
The `mdkir(path, 0755 || errno == EEXIST) < 0` part of the check doesn't look right.
Nice catch, I'll fix that, thanks. Pavel

On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
When creating cgroup hierarchy we need to enable controllers in the parent cgroup in order to be usable. That means writing "+{controller}" into cgroup.subtree_control file. We can enable only controllers that are enabled for parent cgroup, that means we need to do that for the whole cgroup tree.
Cgroups for threads needs to be handled differently in cgroup v2. There are two types of controllers:
- domain controllers: these cannot be enabled for threads - threaded controllers: these can be enabled for threads
In addition there are multiple types of cgroups:
- domain: normal cgroup - domain threaded: a domain cgroup that serves as root for threaded cgroups - domain invalid: invalid cgroup, can be changed into threaded, this is the default state if you create subgroup inside domain threaded group or threaded group - threaded: threaded cgroup which can have domain threaded or threaded as parent group
In order to create threaded cgroup it's sufficient to write "threaded" into cgroup.type file, it will automatically make parent cgroup "domain threaded" if it was only "domain". In case the parent cgroup is already "domain threaded" or "threaded" it will modify only the type of current cgroup. After that we can enable threaded controllers.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroup.c | 2 +- src/util/vircgroupbackend.h | 1 + src/util/vircgroupv2.c | 78 +++++++++++++++++++++++++++++++++++++ 3 files changed, 80 insertions(+), 1 deletion(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 1097b1f998..dc249bfe33 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -938,7 +938,7 @@ virCgroupNewThread(virCgroupPtr domain, if (virCgroupNew(-1, name, domain, controllers, group) < 0) return -1;
- if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE) < 0) { + if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_THREAD) < 0) {
So unsupported flags are ignored?
virCgroupFree(group); return -1; } diff --git a/src/util/vircgroupbackend.h b/src/util/vircgroupbackend.h index b1f19233e4..86d1539e07 100644 --- a/src/util/vircgroupbackend.h +++ b/src/util/vircgroupbackend.h @@ -33,6 +33,7 @@ typedef enum { * before creating subcgroups and * attaching tasks */ + VIR_CGROUP_THREAD = 1 << 1, /* cgroup v2 handles threads differently */ } virCgroupBackendFlags;
typedef enum { diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 3ca463e4c2..8fb9ace474 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -336,6 +336,83 @@ virCgroupV2PathOfController(virCgroupPtr group, }
+static int +virCgroupV2EnableController(virCgroupPtr parent, + int controller) +{ + VIR_AUTOFREE(char *) val = NULL; + + if (virAsprintf(&val, "+%s", + virCgroupV2ControllerTypeToString(controller)) < 0) { + return -1; + } + + if (virCgroupSetValueStr(parent, controller, + "cgroup.subtree_control", val) < 0) { + return -1; + } + + return 0; +} + + +static int +virCgroupV2MakeGroup(virCgroupPtr parent ATTRIBUTE_UNUSED, + virCgroupPtr group, + bool create, + unsigned int flags) +{ + VIR_AUTOFREE(char *) path = NULL; + int controller; + + VIR_DEBUG("Make group %s", group->path); + + controller = virCgroupV2GetAnyController(group); + if (virCgroupV2PathOfController(group, controller, "", &path) < 0) + return -1; + + VIR_DEBUG("Make controller %s", path); + + if (!virFileExists(path)) {
This should have been virFileIsDir() because EEXIST doesn't mean the @path is a directory. It could be a regular file.
+ if (!create || mkdir(path, 0755 || errno == EEXIST) < 0) {
Ooops, misplaced ')'. mkdir() || errno ;-) This is why I tend to write each condition on a separate line. But if you want to ignore EEXIST then you need to rewrite this check as follows: if (!create || (mkdir() < 0 && errno != EEXIST)) However, I don't think you want to ignore any errno. Also, any reason these two if()-s should not be merged into one?
+ virReportSystemError(errno, + _("Failed to create v2 cgroup '%s'"), + group->path); + return -1; + } + } + + if (create) { + if (flags & VIR_CGROUP_THREAD) { + if (virCgroupSetValueStr(group, VIR_CGROUP_CONTROLLER_CPU, + "cgroup.type", "threaded") < 0) { + return -1; + } + + if (virCgroupV2EnableController(parent, + VIR_CGROUP_CONTROLLER_CPU) < 0) { + return -1; + } + } else { + size_t i; + for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) { + if (!virCgroupV2HasController(parent, i)) + continue; + + /* Controllers that are implicitly enabled if available. */ + if (i == VIR_CGROUP_CONTROLLER_CPUACCT) + continue; + + if (virCgroupV2EnableController(parent, i) < 0) + return -1; + } + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -351,6 +428,7 @@ virCgroupBackend virCgroupV2Backend = { .hasController = virCgroupV2HasController, .getAnyController = virCgroupV2GetAnyController, .pathOfController = virCgroupV2PathOfController, + .makeGroup = virCgroupV2MakeGroup, };
Michal

On Thu, Oct 04, 2018 at 02:55:25PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
When creating cgroup hierarchy we need to enable controllers in the parent cgroup in order to be usable. That means writing "+{controller}" into cgroup.subtree_control file. We can enable only controllers that are enabled for parent cgroup, that means we need to do that for the whole cgroup tree.
Cgroups for threads needs to be handled differently in cgroup v2. There are two types of controllers:
- domain controllers: these cannot be enabled for threads - threaded controllers: these can be enabled for threads
In addition there are multiple types of cgroups:
- domain: normal cgroup - domain threaded: a domain cgroup that serves as root for threaded cgroups - domain invalid: invalid cgroup, can be changed into threaded, this is the default state if you create subgroup inside domain threaded group or threaded group - threaded: threaded cgroup which can have domain threaded or threaded as parent group
In order to create threaded cgroup it's sufficient to write "threaded" into cgroup.type file, it will automatically make parent cgroup "domain threaded" if it was only "domain". In case the parent cgroup is already "domain threaded" or "threaded" it will modify only the type of current cgroup. After that we can enable threaded controllers.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroup.c | 2 +- src/util/vircgroupbackend.h | 1 + src/util/vircgroupv2.c | 78 +++++++++++++++++++++++++++++++++++++ 3 files changed, 80 insertions(+), 1 deletion(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 1097b1f998..dc249bfe33 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -938,7 +938,7 @@ virCgroupNewThread(virCgroupPtr domain, if (virCgroupNew(-1, name, domain, controllers, group) < 0) return -1;
- if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE) < 0) { + if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_THREAD) < 0) {
So unsupported flags are ignored?
Correct, each backend will create the directories differently and there are different rules applies, so the flag can be ignored if there is no need to handle it.
virCgroupFree(group); return -1; } diff --git a/src/util/vircgroupbackend.h b/src/util/vircgroupbackend.h index b1f19233e4..86d1539e07 100644 --- a/src/util/vircgroupbackend.h +++ b/src/util/vircgroupbackend.h @@ -33,6 +33,7 @@ typedef enum { * before creating subcgroups and * attaching tasks */ + VIR_CGROUP_THREAD = 1 << 1, /* cgroup v2 handles threads differently */ } virCgroupBackendFlags;
typedef enum { diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 3ca463e4c2..8fb9ace474 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -336,6 +336,83 @@ virCgroupV2PathOfController(virCgroupPtr group, }
+static int +virCgroupV2EnableController(virCgroupPtr parent, + int controller) +{ + VIR_AUTOFREE(char *) val = NULL; + + if (virAsprintf(&val, "+%s", + virCgroupV2ControllerTypeToString(controller)) < 0) { + return -1; + } + + if (virCgroupSetValueStr(parent, controller, + "cgroup.subtree_control", val) < 0) { + return -1; + } + + return 0; +} + + +static int +virCgroupV2MakeGroup(virCgroupPtr parent ATTRIBUTE_UNUSED, + virCgroupPtr group, + bool create, + unsigned int flags) +{ + VIR_AUTOFREE(char *) path = NULL; + int controller; + + VIR_DEBUG("Make group %s", group->path); + + controller = virCgroupV2GetAnyController(group); + if (virCgroupV2PathOfController(group, controller, "", &path) < 0) + return -1; + + VIR_DEBUG("Make controller %s", path); + + if (!virFileExists(path)) {
This should have been virFileIsDir() because EEXIST doesn't mean the @path is a directory. It could be a regular file.
Good point, I'll fix that and as a followup we should fix it for cgroup v1 backend as well.
+ if (!create || mkdir(path, 0755 || errno == EEXIST) < 0) {
Ooops, misplaced ')'. mkdir() || errno ;-) This is why I tend to write each condition on a separate line. But if you want to ignore EEXIST then you need to rewrite this check as follows:
if (!create || (mkdir() < 0 && errno != EEXIST))
However, I don't think you want to ignore any errno.
The point of EEXIST was the there can be race condition when file creation so in case mkdir() fails with EEXIST we are ignoring that error. The race condition may happen if you start multiple guests on non-systemd OS where libvirt is actually creating the whole hierarchy and you need to create the /machines partition directory itself (my guess, I did not verify it).
Also, any reason these two if()-s should not be merged into one?
No reason, they can be merged.
+ virReportSystemError(errno, + _("Failed to create v2 cgroup '%s'"), + group->path); + return -1; + } + } + + if (create) { + if (flags & VIR_CGROUP_THREAD) { + if (virCgroupSetValueStr(group, VIR_CGROUP_CONTROLLER_CPU, + "cgroup.type", "threaded") < 0) { + return -1; + } + + if (virCgroupV2EnableController(parent, + VIR_CGROUP_CONTROLLER_CPU) < 0) { + return -1; + } + } else { + size_t i; + for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) { + if (!virCgroupV2HasController(parent, i)) + continue; + + /* Controllers that are implicitly enabled if available. */ + if (i == VIR_CGROUP_CONTROLLER_CPUACCT) + continue; + + if (virCgroupV2EnableController(parent, i) < 0) + return -1; + } + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -351,6 +428,7 @@ virCgroupBackend virCgroupV2Backend = { .hasController = virCgroupV2HasController, .getAnyController = virCgroupV2GetAnyController, .pathOfController = virCgroupV2PathOfController, + .makeGroup = virCgroupV2MakeGroup, };
Michal

On Thu, Oct 04, 2018 at 03:20:18PM +0200, Pavel Hrdina wrote:
On Thu, Oct 04, 2018 at 02:55:25PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:43 AM, Pavel Hrdina wrote:
When creating cgroup hierarchy we need to enable controllers in the parent cgroup in order to be usable. That means writing "+{controller}" into cgroup.subtree_control file. We can enable only controllers that are enabled for parent cgroup, that means we need to do that for the whole cgroup tree.
Cgroups for threads needs to be handled differently in cgroup v2. There are two types of controllers:
- domain controllers: these cannot be enabled for threads - threaded controllers: these can be enabled for threads
In addition there are multiple types of cgroups:
- domain: normal cgroup - domain threaded: a domain cgroup that serves as root for threaded cgroups - domain invalid: invalid cgroup, can be changed into threaded, this is the default state if you create subgroup inside domain threaded group or threaded group - threaded: threaded cgroup which can have domain threaded or threaded as parent group
In order to create threaded cgroup it's sufficient to write "threaded" into cgroup.type file, it will automatically make parent cgroup "domain threaded" if it was only "domain". In case the parent cgroup is already "domain threaded" or "threaded" it will modify only the type of current cgroup. After that we can enable threaded controllers.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroup.c | 2 +- src/util/vircgroupbackend.h | 1 + src/util/vircgroupv2.c | 78 +++++++++++++++++++++++++++++++++++++ 3 files changed, 80 insertions(+), 1 deletion(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 1097b1f998..dc249bfe33 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -938,7 +938,7 @@ virCgroupNewThread(virCgroupPtr domain, if (virCgroupNew(-1, name, domain, controllers, group) < 0) return -1;
- if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_NONE) < 0) { + if (virCgroupMakeGroup(domain, *group, create, VIR_CGROUP_THREAD) < 0) {
So unsupported flags are ignored?
Correct, each backend will create the directories differently and there are different rules applies, so the flag can be ignored if there is no need to handle it.
virCgroupFree(group); return -1; } diff --git a/src/util/vircgroupbackend.h b/src/util/vircgroupbackend.h index b1f19233e4..86d1539e07 100644 --- a/src/util/vircgroupbackend.h +++ b/src/util/vircgroupbackend.h @@ -33,6 +33,7 @@ typedef enum { * before creating subcgroups and * attaching tasks */ + VIR_CGROUP_THREAD = 1 << 1, /* cgroup v2 handles threads differently */ } virCgroupBackendFlags;
typedef enum { diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 3ca463e4c2..8fb9ace474 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -336,6 +336,83 @@ virCgroupV2PathOfController(virCgroupPtr group, }
+static int +virCgroupV2EnableController(virCgroupPtr parent, + int controller) +{ + VIR_AUTOFREE(char *) val = NULL; + + if (virAsprintf(&val, "+%s", + virCgroupV2ControllerTypeToString(controller)) < 0) { + return -1; + } + + if (virCgroupSetValueStr(parent, controller, + "cgroup.subtree_control", val) < 0) { + return -1; + } + + return 0; +} + + +static int +virCgroupV2MakeGroup(virCgroupPtr parent ATTRIBUTE_UNUSED, + virCgroupPtr group, + bool create, + unsigned int flags) +{ + VIR_AUTOFREE(char *) path = NULL; + int controller; + + VIR_DEBUG("Make group %s", group->path); + + controller = virCgroupV2GetAnyController(group); + if (virCgroupV2PathOfController(group, controller, "", &path) < 0) + return -1; + + VIR_DEBUG("Make controller %s", path); + + if (!virFileExists(path)) {
This should have been virFileIsDir() because EEXIST doesn't mean the @path is a directory. It could be a regular file.
Good point, I'll fix that and as a followup we should fix it for cgroup v1 backend as well.
Actually it is guaranteed that the path will be only DIR, cgroup fs doesn't allow to create regular files and when creating partitions we check for collisions with cgroup interface files so virFileExists is good enough. Pavel

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 8fb9ace474..103f37fe91 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -413,6 +413,25 @@ virCgroupV2MakeGroup(virCgroupPtr parent ATTRIBUTE_UNUSED, } +static int +virCgroupV2Remove(virCgroupPtr group) +{ + VIR_AUTOFREE(char *) grppath = NULL; + int controller; + + /* Don't delete the root group, if we accidentally + ended up in it for some reason */ + if (STREQ(group->unified.placement, "/")) + return 0; + + controller = virCgroupV2GetAnyController(group); + if (virCgroupV2PathOfController(group, controller, "", &grppath) < 0) + return 0; + + return virCgroupRemoveRecursively(grppath); +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -429,6 +448,7 @@ virCgroupBackend virCgroupV2Backend = { .getAnyController = virCgroupV2GetAnyController, .pathOfController = virCgroupV2PathOfController, .makeGroup = virCgroupV2MakeGroup, + .remove = virCgroupV2Remove, }; -- 2.17.1

In cgroups v2 we need to handle threads and processes differently. If you need to move a process you need to write its pid into cgrou.procs file and it will move the process with all its threads as well. The whole process will be moved if you use tid of any thread. In order to move only threads at first we need to create threaded group and after that we can write the relevant thread tids into cgroup.threads file. Threads can be moved only into cgroups that are children of cgroup of its process. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 103f37fe91..e7dbace42b 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -432,6 +432,20 @@ virCgroupV2Remove(virCgroupPtr group) } +static int +virCgroupV2AddTask(virCgroupPtr group, + pid_t pid, + unsigned int flags) +{ + int controller = virCgroupV2GetAnyController(group); + + if (flags & VIR_CGROUP_TASK_THREAD) + return virCgroupSetValueI64(group, controller, "cgroup.threads", pid); + else + return virCgroupSetValueI64(group, controller, "cgroup.procs", pid); +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -449,6 +463,7 @@ virCgroupBackend virCgroupV2Backend = { .pathOfController = virCgroupV2PathOfController, .makeGroup = virCgroupV2MakeGroup, .remove = virCgroupV2Remove, + .addTask = virCgroupV2AddTask, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
In cgroups v2 we need to handle threads and processes differently. If you need to move a process you need to write its pid into cgrou.procs file and it will move the process with all its threads as well. The whole process will be moved if you use tid of any thread.
In order to move only threads at first we need to create threaded group and after that we can write the relevant thread tids into cgroup.threads file. Threads can be moved only into cgroups that are children of cgroup of its process.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index e7dbace42b..c6a9e0a7df 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -446,6 +446,22 @@ virCgroupV2AddTask(virCgroupPtr group, } +static int +virCgroupV2HasEmptyTasks(virCgroupPtr cgroup, + int controller) +{ + int ret = -1; + VIR_AUTOFREE(char *) content = NULL; + + ret = virCgroupGetValueStr(cgroup, controller, "cgroup.procs", &content); + + if (ret == 0 && content[0] == '\0') + ret = 1; + + return ret; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -464,6 +480,7 @@ virCgroupBackend virCgroupV2Backend = { .makeGroup = virCgroupV2MakeGroup, .remove = virCgroupV2Remove, .addTask = virCgroupV2AddTask, + .hasEmptyTasks = virCgroupV2HasEmptyTasks, }; -- 2.17.1

On Tue, 2018-10-02 at 10:44 +0200, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index e7dbace42b..c6a9e0a7df 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -446,6 +446,22 @@ virCgroupV2AddTask(virCgroupPtr group, }
+static int +virCgroupV2HasEmptyTasks(virCgroupPtr cgroup, + int controller) +{ + int ret = -1; + VIR_AUTOFREE(char *) content = NULL; +
Just a nitpick here. I've noticed that in the v1 you're actually checking whether cgroup is NULL or not. Shouldn't you consider doing the same here as well?

On Thu, Oct 04, 2018 at 02:55:30PM +0200, Fabiano Fidêncio wrote:
On Tue, 2018-10-02 at 10:44 +0200, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index e7dbace42b..c6a9e0a7df 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -446,6 +446,22 @@ virCgroupV2AddTask(virCgroupPtr group, }
+static int +virCgroupV2HasEmptyTasks(virCgroupPtr cgroup, + int controller) +{ + int ret = -1; + VIR_AUTOFREE(char *) content = NULL; +
Just a nitpick here. I've noticed that in the v1 you're actually checking whether cgroup is NULL or not.
Shouldn't you consider doing the same here as well?
Sure, good point :) but if cgroup is NULL it will segfault in virCgroupHasEmptyTasks because there we call cgroup->backend->hasEmptyTasks(cgroup, controller); The proper fix would be remove the NULL check from cgroup v1 backend. Another point to this topic, the code in libvirt using cgroups currently relies on cgroup to be allocated (LXC) or cgroup to be available and allocated (QEMU). So technically it should not never happen unless there is programming error. Pavel

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index c6a9e0a7df..e30f438015 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -21,6 +21,7 @@ #ifdef __linux__ # include <mntent.h> +# include <sys/mount.h> #endif /* __linux__ */ #include "internal.h" @@ -462,6 +463,38 @@ virCgroupV2HasEmptyTasks(virCgroupPtr cgroup, } +static int +virCgroupV2BindMount(virCgroupPtr group, + const char *oldroot, + const char *mountopts) +{ + VIR_AUTOFREE(char *) opts = NULL; + VIR_AUTOFREE(char *) src = NULL; + + VIR_DEBUG("Mounting cgroups at '%s'", group->unified.mountPoint); + + if (virFileMakePath(group->unified.mountPoint) < 0) { + virReportSystemError(errno, _("Unable to create directory %s"), + group->unified.mountPoint); + return -1; + } + + if (virAsprintf(&opts, "mode=755,size=65536%s", mountopts) < 0) + return -1; + + if (virAsprintf(&src, "%s%s", oldroot, group->unified.mountPoint) < 0) + return -1; + + if (mount(src, group->unified.mountPoint, "none", MS_BIND, NULL) < 0) { + virReportSystemError(errno, _("Failed to bind cgroup '%s' on '%s'"), + src, group->unified.mountPoint); + return -1; + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -481,6 +514,7 @@ virCgroupBackend virCgroupV2Backend = { .remove = virCgroupV2Remove, .addTask = virCgroupV2AddTask, .hasEmptyTasks = virCgroupV2HasEmptyTasks, + .bindMount = virCgroupV2BindMount, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 52 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index e30f438015..fec5d28835 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -495,6 +495,57 @@ virCgroupV2BindMount(virCgroupPtr group, } +static int +virCgroupV2SetOwner(virCgroupPtr cgroup, + uid_t uid, + gid_t gid, + int controllers ATTRIBUTE_UNUSED) +{ + int ret = -1; + VIR_AUTOFREE(char *) base = NULL; + DIR *dh = NULL; + int direrr; + + struct dirent *de; + + if (virAsprintf(&base, "%s%s", cgroup->unified.mountPoint, + cgroup->unified.placement) < 0) { + goto cleanup; + } + + if (virDirOpen(&dh, base) < 0) + goto cleanup; + + while ((direrr = virDirRead(dh, &de, base)) > 0) { + VIR_AUTOFREE(char *) entry = NULL; + + if (virAsprintf(&entry, "%s/%s", base, de->d_name) < 0) + goto cleanup; + + if (chown(entry, uid, gid) < 0) { + virReportSystemError(errno, _("cannot chown '%s' to (%u, %u)"), + entry, uid, gid); + goto cleanup; + } + } + + if (direrr < 0) + goto cleanup; + + if (chown(base, uid, gid) < 0) { + virReportSystemError(errno, _("cannot chown '%s' to (%u, %u)"), + base, uid, gid); + goto cleanup; + } + + ret = 0; + + cleanup: + VIR_DIR_CLOSE(dh); + return ret; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -515,6 +566,7 @@ virCgroupBackend virCgroupV2Backend = { .addTask = virCgroupV2AddTask, .hasEmptyTasks = virCgroupV2HasEmptyTasks, .bindMount = virCgroupV2BindMount, + .setOwner = virCgroupV2SetOwner, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 52 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index e30f438015..fec5d28835 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -495,6 +495,57 @@ virCgroupV2BindMount(virCgroupPtr group, }
+static int +virCgroupV2SetOwner(virCgroupPtr cgroup, + uid_t uid, + gid_t gid, + int controllers ATTRIBUTE_UNUSED) +{ + int ret = -1; + VIR_AUTOFREE(char *) base = NULL; + DIR *dh = NULL; + int direrr; + + struct dirent *de;
Useless empty line;
+ + if (virAsprintf(&base, "%s%s", cgroup->unified.mountPoint, + cgroup->unified.placement) < 0) { + goto cleanup; + } + + if (virDirOpen(&dh, base) < 0) + goto cleanup; + + while ((direrr = virDirRead(dh, &de, base)) > 0) { + VIR_AUTOFREE(char *) entry = NULL; + + if (virAsprintf(&entry, "%s/%s", base, de->d_name) < 0) + goto cleanup; + + if (chown(entry, uid, gid) < 0) { + virReportSystemError(errno, _("cannot chown '%s' to (%u, %u)"), + entry, uid, gid); + goto cleanup; + } + }
Or virFileChownFiles().
+ + if (direrr < 0) + goto cleanup; + + if (chown(base, uid, gid) < 0) { + virReportSystemError(errno, _("cannot chown '%s' to (%u, %u)"), + base, uid, gid); + goto cleanup; + } + + ret = 0; + + cleanup: + VIR_DIR_CLOSE(dh); + return ret; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -515,6 +566,7 @@ virCgroupBackend virCgroupV2Backend = { .addTask = virCgroupV2AddTask, .hasEmptyTasks = virCgroupV2HasEmptyTasks, .bindMount = virCgroupV2BindMount, + .setOwner = virCgroupV2SetOwner, };
ACK Michal

All mandatory callbacks are implemented for cgroup v2 backend so we can register it now. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupbackend.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/util/vircgroupbackend.c b/src/util/vircgroupbackend.c index 79fe6cb73d..7ee39ac8ca 100644 --- a/src/util/vircgroupbackend.c +++ b/src/util/vircgroupbackend.c @@ -52,6 +52,7 @@ virCgroupBackendRegister(virCgroupBackendPtr backend) static void virCgroupBackendOnceInit(void) { + virCgroupV2Register(); virCgroupV1Register(); } -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
All mandatory callbacks are implemented for cgroup v2 backend so we can register it now.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupbackend.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/util/vircgroupbackend.c b/src/util/vircgroupbackend.c index 79fe6cb73d..7ee39ac8ca 100644 --- a/src/util/vircgroupbackend.c +++ b/src/util/vircgroupbackend.c @@ -52,6 +52,7 @@ virCgroupBackendRegister(virCgroupBackendPtr backend) static void virCgroupBackendOnceInit(void) { + virCgroupV2Register(); virCgroupV1Register(); }
Well, the callbacks are implemented, but that's about it. No API that involves reading/setting a value from CGroup will work at this point, oui? For instance: virsh blkiotune fedora error: Operation not supported: operation 'getBlkioWeight' not supported ACK if you move this to be the very last patch (or somewhere before the tests - I haven't gotten that far yet, so I don't know how tests are implemented, whether they need the v2 backend to be registered or not). Michal

On Thu, Oct 04, 2018 at 05:26:32PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
All mandatory callbacks are implemented for cgroup v2 backend so we can register it now.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupbackend.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/util/vircgroupbackend.c b/src/util/vircgroupbackend.c index 79fe6cb73d..7ee39ac8ca 100644 --- a/src/util/vircgroupbackend.c +++ b/src/util/vircgroupbackend.c @@ -52,6 +52,7 @@ virCgroupBackendRegister(virCgroupBackendPtr backend) static void virCgroupBackendOnceInit(void) { + virCgroupV2Register(); virCgroupV1Register(); }
Well, the callbacks are implemented, but that's about it. No API that involves reading/setting a value from CGroup will work at this point, oui?
For instance:
virsh blkiotune fedora error: Operation not supported: operation 'getBlkioWeight' not supported
ACK if you move this to be the very last patch (or somewhere before the tests - I haven't gotten that far yet, so I don't know how tests are implemented, whether they need the v2 backend to be registered or not).
This was intentional to register the backend right after all mandatory callbacks are implemented. Controller specific callbacks are optional because some of the controllers that are available in cgroup v1 will never exist in cgroup v2. I can move this patch after the all controllers in this series are implemented, but there is still devices controller missing and is not even implemented in kernel and most likely will never exist in v2. So technically there is no point of moving it. Pavel

On 10/04/2018 06:10 PM, Pavel Hrdina wrote:
On Thu, Oct 04, 2018 at 05:26:32PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
All mandatory callbacks are implemented for cgroup v2 backend so we can register it now.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupbackend.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/util/vircgroupbackend.c b/src/util/vircgroupbackend.c index 79fe6cb73d..7ee39ac8ca 100644 --- a/src/util/vircgroupbackend.c +++ b/src/util/vircgroupbackend.c @@ -52,6 +52,7 @@ virCgroupBackendRegister(virCgroupBackendPtr backend) static void virCgroupBackendOnceInit(void) { + virCgroupV2Register(); virCgroupV1Register(); }
Well, the callbacks are implemented, but that's about it. No API that involves reading/setting a value from CGroup will work at this point, oui?
For instance:
virsh blkiotune fedora error: Operation not supported: operation 'getBlkioWeight' not supported
ACK if you move this to be the very last patch (or somewhere before the tests - I haven't gotten that far yet, so I don't know how tests are implemented, whether they need the v2 backend to be registered or not).
This was intentional to register the backend right after all mandatory callbacks are implemented. Controller specific callbacks are optional because some of the controllers that are available in cgroup v1 will never exist in cgroup v2.
Sure, but some of them will. Just like I demonstrated above.
I can move this patch after the all controllers in this series are implemented, but there is still devices controller missing and is not even implemented in kernel and most likely will never exist in v2.
Hold on, so how are we supposed to limit what devices qemu has access to if there is no devices cgroup? If not adding new layers of security we should not remove them at least.
So technically there is no point of moving it.
I think there is. Because right now, around these patches none of the APIs that involve CGroup work. It's okay to add code for new feature and enable it only at the very last moment. I think it's even in our coding guidelines. Michal

On Fri, Oct 05, 2018 at 09:50:57AM +0200, Michal Privoznik wrote:
On 10/04/2018 06:10 PM, Pavel Hrdina wrote:
On Thu, Oct 04, 2018 at 05:26:32PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
All mandatory callbacks are implemented for cgroup v2 backend so we can register it now.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupbackend.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/util/vircgroupbackend.c b/src/util/vircgroupbackend.c index 79fe6cb73d..7ee39ac8ca 100644 --- a/src/util/vircgroupbackend.c +++ b/src/util/vircgroupbackend.c @@ -52,6 +52,7 @@ virCgroupBackendRegister(virCgroupBackendPtr backend) static void virCgroupBackendOnceInit(void) { + virCgroupV2Register(); virCgroupV1Register(); }
Well, the callbacks are implemented, but that's about it. No API that involves reading/setting a value from CGroup will work at this point, oui?
For instance:
virsh blkiotune fedora error: Operation not supported: operation 'getBlkioWeight' not supported
ACK if you move this to be the very last patch (or somewhere before the tests - I haven't gotten that far yet, so I don't know how tests are implemented, whether they need the v2 backend to be registered or not).
This was intentional to register the backend right after all mandatory callbacks are implemented. Controller specific callbacks are optional because some of the controllers that are available in cgroup v1 will never exist in cgroup v2.
Sure, but some of them will. Just like I demonstrated above.
I can move this patch after the all controllers in this series are implemented, but there is still devices controller missing and is not even implemented in kernel and most likely will never exist in v2.
Hold on, so how are we supposed to limit what devices qemu has access to if there is no devices cgroup? If not adding new layers of security we should not remove them at least.
This will be a followup series. There is no devices controller in cgroup v2, there is BPF filter instead.
So technically there is no point of moving it.
I think there is. Because right now, around these patches none of the APIs that involve CGroup work. It's okay to add code for new feature and enable it only at the very last moment. I think it's even in our coding guidelines.
OK, I'll move it, I personally don't care. Pavel

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 49 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index fec5d28835..3e2cd16335 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -546,6 +546,52 @@ virCgroupV2SetOwner(virCgroupPtr cgroup, } +static int +virCgroupV2SetBlkioWeight(virCgroupPtr group, + unsigned int weight) +{ + VIR_AUTOFREE(char *) value = NULL; + + if (virAsprintf(&value, "default %u", weight) < 0) + return -1; + + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.weight", + value); +} + + +static int +virCgroupV2GetBlkioWeight(virCgroupPtr group, + unsigned int *weight) +{ + VIR_AUTOFREE(char *) value = NULL; + char *tmp; + + if (virCgroupGetValueStr(group, VIR_CGROUP_CONTROLLER_BLKIO, + "io.weight", &value) < 0) { + return -1; + } + + if (!(tmp = strstr(value, "default "))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot find default io weight.")); + return -1; + } + tmp += strlen("default "); + + if (virStrToLong_ui(tmp, NULL, 10, weight) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse '%s' as an integer"), + tmp); + return -1; + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -567,6 +613,9 @@ virCgroupBackend virCgroupV2Backend = { .hasEmptyTasks = virCgroupV2HasEmptyTasks, .bindMount = virCgroupV2BindMount, .setOwner = virCgroupV2SetOwner, + + .setBlkioWeight = virCgroupV2SetBlkioWeight, + .getBlkioWeight = virCgroupV2GetBlkioWeight, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 49 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index fec5d28835..3e2cd16335 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -546,6 +546,52 @@ virCgroupV2SetOwner(virCgroupPtr cgroup, }
+static int +virCgroupV2SetBlkioWeight(virCgroupPtr group, + unsigned int weight) +{ + VIR_AUTOFREE(char *) value = NULL; + + if (virAsprintf(&value, "default %u", weight) < 0) + return -1; + + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.weight", + value); +} + + +static int +virCgroupV2GetBlkioWeight(virCgroupPtr group, + unsigned int *weight) +{ + VIR_AUTOFREE(char *) value = NULL; + char *tmp; + + if (virCgroupGetValueStr(group, VIR_CGROUP_CONTROLLER_BLKIO, + "io.weight", &value) < 0) { + return -1; + } + + if (!(tmp = strstr(value, "default "))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot find default io weight.")); + return -1; + } + tmp += strlen("default ");
Alternatively, to avoid having to change this at two places, you can just use #define BLKIO_WIGHT_VAL_DESC "default " or something similar.
+ + if (virStrToLong_ui(tmp, NULL, 10, weight) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse '%s' as an integer"), + tmp); + return -1; + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -567,6 +613,9 @@ virCgroupBackend virCgroupV2Backend = { .hasEmptyTasks = virCgroupV2HasEmptyTasks, .bindMount = virCgroupV2BindMount, .setOwner = virCgroupV2SetOwner, + + .setBlkioWeight = virCgroupV2SetBlkioWeight, + .getBlkioWeight = virCgroupV2GetBlkioWeight, };
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 66 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 3e2cd16335..30c400f129 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -592,6 +592,71 @@ virCgroupV2GetBlkioWeight(virCgroupPtr group, } +static int +virCgroupV2GetBlkioIoServiced(virCgroupPtr group, + long long *bytes_read, + long long *bytes_write, + long long *requests_read, + long long *requests_write) +{ + long long stats_val; + VIR_AUTOFREE(char *) str1 = NULL; + char *p1; + size_t i; + + const char *value_names[] = { + "rbytes=", + "wbytes=", + "rios=", + "wios=", + }; + long long *value_ptrs[] = { + bytes_read, + bytes_write, + requests_read, + requests_write + }; + + *bytes_read = 0; + *bytes_write = 0; + *requests_read = 0; + *requests_write = 0; + + if (virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.stat", &str1) < 0) { + return -1; + } + + /* sum up all entries of the same kind, from all devices */ + for (i = 0; i < ARRAY_CARDINALITY(value_names); i++) { + p1 = str1; + + while ((p1 = strstr(p1, value_names[i]))) { + p1 += strlen(value_names[i]); + if (virStrToLong_ll(p1, &p1, 10, &stats_val) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse byte %sstat '%s'"), + value_names[i], + p1); + return -1; + } + + if (stats_val < 0 || + (stats_val > 0 && *value_ptrs[i] > (LLONG_MAX - stats_val))) { + virReportError(VIR_ERR_OVERFLOW, + _("Sum of byte %sstat overflows"), + value_names[i]); + return -1; + } + *value_ptrs[i] += stats_val; + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -616,6 +681,7 @@ virCgroupBackend virCgroupV2Backend = { .setBlkioWeight = virCgroupV2SetBlkioWeight, .getBlkioWeight = virCgroupV2GetBlkioWeight, + .getBlkioIoServiced = virCgroupV2GetBlkioIoServiced, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 66 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 3e2cd16335..30c400f129 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -592,6 +592,71 @@ virCgroupV2GetBlkioWeight(virCgroupPtr group, }
+static int +virCgroupV2GetBlkioIoServiced(virCgroupPtr group, + long long *bytes_read, + long long *bytes_write, + long long *requests_read, + long long *requests_write) +{ + long long stats_val; + VIR_AUTOFREE(char *) str1 = NULL; + char *p1; + size_t i; + + const char *value_names[] = { + "rbytes=", + "wbytes=", + "rios=", + "wios=", + }; + long long *value_ptrs[] = { + bytes_read, + bytes_write, + requests_read, + requests_write + }; + + *bytes_read = 0; + *bytes_write = 0; + *requests_read = 0; + *requests_write = 0; + + if (virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.stat", &str1) < 0) { + return -1; + } + + /* sum up all entries of the same kind, from all devices */ + for (i = 0; i < ARRAY_CARDINALITY(value_names); i++) { + p1 = str1; + + while ((p1 = strstr(p1, value_names[i]))) { + p1 += strlen(value_names[i]); + if (virStrToLong_ll(p1, &p1, 10, &stats_val) < 0) {
Rather unusual use, we tend to use two different pointers. Perhaps s/p1/p/?
+ virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse byte %sstat '%s'"), + value_names[i], + p1);
What do you expect to be printed out here? Because IIUC this will print the incorrect string + the rest of the file. Also, "%sstat" will print "rwbytes=stat". I'd put a space after %s at least.
+ return -1; + } + + if (stats_val < 0 || + (stats_val > 0 && *value_ptrs[i] > (LLONG_MAX - stats_val))) { + virReportError(VIR_ERR_OVERFLOW, + _("Sum of byte %sstat overflows"), + value_names[i]); + return -1; + } + *value_ptrs[i] += stats_val; + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -616,6 +681,7 @@ virCgroupBackend virCgroupV2Backend = {
.setBlkioWeight = virCgroupV2SetBlkioWeight, .getBlkioWeight = virCgroupV2GetBlkioWeight, + .getBlkioIoServiced = virCgroupV2GetBlkioIoServiced, };
Michal

On Thu, Oct 04, 2018 at 05:26:30PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 66 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 3e2cd16335..30c400f129 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -592,6 +592,71 @@ virCgroupV2GetBlkioWeight(virCgroupPtr group, }
+static int +virCgroupV2GetBlkioIoServiced(virCgroupPtr group, + long long *bytes_read, + long long *bytes_write, + long long *requests_read, + long long *requests_write) +{ + long long stats_val; + VIR_AUTOFREE(char *) str1 = NULL; + char *p1; + size_t i; + + const char *value_names[] = { + "rbytes=", + "wbytes=", + "rios=", + "wios=", + }; + long long *value_ptrs[] = { + bytes_read, + bytes_write, + requests_read, + requests_write + }; + + *bytes_read = 0; + *bytes_write = 0; + *requests_read = 0; + *requests_write = 0; + + if (virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.stat", &str1) < 0) { + return -1; + } + + /* sum up all entries of the same kind, from all devices */ + for (i = 0; i < ARRAY_CARDINALITY(value_names); i++) { + p1 = str1; + + while ((p1 = strstr(p1, value_names[i]))) { + p1 += strlen(value_names[i]); + if (virStrToLong_ll(p1, &p1, 10, &stats_val) < 0) {
Rather unusual use, we tend to use two different pointers. Perhaps s/p1/p/?
Actually there is a lot of other uses like this throughout the code so I would not say it's unusual. Yes, we can use different pointer for the endptr and assign to the startptr, which would end up with the same result. With two pointer it would look like this: p1 += strlen(value_names[i]); if (virStrToLong_ll(p1, &p, 10, &stats_val) < 0) { ... } p1 = p;
+ virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse byte %sstat '%s'"), + value_names[i], + p1);
What do you expect to be printed out here? Because IIUC this will print the incorrect string + the rest of the file. Also, "%sstat" will print "rwbytes=stat". I'd put a space after %s at least.
That's a good question :) ideally we should print only the part that we cannot parse as number, however, that might be tricky to do with this approach and it's not probably worth of handling for the error case which will most likely never happen. I'll fix the format strings. Pavel

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 64 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 30c400f129..eff410258c 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -657,6 +657,69 @@ virCgroupV2GetBlkioIoServiced(virCgroupPtr group, } +static int +virCgroupV2GetBlkioIoDeviceServiced(virCgroupPtr group, + const char *path, + long long *bytes_read, + long long *bytes_write, + long long *requests_read, + long long *requests_write) +{ + VIR_AUTOFREE(char *) str1 = NULL; + VIR_AUTOFREE(char *) str2 = NULL; + char *p1; + size_t i; + + const char *value_names[] = { + "rbytes=", + "wbytes=", + "rios=", + "wios=", + }; + long long *value_ptrs[] = { + bytes_read, + bytes_write, + requests_read, + requests_write + }; + + if (virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.stat", &str1) < 0) { + return -1; + } + + if (!(str2 = virCgroupGetBlockDevString(path))) + return -1; + + if (!(p1 = strstr(str1, str2))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot find byte stats for block device '%s'"), + str2); + return -1; + } + + for (i = 0; i < ARRAY_CARDINALITY(value_names); i++) { + if (!(p1 = strstr(p1, value_names[i]))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot find byte %sstats for block device '%s'"), + value_names[i], str2); + return -1; + } + + if (virStrToLong_ll(p1 + strlen(value_names[i]), &p1, + 10, value_ptrs[i]) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse %sstat '%s'"), + value_names[i], p1 + strlen(value_names[i])); + return -1; + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -682,6 +745,7 @@ virCgroupBackend virCgroupV2Backend = { .setBlkioWeight = virCgroupV2SetBlkioWeight, .getBlkioWeight = virCgroupV2GetBlkioWeight, .getBlkioIoServiced = virCgroupV2GetBlkioIoServiced, + .getBlkioIoDeviceServiced = virCgroupV2GetBlkioIoDeviceServiced, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 64 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 30c400f129..eff410258c 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -657,6 +657,69 @@ virCgroupV2GetBlkioIoServiced(virCgroupPtr group, }
+static int +virCgroupV2GetBlkioIoDeviceServiced(virCgroupPtr group, + const char *path, + long long *bytes_read, + long long *bytes_write, + long long *requests_read, + long long *requests_write) +{ + VIR_AUTOFREE(char *) str1 = NULL; + VIR_AUTOFREE(char *) str2 = NULL; + char *p1; + size_t i; + + const char *value_names[] = { + "rbytes=", + "wbytes=", + "rios=", + "wios=", + }; + long long *value_ptrs[] = { + bytes_read, + bytes_write, + requests_read, + requests_write + }; + + if (virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.stat", &str1) < 0) { + return -1; + } + + if (!(str2 = virCgroupGetBlockDevString(path))) + return -1; + + if (!(p1 = strstr(str1, str2))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot find byte stats for block device '%s'"), + str2); + return -1; + } + + for (i = 0; i < ARRAY_CARDINALITY(value_names); i++) { + if (!(p1 = strstr(p1, value_names[i]))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot find byte %sstats for block device '%s'"), + value_names[i], str2); + return -1; + } + + if (virStrToLong_ll(p1 + strlen(value_names[i]), &p1, + 10, value_ptrs[i]) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse %sstat '%s'"), + value_names[i], p1 + strlen(value_names[i]));
Same comment here as in previous patch.
+ return -1; + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -682,6 +745,7 @@ virCgroupBackend virCgroupV2Backend = { .setBlkioWeight = virCgroupV2SetBlkioWeight, .getBlkioWeight = virCgroupV2GetBlkioWeight, .getBlkioIoServiced = virCgroupV2GetBlkioIoServiced, + .getBlkioIoDeviceServiced = virCgroupV2GetBlkioIoDeviceServiced, };
Otherwise looking good. Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 51 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index eff410258c..b7e4db9f41 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -720,6 +720,55 @@ virCgroupV2GetBlkioIoDeviceServiced(virCgroupPtr group, } +static int +virCgroupV2SetBlkioDeviceWeight(virCgroupPtr group, + const char *path, + unsigned int weight) +{ + VIR_AUTOFREE(char *) str = NULL; + VIR_AUTOFREE(char *) blkstr = NULL; + + if (!(blkstr = virCgroupGetBlockDevString(path))) + return -1; + + if (virAsprintf(&str, "%s%d", blkstr, weight) < 0) + return -1; + + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.weight", + str); +} + + +static int +virCgroupV2GetBlkioDeviceWeight(virCgroupPtr group, + const char *path, + unsigned int *weight) +{ + VIR_AUTOFREE(char *) str = NULL; + + if (virCgroupGetValueForBlkDev(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.weight", + path, + &str) < 0) { + return -1; + } + + if (!str) { + *weight = 0; + } else if (virStrToLong_ui(str, NULL, 10, weight) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse '%s' as an integer"), + str); + return -1; + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -746,6 +795,8 @@ virCgroupBackend virCgroupV2Backend = { .getBlkioWeight = virCgroupV2GetBlkioWeight, .getBlkioIoServiced = virCgroupV2GetBlkioIoServiced, .getBlkioIoDeviceServiced = virCgroupV2GetBlkioIoDeviceServiced, + .setBlkioDeviceWeight = virCgroupV2SetBlkioDeviceWeight, + .getBlkioDeviceWeight = virCgroupV2GetBlkioDeviceWeight, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 51 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+)
ACK Michal P.S.: This is where I stop the review and call it the day. Will resume tomorrow.

On Tue, 2018-10-02 at 10:44 +0200, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
I'd just adjust the commit log to show that you're adding the Get/Set instead of just the Set function.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 64 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index b7e4db9f41..408f7e3eeb 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -769,6 +769,68 @@ virCgroupV2GetBlkioDeviceWeight(virCgroupPtr group, } +static int +virCgroupV2SetBlkioDeviceReadIops(virCgroupPtr group, + const char *path, + unsigned int riops) +{ + VIR_AUTOFREE(char *) str = NULL; + VIR_AUTOFREE(char *) blkstr = NULL; + + if (!(blkstr = virCgroupGetBlockDevString(path))) + return -1; + + if (riops == 0) { + if (virAsprintf(&str, "%sriops=max", blkstr) < 0) + return -1; + } else { + if (virAsprintf(&str, "%sriops=%u", blkstr, riops) < 0) + return -1; + } + + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + str); +} + + +static int +virCgroupV2GetBlkioDeviceReadIops(virCgroupPtr group, + const char *path, + unsigned int *riops) +{ + VIR_AUTOFREE(char *) str = NULL; + char *tmp; + + if (virCgroupGetValueForBlkDev(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + path, + &str) < 0) { + return -1; + } + + if (!str) { + *riops = 0; + } else { + tmp = strstr(str, "riops="); + tmp += strlen("riops="); + + if (STREQLEN(tmp, "max", 3)) { + *riops = 0; + } else if (virStrToLong_ui(tmp, NULL, 10, riops) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse '%s' as an integer"), + str); + return -1; + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -797,6 +859,8 @@ virCgroupBackend virCgroupV2Backend = { .getBlkioIoDeviceServiced = virCgroupV2GetBlkioIoDeviceServiced, .setBlkioDeviceWeight = virCgroupV2SetBlkioDeviceWeight, .getBlkioDeviceWeight = virCgroupV2GetBlkioDeviceWeight, + .setBlkioDeviceReadIops = virCgroupV2SetBlkioDeviceReadIops, + .getBlkioDeviceReadIops = virCgroupV2GetBlkioDeviceReadIops, }; -- 2.17.1

On Tue, 2018-10-02 at 10:44 +0200, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Here, again, I'd just adjust the commit log to show that you're adding the Get/Set instead of just the Set function.

On Fri, 2018-10-05 at 09:54 +0200, Fabiano Fidêncio wrote:
On Tue, 2018-10-02 at 10:44 +0200, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Here, again, I'd just adjust the commit log to show that you're adding the Get/Set instead of just the Set function.
And the same comment applies for a few other patches after this one.

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 64 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index b7e4db9f41..408f7e3eeb 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -769,6 +769,68 @@ virCgroupV2GetBlkioDeviceWeight(virCgroupPtr group, }
+static int +virCgroupV2SetBlkioDeviceReadIops(virCgroupPtr group, + const char *path, + unsigned int riops) +{ + VIR_AUTOFREE(char *) str = NULL; + VIR_AUTOFREE(char *) blkstr = NULL; + + if (!(blkstr = virCgroupGetBlockDevString(path))) + return -1; + + if (riops == 0) { + if (virAsprintf(&str, "%sriops=max", blkstr) < 0) + return -1; + } else { + if (virAsprintf(&str, "%sriops=%u", blkstr, riops) < 0) + return -1; + } + + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + str); +} + + +static int +virCgroupV2GetBlkioDeviceReadIops(virCgroupPtr group, + const char *path, + unsigned int *riops) +{ + VIR_AUTOFREE(char *) str = NULL; + char *tmp; + + if (virCgroupGetValueForBlkDev(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + path, + &str) < 0) { + return -1; + } + + if (!str) { + *riops = 0; + } else { + tmp = strstr(str, "riops="); + tmp += strlen("riops=");
I bet coverity won't be happy about this. Please check the retval of strstr().
+ + if (STREQLEN(tmp, "max", 3)) { + *riops = 0; + } else if (virStrToLong_ui(tmp, NULL, 10, riops) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse '%s' as an integer"), + str); + return -1; + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -797,6 +859,8 @@ virCgroupBackend virCgroupV2Backend = { .getBlkioIoDeviceServiced = virCgroupV2GetBlkioIoDeviceServiced, .setBlkioDeviceWeight = virCgroupV2SetBlkioDeviceWeight, .getBlkioDeviceWeight = virCgroupV2GetBlkioDeviceWeight, + .setBlkioDeviceReadIops = virCgroupV2SetBlkioDeviceReadIops, + .getBlkioDeviceReadIops = virCgroupV2GetBlkioDeviceReadIops, };
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 64 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 408f7e3eeb..6279072aab 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -831,6 +831,68 @@ virCgroupV2GetBlkioDeviceReadIops(virCgroupPtr group, } +static int +virCgroupV2SetBlkioDeviceWriteIops(virCgroupPtr group, + const char *path, + unsigned int wiops) +{ + VIR_AUTOFREE(char *) str = NULL; + VIR_AUTOFREE(char *) blkstr = NULL; + + if (!(blkstr = virCgroupGetBlockDevString(path))) + return -1; + + if (wiops == 0) { + if (virAsprintf(&str, "%swiops=max", blkstr) < 0) + return -1; + } else { + if (virAsprintf(&str, "%swiops=%u", blkstr, wiops) < 0) + return -1; + } + + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + str); +} + + +static int +virCgroupV2GetBlkioDeviceWriteIops(virCgroupPtr group, + const char *path, + unsigned int *wiops) +{ + VIR_AUTOFREE(char *) str = NULL; + char *tmp; + + if (virCgroupGetValueForBlkDev(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + path, + &str) < 0) { + return -1; + } + + if (!str) { + *wiops = 0; + } else { + tmp = strstr(str, "wiops="); + tmp += strlen("wiops="); + + if (STREQLEN(tmp, "max", 3)) { + *wiops = 0; + } else if (virStrToLong_ui(tmp, NULL, 10, wiops) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse '%s' as an integer"), + str); + return -1; + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -861,6 +923,8 @@ virCgroupBackend virCgroupV2Backend = { .getBlkioDeviceWeight = virCgroupV2GetBlkioDeviceWeight, .setBlkioDeviceReadIops = virCgroupV2SetBlkioDeviceReadIops, .getBlkioDeviceReadIops = virCgroupV2GetBlkioDeviceReadIops, + .setBlkioDeviceWriteIops = virCgroupV2SetBlkioDeviceWriteIops, + .getBlkioDeviceWriteIops = virCgroupV2GetBlkioDeviceWriteIops, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 64 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 408f7e3eeb..6279072aab 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -831,6 +831,68 @@ virCgroupV2GetBlkioDeviceReadIops(virCgroupPtr group, }
+static int +virCgroupV2SetBlkioDeviceWriteIops(virCgroupPtr group, + const char *path, + unsigned int wiops) +{ + VIR_AUTOFREE(char *) str = NULL; + VIR_AUTOFREE(char *) blkstr = NULL; + + if (!(blkstr = virCgroupGetBlockDevString(path))) + return -1; + + if (wiops == 0) { + if (virAsprintf(&str, "%swiops=max", blkstr) < 0) + return -1; + } else { + if (virAsprintf(&str, "%swiops=%u", blkstr, wiops) < 0) + return -1; + } + + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + str); +} + + +static int +virCgroupV2GetBlkioDeviceWriteIops(virCgroupPtr group, + const char *path, + unsigned int *wiops) +{ + VIR_AUTOFREE(char *) str = NULL; + char *tmp; + + if (virCgroupGetValueForBlkDev(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + path, + &str) < 0) { + return -1; + } + + if (!str) { + *wiops = 0; + } else { + tmp = strstr(str, "wiops="); + tmp += strlen("wiops=");
And again. I am stopping to report this. You get the idea.
+ + if (STREQLEN(tmp, "max", 3)) { + *wiops = 0; + } else if (virStrToLong_ui(tmp, NULL, 10, wiops) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse '%s' as an integer"), + str); + return -1; + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -861,6 +923,8 @@ virCgroupBackend virCgroupV2Backend = { .getBlkioDeviceWeight = virCgroupV2GetBlkioDeviceWeight, .setBlkioDeviceReadIops = virCgroupV2SetBlkioDeviceReadIops, .getBlkioDeviceReadIops = virCgroupV2GetBlkioDeviceReadIops, + .setBlkioDeviceWriteIops = virCgroupV2SetBlkioDeviceWriteIops, + .getBlkioDeviceWriteIops = virCgroupV2GetBlkioDeviceWriteIops, };
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 64 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 6279072aab..88bd9fb37c 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -893,6 +893,68 @@ virCgroupV2GetBlkioDeviceWriteIops(virCgroupPtr group, } +static int +virCgroupV2SetBlkioDeviceReadBps(virCgroupPtr group, + const char *path, + unsigned long long rbps) +{ + VIR_AUTOFREE(char *) str = NULL; + VIR_AUTOFREE(char *) blkstr = NULL; + + if (!(blkstr = virCgroupGetBlockDevString(path))) + return -1; + + if (rbps == 0) { + if (virAsprintf(&str, "%srbps=max", blkstr) < 0) + return -1; + } else { + if (virAsprintf(&str, "%srbps=%llu", blkstr, rbps) < 0) + return -1; + } + + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + str); +} + + +static int +virCgroupV2GetBlkioDeviceReadBps(virCgroupPtr group, + const char *path, + unsigned long long *rbps) +{ + VIR_AUTOFREE(char *) str = NULL; + char *tmp; + + if (virCgroupGetValueForBlkDev(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + path, + &str) < 0) { + return -1; + } + + if (!str) { + *rbps = 0; + } else { + tmp = strstr(str, "rbps="); + tmp += strlen("rbps="); + + if (STREQLEN(tmp, "max", 3)) { + *rbps = 0; + } else if (virStrToLong_ull(tmp, NULL, 10, rbps) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse '%s' as an integer"), + str); + return -1; + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -925,6 +987,8 @@ virCgroupBackend virCgroupV2Backend = { .getBlkioDeviceReadIops = virCgroupV2GetBlkioDeviceReadIops, .setBlkioDeviceWriteIops = virCgroupV2SetBlkioDeviceWriteIops, .getBlkioDeviceWriteIops = virCgroupV2GetBlkioDeviceWriteIops, + .setBlkioDeviceReadBps = virCgroupV2SetBlkioDeviceReadBps, + .getBlkioDeviceReadBps = virCgroupV2GetBlkioDeviceReadBps, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 64 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 64 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 88bd9fb37c..d48e9db559 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -955,6 +955,68 @@ virCgroupV2GetBlkioDeviceReadBps(virCgroupPtr group, } +static int +virCgroupV2SetBlkioDeviceWriteBps(virCgroupPtr group, + const char *path, + unsigned long long wbps) +{ + VIR_AUTOFREE(char *) str = NULL; + VIR_AUTOFREE(char *) blkstr = NULL; + + if (!(blkstr = virCgroupGetBlockDevString(path))) + return -1; + + if (wbps == 0) { + if (virAsprintf(&str, "%swbps=max", blkstr) < 0) + return -1; + } else { + if (virAsprintf(&str, "%swbps=%llu", blkstr, wbps) < 0) + return -1; + } + + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + str); +} + + +static int +virCgroupV2GetBlkioDeviceWriteBps(virCgroupPtr group, + const char *path, + unsigned long long *wbps) +{ + VIR_AUTOFREE(char *) str = NULL; + char *tmp; + + if (virCgroupGetValueForBlkDev(group, + VIR_CGROUP_CONTROLLER_BLKIO, + "io.max", + path, + &str) < 0) { + return -1; + } + + if (!str) { + *wbps = 0; + } else { + tmp = strstr(str, "wbps="); + tmp += strlen("wbps="); + + if (STREQLEN(tmp, "max", 3)) { + *wbps = 0; + } else if (virStrToLong_ull(tmp, NULL, 10, wbps) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse '%s' as an integer"), + str); + return -1; + } + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -989,6 +1051,8 @@ virCgroupBackend virCgroupV2Backend = { .getBlkioDeviceWriteIops = virCgroupV2GetBlkioDeviceWriteIops, .setBlkioDeviceReadBps = virCgroupV2SetBlkioDeviceReadBps, .getBlkioDeviceReadBps = virCgroupV2GetBlkioDeviceReadBps, + .setBlkioDeviceWriteBps = virCgroupV2SetBlkioDeviceWriteBps, + .getBlkioDeviceWriteBps = virCgroupV2GetBlkioDeviceWriteBps, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 64 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index d48e9db559..32d20f2ff6 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1017,6 +1017,33 @@ virCgroupV2GetBlkioDeviceWriteBps(virCgroupPtr group, } +static int +virCgroupV2SetMemory(virCgroupPtr group, + unsigned long long kb) +{ + unsigned long long maxkb = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED; + + if (kb > maxkb) { + virReportError(VIR_ERR_INVALID_ARG, + _("Memory '%llu' must be less than %llu"), + kb, maxkb); + return -1; + } + + if (kb == maxkb) { + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.max", + "max"); + } else { + return virCgroupSetValueU64(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.max", + kb << 10); + } +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1053,6 +1080,8 @@ virCgroupBackend virCgroupV2Backend = { .getBlkioDeviceReadBps = virCgroupV2GetBlkioDeviceReadBps, .setBlkioDeviceWriteBps = virCgroupV2SetBlkioDeviceWriteBps, .getBlkioDeviceWriteBps = virCgroupV2GetBlkioDeviceWriteBps, + + .setMemory = virCgroupV2SetMemory, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 75 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 75 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 32d20f2ff6..da3b3a984c 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1044,6 +1044,80 @@ virCgroupV2SetMemory(virCgroupPtr group, } +static int +virCgroupV2GetMemoryStat(virCgroupPtr group, + unsigned long long *cache, + unsigned long long *activeAnon, + unsigned long long *inactiveAnon, + unsigned long long *activeFile, + unsigned long long *inactiveFile, + unsigned long long *unevictable) +{ + VIR_AUTOFREE(char *) stat = NULL; + char *line = NULL; + unsigned long long cacheVal = 0; + unsigned long long activeAnonVal = 0; + unsigned long long inactiveAnonVal = 0; + unsigned long long activeFileVal = 0; + unsigned long long inactiveFileVal = 0; + unsigned long long unevictableVal = 0; + + if (virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.stat", + &stat) < 0) { + return -1; + } + + line = stat; + + while (line) { + char *newLine = strchr(line, '\n'); + char *valueStr = strchr(line, ' '); + unsigned long long value; + + if (newLine) + *newLine = '\0'; + + if (!valueStr) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot parse 'memory.stat' cgroup file.")); + return -1; + } + *valueStr = '\0'; + + if (virStrToLong_ull(valueStr + 1, NULL, 10, &value) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse '%s' as an integer"), + valueStr + 1); + return -1; + } + + if (STREQ(line, "file")) + cacheVal = value >> 10; + else if (STREQ(line, "active_anon")) + activeAnonVal = value >> 10; + else if (STREQ(line, "inactive_anon")) + inactiveAnonVal = value >> 10; + else if (STREQ(line, "active_file")) + activeFileVal = value >> 10; + else if (STREQ(line, "inactive_file")) + inactiveFileVal = value >> 10; + else if (STREQ(line, "unevictable")) + unevictableVal = value >> 10; + } + + *cache = cacheVal; + *activeAnon = activeAnonVal; + *inactiveAnon = inactiveAnonVal; + *activeFile = activeFileVal; + *inactiveFile = inactiveFileVal; + *unevictable = unevictableVal; + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1082,6 +1156,7 @@ virCgroupBackend virCgroupV2Backend = { .getBlkioDeviceWriteBps = virCgroupV2GetBlkioDeviceWriteBps, .setMemory = virCgroupV2SetMemory, + .getMemoryStat = virCgroupV2GetMemoryStat, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 75 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 75 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 32d20f2ff6..da3b3a984c 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1044,6 +1044,80 @@ virCgroupV2SetMemory(virCgroupPtr group, }
+static int +virCgroupV2GetMemoryStat(virCgroupPtr group, + unsigned long long *cache, + unsigned long long *activeAnon, + unsigned long long *inactiveAnon, + unsigned long long *activeFile, + unsigned long long *inactiveFile, + unsigned long long *unevictable) +{ + VIR_AUTOFREE(char *) stat = NULL; + char *line = NULL; + unsigned long long cacheVal = 0; + unsigned long long activeAnonVal = 0; + unsigned long long inactiveAnonVal = 0; + unsigned long long activeFileVal = 0; + unsigned long long inactiveFileVal = 0; + unsigned long long unevictableVal = 0; + + if (virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.stat", + &stat) < 0) { + return -1; + } + + line = stat; + + while (line) { + char *newLine = strchr(line, '\n'); + char *valueStr = strchr(line, ' '); + unsigned long long value; + + if (newLine) + *newLine = '\0'; + + if (!valueStr) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot parse 'memory.stat' cgroup file.")); + return -1; + } + *valueStr = '\0'; + + if (virStrToLong_ull(valueStr + 1, NULL, 10, &value) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse '%s' as an integer"), + valueStr + 1); + return -1; + } + + if (STREQ(line, "file")) + cacheVal = value >> 10; + else if (STREQ(line, "active_anon")) + activeAnonVal = value >> 10; + else if (STREQ(line, "inactive_anon")) + inactiveAnonVal = value >> 10; + else if (STREQ(line, "active_file")) + activeFileVal = value >> 10; + else if (STREQ(line, "inactive_file")) + inactiveFileVal = value >> 10; + else if (STREQ(line, "unevictable")) + unevictableVal = value >> 10;
Funny, in 22/53 you've created an array so that you don't need to do this. I couldn't care less TBH.
+ } + + *cache = cacheVal; + *activeAnon = activeAnonVal; + *inactiveAnon = inactiveAnonVal; + *activeFile = activeFileVal; + *inactiveFile = inactiveFileVal; + *unevictable = unevictableVal; + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -1082,6 +1156,7 @@ virCgroupBackend virCgroupV2Backend = { .getBlkioDeviceWriteBps = virCgroupV2GetBlkioDeviceWriteBps,
.setMemory = virCgroupV2SetMemory, + .getMemoryStat = virCgroupV2GetMemoryStat, };
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index da3b3a984c..9f37ff5be5 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1118,6 +1118,20 @@ virCgroupV2GetMemoryStat(virCgroupPtr group, } +static int +virCgroupV2GetMemoryUsage(virCgroupPtr group, + unsigned long *kb) +{ + long long unsigned int usage_in_bytes; + int ret = virCgroupGetValueU64(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.current", &usage_in_bytes); + if (ret == 0) + *kb = (unsigned long) usage_in_bytes >> 10; + return ret; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1157,6 +1171,7 @@ virCgroupBackend virCgroupV2Backend = { .setMemory = virCgroupV2SetMemory, .getMemoryStat = virCgroupV2GetMemoryStat, + .getMemoryUsage = virCgroupV2GetMemoryUsage, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index da3b3a984c..9f37ff5be5 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1118,6 +1118,20 @@ virCgroupV2GetMemoryStat(virCgroupPtr group, }
+static int +virCgroupV2GetMemoryUsage(virCgroupPtr group, + unsigned long *kb) +{ + long long unsigned int usage_in_bytes;
or unsigned long long usage_in_bytes;
+ int ret = virCgroupGetValueU64(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.current", &usage_in_bytes); + if (ret == 0) + *kb = (unsigned long) usage_in_bytes >> 10;
I wonder if this is safe. Does ULONG_MAX << 10 = ULLONG_MAX? On the other hand, on 32bit machine you won't have more than 4GiB of RAM anyway and on 64bit UL and ULL are going to be probably the same.
+ return ret; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -1157,6 +1171,7 @@ virCgroupBackend virCgroupV2Backend = {
.setMemory = virCgroupV2SetMemory, .getMemoryStat = virCgroupV2GetMemoryStat, + .getMemoryUsage = virCgroupV2GetMemoryUsage, };
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 43 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 9f37ff5be5..06522498f2 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1132,6 +1132,47 @@ virCgroupV2GetMemoryUsage(virCgroupPtr group, } +static int +virCgroupV2SetMemoryHardLimit(virCgroupPtr group, + unsigned long long kb) +{ + return virCgroupV2SetMemory(group, kb); +} + + +static int +virCgroupV2GetMemoryHardLimit(virCgroupPtr group, + unsigned long long *kb) +{ + VIR_AUTOFREE(char *) value = NULL; + unsigned long long max; + + if (virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.max", &value) < 0) { + return -1; + } + + if (STREQ(value, "max")) { + *kb = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED; + return 0; + } + + if (virStrToLong_ull(value, NULL, 10, &max) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to parse value '%s' as number."), + value); + return -1; + } + + *kb = max >> 10; + if (*kb >= VIR_DOMAIN_MEMORY_PARAM_UNLIMITED) + *kb = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED; + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1172,6 +1213,8 @@ virCgroupBackend virCgroupV2Backend = { .setMemory = virCgroupV2SetMemory, .getMemoryStat = virCgroupV2GetMemoryStat, .getMemoryUsage = virCgroupV2GetMemoryUsage, + .setMemoryHardLimit = virCgroupV2SetMemoryHardLimit, + .getMemoryHardLimit = virCgroupV2GetMemoryHardLimit, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 43 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 61 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 06522498f2..b5cc5f8f2e 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1173,6 +1173,65 @@ virCgroupV2GetMemoryHardLimit(virCgroupPtr group, } +static int +virCgroupV2SetMemorySoftLimit(virCgroupPtr group, + unsigned long long kb) +{ + unsigned long long maxkb = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED; + + if (kb > maxkb) { + virReportError(VIR_ERR_INVALID_ARG, + _("Memory '%llu' must be less than %llu"), + kb, maxkb); + return -1; + } + + if (kb == maxkb) { + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.high", + "max"); + } else { + return virCgroupSetValueU64(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.high", + kb << 10); + } +} + + +static int +virCgroupV2GetMemorySoftLimit(virCgroupPtr group, + unsigned long long *kb) +{ + VIR_AUTOFREE(char *) value = NULL; + unsigned long long high; + + if (virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.high", &value) < 0) + return -1; + + if (STREQ(value, "max")) { + *kb = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED; + return 0; + } + + if (virStrToLong_ull(value, NULL, 10, &high) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to parse value '%s' as number."), + value); + return -1; + } + + *kb = high >> 10; + if (*kb >= VIR_DOMAIN_MEMORY_PARAM_UNLIMITED) + *kb = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED; + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1215,6 +1274,8 @@ virCgroupBackend virCgroupV2Backend = { .getMemoryUsage = virCgroupV2GetMemoryUsage, .setMemoryHardLimit = virCgroupV2SetMemoryHardLimit, .getMemoryHardLimit = virCgroupV2GetMemoryHardLimit, + .setMemorySoftLimit = virCgroupV2SetMemorySoftLimit, + .getMemorySoftLimit = virCgroupV2GetMemorySoftLimit, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 61 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 62 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index b5cc5f8f2e..ae352f0212 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1232,6 +1232,66 @@ virCgroupV2GetMemorySoftLimit(virCgroupPtr group, } +static int +virCgroupV2SetMemSwapHardLimit(virCgroupPtr group, + unsigned long long kb) +{ + unsigned long long maxkb = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED; + + if (kb > maxkb) { + virReportError(VIR_ERR_INVALID_ARG, + _("Memory '%llu' must be less than %llu"), + kb, maxkb); + return -1; + } + + if (kb == maxkb) { + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.swap.max", + "max"); + } else { + return virCgroupSetValueU64(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.swap.max", + kb << 10); + } +} + + +static int +virCgroupV2GetMemSwapHardLimit(virCgroupPtr group, + unsigned long long *kb) +{ + VIR_AUTOFREE(char *) value = NULL; + unsigned long long max; + + if (virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.swap.max", &value) < 0) { + return -1; + } + + if (STREQ(value, "max")) { + *kb = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED; + return 0; + } + + if (virStrToLong_ull(value, NULL, 10, &max) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to parse value '%s' as number."), + value); + return -1; + } + + *kb = max >> 10; + if (*kb >= VIR_DOMAIN_MEMORY_PARAM_UNLIMITED) + *kb = VIR_DOMAIN_MEMORY_PARAM_UNLIMITED; + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1276,6 +1336,8 @@ virCgroupBackend virCgroupV2Backend = { .getMemoryHardLimit = virCgroupV2GetMemoryHardLimit, .setMemorySoftLimit = virCgroupV2SetMemorySoftLimit, .getMemorySoftLimit = virCgroupV2GetMemorySoftLimit, + .setMemSwapHardLimit = virCgroupV2SetMemSwapHardLimit, + .getMemSwapHardLimit = virCgroupV2GetMemSwapHardLimit, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 62 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index ae352f0212..3d1d68ded5 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1292,6 +1292,21 @@ virCgroupV2GetMemSwapHardLimit(virCgroupPtr group, } +static int +virCgroupV2GetMemSwapUsage(virCgroupPtr group, + unsigned long long *kb) +{ + long long unsigned int usage_in_bytes; + int ret; + ret = virCgroupGetValueU64(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.swap.current", &usage_in_bytes); + if (ret == 0) + *kb = (unsigned long) usage_in_bytes >> 10; + return ret; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1338,6 +1353,7 @@ virCgroupBackend virCgroupV2Backend = { .getMemorySoftLimit = virCgroupV2GetMemorySoftLimit, .setMemSwapHardLimit = virCgroupV2SetMemSwapHardLimit, .getMemSwapHardLimit = virCgroupV2GetMemSwapHardLimit, + .getMemSwapUsage = virCgroupV2GetMemSwapUsage, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index ae352f0212..3d1d68ded5 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1292,6 +1292,21 @@ virCgroupV2GetMemSwapHardLimit(virCgroupPtr group, }
+static int +virCgroupV2GetMemSwapUsage(virCgroupPtr group, + unsigned long long *kb) +{ + long long unsigned int usage_in_bytes;
Looks like a copy & paste & paste & paste & paste ... error :-)
+ int ret; + ret = virCgroupGetValueU64(group, + VIR_CGROUP_CONTROLLER_MEMORY, + "memory.swap.current", &usage_in_bytes); + if (ret == 0) + *kb = (unsigned long) usage_in_bytes >> 10; + return ret; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2,
@@ -1338,6 +1353,7 @@ virCgroupBackend virCgroupV2Backend = { .getMemorySoftLimit = virCgroupV2GetMemorySoftLimit, .setMemSwapHardLimit = virCgroupV2SetMemSwapHardLimit, .getMemSwapHardLimit = virCgroupV2GetMemSwapHardLimit, + .getMemSwapUsage = virCgroupV2GetMemSwapUsage, };
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 3d1d68ded5..6cecfb9741 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1307,6 +1307,26 @@ virCgroupV2GetMemSwapUsage(virCgroupPtr group, } +static int +virCgroupV2SetCpuShares(virCgroupPtr group, + unsigned long long shares) +{ + return virCgroupSetValueU64(group, + VIR_CGROUP_CONTROLLER_CPU, + "cpu.weight", shares); +} + + +static int +virCgroupV2GetCpuShares(virCgroupPtr group, + unsigned long long *shares) +{ + return virCgroupGetValueU64(group, + VIR_CGROUP_CONTROLLER_CPU, + "cpu.weight", shares); +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1354,6 +1374,9 @@ virCgroupBackend virCgroupV2Backend = { .setMemSwapHardLimit = virCgroupV2SetMemSwapHardLimit, .getMemSwapHardLimit = virCgroupV2GetMemSwapHardLimit, .getMemSwapUsage = virCgroupV2GetMemSwapUsage, + + .setCpuShares = virCgroupV2SetCpuShares, + .getCpuShares = virCgroupV2GetCpuShares, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
ACK Michal

In order to set CPU cfs period using cgroup v2 'cpu.max' interface we need to load the current value of CPU cfs quota first because format of 'cpu.max' interface is '$quota $period' and in order to change 'period' we need to write 'quota' as well. Writing only one number changes only 'quota'. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 68 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 6cecfb9741..92fe825795 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1327,6 +1327,72 @@ virCgroupV2GetCpuShares(virCgroupPtr group, } +static int +virCgroupV2SetCpuCfsPeriod(virCgroupPtr group, + unsigned long long cfs_period) +{ + VIR_AUTOFREE(char *) value = NULL; + VIR_AUTOFREE(char *) str = NULL; + char *tmp; + + /* The cfs_period should be greater or equal than 1ms, and less or equal + * than 1s. + */ + if (cfs_period < 1000 || cfs_period > 1000000) { + virReportError(VIR_ERR_INVALID_ARG, + _("cfs_period '%llu' must be in range (1000, 1000000)"), + cfs_period); + return -1; + } + + if (virCgroupGetValueStr(group, VIR_CGROUP_CONTROLLER_CPU, + "cpu.max", &str) < 0) { + return -1; + } + + if (!(tmp = strchr(str, ' '))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid 'cpu.max' data.")); + return -1; + } + *tmp = '\n'; + + if (virAsprintf(&value, "%s %llu", str, cfs_period) < 0) + return -1; + + return virCgroupSetValueStr(group, VIR_CGROUP_CONTROLLER_CPU, + "cpu.max", value); +} + + +static int +virCgroupV2GetCpuCfsPeriod(virCgroupPtr group, + unsigned long long *cfs_period) +{ + VIR_AUTOFREE(char *) str = NULL; + char *tmp; + + if (virCgroupGetValueStr(group, VIR_CGROUP_CONTROLLER_CPU, + "cpu.max", &str) < 0) { + return -1; + } + + if (!(tmp = strchr(str, ' '))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid 'cpu.max' data.")); + return -1; + } + + if (virStrToLong_ull(tmp, NULL, 10, cfs_period) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to parse value '%s' from cpu.max."), str); + return -1; + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1377,6 +1443,8 @@ virCgroupBackend virCgroupV2Backend = { .setCpuShares = virCgroupV2SetCpuShares, .getCpuShares = virCgroupV2GetCpuShares, + .setCpuCfsPeriod = virCgroupV2SetCpuCfsPeriod, + .getCpuCfsPeriod = virCgroupV2GetCpuCfsPeriod, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
In order to set CPU cfs period using cgroup v2 'cpu.max' interface we need to load the current value of CPU cfs quota first because format of 'cpu.max' interface is '$quota $period' and in order to change 'period' we need to write 'quota' as well. Writing only one number changes only 'quota'.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 68 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 52 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 92fe825795..d52e065685 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1393,6 +1393,56 @@ virCgroupV2GetCpuCfsPeriod(virCgroupPtr group, } +static int +virCgroupV2SetCpuCfsQuota(virCgroupPtr group, + long long cfs_quota) +{ + /* The cfs_quota should be greater or equal than 1ms */ + if (cfs_quota >= 0 && + (cfs_quota < 1000 || + cfs_quota > ULLONG_MAX / 1000)) { + virReportError(VIR_ERR_INVALID_ARG, + _("cfs_quota '%lld' must be in range (1000, %llu)"), + cfs_quota, ULLONG_MAX / 1000); + return -1; + } + + if (cfs_quota == ULLONG_MAX / 1000) { + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_CPU, + "cpu.max", "max"); + } + + return virCgroupSetValueI64(group, + VIR_CGROUP_CONTROLLER_CPU, + "cpu.max", cfs_quota); +} + + +static int +virCgroupV2GetCpuCfsQuota(virCgroupPtr group, + long long *cfs_quota) +{ + VIR_AUTOFREE(char *) str = NULL; + + if (virCgroupGetValueStr(group, VIR_CGROUP_CONTROLLER_CPU, + "cpu.max", &str) < 0) { + return -1; + } + + if (STREQLEN(str, "max", 3)) + *cfs_quota = ULLONG_MAX / 1000; + + if (virStrToLong_ll(str, NULL, 10, cfs_quota) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to parse value '%s' from cpu.max."), str); + return -1; + } + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1445,6 +1495,8 @@ virCgroupBackend virCgroupV2Backend = { .getCpuShares = virCgroupV2GetCpuShares, .setCpuCfsPeriod = virCgroupV2SetCpuCfsPeriod, .getCpuCfsPeriod = virCgroupV2GetCpuCfsPeriod, + .setCpuCfsQuota = virCgroupV2SetCpuCfsQuota, + .getCpuCfsQuota = virCgroupV2GetCpuCfsQuota, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 52 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index d52e065685..61c881f7e2 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1443,6 +1443,21 @@ virCgroupV2GetCpuCfsQuota(virCgroupPtr group, } +static bool +virCgroupV2SupportsCpuBW(virCgroupPtr cgroup) +{ + VIR_AUTOFREE(char *) path = NULL; + + if (virCgroupV2PathOfController(cgroup, VIR_CGROUP_CONTROLLER_CPU, + "cpu.max", &path) < 0) { + virResetLastError(); + return false; + } + + return virFileExists(path); +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1497,6 +1512,7 @@ virCgroupBackend virCgroupV2Backend = { .getCpuCfsPeriod = virCgroupV2GetCpuCfsPeriod, .setCpuCfsQuota = virCgroupV2SetCpuCfsQuota, .getCpuCfsQuota = virCgroupV2GetCpuCfsQuota, + .supportsCpuBW = virCgroupV2SupportsCpuBW, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index 61c881f7e2..c6e66189bb 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1458,6 +1458,37 @@ virCgroupV2SupportsCpuBW(virCgroupPtr cgroup) } +static int +virCgroupV2GetCpuacctUsage(virCgroupPtr group, + unsigned long long *usage) +{ + VIR_AUTOFREE(char *) str = NULL; + char *tmp; + + if (virCgroupGetValueStr(group, VIR_CGROUP_CONTROLLER_CPUACCT, + "cpu.stat", &str) < 0) { + return -1; + } + + if (!(tmp = strstr(str, "usage_usec "))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("cannot parse cpu usage stat '%s'"), str); + return -1; + } + tmp += strlen("usage_usec "); + + if (virStrToLong_ull(tmp, &tmp, 10, usage) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to parse value '%s' as number."), tmp); + return -1; + } + + *usage *= 1000; + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1513,6 +1544,8 @@ virCgroupBackend virCgroupV2Backend = { .setCpuCfsQuota = virCgroupV2SetCpuCfsQuota, .getCpuCfsQuota = virCgroupV2GetCpuCfsQuota, .supportsCpuBW = virCgroupV2SupportsCpuBW, + + .getCpuacctUsage = virCgroupV2GetCpuacctUsage, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 49 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c index c6e66189bb..c7248c1145 100644 --- a/src/util/vircgroupv2.c +++ b/src/util/vircgroupv2.c @@ -1489,6 +1489,54 @@ virCgroupV2GetCpuacctUsage(virCgroupPtr group, } +static int +virCgroupV2GetCpuacctStat(virCgroupPtr group, + unsigned long long *user, + unsigned long long *sys) +{ + VIR_AUTOFREE(char *) str = NULL; + char *tmp; + unsigned long long userVal = 0; + unsigned long long sysVal = 0; + + if (virCgroupGetValueStr(group, VIR_CGROUP_CONTROLLER_CPUACCT, + "cpu.stat", &str) < 0) { + return -1; + } + + if (!(tmp = strstr(str, "user_usec "))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("cannot parse cpu user stat '%s'"), str); + return -1; + } + tmp += strlen("user_usec "); + + if (virStrToLong_ull(tmp, &tmp, 10, &userVal) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to parse value '%s' as number."), tmp); + return -1; + } + + if (!(tmp = strstr(str, "system_usec "))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("cannot parse cpu sys stat '%s'"), str); + return -1; + } + tmp += strlen("system_usec "); + + if (virStrToLong_ull(tmp, &tmp, 10, &sysVal) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to parse value '%s' as number."), tmp); + return -1; + } + + *user = userVal * 1000; + *sys = sysVal * 1000; + + return 0; +} + + virCgroupBackend virCgroupV2Backend = { .type = VIR_CGROUP_BACKEND_TYPE_V2, @@ -1546,6 +1594,7 @@ virCgroupBackend virCgroupV2Backend = { .supportsCpuBW = virCgroupV2SupportsCpuBW, .getCpuacctUsage = virCgroupV2GetCpuacctUsage, + .getCpuacctStat = virCgroupV2GetCpuacctStat, }; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroupv2.c | 49 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+)
ACK Michal

This enables to use both cgroup v1 and v2 at the same time together with libvirt. It is supported by kernel and there is valid use-case, not all controllers are implemented in cgroup v2 so there might be configurations where administrator would enable these missing controllers in cgroup v1. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroup.c | 351 ++++++++++++++++++++++++++---------- src/util/vircgroupbackend.c | 20 ++ src/util/vircgroupbackend.h | 16 +- src/util/vircgrouppriv.h | 2 +- 4 files changed, 291 insertions(+), 98 deletions(-) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index dc249bfe33..4aec5f1bcf 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -228,6 +228,7 @@ virCgroupDetectMounts(virCgroupPtr group) struct mntent entry; char buf[CGROUP_MAX_VAL]; int ret = -1; + size_t i; mounts = fopen("/proc/mounts", "r"); if (mounts == NULL) { @@ -236,11 +237,14 @@ virCgroupDetectMounts(virCgroupPtr group) } while (getmntent_r(mounts, &entry, buf, sizeof(buf)) != NULL) { - if (group->backend->detectMounts(group, - entry.mnt_type, - entry.mnt_opts, - entry.mnt_dir) < 0) { - goto cleanup; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->detectMounts(group, + entry.mnt_type, + entry.mnt_opts, + entry.mnt_dir) < 0) { + goto cleanup; + } } } @@ -303,6 +307,7 @@ virCgroupDetectPlacement(virCgroupPtr group, } while (fgets(line, sizeof(line), mapping) != NULL) { + size_t i; char *controllers = strchr(line, ':'); char *selfpath = controllers ? strchr(controllers + 1, ':') : NULL; char *nl = selfpath ? strchr(selfpath, '\n') : NULL; @@ -317,9 +322,12 @@ virCgroupDetectPlacement(virCgroupPtr group, controllers++; selfpath++; - if (group->backend->detectPlacement(group, path, controllers, - selfpath) < 0) { - goto cleanup; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->detectPlacement(group, path, controllers, + selfpath) < 0) { + goto cleanup; + } } } @@ -338,8 +346,9 @@ virCgroupDetect(virCgroupPtr group, const char *path, virCgroupPtr parent) { - int rc; size_t i; + bool backendAvailable = false; + int controllersAvailable = 0; virCgroupBackendPtr *backends = virCgroupBackendGetAll(); VIR_DEBUG("group=%p controllers=%d path=%s parent=%p", @@ -350,31 +359,40 @@ virCgroupDetect(virCgroupPtr group, for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { if (backends[i] && backends[i]->available()) { - group->backend = backends[i]; - break; + group->backends[i] = backends[i]; + backendAvailable = true; } } - if (!group->backend) { + if (!backendAvailable) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("no cgroup backend available")); return -1; } if (parent) { - if (group->backend->copyMounts(group, parent) < 0) - return -1; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->copyMounts(group, parent) < 0) { + return -1; + } + } } else { if (virCgroupDetectMounts(group) < 0) return -1; } - rc = group->backend->detectControllers(group, controllers); - if (rc < 0) - return -1; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i]) { + int rc = group->backends[i]->detectControllers(group, controllers); + if (rc < 0) + return -1; + controllersAvailable |= rc; + } + } /* Check that at least 1 controller is available */ - if (rc == 0) { + if (controllersAvailable == 0) { virReportSystemError(ENXIO, "%s", _("At least one cgroup controller is required")); return -1; @@ -383,17 +401,26 @@ virCgroupDetect(virCgroupPtr group, /* In some cases we can copy part of the placement info * based on the parent cgroup... */ - if ((parent || path[0] == '/') && - group->backend->copyPlacement(group, path, parent) < 0) - return -1; + if (parent || path[0] == '/') { + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->copyPlacement(group, path, parent) < 0) { + return -1; + } + } + } /* ... but use /proc/cgroups to fill in the rest */ if (virCgroupDetectPlacement(group, pid, path) < 0) return -1; /* Check that for every mounted controller, we found our placement */ - if (group->backend->validatePlacement(group, pid) < 0) - return -1; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->validatePlacement(group, pid) < 0) { + return -1; + } + } return 0; } @@ -599,9 +626,14 @@ virCgroupMakeGroup(virCgroupPtr parent, bool create, unsigned int flags) { - if (group->backend->makeGroup(parent, group, create, flags) < 0) { - virCgroupRemove(group); - return -1; + size_t i; + + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->makeGroup(parent, group, create, flags) < 0) { + virCgroupRemove(group); + return -1; + } } return 0; @@ -662,6 +694,24 @@ virCgroupNew(pid_t pid, } +static int +virCgroupAddTaskInternal(virCgroupPtr group, + pid_t pid, + unsigned int flags) +{ + size_t i; + + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->addTask(group, pid, flags) < 0) { + return -1; + } + } + + return 0; +} + + /** * virCgroupAddProcess: * @@ -676,7 +726,7 @@ virCgroupNew(pid_t pid, int virCgroupAddProcess(virCgroupPtr group, pid_t pid) { - return group->backend->addTask(group, pid, VIR_CGROUP_TASK_PROCESS); + return virCgroupAddTaskInternal(group, pid, VIR_CGROUP_TASK_PROCESS); } /** @@ -693,9 +743,9 @@ virCgroupAddProcess(virCgroupPtr group, pid_t pid) int virCgroupAddMachineProcess(virCgroupPtr group, pid_t pid) { - return group->backend->addTask(group, pid, - VIR_CGROUP_TASK_PROCESS | - VIR_CGROUP_TASK_SYSTEMD); + return virCgroupAddTaskInternal(group, pid, + VIR_CGROUP_TASK_PROCESS | + VIR_CGROUP_TASK_SYSTEMD); } /** @@ -713,7 +763,7 @@ int virCgroupAddThread(virCgroupPtr group, pid_t pid) { - return group->backend->addTask(group, pid, VIR_CGROUP_TASK_THREAD); + return virCgroupAddTaskInternal(group, pid, VIR_CGROUP_TASK_THREAD); } @@ -967,17 +1017,24 @@ virCgroupNewDetectMachine(const char *name, char *machinename, virCgroupPtr *group) { + size_t i; + if (virCgroupNewDetect(pid, controllers, group) < 0) { if (virCgroupNewIgnoreError()) return 0; return -1; } - if (!(*group)->backend->validateMachineGroup(*group, name, drivername, machinename)) { - VIR_DEBUG("Failed to validate machine name for '%s' driver '%s'", - name, drivername); - virCgroupFree(group); - return 0; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if ((*group)->backends[i] && + !(*group)->backends[i]->validateMachineGroup(*group, name, + drivername, + machinename)) { + VIR_DEBUG("Failed to validate machine name for '%s' driver '%s'", + name, drivername); + virCgroupFree(group); + return 0; + } } return 0; @@ -1055,6 +1112,7 @@ virCgroupNewMachineSystemd(const char *name, int rv; virCgroupPtr init; VIR_AUTOFREE(char *) path = NULL; + size_t i; VIR_DEBUG("Trying to setup machine '%s' via systemd", name); if ((rv = virSystemdCreateMachine(name, @@ -1077,7 +1135,12 @@ virCgroupNewMachineSystemd(const char *name, &init) < 0) return -1; - path = init->backend->stealPlacement(init); + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (init->backends[i] && + (path = init->backends[i]->stealPlacement(init))) { + break; + } + } virCgroupFree(&init); if (!path || STREQ(path, "/") || path[0] != '/') { @@ -1256,12 +1319,21 @@ virCgroupFree(virCgroupPtr *group) bool virCgroupHasController(virCgroupPtr cgroup, int controller) { + size_t i; + if (!cgroup) return false; if (controller < 0 || controller >= VIR_CGROUP_CONTROLLER_LAST) return false; - return cgroup->backend->hasController(cgroup, controller); + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (cgroup->backends[i] && + cgroup->backends[i]->hasController(cgroup, controller)) { + return true; + } + } + + return false; } @@ -1277,7 +1349,8 @@ virCgroupPathOfController(virCgroupPtr group, return -1; } - return group->backend->pathOfController(group, controller, key, path); + VIR_CGROUP_BACKEND_CALL(group, controller, pathOfController, -1, + controller, key, path); } @@ -1299,7 +1372,8 @@ virCgroupGetBlkioIoServiced(virCgroupPtr group, long long *requests_read, long long *requests_write) { - VIR_CGROUP_BACKEND_CALL(group, getBlkioIoServiced, -1, + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + getBlkioIoServiced, -1, bytes_read, bytes_write, requests_read, requests_write); } @@ -1325,7 +1399,8 @@ virCgroupGetBlkioIoDeviceServiced(virCgroupPtr group, long long *requests_read, long long *requests_write) { - VIR_CGROUP_BACKEND_CALL(group, getBlkioIoDeviceServiced, -1, + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + getBlkioIoDeviceServiced, -1, path, bytes_read, bytes_write, requests_read, requests_write); } @@ -1342,7 +1417,8 @@ virCgroupGetBlkioIoDeviceServiced(virCgroupPtr group, int virCgroupSetBlkioWeight(virCgroupPtr group, unsigned int weight) { - VIR_CGROUP_BACKEND_CALL(group, setBlkioWeight, -1, weight); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + setBlkioWeight, -1, weight); } @@ -1357,7 +1433,8 @@ virCgroupSetBlkioWeight(virCgroupPtr group, unsigned int weight) int virCgroupGetBlkioWeight(virCgroupPtr group, unsigned int *weight) { - VIR_CGROUP_BACKEND_CALL(group, getBlkioWeight, -1, weight); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + getBlkioWeight, -1, weight); } /** @@ -1373,7 +1450,8 @@ virCgroupSetBlkioDeviceReadIops(virCgroupPtr group, const char *path, unsigned int riops) { - VIR_CGROUP_BACKEND_CALL(group, setBlkioDeviceReadIops, -1, path, riops); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + setBlkioDeviceReadIops, -1, path, riops); } @@ -1390,7 +1468,8 @@ virCgroupSetBlkioDeviceWriteIops(virCgroupPtr group, const char *path, unsigned int wiops) { - VIR_CGROUP_BACKEND_CALL(group, setBlkioDeviceWriteIops, -1, path, wiops); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + setBlkioDeviceWriteIops, -1, path, wiops); } @@ -1407,7 +1486,8 @@ virCgroupSetBlkioDeviceReadBps(virCgroupPtr group, const char *path, unsigned long long rbps) { - VIR_CGROUP_BACKEND_CALL(group, setBlkioDeviceReadBps, -1, path, rbps); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + setBlkioDeviceReadBps, -1, path, rbps); } /** @@ -1423,7 +1503,8 @@ virCgroupSetBlkioDeviceWriteBps(virCgroupPtr group, const char *path, unsigned long long wbps) { - VIR_CGROUP_BACKEND_CALL(group, setBlkioDeviceWriteBps, -1, path, wbps); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + setBlkioDeviceWriteBps, -1, path, wbps); } @@ -1441,7 +1522,8 @@ virCgroupSetBlkioDeviceWeight(virCgroupPtr group, const char *path, unsigned int weight) { - VIR_CGROUP_BACKEND_CALL(group, setBlkioDeviceWeight, -1, path, weight); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + setBlkioDeviceWeight, -1, path, weight); } /** @@ -1457,7 +1539,8 @@ virCgroupGetBlkioDeviceReadIops(virCgroupPtr group, const char *path, unsigned int *riops) { - VIR_CGROUP_BACKEND_CALL(group, getBlkioDeviceReadIops, -1, path, riops); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + getBlkioDeviceReadIops, -1, path, riops); } /** @@ -1473,7 +1556,8 @@ virCgroupGetBlkioDeviceWriteIops(virCgroupPtr group, const char *path, unsigned int *wiops) { - VIR_CGROUP_BACKEND_CALL(group, getBlkioDeviceWriteIops, -1, path, wiops); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + getBlkioDeviceWriteIops, -1, path, wiops); } /** @@ -1489,7 +1573,8 @@ virCgroupGetBlkioDeviceReadBps(virCgroupPtr group, const char *path, unsigned long long *rbps) { - VIR_CGROUP_BACKEND_CALL(group, getBlkioDeviceReadBps, -1, path, rbps); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + getBlkioDeviceReadBps, -1, path, rbps); } /** @@ -1505,7 +1590,8 @@ virCgroupGetBlkioDeviceWriteBps(virCgroupPtr group, const char *path, unsigned long long *wbps) { - VIR_CGROUP_BACKEND_CALL(group, getBlkioDeviceWriteBps, -1, path, wbps); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + getBlkioDeviceWriteBps, -1, path, wbps); } /** @@ -1521,7 +1607,8 @@ virCgroupGetBlkioDeviceWeight(virCgroupPtr group, const char *path, unsigned int *weight) { - VIR_CGROUP_BACKEND_CALL(group, getBlkioDeviceWeight, -1, path, weight); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_BLKIO, + getBlkioDeviceWeight, -1, path, weight); } @@ -1536,7 +1623,8 @@ virCgroupGetBlkioDeviceWeight(virCgroupPtr group, int virCgroupSetMemory(virCgroupPtr group, unsigned long long kb) { - VIR_CGROUP_BACKEND_CALL(group, setMemory, -1, kb); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_MEMORY, + setMemory, -1, kb); } @@ -1562,7 +1650,8 @@ virCgroupGetMemoryStat(virCgroupPtr group, unsigned long long *inactiveFile, unsigned long long *unevictable) { - VIR_CGROUP_BACKEND_CALL(group, getMemoryStat, -1, cache, + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_MEMORY, + getMemoryStat, -1, cache, activeAnon, inactiveAnon, activeFile, inactiveFile, unevictable); @@ -1580,7 +1669,8 @@ virCgroupGetMemoryStat(virCgroupPtr group, int virCgroupGetMemoryUsage(virCgroupPtr group, unsigned long *kb) { - VIR_CGROUP_BACKEND_CALL(group, getMemoryUsage, -1, kb); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_MEMORY, + getMemoryUsage, -1, kb); } @@ -1595,7 +1685,8 @@ virCgroupGetMemoryUsage(virCgroupPtr group, unsigned long *kb) int virCgroupSetMemoryHardLimit(virCgroupPtr group, unsigned long long kb) { - VIR_CGROUP_BACKEND_CALL(group, setMemoryHardLimit, -1, kb); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_MEMORY, + setMemoryHardLimit, -1, kb); } @@ -1610,7 +1701,8 @@ virCgroupSetMemoryHardLimit(virCgroupPtr group, unsigned long long kb) int virCgroupGetMemoryHardLimit(virCgroupPtr group, unsigned long long *kb) { - VIR_CGROUP_BACKEND_CALL(group, getMemoryHardLimit, -1, kb); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_MEMORY, + getMemoryHardLimit, -1, kb); } @@ -1625,7 +1717,8 @@ virCgroupGetMemoryHardLimit(virCgroupPtr group, unsigned long long *kb) int virCgroupSetMemorySoftLimit(virCgroupPtr group, unsigned long long kb) { - VIR_CGROUP_BACKEND_CALL(group, setMemorySoftLimit, -1, kb); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_MEMORY, + setMemorySoftLimit, -1, kb); } @@ -1640,7 +1733,8 @@ virCgroupSetMemorySoftLimit(virCgroupPtr group, unsigned long long kb) int virCgroupGetMemorySoftLimit(virCgroupPtr group, unsigned long long *kb) { - VIR_CGROUP_BACKEND_CALL(group, getMemorySoftLimit, -1, kb); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_MEMORY, + getMemorySoftLimit, -1, kb); } @@ -1655,7 +1749,8 @@ virCgroupGetMemorySoftLimit(virCgroupPtr group, unsigned long long *kb) int virCgroupSetMemSwapHardLimit(virCgroupPtr group, unsigned long long kb) { - VIR_CGROUP_BACKEND_CALL(group, setMemSwapHardLimit, -1, kb); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_MEMORY, + setMemSwapHardLimit, -1, kb); } @@ -1670,7 +1765,8 @@ virCgroupSetMemSwapHardLimit(virCgroupPtr group, unsigned long long kb) int virCgroupGetMemSwapHardLimit(virCgroupPtr group, unsigned long long *kb) { - VIR_CGROUP_BACKEND_CALL(group, getMemSwapHardLimit, -1, kb); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_MEMORY, + getMemSwapHardLimit, -1, kb); } @@ -1685,7 +1781,8 @@ virCgroupGetMemSwapHardLimit(virCgroupPtr group, unsigned long long *kb) int virCgroupGetMemSwapUsage(virCgroupPtr group, unsigned long long *kb) { - VIR_CGROUP_BACKEND_CALL(group, getMemSwapUsage, -1, kb); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_MEMORY, + getMemSwapUsage, -1, kb); } @@ -1700,7 +1797,8 @@ virCgroupGetMemSwapUsage(virCgroupPtr group, unsigned long long *kb) int virCgroupSetCpusetMems(virCgroupPtr group, const char *mems) { - VIR_CGROUP_BACKEND_CALL(group, setCpusetMems, -1, mems); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPUSET, + setCpusetMems, -1, mems); } @@ -1715,7 +1813,8 @@ virCgroupSetCpusetMems(virCgroupPtr group, const char *mems) int virCgroupGetCpusetMems(virCgroupPtr group, char **mems) { - VIR_CGROUP_BACKEND_CALL(group, getCpusetMems, -1, mems); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPUSET, + getCpusetMems, -1, mems); } @@ -1730,7 +1829,8 @@ virCgroupGetCpusetMems(virCgroupPtr group, char **mems) int virCgroupSetCpusetMemoryMigrate(virCgroupPtr group, bool migrate) { - VIR_CGROUP_BACKEND_CALL(group, setCpusetMemoryMigrate, -1, migrate); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPUSET, + setCpusetMemoryMigrate, -1, migrate); } @@ -1745,7 +1845,8 @@ virCgroupSetCpusetMemoryMigrate(virCgroupPtr group, bool migrate) int virCgroupGetCpusetMemoryMigrate(virCgroupPtr group, bool *migrate) { - VIR_CGROUP_BACKEND_CALL(group, getCpusetMemoryMigrate, -1, migrate); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPUSET, + getCpusetMemoryMigrate, -1, migrate); } @@ -1760,7 +1861,8 @@ virCgroupGetCpusetMemoryMigrate(virCgroupPtr group, bool *migrate) int virCgroupSetCpusetCpus(virCgroupPtr group, const char *cpus) { - VIR_CGROUP_BACKEND_CALL(group, setCpusetCpus, -1, cpus); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPUSET, + setCpusetCpus, -1, cpus); } @@ -1775,7 +1877,8 @@ virCgroupSetCpusetCpus(virCgroupPtr group, const char *cpus) int virCgroupGetCpusetCpus(virCgroupPtr group, char **cpus) { - VIR_CGROUP_BACKEND_CALL(group, getCpusetCpus, -1, cpus); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPUSET, + getCpusetCpus, -1, cpus); } @@ -1789,7 +1892,8 @@ virCgroupGetCpusetCpus(virCgroupPtr group, char **cpus) int virCgroupDenyAllDevices(virCgroupPtr group) { - VIR_CGROUP_BACKEND_CALL(group, denyAllDevices, -1); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_DEVICES, + denyAllDevices, -1); } /** @@ -1809,7 +1913,8 @@ virCgroupDenyAllDevices(virCgroupPtr group) int virCgroupAllowAllDevices(virCgroupPtr group, int perms) { - VIR_CGROUP_BACKEND_CALL(group, allowAllDevices, -1, perms); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_DEVICES, + allowAllDevices, -1, perms); } @@ -1828,7 +1933,8 @@ int virCgroupAllowDevice(virCgroupPtr group, char type, int major, int minor, int perms) { - VIR_CGROUP_BACKEND_CALL(group, allowDevice, -1, type, major, minor, perms); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_DEVICES, + allowDevice, -1, type, major, minor, perms); } @@ -1867,7 +1973,8 @@ virCgroupAllowDevicePath(virCgroupPtr group, if (!S_ISCHR(sb.st_mode) && !S_ISBLK(sb.st_mode)) return 1; - VIR_CGROUP_BACKEND_CALL(group, allowDevice, -1, + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_DEVICES, + allowDevice, -1, S_ISCHR(sb.st_mode) ? 'c' : 'b', major(sb.st_rdev), minor(sb.st_rdev), @@ -1890,7 +1997,8 @@ int virCgroupDenyDevice(virCgroupPtr group, char type, int major, int minor, int perms) { - VIR_CGROUP_BACKEND_CALL(group, denyDevice, -1, type, major, minor, perms); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_DEVICES, + denyDevice, -1, type, major, minor, perms); } @@ -1929,7 +2037,8 @@ virCgroupDenyDevicePath(virCgroupPtr group, if (!S_ISCHR(sb.st_mode) && !S_ISBLK(sb.st_mode)) return 1; - VIR_CGROUP_BACKEND_CALL(group, denyDevice, -1, + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_DEVICES, + denyDevice, -1, S_ISCHR(sb.st_mode) ? 'c' : 'b', major(sb.st_rdev), minor(sb.st_rdev), @@ -2172,14 +2281,16 @@ virCgroupGetDomainTotalCpuStats(virCgroupPtr group, int virCgroupSetCpuShares(virCgroupPtr group, unsigned long long shares) { - VIR_CGROUP_BACKEND_CALL(group, setCpuShares, -1, shares); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPU, + setCpuShares, -1, shares); } int virCgroupGetCpuShares(virCgroupPtr group, unsigned long long *shares) { - VIR_CGROUP_BACKEND_CALL(group, getCpuShares, -1, shares); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPU, + getCpuShares, -1, shares); } @@ -2194,7 +2305,8 @@ virCgroupGetCpuShares(virCgroupPtr group, unsigned long long *shares) int virCgroupSetCpuCfsPeriod(virCgroupPtr group, unsigned long long cfs_period) { - VIR_CGROUP_BACKEND_CALL(group, setCpuCfsPeriod, -1, cfs_period); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPU, + setCpuCfsPeriod, -1, cfs_period); } @@ -2209,7 +2321,8 @@ virCgroupSetCpuCfsPeriod(virCgroupPtr group, unsigned long long cfs_period) int virCgroupGetCpuCfsPeriod(virCgroupPtr group, unsigned long long *cfs_period) { - VIR_CGROUP_BACKEND_CALL(group, getCpuCfsPeriod, -1, cfs_period); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPU, + getCpuCfsPeriod, -1, cfs_period); } @@ -2225,14 +2338,16 @@ virCgroupGetCpuCfsPeriod(virCgroupPtr group, unsigned long long *cfs_period) int virCgroupSetCpuCfsQuota(virCgroupPtr group, long long cfs_quota) { - VIR_CGROUP_BACKEND_CALL(group, setCpuCfsQuota, -1, cfs_quota); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPU, + setCpuCfsQuota, -1, cfs_quota); } int virCgroupGetCpuacctPercpuUsage(virCgroupPtr group, char **usage) { - VIR_CGROUP_BACKEND_CALL(group, getCpuacctPercpuUsage, -1, usage); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPUACCT, + getCpuacctPercpuUsage, -1, usage); } @@ -2299,7 +2414,16 @@ virCgroupRemoveRecursively(char *grppath) int virCgroupRemove(virCgroupPtr group) { - return group->backend->remove(group); + size_t i; + + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->remove(group) < 0) { + return -1; + } + } + + return 0; } @@ -2308,11 +2432,16 @@ virCgroupPathOfAnyController(virCgroupPtr group, const char *name, char **keypath) { + size_t i; int controller; - controller = group->backend->getAnyController(group); - if (controller >= 0) - return virCgroupPathOfController(group, controller, name, keypath); + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i]) { + controller = group->backends[i]->getAnyController(group); + if (controller >= 0) + return virCgroupPathOfController(group, controller, name, keypath); + } + } virReportSystemError(ENOSYS, "%s", _("No controllers are mounted")); @@ -2548,14 +2677,16 @@ virCgroupKillPainfully(virCgroupPtr group) int virCgroupGetCpuCfsQuota(virCgroupPtr group, long long *cfs_quota) { - VIR_CGROUP_BACKEND_CALL(group, getCpuCfsQuota, -1, cfs_quota); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPU, + getCpuCfsQuota, -1, cfs_quota); } int virCgroupGetCpuacctUsage(virCgroupPtr group, unsigned long long *usage) { - VIR_CGROUP_BACKEND_CALL(group, getCpuacctUsage, -1, usage); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPUACCT, + getCpuacctUsage, -1, usage); } @@ -2563,21 +2694,24 @@ int virCgroupGetCpuacctStat(virCgroupPtr group, unsigned long long *user, unsigned long long *sys) { - VIR_CGROUP_BACKEND_CALL(group, getCpuacctStat, -1, user, sys); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_CPUACCT, + getCpuacctStat, -1, user, sys); } int virCgroupSetFreezerState(virCgroupPtr group, const char *state) { - VIR_CGROUP_BACKEND_CALL(group, setFreezerState, -1, state); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_FREEZER, + setFreezerState, -1, state); } int virCgroupGetFreezerState(virCgroupPtr group, char **state) { - VIR_CGROUP_BACKEND_CALL(group, getFreezerState, -1, state); + VIR_CGROUP_BACKEND_CALL(group, VIR_CGROUP_CONTROLLER_FREEZER, + getFreezerState, -1, state); } @@ -2585,7 +2719,16 @@ int virCgroupBindMount(virCgroupPtr group, const char *oldroot, const char *mountopts) { - return group->backend->bindMount(group, oldroot, mountopts); + size_t i; + + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->bindMount(group, oldroot, mountopts) < 0) { + return -1; + } + } + + return 0; } @@ -2594,7 +2737,16 @@ int virCgroupSetOwner(virCgroupPtr cgroup, gid_t gid, int controllers) { - return cgroup->backend->setOwner(cgroup, uid, gid, controllers); + size_t i; + + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (cgroup->backends[i] && + cgroup->backends[i]->setOwner(cgroup, uid, gid, controllers) < 0) { + return -1; + } + } + + return 0; } @@ -2608,13 +2760,24 @@ int virCgroupSetOwner(virCgroupPtr cgroup, bool virCgroupSupportsCpuBW(virCgroupPtr cgroup) { - VIR_CGROUP_BACKEND_CALL(cgroup, supportsCpuBW, false); + VIR_CGROUP_BACKEND_CALL(cgroup, VIR_CGROUP_CONTROLLER_CPU, + supportsCpuBW, false); } int virCgroupHasEmptyTasks(virCgroupPtr cgroup, int controller) { - return cgroup->backend->hasEmptyTasks(cgroup, controller); + size_t i; + + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (cgroup->backends[i]) { + int rc = cgroup->backends[i]->hasEmptyTasks(cgroup, controller); + if (rc <= 0) + return rc; + } + } + + return 1; } bool diff --git a/src/util/vircgroupbackend.c b/src/util/vircgroupbackend.c index 7ee39ac8ca..2e90781dc3 100644 --- a/src/util/vircgroupbackend.c +++ b/src/util/vircgroupbackend.c @@ -20,6 +20,9 @@ #include <config.h> #include "vircgroupbackend.h" +#define __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ +#include "vircgrouppriv.h" +#undef __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ #include "vircgroupv1.h" #include "vircgroupv2.h" #include "virerror.h" @@ -67,3 +70,20 @@ virCgroupBackendGetAll(void) } return virCgroupBackends; } + + +virCgroupBackendPtr +virCgroupBackendForController(virCgroupPtr group, + unsigned int controller) +{ + size_t i; + + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->hasController(group, controller)) { + return group->backends[i]; + } + } + + return NULL; +} diff --git a/src/util/vircgroupbackend.h b/src/util/vircgroupbackend.h index 86d1539e07..bc60b44643 100644 --- a/src/util/vircgroupbackend.h +++ b/src/util/vircgroupbackend.h @@ -436,12 +436,22 @@ virCgroupBackendRegister(virCgroupBackendPtr backend); virCgroupBackendPtr * virCgroupBackendGetAll(void); -# define VIR_CGROUP_BACKEND_CALL(group, func, ret, ...) \ - if (!group->backend->func) { \ +virCgroupBackendPtr +virCgroupBackendForController(virCgroupPtr group, + unsigned int controller); + +# define VIR_CGROUP_BACKEND_CALL(group, controller, func, ret, ...) \ + virCgroupBackendPtr backend = virCgroupBackendForController(group, controller); \ + if (!backend) { \ + virReportError(VIR_ERR_INTERNAL_ERROR, \ + _("failed to get cgroup backend for '%s'"), #func); \ + return ret; \ + } \ + if (!backend->func) { \ virReportError(VIR_ERR_OPERATION_UNSUPPORTED, \ _("operation '%s' not supported"), #func); \ return ret; \ } \ - return group->backend->func(group, ##__VA_ARGS__); + return backend->func(group, ##__VA_ARGS__); #endif /* __VIR_CGROUP_BACKEND_H__ */ diff --git a/src/util/vircgrouppriv.h b/src/util/vircgrouppriv.h index 4a0d75ddbc..8f24b0891e 100644 --- a/src/util/vircgrouppriv.h +++ b/src/util/vircgrouppriv.h @@ -56,7 +56,7 @@ typedef virCgroupV2Controller *virCgroupV2ControllerPtr; struct _virCgroup { char *path; - virCgroupBackendPtr backend; + virCgroupBackendPtr backends[VIR_CGROUP_BACKEND_TYPE_LAST]; virCgroupV1Controller legacy[VIR_CGROUP_CONTROLLER_LAST]; virCgroupV2Controller unified; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
This enables to use both cgroup v1 and v2 at the same time together with libvirt. It is supported by kernel and there is valid use-case, not all controllers are implemented in cgroup v2 so there might be configurations where administrator would enable these missing controllers in cgroup v1.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroup.c | 351 ++++++++++++++++++++++++++---------- src/util/vircgroupbackend.c | 20 ++ src/util/vircgroupbackend.h | 16 +- src/util/vircgrouppriv.h | 2 +- 4 files changed, 291 insertions(+), 98 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index dc249bfe33..4aec5f1bcf 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -228,6 +228,7 @@ virCgroupDetectMounts(virCgroupPtr group) struct mntent entry; char buf[CGROUP_MAX_VAL]; int ret = -1; + size_t i;
mounts = fopen("/proc/mounts", "r"); if (mounts == NULL) { @@ -236,11 +237,14 @@ virCgroupDetectMounts(virCgroupPtr group) }
while (getmntent_r(mounts, &entry, buf, sizeof(buf)) != NULL) { - if (group->backend->detectMounts(group, - entry.mnt_type, - entry.mnt_opts, - entry.mnt_dir) < 0) { - goto cleanup; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->detectMounts(group, + entry.mnt_type, + entry.mnt_opts, + entry.mnt_dir) < 0) { + goto cleanup; + } } }
@@ -303,6 +307,7 @@ virCgroupDetectPlacement(virCgroupPtr group, }
while (fgets(line, sizeof(line), mapping) != NULL) { + size_t i; char *controllers = strchr(line, ':'); char *selfpath = controllers ? strchr(controllers + 1, ':') : NULL; char *nl = selfpath ? strchr(selfpath, '\n') : NULL; @@ -317,9 +322,12 @@ virCgroupDetectPlacement(virCgroupPtr group, controllers++; selfpath++;
- if (group->backend->detectPlacement(group, path, controllers, - selfpath) < 0) { - goto cleanup; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->detectPlacement(group, path, controllers, + selfpath) < 0) { + goto cleanup; + }
So a loop like this appears all over the patch: for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { if (group->backends[i] && group->backends[i]->cb(); } I wonder if we should have wrappers around these chunks of code. But that could be saved as a follow up patch.
} }
@@ -338,8 +346,9 @@ virCgroupDetect(virCgroupPtr group, const char *path, virCgroupPtr parent) { - int rc; size_t i; + bool backendAvailable = false; + int controllersAvailable = 0; virCgroupBackendPtr *backends = virCgroupBackendGetAll();
VIR_DEBUG("group=%p controllers=%d path=%s parent=%p", @@ -350,31 +359,40 @@ virCgroupDetect(virCgroupPtr group,
for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { if (backends[i] && backends[i]->available()) { - group->backend = backends[i]; - break; + group->backends[i] = backends[i]; + backendAvailable = true;
No need to remove the 'break'.
} }
ACK Michal

On Fri, 2018-10-05 at 12:14 +0200, Michal Privoznik wrote:
On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
This enables to use both cgroup v1 and v2 at the same time together with libvirt. It is supported by kernel and there is valid use- case, not all controllers are implemented in cgroup v2 so there might be configurations where administrator would enable these missing controllers in cgroup v1.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroup.c | 351 ++++++++++++++++++++++++++---- ------ src/util/vircgroupbackend.c | 20 ++ src/util/vircgroupbackend.h | 16 +- src/util/vircgrouppriv.h | 2 +- 4 files changed, 291 insertions(+), 98 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index dc249bfe33..4aec5f1bcf 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -228,6 +228,7 @@ virCgroupDetectMounts(virCgroupPtr group) struct mntent entry; char buf[CGROUP_MAX_VAL]; int ret = -1; + size_t i;
mounts = fopen("/proc/mounts", "r"); if (mounts == NULL) { @@ -236,11 +237,14 @@ virCgroupDetectMounts(virCgroupPtr group) }
while (getmntent_r(mounts, &entry, buf, sizeof(buf)) != NULL) { - if (group->backend->detectMounts(group, - entry.mnt_type, - entry.mnt_opts, - entry.mnt_dir) < 0) { - goto cleanup; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->detectMounts(group, + entry.mnt_type, + entry.mnt_opts, + entry.mnt_dir) < 0) { + goto cleanup; + } } }
@@ -303,6 +307,7 @@ virCgroupDetectPlacement(virCgroupPtr group, }
while (fgets(line, sizeof(line), mapping) != NULL) { + size_t i; char *controllers = strchr(line, ':'); char *selfpath = controllers ? strchr(controllers + 1, ':') : NULL; char *nl = selfpath ? strchr(selfpath, '\n') : NULL; @@ -317,9 +322,12 @@ virCgroupDetectPlacement(virCgroupPtr group, controllers++; selfpath++;
- if (group->backend->detectPlacement(group, path, controllers, - selfpath) < 0) { - goto cleanup; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->detectPlacement(group, path, controllers, + selfpath) < 0) { + goto cleanup; + }
So a loop like this appears all over the patch:
for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { if (group->backends[i] && group->backends[i]->cb(); }
I wonder if we should have wrappers around these chunks of code. But that could be saved as a follow up patch.
We already have a macro for something similar that could be expanded and reused in all of those. And, yes, I totally agree that it could be done as a follow-up patch.
} }
@@ -338,8 +346,9 @@ virCgroupDetect(virCgroupPtr group, const char *path, virCgroupPtr parent) { - int rc; size_t i; + bool backendAvailable = false; + int controllersAvailable = 0; virCgroupBackendPtr *backends = virCgroupBackendGetAll();
VIR_DEBUG("group=%p controllers=%d path=%s parent=%p", @@ -350,31 +359,40 @@ virCgroupDetect(virCgroupPtr group,
for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { if (backends[i] && backends[i]->available()) { - group->backend = backends[i]; - break; + group->backends[i] = backends[i]; + backendAvailable = true;
No need to remove the 'break'.
} }
ACK
Michal
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Fri, Oct 05, 2018 at 12:14:55PM +0200, Michal Privoznik wrote:
On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
This enables to use both cgroup v1 and v2 at the same time together with libvirt. It is supported by kernel and there is valid use-case, not all controllers are implemented in cgroup v2 so there might be configurations where administrator would enable these missing controllers in cgroup v1.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- src/util/vircgroup.c | 351 ++++++++++++++++++++++++++---------- src/util/vircgroupbackend.c | 20 ++ src/util/vircgroupbackend.h | 16 +- src/util/vircgrouppriv.h | 2 +- 4 files changed, 291 insertions(+), 98 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index dc249bfe33..4aec5f1bcf 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -228,6 +228,7 @@ virCgroupDetectMounts(virCgroupPtr group) struct mntent entry; char buf[CGROUP_MAX_VAL]; int ret = -1; + size_t i;
mounts = fopen("/proc/mounts", "r"); if (mounts == NULL) { @@ -236,11 +237,14 @@ virCgroupDetectMounts(virCgroupPtr group) }
while (getmntent_r(mounts, &entry, buf, sizeof(buf)) != NULL) { - if (group->backend->detectMounts(group, - entry.mnt_type, - entry.mnt_opts, - entry.mnt_dir) < 0) { - goto cleanup; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->detectMounts(group, + entry.mnt_type, + entry.mnt_opts, + entry.mnt_dir) < 0) { + goto cleanup; + } } }
@@ -303,6 +307,7 @@ virCgroupDetectPlacement(virCgroupPtr group, }
while (fgets(line, sizeof(line), mapping) != NULL) { + size_t i; char *controllers = strchr(line, ':'); char *selfpath = controllers ? strchr(controllers + 1, ':') : NULL; char *nl = selfpath ? strchr(selfpath, '\n') : NULL; @@ -317,9 +322,12 @@ virCgroupDetectPlacement(virCgroupPtr group, controllers++; selfpath++;
- if (group->backend->detectPlacement(group, path, controllers, - selfpath) < 0) { - goto cleanup; + for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { + if (group->backends[i] && + group->backends[i]->detectPlacement(group, path, controllers, + selfpath) < 0) { + goto cleanup; + }
So a loop like this appears all over the patch:
for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { if (group->backends[i] && group->backends[i]->cb(); }
I wonder if we should have wrappers around these chunks of code. But that could be saved as a follow up patch.
Right, we can create a wrapper for this use-case. The reason why I did not create one is that not all the loops are the same.
} }
@@ -338,8 +346,9 @@ virCgroupDetect(virCgroupPtr group, const char *path, virCgroupPtr parent) { - int rc; size_t i; + bool backendAvailable = false; + int controllersAvailable = 0; virCgroupBackendPtr *backends = virCgroupBackendGetAll();
VIR_DEBUG("group=%p controllers=%d path=%s parent=%p", @@ -350,31 +359,40 @@ virCgroupDetect(virCgroupPtr group,
for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) { if (backends[i] && backends[i]->available()) { - group->backend = backends[i]; - break; + group->backends[i] = backends[i]; + backendAvailable = true;
No need to remove the 'break'.
We have to remove the 'break' here, otherwise we would not detect all backends for hybrid mode. We assign the backends that are available to the cgroup so we need to go through all the backends. Pavel

Remove the trailing '/' from prefix. This change is required in order to introduce tests for unified cgroups. They are usually mounted in '/sys/fs/cgroup'. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgroupmock.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/tests/vircgroupmock.c b/tests/vircgroupmock.c index cfff1f0b7a..5c48ef012a 100644 --- a/tests/vircgroupmock.c +++ b/tests/vircgroupmock.c @@ -56,7 +56,7 @@ const char *fakedevicedir0 = FAKEDEVDIR0; const char *fakedevicedir1 = FAKEDEVDIR1; -# define SYSFS_CGROUP_PREFIX "/not/really/sys/fs/cgroup/" +# define SYSFS_CGROUP_PREFIX "/not/really/sys/fs/cgroup" # define SYSFS_CPU_PRESENT "/sys/devices/system/cpu/present" # define SYSFS_CPU_PRESENT_MOCKED "devices_system_cpu_present" @@ -354,7 +354,7 @@ int access(const char *path, int mode) if (STRPREFIX(path, SYSFS_CGROUP_PREFIX)) { init_sysfs(); char *newpath; - if (asprintf(&newpath, "%s/%s", + if (asprintf(&newpath, "%s%s", fakesysfscgroupdir, path + strlen(SYSFS_CGROUP_PREFIX)) < 0) { errno = ENOMEM; @@ -386,7 +386,7 @@ int __lxstat(int ver, const char *path, struct stat *sb) if (STRPREFIX(path, SYSFS_CGROUP_PREFIX)) { init_sysfs(); char *newpath; - if (asprintf(&newpath, "%s/%s", + if (asprintf(&newpath, "%s%s", fakesysfscgroupdir, path + strlen(SYSFS_CGROUP_PREFIX)) < 0) { errno = ENOMEM; @@ -417,7 +417,7 @@ int lstat(const char *path, struct stat *sb) if (STRPREFIX(path, SYSFS_CGROUP_PREFIX)) { init_sysfs(); char *newpath; - if (asprintf(&newpath, "%s/%s", + if (asprintf(&newpath, "%s%s", fakesysfscgroupdir, path + strlen(SYSFS_CGROUP_PREFIX)) < 0) { errno = ENOMEM; @@ -448,7 +448,7 @@ int __xstat(int ver, const char *path, struct stat *sb) if (STRPREFIX(path, SYSFS_CGROUP_PREFIX)) { init_sysfs(); char *newpath; - if (asprintf(&newpath, "%s/%s", + if (asprintf(&newpath, "%s%s", fakesysfscgroupdir, path + strlen(SYSFS_CGROUP_PREFIX)) < 0) { errno = ENOMEM; @@ -487,7 +487,7 @@ int stat(const char *path, struct stat *sb) } } else if (STRPREFIX(path, SYSFS_CGROUP_PREFIX)) { init_sysfs(); - if (asprintf(&newpath, "%s/%s", + if (asprintf(&newpath, "%s%s", fakesysfscgroupdir, path + strlen(SYSFS_CGROUP_PREFIX)) < 0) { errno = ENOMEM; @@ -519,7 +519,7 @@ int mkdir(const char *path, mode_t mode) if (STRPREFIX(path, SYSFS_CGROUP_PREFIX)) { init_sysfs(); char *newpath; - if (asprintf(&newpath, "%s/%s", + if (asprintf(&newpath, "%s%s", fakesysfscgroupdir, path + strlen(SYSFS_CGROUP_PREFIX)) < 0) { errno = ENOMEM; @@ -552,7 +552,7 @@ int open(const char *path, int flags, ...) if (STRPREFIX(path, SYSFS_CGROUP_PREFIX)) { init_sysfs(); - if (asprintf(&newpath, "%s/%s", + if (asprintf(&newpath, "%s%s", fakesysfscgroupdir, path + strlen(SYSFS_CGROUP_PREFIX)) < 0) { errno = ENOMEM; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Remove the trailing '/' from prefix. This change is required in order to introduce tests for unified cgroups. They are usually mounted in '/sys/fs/cgroup'.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgroupmock.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
ACK Michal

We need to create the cgroup v2 sysfs the same way as we do for cgroup v1. This introduces new VIR_CGROUP_MOCK_MODE env variable which will configure which cgroup mode each test requires. There are three different modes: - legacy: only cgroup v1 is available and it's the default mode - hybrid: both cgroup v1 and cgroup v2 are available and have some controllers - unified: only cgroup v2 is available Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgroupmock.c | 153 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 138 insertions(+), 15 deletions(-) diff --git a/tests/vircgroupmock.c b/tests/vircgroupmock.c index 5c48ef012a..1ed00ea91c 100644 --- a/tests/vircgroupmock.c +++ b/tests/vircgroupmock.c @@ -105,7 +105,8 @@ static int make_file(const char *path, return ret; } -static int make_controller(const char *path, mode_t mode) + +static int make_controller_v1(const char *path, mode_t mode) { int ret = -1; const char *controller; @@ -231,11 +232,86 @@ static int make_controller(const char *path, mode_t mode) goto cleanup; } +# undef MAKE_FILE + ret = 0; cleanup: return ret; } + +static int make_controller_v2(const char *path, mode_t mode) +{ + if (!STRPREFIX(path, fakesysfscgroupdir)) { + errno = EINVAL; + return -1; + } + + if (real_mkdir(path, mode) < 0 && errno != EEXIST) + return -1; + +# define MAKE_FILE(name, value) \ + do { \ + if (make_file(path, name, value) < 0) \ + return -1; \ + } while (0) + + MAKE_FILE("cgroup.controllers", "cpu io memory\n"); + MAKE_FILE("cgroup.subtree_control", ""); + MAKE_FILE("cgroup.type", "domain\n"); + MAKE_FILE("cpu.max", "max 100000\n"); + MAKE_FILE("cpu.stat", + "usage_usec 0\n" + "user_usec 0\n" + "system_usec 0\n" + "nr_periods 0\n" + "nr_throttled 0\n" + "throttled_usec 0\n"); + MAKE_FILE("cpu.weight", "100\n"); + MAKE_FILE("memory.current", "1455321088\n"); + MAKE_FILE("memory.high", "max\n"); + MAKE_FILE("memory.max", "max\n"); + MAKE_FILE("memory.stat", + "anon 0\n" + "file 0\n" + "kernel_stack 0\n" + "slab 0\n" + "sock 0\n" + "shmem 0\n" + "file_mapped 0\n" + "file_dirty 0\n" + "file_writeback 0\n" + "inactive_anon 0\n" + "active_anon 0\n" + "inactive_file 0\n" + "active_file 0\n" + "unevictable 0\n" + "slab_reclaimable 0\n" + "slab_unreclaimable 0\n" + "pgfault 0\n" + "pgmajfault 0\n" + "pgrefill 0\n" + "pgscan 0\n" + "pgsteal 0\n" + "pgactivate 0\n" + "pgdeactivate 0\n" + "pglazyfree 0\n" + "pglazyfreed 0\n" + "workingset_refault 0\n" + "workingset_activate 0\n" + "workingset_nodereclaim 0\n"); + MAKE_FILE("memory.swap.current", "0\n"); + MAKE_FILE("memory.swap.max", "max\n"); + MAKE_FILE("io.stat", "8:0 rbytes=26828800 wbytes=77062144 rios=2256 wios=7849 dbytes=0 dios=0\n"); + MAKE_FILE("io.max", ""); + MAKE_FILE("io.weight", "default 100\n"); + +# undef MAKE_FILE + + return 0; +} + + static void init_syms(void) { if (real_fopen) @@ -249,16 +325,56 @@ static void init_syms(void) VIR_MOCK_REAL_INIT(open); } + +static int make_controller(const char *path, mode_t mode) +{ + const char *mock; + bool unified = false; + bool hybrid = false; + + mock = getenv("VIR_CGROUP_MOCK_MODE"); + if (mock) { + if (STREQ(mock, "unified")) + unified = true; + else if (STREQ(mock, "hybrid")) + hybrid = true; + } + + if (unified || (hybrid && strstr(path, "unified"))) { + return make_controller_v2(path, mode); + } else { + return make_controller_v1(path, mode); + } +} + + static void init_sysfs(void) { - if (fakerootdir && fakesysfscgroupdir) - return; + const char *mock; + char *newfakerootdir; + bool unified = false; + bool hybrid = false; - if (!(fakerootdir = getenv("LIBVIRT_FAKE_ROOT_DIR"))) { + if (!(newfakerootdir = getenv("LIBVIRT_FAKE_ROOT_DIR"))) { fprintf(stderr, "Missing LIBVIRT_FAKE_ROOT_DIR env variable\n"); abort(); } + if (fakerootdir && STREQ(fakerootdir, newfakerootdir)) + return; + + fakerootdir = newfakerootdir; + + mock = getenv("VIR_CGROUP_MOCK_MODE"); + if (mock) { + if (STREQ(mock, "unified")) + unified = true; + else if (STREQ(mock, "hybrid")) + hybrid = true; + } + + VIR_FREE(fakesysfscgroupdir); + if (virAsprintfQuiet(&fakesysfscgroupdir, "%s%s", fakerootdir, SYSFS_CGROUP_PREFIX) < 0) abort(); @@ -281,18 +397,25 @@ static void init_sysfs(void) free(path); \ } while (0) - MAKE_CONTROLLER("cpu"); - MAKE_CONTROLLER("cpuacct"); - MAKE_CONTROLLER("cpu,cpuacct"); - MAKE_CONTROLLER("cpu,cpuacct/system"); - MAKE_CONTROLLER("cpuset"); - MAKE_CONTROLLER("blkio"); - MAKE_CONTROLLER("memory"); - MAKE_CONTROLLER("freezer"); + if (unified) { + MAKE_CONTROLLER(""); + } else if (hybrid) { + MAKE_CONTROLLER("unified"); + MAKE_CONTROLLER("cpuset"); + MAKE_CONTROLLER("freezer"); + } else { + MAKE_CONTROLLER("cpu"); + MAKE_CONTROLLER("cpuacct"); + MAKE_CONTROLLER("cpu,cpuacct"); + MAKE_CONTROLLER("cpuset"); + MAKE_CONTROLLER("blkio"); + MAKE_CONTROLLER("memory"); + MAKE_CONTROLLER("freezer"); - if (make_file(fakesysfscgroupdir, - SYSFS_CPU_PRESENT_MOCKED, "8-23,48-159\n") < 0) - abort(); + if (make_file(fakesysfscgroupdir, + SYSFS_CPU_PRESENT_MOCKED, "8-23,48-159\n") < 0) + abort(); + } } -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
We need to create the cgroup v2 sysfs the same way as we do for cgroup v1.
This introduces new VIR_CGROUP_MOCK_MODE env variable which will configure which cgroup mode each test requires. There are three different modes:
- legacy: only cgroup v1 is available and it's the default mode - hybrid: both cgroup v1 and cgroup v2 are available and have some controllers - unified: only cgroup v2 is available
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgroupmock.c | 153 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 138 insertions(+), 15 deletions(-)
diff --git a/tests/vircgroupmock.c b/tests/vircgroupmock.c index 5c48ef012a..1ed00ea91c 100644 --- a/tests/vircgroupmock.c +++ b/tests/vircgroupmock.c @@ -105,7 +105,8 @@ static int make_file(const char *path, return ret; }
-static int make_controller(const char *path, mode_t mode) + +static int make_controller_v1(const char *path, mode_t mode) { int ret = -1; const char *controller; @@ -231,11 +232,86 @@ static int make_controller(const char *path, mode_t mode) goto cleanup; }
+# undef MAKE_FILE + ret = 0; cleanup: return ret; }
+ +static int make_controller_v2(const char *path, mode_t mode) +{ + if (!STRPREFIX(path, fakesysfscgroupdir)) { + errno = EINVAL; + return -1; + } + + if (real_mkdir(path, mode) < 0 && errno != EEXIST) + return -1; + +# define MAKE_FILE(name, value) \ + do { \ + if (make_file(path, name, value) < 0) \ + return -1; \ + } while (0) + + MAKE_FILE("cgroup.controllers", "cpu io memory\n"); + MAKE_FILE("cgroup.subtree_control", ""); + MAKE_FILE("cgroup.type", "domain\n"); + MAKE_FILE("cpu.max", "max 100000\n"); + MAKE_FILE("cpu.stat", + "usage_usec 0\n" + "user_usec 0\n" + "system_usec 0\n" + "nr_periods 0\n" + "nr_throttled 0\n" + "throttled_usec 0\n"); + MAKE_FILE("cpu.weight", "100\n"); + MAKE_FILE("memory.current", "1455321088\n"); + MAKE_FILE("memory.high", "max\n"); + MAKE_FILE("memory.max", "max\n"); + MAKE_FILE("memory.stat", + "anon 0\n" + "file 0\n" + "kernel_stack 0\n" + "slab 0\n" + "sock 0\n" + "shmem 0\n" + "file_mapped 0\n" + "file_dirty 0\n" + "file_writeback 0\n" + "inactive_anon 0\n" + "active_anon 0\n" + "inactive_file 0\n" + "active_file 0\n" + "unevictable 0\n" + "slab_reclaimable 0\n" + "slab_unreclaimable 0\n" + "pgfault 0\n" + "pgmajfault 0\n" + "pgrefill 0\n" + "pgscan 0\n" + "pgsteal 0\n" + "pgactivate 0\n" + "pgdeactivate 0\n" + "pglazyfree 0\n" + "pglazyfreed 0\n" + "workingset_refault 0\n" + "workingset_activate 0\n" + "workingset_nodereclaim 0\n"); + MAKE_FILE("memory.swap.current", "0\n"); + MAKE_FILE("memory.swap.max", "max\n"); + MAKE_FILE("io.stat", "8:0 rbytes=26828800 wbytes=77062144 rios=2256 wios=7849 dbytes=0 dios=0\n"); + MAKE_FILE("io.max", ""); + MAKE_FILE("io.weight", "default 100\n"); + +# undef MAKE_FILE + + return 0; +} + + static void init_syms(void) { if (real_fopen) @@ -249,16 +325,56 @@ static void init_syms(void) VIR_MOCK_REAL_INIT(open); }
+ +static int make_controller(const char *path, mode_t mode) +{ + const char *mock; + bool unified = false; + bool hybrid = false; + + mock = getenv("VIR_CGROUP_MOCK_MODE"); + if (mock) { + if (STREQ(mock, "unified")) + unified = true; + else if (STREQ(mock, "hybrid")) + hybrid = true; + } + + if (unified || (hybrid && strstr(path, "unified"))) { + return make_controller_v2(path, mode); + } else { + return make_controller_v1(path, mode); + } +} + + static void init_sysfs(void) { - if (fakerootdir && fakesysfscgroupdir) - return; + const char *mock; + char *newfakerootdir; + bool unified = false; + bool hybrid = false;
- if (!(fakerootdir = getenv("LIBVIRT_FAKE_ROOT_DIR"))) { + if (!(newfakerootdir = getenv("LIBVIRT_FAKE_ROOT_DIR"))) { fprintf(stderr, "Missing LIBVIRT_FAKE_ROOT_DIR env variable\n"); abort(); }
+ if (fakerootdir && STREQ(fakerootdir, newfakerootdir))
or simply STREQ_NULLABLE()
+ return; + + fakerootdir = newfakerootdir; + + mock = getenv("VIR_CGROUP_MOCK_MODE"); + if (mock) { + if (STREQ(mock, "unified")) + unified = true; + else if (STREQ(mock, "hybrid")) + hybrid = true;
abort(unknown mode $mock);
+ } + + VIR_FREE(fakesysfscgroupdir); + if (virAsprintfQuiet(&fakesysfscgroupdir, "%s%s", fakerootdir, SYSFS_CGROUP_PREFIX) < 0) abort(); @@ -281,18 +397,25 @@ static void init_sysfs(void) free(path); \ } while (0)
- MAKE_CONTROLLER("cpu"); - MAKE_CONTROLLER("cpuacct"); - MAKE_CONTROLLER("cpu,cpuacct"); - MAKE_CONTROLLER("cpu,cpuacct/system"); - MAKE_CONTROLLER("cpuset"); - MAKE_CONTROLLER("blkio"); - MAKE_CONTROLLER("memory"); - MAKE_CONTROLLER("freezer"); + if (unified) { + MAKE_CONTROLLER(""); + } else if (hybrid) { + MAKE_CONTROLLER("unified"); + MAKE_CONTROLLER("cpuset"); + MAKE_CONTROLLER("freezer"); + } else { + MAKE_CONTROLLER("cpu"); + MAKE_CONTROLLER("cpuacct"); + MAKE_CONTROLLER("cpu,cpuacct"); + MAKE_CONTROLLER("cpuset"); + MAKE_CONTROLLER("blkio"); + MAKE_CONTROLLER("memory"); + MAKE_CONTROLLER("freezer");
- if (make_file(fakesysfscgroupdir, - SYSFS_CPU_PRESENT_MOCKED, "8-23,48-159\n") < 0) - abort(); + if (make_file(fakesysfscgroupdir, + SYSFS_CPU_PRESENT_MOCKED, "8-23,48-159\n") < 0) + abort(); + } }
ACK Michal

We need to configure multiple env variables for each set of tests so create helper functions to do that. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgrouptest.c | 48 ++++++++++++++++++++++++++++++++----------- 1 file changed, 36 insertions(+), 12 deletions(-) diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index 4f62014ead..3e8793a6c3 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -830,10 +830,10 @@ static int testCgroupGetBlkioIoDeviceServiced(const void *args ATTRIBUTE_UNUSED) # define FAKEROOTDIRTEMPLATE abs_builddir "/fakerootdir-XXXXXX" -static int -mymain(void) +static char * +initFakeFS(const char *mode, + const char *filename) { - int ret = 0; char *fakerootdir; if (VIR_STRDUP_QUIET(fakerootdir, FAKEROOTDIRTEMPLATE) < 0) { @@ -848,6 +848,33 @@ mymain(void) setenv("LIBVIRT_FAKE_ROOT_DIR", fakerootdir, 1); + if (mode) + setenv("VIR_CGROUP_MOCK_MODE", mode, 1); + + if (filename) + setenv("VIR_CGROUP_MOCK_FILENAME", filename, 1); + + return fakerootdir; +} + +static void +cleanupFakeFS(char *fakerootdir) +{ + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) + virFileDeleteTree(fakerootdir); + + VIR_FREE(fakerootdir); + unsetenv("LIBVIRT_FAKE_ROOT_DIR"); + unsetenv("VIR_CGROUP_MOCK_MODE"); + unsetenv("VIR_CGROUP_MOCK_FILENAME"); +} + +static int +mymain(void) +{ + int ret = 0; + char *fakerootdir; + # define DETECT_MOUNTS_FULL(file, fail) \ do { \ struct _detectMountsData data = { file, fail }; \ @@ -871,7 +898,7 @@ mymain(void) DETECT_MOUNTS_FAIL("no-cgroups"); DETECT_MOUNTS("kubevirt"); - setenv("VIR_CGROUP_MOCK_FILENAME", "systemd", 1); + fakerootdir = initFakeFS(NULL, "systemd"); if (virTestRun("New cgroup for self", testCgroupNewForSelf, NULL) < 0) ret = -1; @@ -907,26 +934,23 @@ mymain(void) if (virTestRun("virCgroupGetPercpuStats works", testCgroupGetPercpuStats, NULL) < 0) ret = -1; - unsetenv("VIR_CGROUP_MOCK_FILENAME"); + cleanupFakeFS(fakerootdir); - setenv("VIR_CGROUP_MOCK_FILENAME", "all-in-one", 1); + fakerootdir = initFakeFS(NULL, "all-in-one"); if (virTestRun("New cgroup for self (allinone)", testCgroupNewForSelfAllInOne, NULL) < 0) ret = -1; if (virTestRun("Cgroup available", testCgroupAvailable, (void*)0x1) < 0) ret = -1; - unsetenv("VIR_CGROUP_MOCK_FILENAME"); + cleanupFakeFS(fakerootdir); - setenv("VIR_CGROUP_MOCK_FILENAME", "logind", 1); + fakerootdir = initFakeFS(NULL, "logind"); if (virTestRun("New cgroup for self (logind)", testCgroupNewForSelfLogind, NULL) < 0) ret = -1; if (virTestRun("Cgroup available", testCgroupAvailable, (void*)0x0) < 0) ret = -1; - unsetenv("VIR_CGROUP_MOCK_FILENAME"); + cleanupFakeFS(fakerootdir); - if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) - virFileDeleteTree(fakerootdir); - VIR_FREE(fakerootdir); return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
We need to configure multiple env variables for each set of tests so create helper functions to do that.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgrouptest.c | 48 ++++++++++++++++++++++++++++++++----------- 1 file changed, 36 insertions(+), 12 deletions(-)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgroupdata/all-in-one.parsed | 1 + tests/vircgroupdata/cgroups1.parsed | 1 + tests/vircgroupdata/cgroups2.parsed | 1 + tests/vircgroupdata/cgroups3.parsed | 1 + tests/vircgroupdata/fedora-18.parsed | 1 + tests/vircgroupdata/fedora-21.parsed | 1 + tests/vircgroupdata/kubevirt.parsed | 1 + tests/vircgroupdata/ovirt-node-6.6.parsed | 1 + tests/vircgroupdata/ovirt-node-7.1.parsed | 1 + tests/vircgroupdata/rhel-7.1.parsed | 1 + tests/vircgrouptest.c | 2 ++ 11 files changed, 12 insertions(+) diff --git a/tests/vircgroupdata/all-in-one.parsed b/tests/vircgroupdata/all-in-one.parsed index d703d08fb9..97c96e8ecd 100644 --- a/tests/vircgroupdata/all-in-one.parsed +++ b/tests/vircgroupdata/all-in-one.parsed @@ -8,3 +8,4 @@ blkio /not/really/sys/fs/cgroup net_cls <null> perf_event <null> name=systemd <null> +unified <null> diff --git a/tests/vircgroupdata/cgroups1.parsed b/tests/vircgroupdata/cgroups1.parsed index b6916f17a1..16431aa006 100644 --- a/tests/vircgroupdata/cgroups1.parsed +++ b/tests/vircgroupdata/cgroups1.parsed @@ -8,3 +8,4 @@ blkio /sys/fs/cgroup/blkio net_cls /sys/fs/cgroup/net_cls perf_event <null> name=systemd <null> +unified <null> diff --git a/tests/vircgroupdata/cgroups2.parsed b/tests/vircgroupdata/cgroups2.parsed index 5eb2bc7bb2..d1bb0dcb7c 100644 --- a/tests/vircgroupdata/cgroups2.parsed +++ b/tests/vircgroupdata/cgroups2.parsed @@ -8,3 +8,4 @@ blkio /sys/fs/cgroup/blkio net_cls <null> perf_event /sys/fs/cgroup/perf_event name=systemd <null> +unified <null> diff --git a/tests/vircgroupdata/cgroups3.parsed b/tests/vircgroupdata/cgroups3.parsed index 2b1f3825c1..44e475c9d2 100644 --- a/tests/vircgroupdata/cgroups3.parsed +++ b/tests/vircgroupdata/cgroups3.parsed @@ -8,3 +8,4 @@ blkio /sys/fs/cgroup/blkio net_cls /sys/fs/cgroup/net_cls perf_event /sys/fs/cgroup/perf_event name=systemd <null> +unified <null> diff --git a/tests/vircgroupdata/fedora-18.parsed b/tests/vircgroupdata/fedora-18.parsed index 8d5ba75c7e..662a38a9e8 100644 --- a/tests/vircgroupdata/fedora-18.parsed +++ b/tests/vircgroupdata/fedora-18.parsed @@ -8,3 +8,4 @@ blkio /sys/fs/cgroup/blkio net_cls /sys/fs/cgroup/net_cls perf_event /sys/fs/cgroup/perf_event name=systemd /sys/fs/cgroup/systemd +unified <null> diff --git a/tests/vircgroupdata/fedora-21.parsed b/tests/vircgroupdata/fedora-21.parsed index 3377af0382..4e447fd7bd 100644 --- a/tests/vircgroupdata/fedora-21.parsed +++ b/tests/vircgroupdata/fedora-21.parsed @@ -8,3 +8,4 @@ blkio /sys/fs/cgroup/blkio net_cls /sys/fs/cgroup/net_cls,net_prio perf_event /sys/fs/cgroup/perf_event name=systemd /sys/fs/cgroup/systemd +unified <null> diff --git a/tests/vircgroupdata/kubevirt.parsed b/tests/vircgroupdata/kubevirt.parsed index 6948707238..bf977f8363 100644 --- a/tests/vircgroupdata/kubevirt.parsed +++ b/tests/vircgroupdata/kubevirt.parsed @@ -8,3 +8,4 @@ blkio /sys/fs/cgroup/blkio net_cls <null> perf_event /sys/fs/cgroup/perf_event name=systemd <null> +unified <null> diff --git a/tests/vircgroupdata/ovirt-node-6.6.parsed b/tests/vircgroupdata/ovirt-node-6.6.parsed index 01bf466be6..9d10813d8c 100644 --- a/tests/vircgroupdata/ovirt-node-6.6.parsed +++ b/tests/vircgroupdata/ovirt-node-6.6.parsed @@ -8,3 +8,4 @@ blkio /cgroup/blkio net_cls /cgroup/net_cls perf_event <null> name=systemd <null> +unified <null> diff --git a/tests/vircgroupdata/ovirt-node-7.1.parsed b/tests/vircgroupdata/ovirt-node-7.1.parsed index 8d5ba75c7e..662a38a9e8 100644 --- a/tests/vircgroupdata/ovirt-node-7.1.parsed +++ b/tests/vircgroupdata/ovirt-node-7.1.parsed @@ -8,3 +8,4 @@ blkio /sys/fs/cgroup/blkio net_cls /sys/fs/cgroup/net_cls perf_event /sys/fs/cgroup/perf_event name=systemd /sys/fs/cgroup/systemd +unified <null> diff --git a/tests/vircgroupdata/rhel-7.1.parsed b/tests/vircgroupdata/rhel-7.1.parsed index 8d5ba75c7e..662a38a9e8 100644 --- a/tests/vircgroupdata/rhel-7.1.parsed +++ b/tests/vircgroupdata/rhel-7.1.parsed @@ -8,3 +8,4 @@ blkio /sys/fs/cgroup/blkio net_cls /sys/fs/cgroup/net_cls perf_event /sys/fs/cgroup/perf_event name=systemd /sys/fs/cgroup/systemd +unified <null> diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index 3e8793a6c3..75187216d9 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -175,6 +175,8 @@ testCgroupDetectMounts(const void *args) virCgroupControllerTypeToString(i), NULLSTR(group->legacy[i].mountPoint)); } + virBufferAsprintf(&buf, "%-12s %s\n", + "unified", NULLSTR(group->unified.mountPoint)); if (virBufferCheckError(&buf) < 0) goto cleanup; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgroupdata/all-in-one.parsed | 1 + tests/vircgroupdata/cgroups1.parsed | 1 + tests/vircgroupdata/cgroups2.parsed | 1 + tests/vircgroupdata/cgroups3.parsed | 1 + tests/vircgroupdata/fedora-18.parsed | 1 + tests/vircgroupdata/fedora-21.parsed | 1 + tests/vircgroupdata/kubevirt.parsed | 1 + tests/vircgroupdata/ovirt-node-6.6.parsed | 1 + tests/vircgroupdata/ovirt-node-7.1.parsed | 1 + tests/vircgroupdata/rhel-7.1.parsed | 1 + tests/vircgrouptest.c | 2 ++ 11 files changed, 12 insertions(+)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgroupdata/unified.cgroups | 13 +++++++++++++ tests/vircgroupdata/unified.mounts | 20 ++++++++++++++++++++ tests/vircgroupdata/unified.parsed | 11 +++++++++++ tests/vircgroupdata/unified.self.cgroup | 1 + tests/vircgrouptest.c | 3 +++ 5 files changed, 48 insertions(+) create mode 100644 tests/vircgroupdata/unified.cgroups create mode 100644 tests/vircgroupdata/unified.mounts create mode 100644 tests/vircgroupdata/unified.parsed create mode 100644 tests/vircgroupdata/unified.self.cgroup diff --git a/tests/vircgroupdata/unified.cgroups b/tests/vircgroupdata/unified.cgroups new file mode 100644 index 0000000000..e0d8a3561c --- /dev/null +++ b/tests/vircgroupdata/unified.cgroups @@ -0,0 +1,13 @@ +#subsys_name hierarchy num_cgroups enabled +cpuset 0 1 1 +cpu 0 1 1 +cpuacct 0 1 1 +blkio 0 1 1 +memory 0 1 1 +devices 0 1 1 +freezer 0 1 1 +net_cls 0 1 1 +perf_event 0 1 1 +net_prio 0 1 1 +hugetlb 0 1 1 +pids 0 1 1 diff --git a/tests/vircgroupdata/unified.mounts b/tests/vircgroupdata/unified.mounts new file mode 100644 index 0000000000..b4ab94a2c3 --- /dev/null +++ b/tests/vircgroupdata/unified.mounts @@ -0,0 +1,20 @@ +sysfs /sys sysfs rw,seclabel,nosuid,nodev,noexec,relatime 0 0 +proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0 +devtmpfs /dev devtmpfs rw,seclabel,nosuid,size=1009844k,nr_inodes=252461,mode=755 0 0 +securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0 +tmpfs /dev/shm tmpfs rw,seclabel,nosuid,nodev 0 0 +devpts /dev/pts devpts rw,seclabel,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0 +tmpfs /run tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0 +cgroup2 /not/really/sys/fs/cgroup cgroup2 rw,seclabel,nosuid,nodev,noexec,relatime,nsdelegate 0 0 +pstore /sys/fs/pstore pstore rw,seclabel,nosuid,nodev,noexec,relatime 0 0 +bpf /sys/fs/bpf bpf rw,nosuid,nodev,noexec,relatime,mode=700 0 0 +configfs /sys/kernel/config configfs rw,relatime 0 0 +/dev/vda2 / ext4 rw,seclabel,relatime 0 0 +selinuxfs /sys/fs/selinux selinuxfs rw,relatime 0 0 +debugfs /sys/kernel/debug debugfs rw,seclabel,relatime 0 0 +hugetlbfs /dev/hugepages hugetlbfs rw,seclabel,relatime,pagesize=2M 0 0 +systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=40,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=14586 0 0 +mqueue /dev/mqueue mqueue rw,seclabel,relatime 0 0 +tmpfs /tmp tmpfs rw,seclabel,nosuid,nodev 0 0 +sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0 +tmpfs /run/user/0 tmpfs rw,seclabel,nosuid,nodev,relatime,size=204000k,mode=700 0 0 diff --git a/tests/vircgroupdata/unified.parsed b/tests/vircgroupdata/unified.parsed new file mode 100644 index 0000000000..3de0fc643d --- /dev/null +++ b/tests/vircgroupdata/unified.parsed @@ -0,0 +1,11 @@ +cpu <null> +cpuacct <null> +cpuset <null> +memory <null> +devices <null> +freezer <null> +blkio <null> +net_cls <null> +perf_event <null> +name=systemd <null> +unified /not/really/sys/fs/cgroup diff --git a/tests/vircgroupdata/unified.self.cgroup b/tests/vircgroupdata/unified.self.cgroup new file mode 100644 index 0000000000..1e027b2a3c --- /dev/null +++ b/tests/vircgroupdata/unified.self.cgroup @@ -0,0 +1 @@ +0::/ diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index 75187216d9..800522e311 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -899,6 +899,9 @@ mymain(void) DETECT_MOUNTS("all-in-one"); DETECT_MOUNTS_FAIL("no-cgroups"); DETECT_MOUNTS("kubevirt"); + fakerootdir = initFakeFS("unified", NULL); + DETECT_MOUNTS("unified"); + cleanupFakeFS(fakerootdir); fakerootdir = initFakeFS(NULL, "systemd"); if (virTestRun("New cgroup for self", testCgroupNewForSelf, NULL) < 0) -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgroupdata/unified.cgroups | 13 +++++++++++++ tests/vircgroupdata/unified.mounts | 20 ++++++++++++++++++++ tests/vircgroupdata/unified.parsed | 11 +++++++++++ tests/vircgroupdata/unified.self.cgroup | 1 + tests/vircgrouptest.c | 3 +++ 5 files changed, 48 insertions(+) create mode 100644 tests/vircgroupdata/unified.cgroups create mode 100644 tests/vircgroupdata/unified.mounts create mode 100644 tests/vircgroupdata/unified.parsed create mode 100644 tests/vircgroupdata/unified.self.cgroup
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgroupdata/hybrid.cgroups | 12 ++++++++++++ tests/vircgroupdata/hybrid.mounts | 23 +++++++++++++++++++++++ tests/vircgroupdata/hybrid.parsed | 11 +++++++++++ tests/vircgroupdata/hybrid.self.cgroup | 9 +++++++++ tests/vircgrouptest.c | 3 +++ 5 files changed, 58 insertions(+) create mode 100644 tests/vircgroupdata/hybrid.cgroups create mode 100644 tests/vircgroupdata/hybrid.mounts create mode 100644 tests/vircgroupdata/hybrid.parsed create mode 100644 tests/vircgroupdata/hybrid.self.cgroup diff --git a/tests/vircgroupdata/hybrid.cgroups b/tests/vircgroupdata/hybrid.cgroups new file mode 100644 index 0000000000..7f3bc7b8cb --- /dev/null +++ b/tests/vircgroupdata/hybrid.cgroups @@ -0,0 +1,12 @@ +#subsys_name hierarchy num_cgroups enabled +cpuset 2 1 1 +cpu 0 1 1 +cpuacct 0 1 1 +blkio 0 1 1 +memory 0 1 1 +devices 3 1 1 +freezer 5 1 1 +net_cls 4 1 1 +perf_event 6 1 1 +hugetlb 7 1 1 +pids 8 1 1 diff --git a/tests/vircgroupdata/hybrid.mounts b/tests/vircgroupdata/hybrid.mounts new file mode 100644 index 0000000000..d6f5f82115 --- /dev/null +++ b/tests/vircgroupdata/hybrid.mounts @@ -0,0 +1,23 @@ +proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0 +sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0 +devtmpfs /dev devtmpfs rw,nosuid,size=10240k,nr_inodes=502705,mode=755 0 0 +devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0 +tmpfs /dev/shm tmpfs rw,nosuid,nodev,noexec 0 0 +tmpfs /run tmpfs rw,nosuid,nodev,noexec,mode=755 0 0 +mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0 +securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0 +debugfs /sys/kernel/debug debugfs rw,nosuid,nodev,noexec,relatime 0 0 +configfs /sys/kernel/config configfs rw,nosuid,nodev,noexec,relatime 0 0 +fusectl /sys/fs/fuse/connections fusectl rw,nosuid,nodev,noexec,relatime 0 0 +selinuxfs /sys/fs/selinux selinuxfs rw,relatime 0 0 +pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0 +cgroup_root /not/really/sys/fs/cgroup tmpfs rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755 0 0 +openrc /not/really/sys/fs/cgroup/openrc cgroup rw,nosuid,nodev,noexec,relatime,release_agent=/lib64/rc/sh/cgroup-release-agent.sh,name=openrc 0 0 +none /not/really/sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0 +cpuset /not/really/sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0 +devices /not/really/sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0 +net_cls /not/really/sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0 +freezer /not/really/sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0 +perf_event /not/really/sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0 +hugetlb /not/really/sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0 +pids /not/really/sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0 diff --git a/tests/vircgroupdata/hybrid.parsed b/tests/vircgroupdata/hybrid.parsed new file mode 100644 index 0000000000..7600de5f45 --- /dev/null +++ b/tests/vircgroupdata/hybrid.parsed @@ -0,0 +1,11 @@ +cpu <null> +cpuacct <null> +cpuset /not/really/sys/fs/cgroup/cpuset +memory <null> +devices /not/really/sys/fs/cgroup/devices +freezer /not/really/sys/fs/cgroup/freezer +blkio <null> +net_cls /not/really/sys/fs/cgroup/net_cls +perf_event /not/really/sys/fs/cgroup/perf_event +name=systemd <null> +unified /not/really/sys/fs/cgroup/unified diff --git a/tests/vircgroupdata/hybrid.self.cgroup b/tests/vircgroupdata/hybrid.self.cgroup new file mode 100644 index 0000000000..2a08905c91 --- /dev/null +++ b/tests/vircgroupdata/hybrid.self.cgroup @@ -0,0 +1,9 @@ +8:pids:/ +7:hugetlb:/ +6:perf_event:/ +5:freezer:/ +4:net_cls:/ +3:devices:/ +2:cpuset:/ +1:name=openrc:/ +0::/ diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index 800522e311..e766ad7cbe 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -902,6 +902,9 @@ mymain(void) fakerootdir = initFakeFS("unified", NULL); DETECT_MOUNTS("unified"); cleanupFakeFS(fakerootdir); + fakerootdir = initFakeFS("hybrid", NULL); + DETECT_MOUNTS("hybrid"); + cleanupFakeFS(fakerootdir); fakerootdir = initFakeFS(NULL, "systemd"); if (virTestRun("New cgroup for self", testCgroupNewForSelf, NULL) < 0) -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgroupdata/hybrid.cgroups | 12 ++++++++++++ tests/vircgroupdata/hybrid.mounts | 23 +++++++++++++++++++++++ tests/vircgroupdata/hybrid.parsed | 11 +++++++++++ tests/vircgroupdata/hybrid.self.cgroup | 9 +++++++++ tests/vircgrouptest.c | 3 +++ 5 files changed, 58 insertions(+) create mode 100644 tests/vircgroupdata/hybrid.cgroups create mode 100644 tests/vircgroupdata/hybrid.mounts create mode 100644 tests/vircgroupdata/hybrid.parsed create mode 100644 tests/vircgroupdata/hybrid.self.cgroup
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgrouptest.c | 53 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 44 insertions(+), 9 deletions(-) diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index e766ad7cbe..ead5046600 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -43,7 +43,10 @@ static int validateCgroup(virCgroupPtr cgroup, const char *expectPath, const char **expectMountPoint, const char **expectLinkPoint, - const char **expectPlacement) + const char **expectPlacement, + const char *expectUnifiedMountPoint, + const char *expectUnifiedPlacement, + unsigned int expectUnifiedControllers) { size_t i; @@ -80,6 +83,38 @@ static int validateCgroup(virCgroupPtr cgroup, } } + if (STRNEQ_NULLABLE(expectUnifiedMountPoint, + cgroup->unified.mountPoint)) { + fprintf(stderr, "Wrong mount '%s', expected '%s' for 'unified'\n", + cgroup->unified.mountPoint, + expectUnifiedMountPoint); + return -1; + } + if (STRNEQ_NULLABLE(expectUnifiedPlacement, + cgroup->unified.placement)) { + fprintf(stderr, "Wrong placement '%s', expected '%s' for 'unified'\n", + cgroup->unified.placement, + expectUnifiedPlacement); + return -1; + } + if (expectUnifiedControllers != cgroup->unified.controllers) { + for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) { + int type = 1 << i; + if ((expectUnifiedControllers & type) != (cgroup->unified.controllers & type)) { + const char *typeStr = virCgroupControllerTypeToString(i); + if (expectUnifiedControllers & type) { + fprintf(stderr, "expected controller '%s' for 'unified', " + "but it's missing\n", typeStr); + } else { + fprintf(stderr, "existing controller '%s' for 'unified', " + "but it's not expected\n", typeStr); + } + } + + } + return -1; + } + return 0; } @@ -215,7 +250,7 @@ static int testCgroupNewForSelf(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - ret = validateCgroup(cgroup, "", mountsFull, links, placement); + ret = validateCgroup(cgroup, "", mountsFull, links, placement, NULL, NULL, 0); cleanup: virCgroupFree(&cgroup); @@ -294,14 +329,14 @@ static int testCgroupNewForPartition(const void *args ATTRIBUTE_UNUSED) fprintf(stderr, "Cannot create /virtualmachines cgroup: %d\n", -rv); goto cleanup; } - ret = validateCgroup(cgroup, "/virtualmachines.partition", mountsSmall, links, placementSmall); + ret = validateCgroup(cgroup, "/virtualmachines.partition", mountsSmall, links, placementSmall, NULL, NULL, 0); virCgroupFree(&cgroup); if ((rv = virCgroupNewPartition("/virtualmachines", true, -1, &cgroup)) != 0) { fprintf(stderr, "Cannot create /virtualmachines cgroup: %d\n", -rv); goto cleanup; } - ret = validateCgroup(cgroup, "/virtualmachines.partition", mountsFull, links, placementFull); + ret = validateCgroup(cgroup, "/virtualmachines.partition", mountsFull, links, placementFull, NULL, NULL, 0); cleanup: virCgroupFree(&cgroup); @@ -351,7 +386,7 @@ static int testCgroupNewForPartitionNested(const void *args ATTRIBUTE_UNUSED) } ret = validateCgroup(cgroup, "/deployment.partition/production.partition", - mountsFull, links, placementFull); + mountsFull, links, placementFull, NULL, NULL, 0); cleanup: virCgroupFree(&cgroup); @@ -407,7 +442,7 @@ static int testCgroupNewForPartitionNestedDeep(const void *args ATTRIBUTE_UNUSED } ret = validateCgroup(cgroup, "/user/berrange.user/production.partition", - mountsFull, links, placementFull); + mountsFull, links, placementFull, NULL, NULL, 0); cleanup: virCgroupFree(&cgroup); @@ -443,7 +478,7 @@ static int testCgroupNewForPartitionDomain(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - ret = validateCgroup(domaincgroup, "/production.partition/foo.libvirt-lxc", mountsFull, links, placement); + ret = validateCgroup(domaincgroup, "/production.partition/foo.libvirt-lxc", mountsFull, links, placement, NULL, NULL, 0); cleanup: virCgroupFree(&partitioncgroup); @@ -494,7 +529,7 @@ static int testCgroupNewForPartitionDomainEscaped(const void *args ATTRIBUTE_UNU * since our fake /proc/cgroups pretends this controller * isn't compiled into the kernel */ - ret = validateCgroup(domaincgroup, "/_cgroup.evil/net_cls.evil/__evil.evil/_cpu.foo.libvirt-lxc", mountsFull, links, placement); + ret = validateCgroup(domaincgroup, "/_cgroup.evil/net_cls.evil/__evil.evil/_cpu.foo.libvirt-lxc", mountsFull, links, placement, NULL, NULL, 0); cleanup: virCgroupFree(&partitioncgroup3); @@ -523,7 +558,7 @@ static int testCgroupNewForSelfAllInOne(const void *args ATTRIBUTE_UNUSED) goto cleanup; } - ret = validateCgroup(cgroup, "", mountsAllInOne, linksAllInOne, placement); + ret = validateCgroup(cgroup, "", mountsAllInOne, linksAllInOne, placement, NULL, NULL, 0); cleanup: virCgroupFree(&cgroup); -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgrouptest.c | 53 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 44 insertions(+), 9 deletions(-)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgrouptest.c | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index ead5046600..fe0cb9585a 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -567,16 +567,37 @@ static int testCgroupNewForSelfAllInOne(const void *args ATTRIBUTE_UNUSED) static int testCgroupNewForSelfLogind(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr cgroup = NULL; + + if (virCgroupNewSelf(&cgroup) >= 0) { + fprintf(stderr, "Expected to fail, only systemd cgroup available.\n"); + virCgroupFree(&cgroup); + return -1; + } + + return 0; +} + + +static int testCgroupNewForSelfUnified(const void *args ATTRIBUTE_UNUSED) { virCgroupPtr cgroup = NULL; int ret = -1; + const char *empty[VIR_CGROUP_CONTROLLER_LAST] = { 0 }; + unsigned int controllers = + (1 << VIR_CGROUP_CONTROLLER_CPU) | + (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | + (1 << VIR_CGROUP_CONTROLLER_MEMORY) | + (1 << VIR_CGROUP_CONTROLLER_BLKIO); - if (virCgroupNewSelf(&cgroup) == 0) { - fprintf(stderr, "Expected cgroup creation to fail.\n"); + if (virCgroupNewSelf(&cgroup) < 0) { + fprintf(stderr, "Cannot create cgroup for self\n"); goto cleanup; } - ret = 0; + ret = validateCgroup(cgroup, "", empty, empty, empty, + "/not/really/sys/fs/cgroup", "/", controllers); cleanup: virCgroupFree(&cgroup); return ret; @@ -993,7 +1014,14 @@ mymain(void) ret = -1; cleanupFakeFS(fakerootdir); + /* cgroup unified */ + fakerootdir = initFakeFS("unified", "unified"); + if (virTestRun("New cgroup for self (unified)", testCgroupNewForSelfUnified, NULL) < 0) + ret = -1; + if (virTestRun("Cgroup available (unified)", testCgroupAvailable, (void*)0x1) < 0) + ret = -1; + cleanupFakeFS(fakerootdir); return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgrouptest.c | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-)
ACK Michal

Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgrouptest.c | 48 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/tests/vircgrouptest.c b/tests/vircgrouptest.c index fe0cb9585a..8fcee21bb2 100644 --- a/tests/vircgrouptest.c +++ b/tests/vircgrouptest.c @@ -604,6 +604,45 @@ static int testCgroupNewForSelfUnified(const void *args ATTRIBUTE_UNUSED) } +static int testCgroupNewForSelfHybrid(const void *args ATTRIBUTE_UNUSED) +{ + virCgroupPtr cgroup = NULL; + int ret = -1; + const char *empty[VIR_CGROUP_CONTROLLER_LAST] = { 0 }; + const char *mounts[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPUSET] = "/not/really/sys/fs/cgroup/cpuset", + [VIR_CGROUP_CONTROLLER_DEVICES] = "/not/really/sys/fs/cgroup/devices", + [VIR_CGROUP_CONTROLLER_FREEZER] = "/not/really/sys/fs/cgroup/freezer", + [VIR_CGROUP_CONTROLLER_NET_CLS] = "/not/really/sys/fs/cgroup/net_cls", + [VIR_CGROUP_CONTROLLER_PERF_EVENT] = "/not/really/sys/fs/cgroup/perf_event", + }; + const char *placement[VIR_CGROUP_CONTROLLER_LAST] = { + [VIR_CGROUP_CONTROLLER_CPUSET] = "/", + [VIR_CGROUP_CONTROLLER_DEVICES] = "/", + [VIR_CGROUP_CONTROLLER_FREEZER] = "/", + [VIR_CGROUP_CONTROLLER_NET_CLS] = "/", + [VIR_CGROUP_CONTROLLER_PERF_EVENT] = "/", + }; + unsigned int controllers = + (1 << VIR_CGROUP_CONTROLLER_CPU) | + (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | + (1 << VIR_CGROUP_CONTROLLER_MEMORY) | + (1 << VIR_CGROUP_CONTROLLER_BLKIO); + + if (virCgroupNewSelf(&cgroup) < 0) { + fprintf(stderr, "Cannot create cgroup for self\n"); + goto cleanup; + } + + ret = validateCgroup(cgroup, "", mounts, empty, placement, + "/not/really/sys/fs/cgroup/unified", "/", controllers); + + cleanup: + virCgroupFree(&cgroup); + return ret; +} + + static int testCgroupAvailable(const void *args) { bool got = virCgroupAvailable(); @@ -1023,6 +1062,15 @@ mymain(void) ret = -1; cleanupFakeFS(fakerootdir); + /* cgroup hybrid */ + + fakerootdir = initFakeFS("hybrid", "hybrid"); + if (virTestRun("New cgroup for self (hybrid)", testCgroupNewForSelfHybrid, NULL) < 0) + ret = -1; + if (virTestRun("Cgroup available (hybrid)", testCgroupAvailable, (void*)0x1) < 0) + ret = -1; + cleanupFakeFS(fakerootdir); + return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tests/vircgrouptest.c | 48 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+)
ACK Michal

This removes code duplication and simplifies cgroup detection. As a drawback we will not have separate messages to enable cgroup controller in kernel or to mount it. On the other side the rewrite adds support for cgroup v2. The kernel config support was wrong because it was parsing '/proc/self/cgroup' instead of '/proc/cgroups/' file. The mount suggestion is removed as well because it will not work with cgroup v2. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tools/virt-host-validate-common.c | 162 ++++++------------------------ tools/virt-host-validate-common.h | 7 +- tools/virt-host-validate-lxc.c | 40 ++------ tools/virt-host-validate-qemu.c | 40 ++------ 4 files changed, 54 insertions(+), 195 deletions(-) diff --git a/tools/virt-host-validate-common.c b/tools/virt-host-validate-common.c index ccbd764c84..4e70fe9e9c 100644 --- a/tools/virt-host-validate-common.c +++ b/tools/virt-host-validate-common.c @@ -24,12 +24,10 @@ #include <stdarg.h> #include <unistd.h> #include <sys/utsname.h> -#ifdef HAVE_MNTENT_H -# include <mntent.h> -#endif /* HAVE_MNTENT_H */ #include <sys/stat.h> #include "viralloc.h" +#include "vircgroup.h" #include "virfile.h" #include "virt-host-validate-common.h" #include "virstring.h" @@ -288,152 +286,50 @@ int virHostValidateLinuxKernel(const char *hvname, } } - -static int virHostValidateCGroupSupport(const char *hvname, - const char *cg_name, - virHostValidateLevel level, - const char *config_name) -{ - virHostMsgCheck(hvname, "for cgroup '%s' controller support", cg_name); - FILE *fp = fopen("/proc/self/cgroup", "r"); - size_t len = 0; - char *line = NULL; - ssize_t ret; - bool matched = false; - - if (!fp) - goto error; - - while ((ret = getline(&line, &len, fp)) >= 0 && !matched) { - char **cgroups; - char *start; - char *end; - size_t ncgroups; - size_t i; - - /* Each line in this file looks like - * - * 4:cpu,cpuacct:/machine.slice/machine-qemu\x2dtest.scope/emulator - * - * Since multiple cgroups can be part of the same line and some cgroup - * names can appear as part of other cgroup names (eg. 'cpu' is a - * prefix for both 'cpuacct' and 'cpuset'), it's not enough to simply - * check whether the cgroup name is present somewhere inside the file. - * - * Moreover, there's nothing stopping the cgroup name from appearing - * in an unrelated mount point name as well */ - - /* Look for the first colon. - * The part we're interested in starts right after it */ - if (!(start = strchr(line, ':'))) - continue; - start++; - - /* Look for the second colon. - * The part we're interested in ends exactly there */ - if (!(end = strchr(start, ':'))) - continue; - *end = '\0'; - - if (!(cgroups = virStringSplitCount(start, ",", 0, &ncgroups))) - continue; - - /* Look for the matching cgroup */ - for (i = 0; i < ncgroups; i++) { - if (STREQ(cgroups[i], cg_name)) - matched = true; - } - - virStringListFreeCount(cgroups, ncgroups); - } - - VIR_FREE(line); - VIR_FORCE_FCLOSE(fp); - if (!matched) - goto error; - - virHostMsgPass(); - return 0; - - error: - VIR_FREE(line); - virHostMsgFail(level, "Enable CONFIG_%s in kernel Kconfig file", config_name); - return -1; -} - -#ifdef HAVE_MNTENT_H -static int virHostValidateCGroupMount(const char *hvname, - const char *cg_name, - virHostValidateLevel level) +#ifdef __linux__ +int virHostValidateCGroupControllers(const char *hvname, + int controllers, + virHostValidateLevel level) { - virHostMsgCheck(hvname, "for cgroup '%s' controller mount-point", cg_name); - FILE *fp = setmntent("/proc/mounts", "r"); - struct mntent ent; - char mntbuf[1024]; - bool matched = false; + virCgroupPtr group = NULL; + int ret = 0; + size_t i; - if (!fp) - goto error; + if (virCgroupNewSelf(&group) < 0) + return -1; - while (getmntent_r(fp, &ent, mntbuf, sizeof(mntbuf)) && !matched) { - char **opts; - size_t nopts; - size_t i; + for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) { + int flag = 1 << i; + const char *cg_name = virCgroupControllerTypeToString(i); - /* Ignore non-cgroup mounts */ - if (STRNEQ(ent.mnt_type, "cgroup")) + if (!(controllers & flag)) continue; - if (!(opts = virStringSplitCount(ent.mnt_opts, ",", 0, &nopts))) - continue; + virHostMsgCheck(hvname, "for cgroup '%s' controller support", cg_name); - /* Look for a mount option matching the cgroup name */ - for (i = 0; i < nopts; i++) { - if (STREQ(opts[i], cg_name)) - matched = true; + if (!virCgroupHasController(group, i)) { + ret = -1; + virHostMsgFail(level, "Enable '%s' in kernel Kconfig file or " + "mount/enable cgroup controller in your system", + cg_name); + } else { + virHostMsgPass(); } - - virStringListFreeCount(opts, nopts); } - endmntent(fp); - if (!matched) - goto error; - virHostMsgPass(); - return 0; + virCgroupFree(&group); - error: - virHostMsgFail(level, "Mount '%s' cgroup controller (suggested at /sys/fs/cgroup/%s)", - cg_name, cg_name); - return -1; + return ret; } -#else /* ! HAVE_MNTENT_H */ -static int virHostValidateCGroupMount(const char *hvname, - const char *cg_name, - virHostValidateLevel level) +#else /* !__linux__ */ +int virHostValidateCGroupControllers(const char *hvname, + int controllers, + virHostValidateLevel level) { - virHostMsgCheck(hvname, "for cgroup '%s' controller mount-point", cg_name); virHostMsgFail(level, "%s", "This platform does not support cgroups"); return -1; } -#endif /* ! HAVE_MNTENT_H */ - -int virHostValidateCGroupController(const char *hvname, - const char *cg_name, - virHostValidateLevel level, - const char *config_name) -{ - if (virHostValidateCGroupSupport(hvname, - cg_name, - level, - config_name) < 0) - return -1; - if (virHostValidateCGroupMount(hvname, - cg_name, - level) < 0) - return -1; - return 0; -} +#endif /* !__linux__ */ int virHostValidateIOMMU(const char *hvname, virHostValidateLevel level) diff --git a/tools/virt-host-validate-common.h b/tools/virt-host-validate-common.h index b6fe17daa7..b23dd7cdbe 100644 --- a/tools/virt-host-validate-common.h +++ b/tools/virt-host-validate-common.h @@ -77,10 +77,9 @@ int virHostValidateNamespace(const char *hvname, virHostValidateLevel level, const char *hint); -int virHostValidateCGroupController(const char *hvname, - const char *cg_name, - virHostValidateLevel level, - const char *config_name); +int virHostValidateCGroupControllers(const char *hvname, + int controllers, + virHostValidateLevel level); int virHostValidateIOMMU(const char *hvname, virHostValidateLevel level); diff --git a/tools/virt-host-validate-lxc.c b/tools/virt-host-validate-lxc.c index 64d9279c30..3c55b1b26d 100644 --- a/tools/virt-host-validate-lxc.c +++ b/tools/virt-host-validate-lxc.c @@ -23,6 +23,7 @@ #include "virt-host-validate-lxc.h" #include "virt-host-validate-common.h" +#include "vircgroup.h" int virHostValidateLXC(void) { @@ -63,35 +64,16 @@ int virHostValidateLXC(void) _("User namespace support is recommended")) < 0) ret = -1; - if (virHostValidateCGroupController("LXC", "memory", - VIR_HOST_VALIDATE_FAIL, - "MEMCG") < 0) - ret = -1; - - if (virHostValidateCGroupController("LXC", "cpu", - VIR_HOST_VALIDATE_FAIL, - "CGROUP_CPU") < 0) - ret = -1; - - if (virHostValidateCGroupController("LXC", "cpuacct", - VIR_HOST_VALIDATE_FAIL, - "CGROUP_CPUACCT") < 0) - ret = -1; - - if (virHostValidateCGroupController("LXC", "cpuset", - VIR_HOST_VALIDATE_FAIL, - "CPUSETS") < 0) - ret = -1; - - if (virHostValidateCGroupController("LXC", "devices", - VIR_HOST_VALIDATE_FAIL, - "CGROUP_DEVICE") < 0) - ret = -1; - - if (virHostValidateCGroupController("LXC", "blkio", - VIR_HOST_VALIDATE_FAIL, - "BLK_CGROUP") < 0) - ret = -1; + if (virHostValidateCGroupControllers("LXC", + (1 << VIR_CGROUP_CONTROLLER_MEMORY) | + (1 << VIR_CGROUP_CONTROLLER_CPU) | + (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | + (1 << VIR_CGROUP_CONTROLLER_CPUSET) | + (1 << VIR_CGROUP_CONTROLLER_DEVICES) | + (1 << VIR_CGROUP_CONTROLLER_BLKIO), + VIR_HOST_VALIDATE_FAIL) < 0) { + ret = -1; + } #if WITH_FUSE if (virHostValidateDeviceExists("LXC", "/sys/fs/fuse/connections", diff --git a/tools/virt-host-validate-qemu.c b/tools/virt-host-validate-qemu.c index d7573ea8b3..ff3c1f0231 100644 --- a/tools/virt-host-validate-qemu.c +++ b/tools/virt-host-validate-qemu.c @@ -26,6 +26,7 @@ #include "virt-host-validate-common.h" #include "virarch.h" #include "virbitmap.h" +#include "vircgroup.h" int virHostValidateQEMU(void) { @@ -96,35 +97,16 @@ int virHostValidateQEMU(void) _("Load the 'tun' module to enable networking for QEMU guests")) < 0) ret = -1; - if (virHostValidateCGroupController("QEMU", "memory", - VIR_HOST_VALIDATE_WARN, - "MEMCG") < 0) - ret = -1; - - if (virHostValidateCGroupController("QEMU", "cpu", - VIR_HOST_VALIDATE_WARN, - "CGROUP_CPU") < 0) - ret = -1; - - if (virHostValidateCGroupController("QEMU", "cpuacct", - VIR_HOST_VALIDATE_WARN, - "CGROUP_CPUACCT") < 0) - ret = -1; - - if (virHostValidateCGroupController("QEMU", "cpuset", - VIR_HOST_VALIDATE_WARN, - "CPUSETS") < 0) - ret = -1; - - if (virHostValidateCGroupController("QEMU", "devices", - VIR_HOST_VALIDATE_WARN, - "CGROUP_DEVICES") < 0) - ret = -1; - - if (virHostValidateCGroupController("QEMU", "blkio", - VIR_HOST_VALIDATE_WARN, - "BLK_CGROUP") < 0) - ret = -1; + if (virHostValidateCGroupControllers("QEMU", + (1 << VIR_CGROUP_CONTROLLER_MEMORY) | + (1 << VIR_CGROUP_CONTROLLER_CPU) | + (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | + (1 << VIR_CGROUP_CONTROLLER_CPUSET) | + (1 << VIR_CGROUP_CONTROLLER_DEVICES) | + (1 << VIR_CGROUP_CONTROLLER_BLKIO), + VIR_HOST_VALIDATE_WARN) < 0) { + ret = -1; + } if (virHostValidateIOMMU("QEMU", VIR_HOST_VALIDATE_WARN) < 0) -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
This removes code duplication and simplifies cgroup detection. As a drawback we will not have separate messages to enable cgroup controller in kernel or to mount it. On the other side the rewrite adds support for cgroup v2.
The kernel config support was wrong because it was parsing '/proc/self/cgroup' instead of '/proc/cgroups/' file.
The mount suggestion is removed as well because it will not work with cgroup v2.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tools/virt-host-validate-common.c | 162 ++++++------------------------ tools/virt-host-validate-common.h | 7 +- tools/virt-host-validate-lxc.c | 40 ++------ tools/virt-host-validate-qemu.c | 40 ++------ 4 files changed, 54 insertions(+), 195 deletions(-)
ACK Michal

Cgroup freezer support for LXC was added in libvirt-0.7.2. Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tools/virt-host-validate-lxc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/virt-host-validate-lxc.c b/tools/virt-host-validate-lxc.c index 3c55b1b26d..8613f37cc7 100644 --- a/tools/virt-host-validate-lxc.c +++ b/tools/virt-host-validate-lxc.c @@ -70,6 +70,7 @@ int virHostValidateLXC(void) (1 << VIR_CGROUP_CONTROLLER_CPUACCT) | (1 << VIR_CGROUP_CONTROLLER_CPUSET) | (1 << VIR_CGROUP_CONTROLLER_DEVICES) | + (1 << VIR_CGROUP_CONTROLLER_FREEZER) | (1 << VIR_CGROUP_CONTROLLER_BLKIO), VIR_HOST_VALIDATE_FAIL) < 0) { ret = -1; -- 2.17.1

On 10/02/2018 10:44 AM, Pavel Hrdina wrote:
Cgroup freezer support for LXC was added in libvirt-0.7.2.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> --- tools/virt-host-validate-lxc.c | 1 + 1 file changed, 1 insertion(+)
ACK Michal
participants (3)
-
Fabiano Fidêncio
-
Michal Privoznik
-
Pavel Hrdina