[libvirt] [PATCH 0/7] Per domain bandwidth settings

We decide to make a global per domain bandwidth setting as were discussed in mailing list earlier. This patchset implements hierarchy top level cpu.cfs_period_us and cpu.cfs_quota_us control knob. I've named this parameters as global_period and global_quota. Alexander Burluka (7): Add global period definitions Add global quota parameter necessary definitions Add error checking on global quota and period Add new cgroup thread type Rename qemuSetupCgroupVcpuBW to qemuSetupBandwidthCgroup Implement qemuSetupGlobalCpuCgroup Implement handling of per-domain bandwidth settings docs/schemas/domaincommon.rng | 10 ++++ include/libvirt/libvirt-domain.h | 32 ++++++++++ src/conf/domain_conf.c | 37 ++++++++++++ src/conf/domain_conf.h | 2 + src/qemu/qemu_cgroup.c | 78 +++++++++++++++++++++--- src/qemu/qemu_cgroup.h | 7 ++- src/qemu/qemu_command.c | 3 +- src/qemu/qemu_driver.c | 125 +++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_process.c | 4 ++ src/util/vircgroup.c | 4 ++ src/util/vircgroup.h | 1 + 11 files changed, 287 insertions(+), 16 deletions(-) -- 1.8.3.1

This parameter represents top level period cgroup that limits whole domain enforcement period for a quota Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com> --- docs/schemas/domaincommon.rng | 5 +++++ include/libvirt/libvirt-domain.h | 16 ++++++++++++++++ src/conf/domain_conf.c | 18 ++++++++++++++++++ src/conf/domain_conf.h | 1 + 4 files changed, 40 insertions(+) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 5deb17b..aa7eae9 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -670,6 +670,11 @@ </element> </optional> <optional> + <element name="global_period"> + <ref name="cpuperiod"/> + </element> + </optional> + <optional> <element name="period"> <ref name="cpuperiod"/> </element> diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index d26faa5..cb30313 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -312,6 +312,14 @@ typedef enum { # define VIR_DOMAIN_SCHEDULER_CPU_SHARES "cpu_shares" /** + * VIR_DOMAIN_SCHEDULER_GLOBAL_PERIOD: + * + * Macro represents the enforcement period for a quota, in microseconds, + * for whole domain, when using the posix scheduler, as a ullong. + */ +# define VIR_DOMAIN_SCHEDULER_GLOBAL_PERIOD "global_period" + +/** * VIR_DOMAIN_SCHEDULER_VCPU_PERIOD: * * Macro represents the enforcement period for a quota, in microseconds, @@ -3318,6 +3326,14 @@ typedef void (*virConnectDomainEventDeviceAddedCallback)(virConnectPtr conn, # define VIR_DOMAIN_TUNABLE_CPU_CPU_SHARES "cputune.cpu_shares" /** + * VIR_DOMAIN_TUNABLE_CPU_GLOBAL_PERIOD: + * + * Macro represents the enforcement period for a quota, in microseconds, + * for whole domain, when using the posix scheduler, as VIR_TYPED_PARAM_ULLONG. + */ +# define VIR_DOMAIN_TUNABLE_CPU_GLOBAL_PERIOD "cputune.global_period" + +/** * VIR_DOMAIN_TUNABLE_CPU_VCPU_PERIOD: * * Macro represents the enforcement period for a quota, in microseconds, diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 1e78da1..be714ea 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -15059,6 +15059,21 @@ virDomainDefParseXML(xmlDocPtr xml, goto error; } + if (virXPathULongLong("string(./cputune/global_period[1])", ctxt, + &def->cputune.global_period) < -1) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("can't parse cputune global period value")); + goto error; + } + + if (def->cputune.global_period > 0 && + (def->cputune.global_period < 1000 || def->cputune.global_period > 1000000)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Value of cputune global period must be in range " + "[1000, 1000000]")); + goto error; + } + if (virXPathULongLong("string(./cputune/emulator_period[1])", ctxt, &def->cputune.emulator_period) < -1) { virReportError(VIR_ERR_XML_ERROR, "%s", @@ -21818,6 +21833,9 @@ virDomainDefFormatInternal(virDomainDefPtr def, if (def->cputune.quota) virBufferAsprintf(&childrenBuf, "<quota>%lld</quota>\n", def->cputune.quota); + if (def->cputune.global_period) + virBufferAsprintf(&childrenBuf, "<global_period>%llu</global_period>\n", + def->cputune.global_period); if (def->cputune.emulator_period) virBufferAsprintf(&childrenBuf, "<emulator_period>%llu" diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 0141009..7c43e7d 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2135,6 +2135,7 @@ struct _virDomainCputune { bool sharesSpecified; unsigned long long period; long long quota; + unsigned long long global_period; unsigned long long emulator_period; long long emulator_quota; size_t nvcpupin; -- 1.8.3.1

This parameter controls the maximum bandwidth to be used within a period for whole domain. Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com> --- docs/schemas/domaincommon.rng | 5 +++++ include/libvirt/libvirt-domain.h | 16 ++++++++++++++++ src/conf/domain_conf.c | 19 +++++++++++++++++++ src/conf/domain_conf.h | 1 + 4 files changed, 41 insertions(+) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index aa7eae9..17653e1 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -675,6 +675,11 @@ </element> </optional> <optional> + <element name="global_quota"> + <ref name="cpuquota"/> + </element> + </optional> + <optional> <element name="period"> <ref name="cpuperiod"/> </element> diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index cb30313..7d3d8e8 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -320,6 +320,14 @@ typedef enum { # define VIR_DOMAIN_SCHEDULER_GLOBAL_PERIOD "global_period" /** + * VIR_DOMAIN_SCHEDULER_GLOBAL_QUOTA: + * + * Macro represents the maximum bandwidth to be used within a period for + * whole domain, when using the posix scheduler, as an llong. + */ +# define VIR_DOMAIN_SCHEDULER_GLOBAL_QUOTA "global_quota" + +/** * VIR_DOMAIN_SCHEDULER_VCPU_PERIOD: * * Macro represents the enforcement period for a quota, in microseconds, @@ -3334,6 +3342,14 @@ typedef void (*virConnectDomainEventDeviceAddedCallback)(virConnectPtr conn, # define VIR_DOMAIN_TUNABLE_CPU_GLOBAL_PERIOD "cputune.global_period" /** + * VIR_DOMAIN_TUNABLE_CPU_GLOBAL_QUOTA: + * + * Macro represents the maximum bandwidth to be used within a period for + * whole domain, when using the posix scheduler, as VIR_TYPED_PARAM_LLONG. + */ +# define VIR_DOMAIN_TUNABLE_CPU_GLOBAL_QUOTA "cputune.global_quota" + +/** * VIR_DOMAIN_TUNABLE_CPU_VCPU_PERIOD: * * Macro represents the enforcement period for a quota, in microseconds, diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index be714ea..1e50bb1 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -15074,6 +15074,22 @@ virDomainDefParseXML(xmlDocPtr xml, goto error; } + if (virXPathLongLong("string(./cputune/global_quota[1])", ctxt, + &def->cputune.global_quota) < -1) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("can't parse cputune global quota value")); + goto error; + } + + if (def->cputune.global_quota > 0 && + (def->cputune.global_quota < 1000 || + def->cputune.global_quota > 18446744073709551LL)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Value of cputune global quota must be in range " + "[1000, 18446744073709551]")); + goto error; + } + if (virXPathULongLong("string(./cputune/emulator_period[1])", ctxt, &def->cputune.emulator_period) < -1) { virReportError(VIR_ERR_XML_ERROR, "%s", @@ -21836,6 +21852,9 @@ virDomainDefFormatInternal(virDomainDefPtr def, if (def->cputune.global_period) virBufferAsprintf(&childrenBuf, "<global_period>%llu</global_period>\n", def->cputune.global_period); + if (def->cputune.global_quota) + virBufferAsprintf(&childrenBuf, "<global_quota>%lld</global_quota>\n", + def->cputune.global_quota); if (def->cputune.emulator_period) virBufferAsprintf(&childrenBuf, "<emulator_period>%llu" diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 7c43e7d..bac82b3 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2136,6 +2136,7 @@ struct _virDomainCputune { unsigned long long period; long long quota; unsigned long long global_period; + long long global_quota; unsigned long long emulator_period; long long emulator_quota; size_t nvcpupin; -- 1.8.3.1

Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com> --- src/qemu/qemu_command.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 5d3ab3a..087e9ad 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -9397,7 +9397,8 @@ qemuBuildCommandLine(virConnectPtr conn, } if (def->cputune.sharesSpecified || def->cputune.period || - def->cputune.quota || def->cputune.emulator_period || + def->cputune.quota || def->cputune.global_period || + def->cputune.global_quota || def->cputune.emulator_period || def->cputune.emulator_quota) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("CPU tuning is not available in session mode")); -- 1.8.3.1

Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com> --- src/util/vircgroup.c | 4 ++++ src/util/vircgroup.h | 1 + 2 files changed, 5 insertions(+) diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 78f519c..b829794 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1514,6 +1514,10 @@ virCgroupNewThread(virCgroupPtr domain, if (virAsprintf(&name, "iothread%d", id) < 0) goto cleanup; break; + case VIR_CGROUP_THREAD_GLOBAL: + if (VIR_STRDUP(name, "") < 0) + goto cleanup; + break; case VIR_CGROUP_THREAD_LAST: virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected name value %d"), nameval); diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 63a9e1c..8ee1dad 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -56,6 +56,7 @@ typedef enum { VIR_CGROUP_THREAD_VCPU = 0, VIR_CGROUP_THREAD_EMULATOR, VIR_CGROUP_THREAD_IOTHREAD, + VIR_CGROUP_THREAD_GLOBAL, VIR_CGROUP_THREAD_LAST } virCgroupThreadName; -- 1.8.3.1

On Tue, 2016-01-12 at 19:42 +0300, Alexander Burluka wrote:
Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com> --- src/util/vircgroup.c | 4 ++++ src/util/vircgroup.h | 1 + 2 files changed, 5 insertions(+)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c index 78f519c..b829794 100644 --- a/src/util/vircgroup.c +++ b/src/util/vircgroup.c @@ -1514,6 +1514,10 @@ virCgroupNewThread(virCgroupPtr domain, if (virAsprintf(&name, "iothread%d", id) < 0) goto cleanup; break; + case VIR_CGROUP_THREAD_GLOBAL: + if (VIR_STRDUP(name, "") < 0) + goto cleanup; + break; case VIR_CGROUP_THREAD_LAST:
This function called with VIR_CGROUP_THREAD_GLOBAL will do nothing, see comment to the 6th patch.
virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected name value %d"), nameval); diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h index 63a9e1c..8ee1dad 100644 --- a/src/util/vircgroup.h +++ b/src/util/vircgroup.h @@ -56,6 +56,7 @@ typedef enum { VIR_CGROUP_THREAD_VCPU = 0, VIR_CGROUP_THREAD_EMULATOR, VIR_CGROUP_THREAD_IOTHREAD, + VIR_CGROUP_THREAD_GLOBAL, VIR_CGROUP_THREAD_LAST } virCgroupThreadName;

This rename is required to reuse this function in per-domain bandwidth setup routine Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com> --- src/qemu/qemu_cgroup.c | 14 +++++++------- src/qemu/qemu_cgroup.h | 6 +++--- src/qemu/qemu_driver.c | 5 ++--- 3 files changed, 12 insertions(+), 13 deletions(-) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 1c406ce..e835ac4 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -944,9 +944,9 @@ qemuSetupCgroup(virQEMUDriverPtr driver, } int -qemuSetupCgroupVcpuBW(virCgroupPtr cgroup, - unsigned long long period, - long long quota) +qemuSetupBandwidthCgroup(virCgroupPtr cgroup, + unsigned long long period, + long long quota) { unsigned long long old_period; @@ -1053,7 +1053,7 @@ qemuSetupCgroupForVcpu(virDomainObjPtr vm) goto cleanup; if (period || quota) { - if (qemuSetupCgroupVcpuBW(cgroup_vcpu, period, quota) < 0) + if (qemuSetupBandwidthCgroup(cgroup_vcpu, period, quota) < 0) goto cleanup; } @@ -1155,8 +1155,8 @@ qemuSetupCgroupForEmulator(virDomainObjPtr vm) if (period || quota) { if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU) && - qemuSetupCgroupVcpuBW(cgroup_emulator, period, - quota) < 0) + qemuSetupBandwidthCgroup(cgroup_emulator, period, + quota) < 0) goto cleanup; } @@ -1224,7 +1224,7 @@ qemuSetupCgroupForIOThreads(virDomainObjPtr vm) goto cleanup; if (period || quota) { - if (qemuSetupCgroupVcpuBW(cgroup_iothread, period, quota) < 0) + if (qemuSetupBandwidthCgroup(cgroup_iothread, period, quota) < 0) goto cleanup; } diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 2bcf071..17da920 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -49,9 +49,9 @@ int qemuSetupCgroup(virQEMUDriverPtr driver, size_t nnicindexes, int *nicindexes); int qemuSetupCpusetMems(virDomainObjPtr vm); -int qemuSetupCgroupVcpuBW(virCgroupPtr cgroup, - unsigned long long period, - long long quota); +int qemuSetupBandwidthCgroup(virCgroupPtr cgroup, + unsigned long long period, + long long quota); int qemuSetupCgroupCpusetCpus(virCgroupPtr cgroup, virBitmapPtr cpumask); int qemuSetupCgroupForVcpu(virDomainObjPtr vm); int qemuSetupCgroupForIOThreads(virDomainObjPtr vm); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8ccf68b..48aeab6 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -10251,8 +10251,7 @@ qemuSetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, if (virCgroupNewThread(cgroup, VIR_CGROUP_THREAD_VCPU, i, false, &cgroup_vcpu) < 0) goto cleanup; - - if (qemuSetupCgroupVcpuBW(cgroup_vcpu, period, quota) < 0) + if (qemuSetupBandwidthCgroup(cgroup_vcpu, period, quota) < 0) goto cleanup; virCgroupFree(&cgroup_vcpu); @@ -10279,7 +10278,7 @@ qemuSetEmulatorBandwidthLive(virCgroupPtr cgroup, false, &cgroup_emulator) < 0) goto cleanup; - if (qemuSetupCgroupVcpuBW(cgroup_emulator, period, quota) < 0) + if (qemuSetupBandwidthCgroup(cgroup_emulator, period, quota) < 0) goto cleanup; virCgroupFree(&cgroup_emulator); -- 1.8.3.1

This functions setups per-domain cpu bandwidth parameters Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com> --- src/qemu/qemu_cgroup.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_cgroup.h | 1 + src/qemu/qemu_process.c | 4 ++++ 3 files changed, 69 insertions(+) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index e835ac4..53002b7 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -1002,6 +1002,70 @@ qemuSetupCgroupCpusetCpus(virCgroupPtr cgroup, } int +qemuSetupGlobalCpuCgroup(virDomainObjPtr vm) +{ + virCgroupPtr cgroup_vcpu = NULL; + qemuDomainObjPrivatePtr priv = vm->privateData; + unsigned long long period = vm->def->cputune.global_period; + long long quota = vm->def->cputune.global_quota; + char *mem_mask = NULL; + virDomainNumatuneMemMode mem_mode; + + if ((period || quota) && + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cgroup cpu is required for scheduler tuning")); + return -1; + } + + /* + * If CPU cgroup controller is not initialized here, then we need + * neither period nor quota settings. And if CPUSET controller is + * not initialized either, then there's nothing to do anyway. + */ + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU) && + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) + return 0; + + /* We are trying to setup cgroups for CPU pinning, which can also be done + * with virProcessSetAffinity, thus the lack of cgroups is not fatal here. + */ + if (priv->cgroup == NULL) + return 0; + + if (virDomainNumatuneGetMode(vm->def->numa, -1, &mem_mode) == 0 && + mem_mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT && + virDomainNumatuneMaybeFormatNodeset(vm->def->numa, + priv->autoNodeset, + &mem_mask, -1) < 0) + goto cleanup; + + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_GLOBAL, 0, + true, &cgroup_vcpu) < 0) + goto cleanup; + + if (period || quota) { + if (qemuSetupBandwidthCgroup(cgroup_vcpu, period, quota) < 0) + goto cleanup; + } + + virCgroupFree(&cgroup_vcpu); + VIR_FREE(mem_mask); + + return 0; + + cleanup: + if (cgroup_vcpu) { + virCgroupRemove(cgroup_vcpu); + virCgroupFree(&cgroup_vcpu); + } + VIR_FREE(mem_mask); + + return -1; +} + + +int qemuSetupCgroupForVcpu(virDomainObjPtr vm) { virCgroupPtr cgroup_vcpu = NULL; diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 17da920..75f9eb7 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -54,6 +54,7 @@ int qemuSetupBandwidthCgroup(virCgroupPtr cgroup, long long quota); int qemuSetupCgroupCpusetCpus(virCgroupPtr cgroup, virBitmapPtr cpumask); int qemuSetupCgroupForVcpu(virDomainObjPtr vm); +int qemuSetupGlobalCpuCgroup(virDomainObjPtr vm); int qemuSetupCgroupForIOThreads(virDomainObjPtr vm); int qemuSetupCgroupForEmulator(virDomainObjPtr vm); int qemuRemoveCgroup(virQEMUDriverPtr driver, virDomainObjPtr vm); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 3d9e0e5..7a90457 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4981,6 +4981,10 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessDetectIOThreadPIDs(driver, vm, asyncJob) < 0) goto cleanup; + VIR_DEBUG("Setting global CPU cgroup (if required)"); + if (qemuSetupGlobalCpuCgroup(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting cgroup for each VCPU (if required)"); if (qemuSetupCgroupForVcpu(vm) < 0) goto cleanup; -- 1.8.3.1

On Tue, 2016-01-12 at 19:42 +0300, Alexander Burluka wrote:
This functions setups per-domain cpu bandwidth parameters
Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com> --- src/qemu/qemu_cgroup.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_cgroup.h | 1 + src/qemu/qemu_process.c | 4 ++++ 3 files changed, 69 insertions(+)
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index e835ac4..53002b7 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -1002,6 +1002,70 @@ qemuSetupCgroupCpusetCpus(virCgroupPtr cgroup, } int +qemuSetupGlobalCpuCgroup(virDomainObjPtr vm) +{ + virCgroupPtr cgroup_vcpu = NULL; + qemuDomainObjPrivatePtr priv = vm->privateData; + unsigned long long period = vm->def->cputune.global_period; + long long quota = vm->def->cputune.global_quota; + char *mem_mask = NULL; + virDomainNumatuneMemMode mem_mode; + + if ((period || quota) && + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cgroup cpu is required for scheduler tuning")); + return -1; + } + + /* + * If CPU cgroup controller is not initialized here, then we need + * neither period nor quota settings. And if CPUSET controller is + * not initialized either, then there's nothing to do anyway. + */ + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU) && + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) + return 0; + + /* We are trying to setup cgroups for CPU pinning, which can also be done + * with virProcessSetAffinity, thus the lack of cgroups is not fatal here. + */ + if (priv->cgroup == NULL) + return 0; + + if (virDomainNumatuneGetMode(vm->def->numa, -1, &mem_mode) == 0 && + mem_mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT && + virDomainNumatuneMaybeFormatNodeset(vm->def->numa, + priv->autoNodeset, + &mem_mask, -1) < 0) + goto cleanup; + + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_GLOBAL, 0, + true, &cgroup_vcpu) < 0) + goto cleanup; + + if (period || quota) { + if (qemuSetupBandwidthCgroup(cgroup_vcpu, period, quota) < 0) + goto cleanup; + }
I think you could just use priv->cgroup and don't call virCgroupNewThread, so the previous patch is not needed.
+ + virCgroupFree(&cgroup_vcpu); + VIR_FREE(mem_mask); + + return 0; + + cleanup: + if (cgroup_vcpu) { + virCgroupRemove(cgroup_vcpu); + virCgroupFree(&cgroup_vcpu); + } + VIR_FREE(mem_mask); + + return -1; +} + + +int qemuSetupCgroupForVcpu(virDomainObjPtr vm) { virCgroupPtr cgroup_vcpu = NULL; diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 17da920..75f9eb7 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -54,6 +54,7 @@ int qemuSetupBandwidthCgroup(virCgroupPtr cgroup, long long quota); int qemuSetupCgroupCpusetCpus(virCgroupPtr cgroup, virBitmapPtr cpumask); int qemuSetupCgroupForVcpu(virDomainObjPtr vm); +int qemuSetupGlobalCpuCgroup(virDomainObjPtr vm); int qemuSetupCgroupForIOThreads(virDomainObjPtr vm); int qemuSetupCgroupForEmulator(virDomainObjPtr vm); int qemuRemoveCgroup(virQEMUDriverPtr driver, virDomainObjPtr vm); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 3d9e0e5..7a90457 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4981,6 +4981,10 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessDetectIOThreadPIDs(driver, vm, asyncJob) < 0) goto cleanup; + VIR_DEBUG("Setting global CPU cgroup (if required)"); + if (qemuSetupGlobalCpuCgroup(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting cgroup for each VCPU (if required)"); if (qemuSetupCgroupForVcpu(vm) < 0) goto cleanup;

On Thu, 2016-01-14 at 14:28 +0300, Dmitry Guryanov wrote:
On Tue, 2016-01-12 at 19:42 +0300, Alexander Burluka wrote:
This functions setups per-domain cpu bandwidth parameters
Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com> --- src/qemu/qemu_cgroup.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_cgroup.h | 1 + src/qemu/qemu_process.c | 4 ++++ 3 files changed, 69 insertions(+)
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index e835ac4..53002b7 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -1002,6 +1002,70 @@ qemuSetupCgroupCpusetCpus(virCgroupPtr cgroup, } int +qemuSetupGlobalCpuCgroup(virDomainObjPtr vm) +{ + virCgroupPtr cgroup_vcpu = NULL; + qemuDomainObjPrivatePtr priv = vm->privateData; + unsigned long long period = vm->def->cputune.global_period; + long long quota = vm->def->cputune.global_quota; + char *mem_mask = NULL; + virDomainNumatuneMemMode mem_mode; + + if ((period || quota) && + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cgroup cpu is required for scheduler tuning")); + return -1; + } + + /* + * If CPU cgroup controller is not initialized here, then we need + * neither period nor quota settings. And if CPUSET controller is + * not initialized either, then there's nothing to do anyway. + */ + if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU) && + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) + return 0; + + /* We are trying to setup cgroups for CPU pinning, which can also be done + * with virProcessSetAffinity, thus the lack of cgroups is not fatal here. + */ + if (priv->cgroup == NULL) + return 0; + + if (virDomainNumatuneGetMode(vm->def->numa, -1, &mem_mode) == 0 && + mem_mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT && + virDomainNumatuneMaybeFormatNodeset(vm->def->numa, + priv->autoNodeset, + &mem_mask, -1) < 0) + goto cleanup; + + if (virCgroupNewThread(priv->cgroup, VIR_CGROUP_THREAD_GLOBAL, 0, + true, &cgroup_vcpu) < 0) + goto cleanup; + + if (period || quota) { + if (qemuSetupBandwidthCgroup(cgroup_vcpu, period, quota) < 0) + goto cleanup; + }
I think you could just use priv->cgroup and don't call virCgroupNewThread, so the previous patch is not needed.
Sorry, not previous, 4th.
+ + virCgroupFree(&cgroup_vcpu); + VIR_FREE(mem_mask); + + return 0; + + cleanup: + if (cgroup_vcpu) { + virCgroupRemove(cgroup_vcpu); + virCgroupFree(&cgroup_vcpu); + } + VIR_FREE(mem_mask); + + return -1; +} + + +int qemuSetupCgroupForVcpu(virDomainObjPtr vm) { virCgroupPtr cgroup_vcpu = NULL; diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 17da920..75f9eb7 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -54,6 +54,7 @@ int qemuSetupBandwidthCgroup(virCgroupPtr cgroup, long long quota); int qemuSetupCgroupCpusetCpus(virCgroupPtr cgroup, virBitmapPtr cpumask); int qemuSetupCgroupForVcpu(virDomainObjPtr vm); +int qemuSetupGlobalCpuCgroup(virDomainObjPtr vm); int qemuSetupCgroupForIOThreads(virDomainObjPtr vm); int qemuSetupCgroupForEmulator(virDomainObjPtr vm); int qemuRemoveCgroup(virQEMUDriverPtr driver, virDomainObjPtr vm); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 3d9e0e5..7a90457 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4981,6 +4981,10 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessDetectIOThreadPIDs(driver, vm, asyncJob) < 0) goto cleanup; + VIR_DEBUG("Setting global CPU cgroup (if required)"); + if (qemuSetupGlobalCpuCgroup(vm) < 0) + goto cleanup; + VIR_DEBUG("Setting cgroup for each VCPU (if required)"); if (qemuSetupCgroupForVcpu(vm) < 0) goto cleanup;
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com> --- src/qemu/qemu_driver.c | 120 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 118 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 48aeab6..6a0fa9b 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8906,7 +8906,7 @@ static char *qemuDomainGetSchedulerType(virDomainPtr dom, /* Domain not running, thus no cgroups - return defaults */ if (!virDomainObjIsActive(vm)) { if (nparams) - *nparams = 5; + *nparams = 7; ignore_value(VIR_STRDUP(ret, "posix")); goto cleanup; } @@ -8919,7 +8919,7 @@ static char *qemuDomainGetSchedulerType(virDomainPtr dom, if (nparams) { if (virCgroupSupportsCpuBW(priv->cgroup)) - *nparams = 5; + *nparams = 7; else *nparams = 1; } @@ -10234,6 +10234,31 @@ qemuDomainGetNumaParameters(virDomainPtr dom, } static int +qemuSetGlobalBWLive(virDomainObjPtr vm ATTRIBUTE_UNUSED, virCgroupPtr cgroup, + unsigned long long period, long long quota) +{ + virCgroupPtr cgroup_global = NULL; + + if (period == 0 && quota == 0) + return 0; + + if (virCgroupNewThread(cgroup, VIR_CGROUP_THREAD_GLOBAL, 0, + false, &cgroup_global) < 0) + goto cleanup; + + if (qemuSetupBandwidthCgroup(cgroup_global, period, quota) < 0) + goto cleanup; + + virCgroupFree(&cgroup_global); + + return 0; + + cleanup: + virCgroupFree(&cgroup_global); + return -1; +} + +static int qemuSetVcpusBWLive(virDomainObjPtr vm, virCgroupPtr cgroup, unsigned long long period, long long quota) { @@ -10329,6 +10354,10 @@ qemuDomainSetSchedulerParametersFlags(virDomainPtr dom, VIR_TYPED_PARAM_ULLONG, VIR_DOMAIN_SCHEDULER_VCPU_QUOTA, VIR_TYPED_PARAM_LLONG, + VIR_DOMAIN_SCHEDULER_GLOBAL_PERIOD, + VIR_TYPED_PARAM_ULLONG, + VIR_DOMAIN_SCHEDULER_GLOBAL_QUOTA, + VIR_TYPED_PARAM_LLONG, VIR_DOMAIN_SCHEDULER_EMULATOR_PERIOD, VIR_TYPED_PARAM_ULLONG, VIR_DOMAIN_SCHEDULER_EMULATOR_QUOTA, @@ -10445,6 +10474,46 @@ qemuDomainSetSchedulerParametersFlags(virDomainPtr dom, if (flags & VIR_DOMAIN_AFFECT_CONFIG) vmdef->cputune.quota = value_l; + } else if (STREQ(param->field, VIR_DOMAIN_SCHEDULER_GLOBAL_PERIOD)) { + SCHED_RANGE_CHECK(value_ul, VIR_DOMAIN_SCHEDULER_GLOBAL_PERIOD, + QEMU_SCHED_MIN_PERIOD, QEMU_SCHED_MAX_PERIOD); + + if (flags & VIR_DOMAIN_AFFECT_LIVE && value_ul) { + if ((rc = qemuSetGlobalBWLive(vm, priv->cgroup, value_ul, 0))) + goto endjob; + + vm->def->cputune.global_period = value_ul; + + if (virTypedParamsAddULLong(&eventParams, &eventNparams, + &eventMaxNparams, + VIR_DOMAIN_TUNABLE_CPU_GLOBAL_PERIOD, + value_ul) < 0) + goto endjob; + } + + if (flags & VIR_DOMAIN_AFFECT_CONFIG) + vmdef->cputune.period = params[i].value.ul; + + } else if (STREQ(param->field, VIR_DOMAIN_SCHEDULER_GLOBAL_QUOTA)) { + SCHED_RANGE_CHECK(value_l, VIR_DOMAIN_SCHEDULER_GLOBAL_QUOTA, + QEMU_SCHED_MIN_QUOTA, QEMU_SCHED_MAX_QUOTA); + + if (flags & VIR_DOMAIN_AFFECT_LIVE && value_l) { + if ((rc = qemuSetGlobalBWLive(vm, priv->cgroup, 0, value_l))) + goto endjob; + + vm->def->cputune.global_quota = value_l; + + if (virTypedParamsAddLLong(&eventParams, &eventNparams, + &eventMaxNparams, + VIR_DOMAIN_TUNABLE_CPU_GLOBAL_QUOTA, + value_l) < 0) + goto endjob; + } + + if (flags & VIR_DOMAIN_AFFECT_CONFIG) + vmdef->cputune.global_quota = value_l; + } else if (STREQ(param->field, VIR_DOMAIN_SCHEDULER_EMULATOR_PERIOD)) { SCHED_RANGE_CHECK(value_ul, VIR_DOMAIN_SCHEDULER_EMULATOR_PERIOD, QEMU_SCHED_MIN_PERIOD, QEMU_SCHED_MAX_PERIOD); @@ -10609,6 +10678,27 @@ qemuGetEmulatorBandwidthLive(virCgroupPtr cgroup, } static int +qemuGetGlobalBWLive(virCgroupPtr cgroup, unsigned long long *period, + long long *quota) +{ + virCgroupPtr global_cgroup= NULL; + int ret = -1; + + if (virCgroupNewThread(cgroup, VIR_CGROUP_THREAD_GLOBAL, 0, + false, &global_cgroup) < 0) + goto cleanup; + + if (qemuGetVcpuBWLive(global_cgroup, period, quota) < 0) + goto cleanup; + + ret = 0; + + cleanup: + virCgroupFree(&global_cgroup); + return ret; +} + +static int qemuDomainGetSchedulerParametersFlags(virDomainPtr dom, virTypedParameterPtr params, int *nparams, @@ -10619,6 +10709,8 @@ qemuDomainGetSchedulerParametersFlags(virDomainPtr dom, unsigned long long shares; unsigned long long period; long long quota; + unsigned long long global_period; + long long global_quota; unsigned long long emulator_period; long long emulator_quota; int ret = -1; @@ -10665,6 +10757,8 @@ qemuDomainGetSchedulerParametersFlags(virDomainPtr dom, if (*nparams > 1) { period = persistentDef->cputune.period; quota = persistentDef->cputune.quota; + global_period = persistentDef->cputune.global_period; + global_quota = persistentDef->cputune.global_quota; emulator_period = persistentDef->cputune.emulator_period; emulator_quota = persistentDef->cputune.emulator_quota; cpu_bw_status = true; /* Allow copy of data to params[] */ @@ -10694,6 +10788,12 @@ qemuDomainGetSchedulerParametersFlags(virDomainPtr dom, goto cleanup; } + if (*nparams > 5 && cpu_bw_status) { + rc = qemuGetGlobalBWLive(priv->cgroup, &global_period, &global_quota); + if (rc != 0) + goto cleanup; + } + out: if (virTypedParameterAssign(¶ms[0], VIR_DOMAIN_SCHEDULER_CPU_SHARES, VIR_TYPED_PARAM_ULLONG, shares) < 0) @@ -10734,6 +10834,22 @@ qemuDomainGetSchedulerParametersFlags(virDomainPtr dom, goto cleanup; saved_nparams++; } + + if (*nparams > saved_nparams) { + if (virTypedParameterAssign(¶ms[5], + VIR_DOMAIN_SCHEDULER_GLOBAL_PERIOD, + VIR_TYPED_PARAM_ULLONG, global_period) < 0) + goto cleanup; + saved_nparams++; + } + + if (*nparams > saved_nparams) { + if (virTypedParameterAssign(¶ms[6], + VIR_DOMAIN_SCHEDULER_GLOBAL_QUOTA, + VIR_TYPED_PARAM_LLONG, global_quota) < 0) + goto cleanup; + saved_nparams++; + } } *nparams = saved_nparams; -- 1.8.3.1

On 01/12/2016 11:42 AM, Alexander Burluka wrote:
We decide to make a global per domain bandwidth setting as were discussed in mailing list earlier. This patchset implements hierarchy top level cpu.cfs_period_us and cpu.cfs_quota_us control knob. I've named this parameters as global_period and global_quota.
I haven't looked into the details of the patches (and don't really feel qualified to do so), but wanted to mention a couple of things: * although there are examples in the RNG of elements and attributes with underscores in their names, we had decided a few years ago that they should be avoided, and capitalization be used instead (e.g. globalPeriod). I truthfully don't know how well that's been adhered to (and also see that there are already other sublements within cputune that use _, e.g. emulator_period), so I don't know which way we should try to be consistent (going either way could be seen as wrong), but just thought I should mention it. * there are no new XML test cases that use the new elements.
Alexander Burluka (7): Add global period definitions Add global quota parameter necessary definitions Add error checking on global quota and period Add new cgroup thread type Rename qemuSetupCgroupVcpuBW to qemuSetupBandwidthCgroup Implement qemuSetupGlobalCpuCgroup Implement handling of per-domain bandwidth settings
docs/schemas/domaincommon.rng | 10 ++++ include/libvirt/libvirt-domain.h | 32 ++++++++++ src/conf/domain_conf.c | 37 ++++++++++++ src/conf/domain_conf.h | 2 + src/qemu/qemu_cgroup.c | 78 +++++++++++++++++++++--- src/qemu/qemu_cgroup.h | 7 ++- src/qemu/qemu_command.c | 3 +- src/qemu/qemu_driver.c | 125 +++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_process.c | 4 ++ src/util/vircgroup.c | 4 ++ src/util/vircgroup.h | 1 + 11 files changed, 287 insertions(+), 16 deletions(-)

On 01/12/2016 08:19 PM, Laine Stump wrote:
On 01/12/2016 11:42 AM, Alexander Burluka wrote:
We decide to make a global per domain bandwidth setting as were discussed in mailing list earlier. This patchset implements hierarchy top level cpu.cfs_period_us and cpu.cfs_quota_us control knob. I've named this parameters as global_period and global_quota.
I haven't looked into the details of the patches (and don't really feel qualified to do so), but wanted to mention a couple of things:
* although there are examples in the RNG of elements and attributes with underscores in their names, we had decided a few years ago that they should be avoided, and capitalization be used instead (e.g. globalPeriod). I truthfully don't know how well that's been adhered to (and also see that there are already other sublements within cputune that use _, e.g. emulator_period), so I don't know which way we should try to be consistent (going either way could be seen as wrong), but just thought I should mention it. Sorry, I missed this agreement, so use consistent name. Thank you for notice. Can you please point a person who can help with this stylish question?
* there are no new XML test cases that use the new elements. My bad, will add them. Thank you!
Alexander Burluka (7): Add global period definitions Add global quota parameter necessary definitions Add error checking on global quota and period Add new cgroup thread type Rename qemuSetupCgroupVcpuBW to qemuSetupBandwidthCgroup Implement qemuSetupGlobalCpuCgroup Implement handling of per-domain bandwidth settings
docs/schemas/domaincommon.rng | 10 ++++ include/libvirt/libvirt-domain.h | 32 ++++++++++ src/conf/domain_conf.c | 37 ++++++++++++ src/conf/domain_conf.h | 2 + src/qemu/qemu_cgroup.c | 78 +++++++++++++++++++++--- src/qemu/qemu_cgroup.h | 7 ++- src/qemu/qemu_command.c | 3 +- src/qemu/qemu_driver.c | 125 +++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_process.c | 4 ++ src/util/vircgroup.c | 4 ++ src/util/vircgroup.h | 1 + 11 files changed, 287 insertions(+), 16 deletions(-)
-- Regards, Alexander Burluka

On 12.01.2016 17:42, Alexander Burluka wrote:
We decide to make a global per domain bandwidth setting as were discussed in mailing list earlier. This patchset implements hierarchy top level cpu.cfs_period_us and cpu.cfs_quota_us control knob. I've named this parameters as global_period and global_quota.
Alexander Burluka (7): Add global period definitions Add global quota parameter necessary definitions Add error checking on global quota and period Add new cgroup thread type Rename qemuSetupCgroupVcpuBW to qemuSetupBandwidthCgroup Implement qemuSetupGlobalCpuCgroup Implement handling of per-domain bandwidth settings
docs/schemas/domaincommon.rng | 10 ++++ include/libvirt/libvirt-domain.h | 32 ++++++++++ src/conf/domain_conf.c | 37 ++++++++++++ src/conf/domain_conf.h | 2 + src/qemu/qemu_cgroup.c | 78 +++++++++++++++++++++--- src/qemu/qemu_cgroup.h | 7 ++- src/qemu/qemu_command.c | 3 +- src/qemu/qemu_driver.c | 125 +++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_process.c | 4 ++ src/util/vircgroup.c | 4 ++ src/util/vircgroup.h | 1 + 11 files changed, 287 insertions(+), 16 deletions(-)
Similarly to Laine, I have not went through the patches in detail, but does this patch set touch the domain top-level cgroup? If so we may be in trouble the minute we want to pin the vcpus elsewhere - if it is touching the cpuset cgroup too. Michal

You are absolutely right, this patchset allows to set domain top-level cpu.cfs_period_us and cpu.cfs_quota_us cgroups. Can you please explain problem case a little bit more detailed? This code does not affect top-level cpuset cgroup, only quota and period, so there is no visible troubles for me. Thank you! On 01/12/2016 08:30 PM, Michal Privoznik wrote:
On 12.01.2016 17:42, Alexander Burluka wrote:
We decide to make a global per domain bandwidth setting as were discussed in mailing list earlier. This patchset implements hierarchy top level cpu.cfs_period_us and cpu.cfs_quota_us control knob. I've named this parameters as global_period and global_quota.
Alexander Burluka (7): Add global period definitions Add global quota parameter necessary definitions Add error checking on global quota and period Add new cgroup thread type Rename qemuSetupCgroupVcpuBW to qemuSetupBandwidthCgroup Implement qemuSetupGlobalCpuCgroup Implement handling of per-domain bandwidth settings
docs/schemas/domaincommon.rng | 10 ++++ include/libvirt/libvirt-domain.h | 32 ++++++++++ src/conf/domain_conf.c | 37 ++++++++++++ src/conf/domain_conf.h | 2 + src/qemu/qemu_cgroup.c | 78 +++++++++++++++++++++--- src/qemu/qemu_cgroup.h | 7 ++- src/qemu/qemu_command.c | 3 +- src/qemu/qemu_driver.c | 125 +++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_process.c | 4 ++ src/util/vircgroup.c | 4 ++ src/util/vircgroup.h | 1 + 11 files changed, 287 insertions(+), 16 deletions(-)
Similarly to Laine, I have not went through the patches in detail, but does this patch set touch the domain top-level cgroup? If so we may be in trouble the minute we want to pin the vcpus elsewhere - if it is touching the cpuset cgroup too.
Michal
-- Regards, Alexander Burluka

On 13.01.2016 11:17, Alexander Burluka wrote:
You are absolutely right, this patchset allows to set domain top-level cpu.cfs_period_us and cpu.cfs_quota_us cgroups. Can you please explain problem case a little bit more detailed? This code does not affect top-level cpuset cgroup, only quota and period, so there is no visible troubles for me. Thank you!
Well, I would have to dig into the patches, but in theory, if you have the following cgroup layout: A->B->C where A is the top-level cgroup, B is child of A, C is child of B, then B is inherently restricted by A, C is inherently restricted by B (and transitively by A too). Therefore if for instance cpuset in A is set to 0-3, B can be only as good as A or more restrictive. So values for cpuset in B must be a subset of those in A. And so on. The problem we were facing just recently and that I'm mentioning was, that in the picture, libvirt puts vCPUs into C cgroup and don't touch A or B. Now when user wants to pin vCPUs onto different cpuset, we can simply just change it in C cgroup. If we, however, were to set cpuset in B too, it would be impossible for us to change C due to reasoning above. Now, I am not familiar with CFS and we probably don't allow tuning it at runtime either. I just want to make sure the reason I am mentioning above is kept in picture when touching our CGroups code. If I'm completely off, and just blabbing off the lines, disregard me. I will probably learn more once I'm through the patches. Michal

Oh, I got it, your example is really problem case, but it's a cpuset specific. cfs_period_us and cfs_quota_us are a little different, they limit only bandwidth and won't cause anything similar. For example: A->B A cpu.cfs_period_us is set to 100000 If you will try to write 200000 to B cpu.cfs_period_us, you will get a write error. (I've tried this on 3.10 Linux kernel). So child period and quota cannot be greater than a parent ones (of course if parent constraints are set they are ignored). The main reason to do this knob is flexibility and behavior similar to VMWare and our Parallels Cloud Server 6. Our customers really want it. By the way, period and quota can be set on running domain. On 01/13/2016 01:42 PM, Michal Privoznik wrote:
You are absolutely right, this patchset allows to set domain top-level cpu.cfs_period_us and cpu.cfs_quota_us cgroups. Can you please explain problem case a little bit more detailed? This code does not affect top-level cpuset cgroup, only quota and period, so there is no visible troubles for me. Thank you! Well, I would have to dig into the patches, but in theory, if you have
On 13.01.2016 11:17, Alexander Burluka wrote: the following cgroup layout:
A->B->C
where A is the top-level cgroup, B is child of A, C is child of B, then B is inherently restricted by A, C is inherently restricted by B (and transitively by A too). Therefore if for instance cpuset in A is set to 0-3, B can be only as good as A or more restrictive. So values for cpuset in B must be a subset of those in A. And so on. The problem we were facing just recently and that I'm mentioning was, that in the picture, libvirt puts vCPUs into C cgroup and don't touch A or B. Now when user wants to pin vCPUs onto different cpuset, we can simply just change it in C cgroup. If we, however, were to set cpuset in B too, it would be impossible for us to change C due to reasoning above.
Now, I am not familiar with CFS and we probably don't allow tuning it at runtime either. I just want to make sure the reason I am mentioning above is kept in picture when touching our CGroups code.
If I'm completely off, and just blabbing off the lines, disregard me. I will probably learn more once I'm through the patches.
Michal
-- Regards, Alexander Burluka
participants (4)
-
Alexander Burluka
-
Dmitry Guryanov
-
Laine Stump
-
Michal Privoznik