[libvirt] [PATCH] Temporarily shutoff CPUSET cgroup in libvirt.

When a cpu or memory is offlined and onlined again, cgroup will not recover any corresponding values. This is a problem in kernel. Do not use CPUSET cgroup to limit threads using cpu and memory until the problem is fixed in kernel. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> --- src/qemu/qemu.conf | 5 ++++- src/qemu/qemu_conf.c | 7 ++++++- src/util/cgroup.c | 3 +-- 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index 18105ca..6bf7290 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -211,7 +211,10 @@ # can be mounted in different locations. libvirt will detect # where they are located. # -#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuset", "cpuacct" ] +# When a cpu or memory is offlined and onlined again, cgroup will not +# recover any corresponding values. This is a problem in kernel. +# So DO NOT use cpuset cgroup until this problem is fixed in kernel. +#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuacct" ] # This is the basic set of devices allowed / required by # all virtual machines. diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 91a56f1..80b0787 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -397,12 +397,17 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, driver->cgroupControllers |= (1 << ctl); } } else { + /* + * When a cpu or memory is offlined and onlined again, cgroup will not + * recover any corresponding values. This is a problem in kernel. + * Do not use CPUSET cgroup to limit threads using cpu and memory until + * the problem is fixed in kernel. + */ driver->cgroupControllers = (1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_DEVICES) | (1 << VIR_CGROUP_CONTROLLER_MEMORY) | (1 << VIR_CGROUP_CONTROLLER_BLKIO) | - (1 << VIR_CGROUP_CONTROLLER_CPUSET) | (1 << VIR_CGROUP_CONTROLLER_CPUACCT); } for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 5dc0764..cd7d3fe 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -543,8 +543,7 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, /* We need to control cpu bandwidth for each vcpu now */ if ((flags & VIR_CGROUP_VCPU) && (i != VIR_CGROUP_CONTROLLER_CPU && - i != VIR_CGROUP_CONTROLLER_CPUACCT && - i != VIR_CGROUP_CONTROLLER_CPUSET)) { + i != VIR_CGROUP_CONTROLLER_CPUACCT)) { /* treat it as unmounted and we can use virCgroupAddTask */ VIR_FREE(group->controllers[i].mountPoint); continue; -- 1.7.10.1

On 09/21/2012 02:20 AM, Tang Chen wrote:
When a cpu or memory is offlined and onlined again, cgroup will not recover any corresponding values. This is a problem in kernel. Do not use CPUSET cgroup to limit threads using cpu and memory until the problem is fixed in kernel.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> --- src/qemu/qemu.conf | 5 ++++- src/qemu/qemu_conf.c | 7 ++++++- src/util/cgroup.c | 3 +-- 3 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index 18105ca..6bf7290 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -211,7 +211,10 @@ # can be mounted in different locations. libvirt will detect # where they are located. # -#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuset", "cpuacct" ] +# When a cpu or memory is offlined and onlined again, cgroup will not +# recover any corresponding values. This is a problem in kernel. +# So DO NOT use cpuset cgroup until this problem is fixed in kernel. +#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuacct" ]
We ought to list which versions of the kernel are known to have the bug (that is, say something like 'at least through kernel 3.6'); when we finally know which kernel version fixes the bugs in the cpuset controller, then we can update the text again to give the user better advice on whether they want to be using this. I'm torn on whether to take this patch prior to 0.10.2 - it makes sense to take (that is, the kernel is buggy, and we cause degraded performance as a result of the kernel bug any time the system goes through S3), but it's rather late, having missed rc2, for such a major change in default functionality. Am I correct, though, that you can manually edit /etc/libvirt/qemu.conf to re-enable the use of cpuset even on a buggy kernel, for further testing the issues?
# This is the basic set of devices allowed / required by # all virtual machines. diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 91a56f1..80b0787 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -397,12 +397,17 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, driver->cgroupControllers |= (1 << ctl); } } else { + /* + * When a cpu or memory is offlined and onlined again, cgroup will not + * recover any corresponding values. This is a problem in kernel. + * Do not use CPUSET cgroup to limit threads using cpu and memory until + * the problem is fixed in kernel. + */ driver->cgroupControllers = (1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_DEVICES) | (1 << VIR_CGROUP_CONTROLLER_MEMORY) | (1 << VIR_CGROUP_CONTROLLER_BLKIO) | - (1 << VIR_CGROUP_CONTROLLER_CPUSET) | (1 << VIR_CGROUP_CONTROLLER_CPUACCT);
Rather than hard-coding the non-use of cpuset as our default, can we do something like a uname() probe or some other clue of whether we have a buggy kernel?
+++ b/src/util/cgroup.c @@ -543,8 +543,7 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, /* We need to control cpu bandwidth for each vcpu now */ if ((flags & VIR_CGROUP_VCPU) && (i != VIR_CGROUP_CONTROLLER_CPU && - i != VIR_CGROUP_CONTROLLER_CPUACCT && - i != VIR_CGROUP_CONTROLLER_CPUSET)) { + i != VIR_CGROUP_CONTROLLER_CPUACCT)) { /* treat it as unmounted and we can use virCgroupAddTask */ VIR_FREE(group->controllers[i].mountPoint); continue;
Is this hunk necessary? In other words, don't we just gracefully skip making per-vcpu groups for any controller that was not mounted, or do we loudly fail if all of these controllers are not present? -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Hi Eric, On 09/22/2012 03:04 AM, Eric Blake wrote:
-#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuset", "cpuacct" ] +# When a cpu or memory is offlined and onlined again, cgroup will not +# recover any corresponding values. This is a problem in kernel. +# So DO NOT use cpuset cgroup until this problem is fixed in kernel. +#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuacct" ]
We ought to list which versions of the kernel are known to have the bug (that is, say something like 'at least through kernel 3.6'); when we finally know which kernel version fixes the bugs in the cpuset controller, then we can update the text again to give the user better advice on whether they want to be using this.
Sure, I'll add the comment. :)
I'm torn on whether to take this patch prior to 0.10.2 - it makes sense to take (that is, the kernel is buggy, and we cause degraded performance as a result of the kernel bug any time the system goes through S3), but it's rather late, having missed rc2, for such a major change in default functionality. Am I correct, though, that you can manually edit /etc/libvirt/qemu.conf to re-enable the use of cpuset even on a buggy kernel, for further testing the issues?
Well, I think we'd better make cpuset off as the default functionality. I like the following idea, adding uname() probe. I will make it possible to re-enable cpuset for testing purpose and resend a new patch. You may take the new patch in later version if necessary. :)
driver->cgroupControllers = (1<< VIR_CGROUP_CONTROLLER_CPU) | (1<< VIR_CGROUP_CONTROLLER_DEVICES) | (1<< VIR_CGROUP_CONTROLLER_MEMORY) | (1<< VIR_CGROUP_CONTROLLER_BLKIO) | - (1<< VIR_CGROUP_CONTROLLER_CPUSET) | (1<< VIR_CGROUP_CONTROLLER_CPUACCT);
Rather than hard-coding the non-use of cpuset as our default, can we do something like a uname() probe or some other clue of whether we have a buggy kernel?
As I said above, this is a good idea! I will add the probe. Thanks. :)
+++ b/src/util/cgroup.c @@ -543,8 +543,7 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, /* We need to control cpu bandwidth for each vcpu now */ if ((flags& VIR_CGROUP_VCPU)&& (i != VIR_CGROUP_CONTROLLER_CPU&& - i != VIR_CGROUP_CONTROLLER_CPUACCT&& - i != VIR_CGROUP_CONTROLLER_CPUSET)) { + i != VIR_CGROUP_CONTROLLER_CPUACCT)) { /* treat it as unmounted and we can use virCgroupAddTask */ VIR_FREE(group->controllers[i].mountPoint); continue;
Is this hunk necessary? In other words, don't we just gracefully skip making per-vcpu groups for any controller that was not mounted, or do we loudly fail if all of these controllers are not present?

At 09/22/2012 03:04 AM, Eric Blake Wrote:
On 09/21/2012 02:20 AM, Tang Chen wrote:
When a cpu or memory is offlined and onlined again, cgroup will not recover any corresponding values. This is a problem in kernel. Do not use CPUSET cgroup to limit threads using cpu and memory until the problem is fixed in kernel.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> --- src/qemu/qemu.conf | 5 ++++- src/qemu/qemu_conf.c | 7 ++++++- src/util/cgroup.c | 3 +-- 3 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index 18105ca..6bf7290 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -211,7 +211,10 @@ # can be mounted in different locations. libvirt will detect # where they are located. # -#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuset", "cpuacct" ] +# When a cpu or memory is offlined and onlined again, cgroup will not +# recover any corresponding values. This is a problem in kernel. +# So DO NOT use cpuset cgroup until this problem is fixed in kernel. +#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuacct" ]
We ought to list which versions of the kernel are known to have the bug (that is, say something like 'at least through kernel 3.6'); when we finally know which kernel version fixes the bugs in the cpuset controller, then we can update the text again to give the user better advice on whether they want to be using this.
I'm torn on whether to take this patch prior to 0.10.2 - it makes sense to take (that is, the kernel is buggy, and we cause degraded performance as a result of the kernel bug any time the system goes through S3), but it's rather late, having missed rc2, for such a major change in default functionality. Am I correct, though, that you can manually edit /etc/libvirt/qemu.conf to re-enable the use of cpuset even on a buggy kernel, for further testing the issues?
# This is the basic set of devices allowed / required by # all virtual machines. diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 91a56f1..80b0787 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -397,12 +397,17 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, driver->cgroupControllers |= (1 << ctl); } } else { + /* + * When a cpu or memory is offlined and onlined again, cgroup will not + * recover any corresponding values. This is a problem in kernel. + * Do not use CPUSET cgroup to limit threads using cpu and memory until + * the problem is fixed in kernel. + */ driver->cgroupControllers = (1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_DEVICES) | (1 << VIR_CGROUP_CONTROLLER_MEMORY) | (1 << VIR_CGROUP_CONTROLLER_BLKIO) | - (1 << VIR_CGROUP_CONTROLLER_CPUSET) | (1 << VIR_CGROUP_CONTROLLER_CPUACCT);
Rather than hard-coding the non-use of cpuset as our default, can we do something like a uname() probe or some other clue of whether we have a buggy kernel?
+++ b/src/util/cgroup.c @@ -543,8 +543,7 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, /* We need to control cpu bandwidth for each vcpu now */ if ((flags & VIR_CGROUP_VCPU) && (i != VIR_CGROUP_CONTROLLER_CPU && - i != VIR_CGROUP_CONTROLLER_CPUACCT && - i != VIR_CGROUP_CONTROLLER_CPUSET)) { + i != VIR_CGROUP_CONTROLLER_CPUACCT)) { /* treat it as unmounted and we can use virCgroupAddTask */ VIR_FREE(group->controllers[i].mountPoint); continue;
Is this hunk necessary? In other words, don't we just gracefully skip making per-vcpu groups for any controller that was not mounted, or do we loudly fail if all of these controllers are not present?
We have checked whether the controller was mounted before this code in this function. If all of these controllers are not present, this function don't fails now. It will fail later, and doesn't cause any problem. So I agree that this function fails if all of these controllers are not present. It should be fixed in another patch. Thanks Wen Congyang
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Hi Eric~ I found that even if I remove cpuset in qemu.conf, the corresponding cgroup directories will still be created. Such as: /cgroup |->cpuset |->libvirt |->qemu | |->vm1 |->lxc Is this behaviour correct ? IMHO, if we shutoff cpuset cgroup in qemu.conf, the qemu and vm1 directories should not be created. Am I right ? Thanks. :) On 09/21/2012 04:20 PM, Tang Chen wrote:
When a cpu or memory is offlined and onlined again, cgroup will not recover any corresponding values. This is a problem in kernel. Do not use CPUSET cgroup to limit threads using cpu and memory until the problem is fixed in kernel.
Signed-off-by: Tang Chen<tangchen@cn.fujitsu.com> --- src/qemu/qemu.conf | 5 ++++- src/qemu/qemu_conf.c | 7 ++++++- src/util/cgroup.c | 3 +-- 3 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index 18105ca..6bf7290 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -211,7 +211,10 @@ # can be mounted in different locations. libvirt will detect # where they are located. # -#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuset", "cpuacct" ] +# When a cpu or memory is offlined and onlined again, cgroup will not +# recover any corresponding values. This is a problem in kernel. +# So DO NOT use cpuset cgroup until this problem is fixed in kernel. +#cgroup_controllers = [ "cpu", "devices", "memory", "blkio", "cpuacct" ]
# This is the basic set of devices allowed / required by # all virtual machines. diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 91a56f1..80b0787 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -397,12 +397,17 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, driver->cgroupControllers |= (1<< ctl); } } else { + /* + * When a cpu or memory is offlined and onlined again, cgroup will not + * recover any corresponding values. This is a problem in kernel. + * Do not use CPUSET cgroup to limit threads using cpu and memory until + * the problem is fixed in kernel. + */ driver->cgroupControllers = (1<< VIR_CGROUP_CONTROLLER_CPU) | (1<< VIR_CGROUP_CONTROLLER_DEVICES) | (1<< VIR_CGROUP_CONTROLLER_MEMORY) | (1<< VIR_CGROUP_CONTROLLER_BLKIO) | - (1<< VIR_CGROUP_CONTROLLER_CPUSET) | (1<< VIR_CGROUP_CONTROLLER_CPUACCT); } for (i = 0 ; i< VIR_CGROUP_CONTROLLER_LAST ; i++) { diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 5dc0764..cd7d3fe 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -543,8 +543,7 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, /* We need to control cpu bandwidth for each vcpu now */ if ((flags& VIR_CGROUP_VCPU)&& (i != VIR_CGROUP_CONTROLLER_CPU&& - i != VIR_CGROUP_CONTROLLER_CPUACCT&& - i != VIR_CGROUP_CONTROLLER_CPUSET)) { + i != VIR_CGROUP_CONTROLLER_CPUACCT)) { /* treat it as unmounted and we can use virCgroupAddTask */ VIR_FREE(group->controllers[i].mountPoint); continue;
participants (3)
-
Eric Blake
-
Tang Chen
-
Wen Congyang