[libvirt] [Question] qemu cpu pinning

Hi, When I run a VM(qemu-0.13) on my host with the latest libvirt, I used following settings. == <domain type='kvm' id='1'> <name>RHEL6</name> <uuid>f7ad6bc3-e82a-1254-efb0-9e1a87d83d88</uuid> <memory>2048000</memory> <currentMemory>2048000</currentMemory> <vcpu cpuset='4-7'>2</vcpu> == I expected all works for this domain will be tied to cpu 4-7. After minites, I checked the VM behavior and it shows == [root@bluextal src]# cat /cgroup/cpuacct/libvirt/qemu/RHEL6/cpuacct.usage_percpu 0 511342 3027636 94237 657712515 257104928 513463748303 252386161 == Hmm, cpu 1,2,3 are used for some purpose. All threads for this qemu may be following. == [root@bluextal src]# cat /cgroup/cpuacct/libvirt/qemu/RHEL6/tasks 25707 25727 25728 25729 == And I found == [root@bluextal src]# grep Cpus /proc/25707/status Cpus_allowed: f0 Cpus_allowed_list: 4-7 [root@bluextal src]# grep Cpus /proc/25727/status Cpus_allowed: ff Cpus_allowed_list: 0-7 [root@bluextal src]# grep Cpus /proc/25728/status Cpus_allowed: f0 Cpus_allowed_list: 4-7 [root@bluextal src]# grep Cpus /proc/25729/status Cpus_allowed: f0 Cpus_allowed_list: 4-7 == Thread 25727 has no limitation. Is this an expected behavior and I need more setting in XML definition ? Thanks, -Kame

On Tue, Jun 21, 2011 at 12:59:32PM +0900, KAMEZAWA Hiroyuki wrote:
Hi,
When I run a VM(qemu-0.13) on my host with the latest libvirt,
I used following settings. == <domain type='kvm' id='1'> <name>RHEL6</name> <uuid>f7ad6bc3-e82a-1254-efb0-9e1a87d83d88</uuid> <memory>2048000</memory> <currentMemory>2048000</currentMemory> <vcpu cpuset='4-7'>2</vcpu> ==
I expected all works for this domain will be tied to cpu 4-7.
After minites, I checked the VM behavior and it shows == [root@bluextal src]# cat /cgroup/cpuacct/libvirt/qemu/RHEL6/cpuacct.usage_percpu 0 511342 3027636 94237 657712515 257104928 513463748303 252386161 ==
Hmm, cpu 1,2,3 are used for some purpose.
All threads for this qemu may be following. == [root@bluextal src]# cat /cgroup/cpuacct/libvirt/qemu/RHEL6/tasks 25707 25727 25728 25729 ==
And I found == [root@bluextal src]# grep Cpus /proc/25707/status Cpus_allowed: f0 Cpus_allowed_list: 4-7 [root@bluextal src]# grep Cpus /proc/25727/status Cpus_allowed: ff Cpus_allowed_list: 0-7 [root@bluextal src]# grep Cpus /proc/25728/status Cpus_allowed: f0 Cpus_allowed_list: 4-7 [root@bluextal src]# grep Cpus /proc/25729/status Cpus_allowed: f0 Cpus_allowed_list: 4-7
==
Thread 25727 has no limitation.
Is this an expected behavior and I need more setting in XML definition ?
Thanks, -Kame
I would understand this as KVM threads used for running the guest domain are properly pinned (and indeed the CPU usage for those threads is way way higher than the other ones), but the threads used internally by KVM for I/O processing are not pinned leading to what you are seeing. But it's a wild guess... Another possiblility could be that some kernel I/O or interrupt processing done on behalf of the KVM process is accounted as such but the kernel doesn't respect the CPU pinning for those.
[root@bluextal src]# grep Cpus /proc/25727/status Cpus_allowed: ff Cpus_allowed_list: 0-7
would clearly favor the first one for me, but it's a wild guess, Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

于 2011年06月21日 11:59, KAMEZAWA Hiroyuki 写道:
Hi,
When I run a VM(qemu-0.13) on my host with the latest libvirt,
I used following settings. == <domain type='kvm' id='1'> <name>RHEL6</name> <uuid>f7ad6bc3-e82a-1254-efb0-9e1a87d83d88</uuid> <memory>2048000</memory> <currentMemory>2048000</currentMemory> <vcpu cpuset='4-7'>2</vcpu> ==
I expected all works for this domain will be tied to cpu 4-7.
After minites, I checked the VM behavior and it shows == [root@bluextal src]# cat /cgroup/cpuacct/libvirt/qemu/RHEL6/cpuacct.usage_percpu 0 511342 3027636 94237 657712515 257104928 513463748303 252386161 ==
Hmm, cpu 1,2,3 are used for some purpose.
All threads for this qemu may be following. == [root@bluextal src]# cat /cgroup/cpuacct/libvirt/qemu/RHEL6/tasks 25707 25727 25728 25729 ==
And I found == [root@bluextal src]# grep Cpus /proc/25707/status Cpus_allowed: f0 Cpus_allowed_list: 4-7 [root@bluextal src]# grep Cpus /proc/25727/status Cpus_allowed: ff Cpus_allowed_list: 0-7 [root@bluextal src]# grep Cpus /proc/25728/status Cpus_allowed: f0 Cpus_allowed_list: 4-7 [root@bluextal src]# grep Cpus /proc/25729/status Cpus_allowed: f0 Cpus_allowed_list: 4-7
==
Thread 25727 has no limitation.
Is this an expected behavior and I need more setting in XML definition ?
The XML is correct, no need for more setting. I'm trying to get a NUMA box and have a look.
Thanks, -Kame
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Tue, 21 Jun 2011 12:59:32 +0900 KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
Thread 25727 has no limitation.
Is this an expected behavior and I need more setting in XML definition ?
I tested again, then... [root@bluextal kamezawa]# cat /cgroup/cpuacct/libvirt/qemu/RHEL6/tasks 15702 15722 15723 15724 [root@bluextal kamezawa]# cat /proc/15722/status Name: vhost-15702 State: S (sleeping) Tgid: 15722 Pid: 15722 PPid: 2 TracerPid: 0 Uid: 0 0 0 0 Gid: 0 0 0 0 FDSize: 64 Groups: Threads: 1 SigQ: 3/193032 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: ffffffffffffffff SigCgt: 0000000000000000 CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: ffffffffffffffff CapBnd: ffffffffffffffff Cpus_allowed: ff Cpus_allowed_list: 0-7 Mems_allowed: 00000000,00000003 Mems_allowed_list: 0-1 voluntary_ctxt_switches: 288 nonvoluntary_ctxt_switches: 1 The thread was not part of qemu...Hmm, a thread for virtio-net ? Thanks, -Kame

Currently, libvirt makes use of sched_setaffinity() to set Guest processes's cpu affinity. But, sometimes, for instance, when QEmu uses vhost-net, the kernel part of vhost will create a kernel thread for some purpose. In this case, such kernel thread won't inherit QEmu's cpu affinity. This patch enables cpuset cgroup in libvirt and setting cpu affinity by configuring cpuset cgroup. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_cgroup.c | 22 ++++++++++++++++++++++ src/qemu/qemu_conf.c | 3 ++- src/util/cgroup.c | 18 ++++++++++++++++++ src/util/cgroup.h | 2 ++ 5 files changed, 45 insertions(+), 1 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 626ac6c..e7aebc7 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -83,6 +83,7 @@ virCgroupMounted; virCgroupPathOfController; virCgroupRemove; virCgroupSetBlkioWeight; +virCgroupCpusetSetcpus; virCgroupSetCpuShares; virCgroupSetFreezerState; virCgroupSetMemory; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 1298924..eb92409 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -296,6 +296,28 @@ int qemuSetupCgroup(struct qemud_driver *driver, } } + if (vm->def->cpumask != NULL) { + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + char *cpumask = NULL; + if ((cpumask = + virDomainCpuSetFormat(vm->def->cpumask, vm->def->cpumasklen)) == NULL) + goto cleanup; + + rc = virCgroupCpusetSetcpus(cgroup, cpumask); + if(rc != 0) { + virReportSystemError(-rc, + _("Unable to set cpus for domain %s"), + vm->def->name); + VIR_FREE(cpumask); + goto cleanup; + } + VIR_FREE(cpumask); + } else { + qemuReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cpuset is not available on this host")); + } + } + if (vm->def->blkio.weight != 0) { if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { rc = virCgroupSetBlkioWeight(cgroup, vm->def->blkio.weight); diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 3d8aba4..8b478c4 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -307,7 +307,8 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, (1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_DEVICES) | (1 << VIR_CGROUP_CONTROLLER_MEMORY) | - (1 << VIR_CGROUP_CONTROLLER_BLKIO); + (1 << VIR_CGROUP_CONTROLLER_BLKIO) | + (1 << VIR_CGROUP_CONTROLLER_CPUSET); } for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { if (driver->cgroupControllers & (1 << i)) { diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 2e5ef46..89b2ad4 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -472,6 +472,24 @@ static int virCgroupCpuSetInherit(virCgroupPtr parent, virCgroupPtr group) return rc; } +int virCgroupCpusetSetcpus(virCgroupPtr group, char *cpustring) +{ + int rc = 0; + const char *key = "cpuset.cpus"; + + VIR_DEBUG("Cpuset: set %s for %s/%s", cpustring, group->path, key); + + rc = virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_CPUSET, + key, + cpustring); + + if (rc != 0) + VIR_ERROR("Failed to set %s for %s/%s", cpustring, group->path, key); + + return rc; +} + static int virCgroupSetMemoryUseHierarchy(virCgroupPtr group) { int rc = 0; diff --git a/src/util/cgroup.h b/src/util/cgroup.h index 8ae756d..ca6a68a 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -30,6 +30,8 @@ enum { VIR_ENUM_DECL(virCgroupController); +int virCgroupCpusetSetcpus(virCgroupPtr group, char *cpustring); + int virCgroupForDriver(const char *name, virCgroupPtr *group, int privileged, -- 1.7.1

Since we have controlled Guest cpu affinity by using cpuset cgroup. Get rid of this part. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> --- src/qemu/qemu_process.c | 6 ------ 1 files changed, 0 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 88a31a3..079666f 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2056,12 +2056,6 @@ static int qemuProcessHook(void *data) if (qemuAddToCgroup(h->driver, h->vm->def) < 0) goto cleanup; - /* This must be done after cgroup placement to avoid resetting CPU - * affinity */ - VIR_DEBUG("Setup CPU affinity"); - if (qemuProcessInitCpuAffinity(h->vm) < 0) - goto cleanup; - if (qemuProcessInitNumaMemoryPolicy(h->vm) < 0) return -1; -- 1.7.1

On Thu, Jun 30, 2011 at 11:16:27AM +0800, Gui Jianfeng wrote:
Since we have controlled Guest cpu affinity by using cpuset cgroup. Get rid of this part.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> --- src/qemu/qemu_process.c | 6 ------ 1 files changed, 0 insertions(+), 6 deletions(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 88a31a3..079666f 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2056,12 +2056,6 @@ static int qemuProcessHook(void *data) if (qemuAddToCgroup(h->driver, h->vm->def) < 0) goto cleanup;
- /* This must be done after cgroup placement to avoid resetting CPU - * affinity */ - VIR_DEBUG("Setup CPU affinity"); - if (qemuProcessInitCpuAffinity(h->vm) < 0) - goto cleanup; - if (qemuProcessInitNumaMemoryPolicy(h->vm) < 0) return -1;
This will cause a regresion for anyone who hasn't got Cgroups support in their kernel, or who has not mounted the cpuset controller. I don't believe there should be any harm to just leaving this in place. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, 30 Jun 2011 11:08:32 +0800 Gui Jianfeng <guijianfeng@cn.fujitsu.com> wrote:
Currently, libvirt makes use of sched_setaffinity() to set Guest processes's cpu affinity. But, sometimes, for instance, when QEmu uses vhost-net, the kernel part of vhost will create a kernel thread for some purpose. In this case, such kernel thread won't inherit QEmu's cpu affinity.
Is that issue able to be fixed by cpuset ? Thanks, -Kame
This patch enables cpuset cgroup in libvirt and setting cpu affinity by configuring cpuset cgroup.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
--- src/libvirt_private.syms | 1 + src/qemu/qemu_cgroup.c | 22 ++++++++++++++++++++++ src/qemu/qemu_conf.c | 3 ++- src/util/cgroup.c | 18 ++++++++++++++++++ src/util/cgroup.h | 2 ++ 5 files changed, 45 insertions(+), 1 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 626ac6c..e7aebc7 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -83,6 +83,7 @@ virCgroupMounted; virCgroupPathOfController; virCgroupRemove; virCgroupSetBlkioWeight; +virCgroupCpusetSetcpus; virCgroupSetCpuShares; virCgroupSetFreezerState; virCgroupSetMemory; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 1298924..eb92409 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -296,6 +296,28 @@ int qemuSetupCgroup(struct qemud_driver *driver, } }
+ if (vm->def->cpumask != NULL) { + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + char *cpumask = NULL; + if ((cpumask = + virDomainCpuSetFormat(vm->def->cpumask, vm->def->cpumasklen)) == NULL) + goto cleanup; + + rc = virCgroupCpusetSetcpus(cgroup, cpumask); + if(rc != 0) { + virReportSystemError(-rc, + _("Unable to set cpus for domain %s"), + vm->def->name); + VIR_FREE(cpumask); + goto cleanup; + } + VIR_FREE(cpumask); + } else { + qemuReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cpuset is not available on this host")); + } + } + if (vm->def->blkio.weight != 0) { if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { rc = virCgroupSetBlkioWeight(cgroup, vm->def->blkio.weight); diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 3d8aba4..8b478c4 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -307,7 +307,8 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, (1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_DEVICES) | (1 << VIR_CGROUP_CONTROLLER_MEMORY) | - (1 << VIR_CGROUP_CONTROLLER_BLKIO); + (1 << VIR_CGROUP_CONTROLLER_BLKIO) | + (1 << VIR_CGROUP_CONTROLLER_CPUSET); } for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { if (driver->cgroupControllers & (1 << i)) { diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 2e5ef46..89b2ad4 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -472,6 +472,24 @@ static int virCgroupCpuSetInherit(virCgroupPtr parent, virCgroupPtr group) return rc; }
+int virCgroupCpusetSetcpus(virCgroupPtr group, char *cpustring) +{ + int rc = 0; + const char *key = "cpuset.cpus"; + + VIR_DEBUG("Cpuset: set %s for %s/%s", cpustring, group->path, key); + + rc = virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_CPUSET, + key, + cpustring); + + if (rc != 0) + VIR_ERROR("Failed to set %s for %s/%s", cpustring, group->path, key); + + return rc; +} + static int virCgroupSetMemoryUseHierarchy(virCgroupPtr group) { int rc = 0; diff --git a/src/util/cgroup.h b/src/util/cgroup.h index 8ae756d..ca6a68a 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -30,6 +30,8 @@ enum {
VIR_ENUM_DECL(virCgroupController);
+int virCgroupCpusetSetcpus(virCgroupPtr group, char *cpustring); + int virCgroupForDriver(const char *name, virCgroupPtr *group, int privileged, -- 1.7.1

KAMEZAWA Hiroyuki wrote:
On Thu, 30 Jun 2011 11:08:32 +0800 Gui Jianfeng <guijianfeng@cn.fujitsu.com> wrote:
Currently, libvirt makes use of sched_setaffinity() to set Guest processes's cpu affinity. But, sometimes, for instance, when QEmu uses vhost-net, the kernel part of vhost will create a kernel thread for some purpose. In this case, such kernel thread won't inherit QEmu's cpu affinity.
Is that issue able to be fixed by cpuset ?
Yes, I think. Thanks, Gui
Thanks, -Kame
This patch enables cpuset cgroup in libvirt and setting cpu affinity by configuring cpuset cgroup.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
--- src/libvirt_private.syms | 1 + src/qemu/qemu_cgroup.c | 22 ++++++++++++++++++++++ src/qemu/qemu_conf.c | 3 ++- src/util/cgroup.c | 18 ++++++++++++++++++ src/util/cgroup.h | 2 ++ 5 files changed, 45 insertions(+), 1 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 626ac6c..e7aebc7 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -83,6 +83,7 @@ virCgroupMounted; virCgroupPathOfController; virCgroupRemove; virCgroupSetBlkioWeight; +virCgroupCpusetSetcpus; virCgroupSetCpuShares; virCgroupSetFreezerState; virCgroupSetMemory; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 1298924..eb92409 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -296,6 +296,28 @@ int qemuSetupCgroup(struct qemud_driver *driver, } }
+ if (vm->def->cpumask != NULL) { + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + char *cpumask = NULL; + if ((cpumask = + virDomainCpuSetFormat(vm->def->cpumask, vm->def->cpumasklen)) == NULL) + goto cleanup; + + rc = virCgroupCpusetSetcpus(cgroup, cpumask); + if(rc != 0) { + virReportSystemError(-rc, + _("Unable to set cpus for domain %s"), + vm->def->name); + VIR_FREE(cpumask); + goto cleanup; + } + VIR_FREE(cpumask); + } else { + qemuReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cpuset is not available on this host")); + } + } + if (vm->def->blkio.weight != 0) { if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { rc = virCgroupSetBlkioWeight(cgroup, vm->def->blkio.weight); diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 3d8aba4..8b478c4 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -307,7 +307,8 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, (1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_DEVICES) | (1 << VIR_CGROUP_CONTROLLER_MEMORY) | - (1 << VIR_CGROUP_CONTROLLER_BLKIO); + (1 << VIR_CGROUP_CONTROLLER_BLKIO) | + (1 << VIR_CGROUP_CONTROLLER_CPUSET); } for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { if (driver->cgroupControllers & (1 << i)) { diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 2e5ef46..89b2ad4 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -472,6 +472,24 @@ static int virCgroupCpuSetInherit(virCgroupPtr parent, virCgroupPtr group) return rc; }
+int virCgroupCpusetSetcpus(virCgroupPtr group, char *cpustring) +{ + int rc = 0; + const char *key = "cpuset.cpus"; + + VIR_DEBUG("Cpuset: set %s for %s/%s", cpustring, group->path, key); + + rc = virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_CPUSET, + key, + cpustring); + + if (rc != 0) + VIR_ERROR("Failed to set %s for %s/%s", cpustring, group->path, key); + + return rc; +} + static int virCgroupSetMemoryUseHierarchy(virCgroupPtr group) { int rc = 0; diff --git a/src/util/cgroup.h b/src/util/cgroup.h index 8ae756d..ca6a68a 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -30,6 +30,8 @@ enum {
VIR_ENUM_DECL(virCgroupController);
+int virCgroupCpusetSetcpus(virCgroupPtr group, char *cpustring); + int virCgroupForDriver(const char *name, virCgroupPtr *group, int privileged, -- 1.7.1
-- Regards Gui Jianfeng

On Thu, Jun 30, 2011 at 11:08:32AM +0800, Gui Jianfeng wrote:
Currently, libvirt makes use of sched_setaffinity() to set Guest processes's cpu affinity. But, sometimes, for instance, when QEmu uses vhost-net, the kernel part of vhost will create a kernel thread for some purpose. In this case, such kernel thread won't inherit QEmu's cpu affinity.
This patch enables cpuset cgroup in libvirt and setting cpu affinity by configuring cpuset cgroup.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_cgroup.c | 22 ++++++++++++++++++++++ src/qemu/qemu_conf.c | 3 ++- src/util/cgroup.c | 18 ++++++++++++++++++ src/util/cgroup.h | 2 ++ 5 files changed, 45 insertions(+), 1 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 626ac6c..e7aebc7 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -83,6 +83,7 @@ virCgroupMounted; virCgroupPathOfController; virCgroupRemove; virCgroupSetBlkioWeight; +virCgroupCpusetSetcpus; virCgroupSetCpuShares; virCgroupSetFreezerState; virCgroupSetMemory; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 1298924..eb92409 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -296,6 +296,28 @@ int qemuSetupCgroup(struct qemud_driver *driver, } }
+ if (vm->def->cpumask != NULL) { + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + char *cpumask = NULL; + if ((cpumask = + virDomainCpuSetFormat(vm->def->cpumask, vm->def->cpumasklen)) == NULL) + goto cleanup; + + rc = virCgroupCpusetSetcpus(cgroup, cpumask); + if(rc != 0) { + virReportSystemError(-rc, + _("Unable to set cpus for domain %s"), + vm->def->name); + VIR_FREE(cpumask); + goto cleanup; + } + VIR_FREE(cpumask); + } else { + qemuReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cpuset is not available on this host"));
This is an effective regression for any existing deployments which are not using cgroups, or do not have the cpuset controller mounted. IMHO, this 'else' clause should just be removed, to allow the existing cpu affinity code to run normally. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 2011-6-30 17:34, Daniel P. Berrange wrote:
On Thu, Jun 30, 2011 at 11:08:32AM +0800, Gui Jianfeng wrote:
Currently, libvirt makes use of sched_setaffinity() to set Guest processes's cpu affinity. But, sometimes, for instance, when QEmu uses vhost-net, the kernel part of vhost will create a kernel thread for some purpose. In this case, such kernel thread won't inherit QEmu's cpu affinity.
This patch enables cpuset cgroup in libvirt and setting cpu affinity by configuring cpuset cgroup.
Signed-off-by: Gui Jianfeng<guijianfeng@cn.fujitsu.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_cgroup.c | 22 ++++++++++++++++++++++ src/qemu/qemu_conf.c | 3 ++- src/util/cgroup.c | 18 ++++++++++++++++++ src/util/cgroup.h | 2 ++ 5 files changed, 45 insertions(+), 1 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 626ac6c..e7aebc7 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -83,6 +83,7 @@ virCgroupMounted; virCgroupPathOfController; virCgroupRemove; virCgroupSetBlkioWeight; +virCgroupCpusetSetcpus; virCgroupSetCpuShares; virCgroupSetFreezerState; virCgroupSetMemory; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 1298924..eb92409 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -296,6 +296,28 @@ int qemuSetupCgroup(struct qemud_driver *driver, } }
+ if (vm->def->cpumask != NULL) { + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + char *cpumask = NULL; + if ((cpumask = + virDomainCpuSetFormat(vm->def->cpumask, vm->def->cpumasklen)) == NULL) + goto cleanup; + + rc = virCgroupCpusetSetcpus(cgroup, cpumask); + if(rc != 0) { + virReportSystemError(-rc, + _("Unable to set cpus for domain %s"), + vm->def->name); + VIR_FREE(cpumask); + goto cleanup; + } + VIR_FREE(cpumask); + } else { + qemuReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Cpuset is not available on this host")); This is an effective regression for any existing deployments which are not using cgroups, or do not have the cpuset controller mounted.
IMHO, this 'else' clause should just be removed, to allow the existing cpu affinity code to run normally.
Hmm... Yes I think so. Will update. Gui
Daniel

Currently, libvirt makes use of sched_setaffinity() to set Guest's cpu affinity. But, sometimes, for instance, when QEmu uses vhost-net, the kernel part of vhost will create a kernel thread for some purpose. In this case, such kernel thread won't inherit QEmu's cpu affinity. This patch enables cpuset cgroup in libvirt and setting cpu affinity by configuring cpuset cgroup. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_cgroup.c | 18 ++++++++++++++++++ src/qemu/qemu_conf.c | 3 ++- src/util/cgroup.c | 18 ++++++++++++++++++ src/util/cgroup.h | 2 ++ 5 files changed, 41 insertions(+), 1 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 626ac6c..e7aebc7 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -83,6 +83,7 @@ virCgroupMounted; virCgroupPathOfController; virCgroupRemove; virCgroupSetBlkioWeight; +virCgroupCpusetSetcpus; virCgroupSetCpuShares; virCgroupSetFreezerState; virCgroupSetMemory; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 1298924..413bee4 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -296,6 +296,24 @@ int qemuSetupCgroup(struct qemud_driver *driver, } } + if (vm->def->cpumask != NULL && + qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + char *cpumask = NULL; + if ((cpumask = + virDomainCpuSetFormat(vm->def->cpumask, vm->def->cpumasklen)) == NULL) + goto cleanup; + + rc = virCgroupCpusetSetcpus(cgroup, cpumask); + if(rc != 0) { + virReportSystemError(-rc, + _("Unable to set cpus for domain %s"), + vm->def->name); + VIR_FREE(cpumask); + goto cleanup; + } + VIR_FREE(cpumask); + } + if (vm->def->blkio.weight != 0) { if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { rc = virCgroupSetBlkioWeight(cgroup, vm->def->blkio.weight); diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 3d8aba4..8b478c4 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -307,7 +307,8 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, (1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_DEVICES) | (1 << VIR_CGROUP_CONTROLLER_MEMORY) | - (1 << VIR_CGROUP_CONTROLLER_BLKIO); + (1 << VIR_CGROUP_CONTROLLER_BLKIO) | + (1 << VIR_CGROUP_CONTROLLER_CPUSET); } for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { if (driver->cgroupControllers & (1 << i)) { diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 2e5ef46..89b2ad4 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -472,6 +472,24 @@ static int virCgroupCpuSetInherit(virCgroupPtr parent, virCgroupPtr group) return rc; } +int virCgroupCpusetSetcpus(virCgroupPtr group, char *cpustring) +{ + int rc = 0; + const char *key = "cpuset.cpus"; + + VIR_DEBUG("Cpuset: set %s for %s/%s", cpustring, group->path, key); + + rc = virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_CPUSET, + key, + cpustring); + + if (rc != 0) + VIR_ERROR("Failed to set %s for %s/%s", cpustring, group->path, key); + + return rc; +} + static int virCgroupSetMemoryUseHierarchy(virCgroupPtr group) { int rc = 0; diff --git a/src/util/cgroup.h b/src/util/cgroup.h index 8ae756d..ca6a68a 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -30,6 +30,8 @@ enum { VIR_ENUM_DECL(virCgroupController); +int virCgroupCpusetSetcpus(virCgroupPtr group, char *cpustring); + int virCgroupForDriver(const char *name, virCgroupPtr *group, int privileged, -- 1.7.1

On 2011-7-2 9:42, Gui Jianfeng wrote:
Currently, libvirt makes use of sched_setaffinity() to set Guest's cpu affinity. But, sometimes, for instance, when QEmu uses vhost-net, the kernel part of vhost will create a kernel thread for some purpose. In this case, such kernel thread won't inherit QEmu's cpu affinity.
This patch enables cpuset cgroup in libvirt and setting cpu affinity by configuring cpuset cgroup.
Ping? Gui
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_cgroup.c | 18 ++++++++++++++++++ src/qemu/qemu_conf.c | 3 ++- src/util/cgroup.c | 18 ++++++++++++++++++ src/util/cgroup.h | 2 ++ 5 files changed, 41 insertions(+), 1 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 626ac6c..e7aebc7 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -83,6 +83,7 @@ virCgroupMounted; virCgroupPathOfController; virCgroupRemove; virCgroupSetBlkioWeight; +virCgroupCpusetSetcpus; virCgroupSetCpuShares; virCgroupSetFreezerState; virCgroupSetMemory; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 1298924..413bee4 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -296,6 +296,24 @@ int qemuSetupCgroup(struct qemud_driver *driver, } }
+ if (vm->def->cpumask != NULL && + qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + char *cpumask = NULL; + if ((cpumask = + virDomainCpuSetFormat(vm->def->cpumask, vm->def->cpumasklen)) == NULL) + goto cleanup; + + rc = virCgroupCpusetSetcpus(cgroup, cpumask); + if(rc != 0) { + virReportSystemError(-rc, + _("Unable to set cpus for domain %s"), + vm->def->name); + VIR_FREE(cpumask); + goto cleanup; + } + VIR_FREE(cpumask); + } + if (vm->def->blkio.weight != 0) { if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_BLKIO)) { rc = virCgroupSetBlkioWeight(cgroup, vm->def->blkio.weight); diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 3d8aba4..8b478c4 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -307,7 +307,8 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, (1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_DEVICES) | (1 << VIR_CGROUP_CONTROLLER_MEMORY) | - (1 << VIR_CGROUP_CONTROLLER_BLKIO); + (1 << VIR_CGROUP_CONTROLLER_BLKIO) | + (1 << VIR_CGROUP_CONTROLLER_CPUSET); } for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { if (driver->cgroupControllers & (1 << i)) { diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 2e5ef46..89b2ad4 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -472,6 +472,24 @@ static int virCgroupCpuSetInherit(virCgroupPtr parent, virCgroupPtr group) return rc; }
+int virCgroupCpusetSetcpus(virCgroupPtr group, char *cpustring) +{ + int rc = 0; + const char *key = "cpuset.cpus"; + + VIR_DEBUG("Cpuset: set %s for %s/%s", cpustring, group->path, key); + + rc = virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_CPUSET, + key, + cpustring); + + if (rc != 0) + VIR_ERROR("Failed to set %s for %s/%s", cpustring, group->path, key); + + return rc; +} + static int virCgroupSetMemoryUseHierarchy(virCgroupPtr group) { int rc = 0; diff --git a/src/util/cgroup.h b/src/util/cgroup.h index 8ae756d..ca6a68a 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -30,6 +30,8 @@ enum {
VIR_ENUM_DECL(virCgroupController);
+int virCgroupCpusetSetcpus(virCgroupPtr group, char *cpustring); + int virCgroupForDriver(const char *name, virCgroupPtr *group, int privileged,
participants (5)
-
Daniel P. Berrange
-
Daniel Veillard
-
Gui Jianfeng
-
KAMEZAWA Hiroyuki
-
Osier Yang