[libvirt] [PATCH v4 00/12] Support hypervisor-threads-pin in vcpupin.

Users can use vcpupin command to bind a vcpu thread to a specific physical cpu. But besides vcpu threads, there are alse some other threads created by qemu (known as hypervisor threads) that could not be explicitly bound to physical cpus. The first 3 patches are from Wen Congyang, which implement Cgroup for differrent hypervisors. The other 10 patches implemented hypervisor threads binding, in two ways: 1) Use sched_setaffinity() function; 2) Use cpuset cgroup. A new xml element is introduced, and vcpupin command is improved, see below. 1. Introduce new xml elements: <cputune> ...... <hypervisorpin cpuset='1'/> </cputune> 2. Improve vcpupin command to support hypervisor threads binding. For example, vm1 has the following configuration: <cputune> <vcpupin vcpu='1' cpuset='1'/> <vcpupin vcpu='0' cpuset='0'/> <hypervisorpin cpuset='1'/> </cputune> 1) query all threads pining # vcpupin vm1 VCPU: CPU Affinity ---------------------------------- 0: 0 1: 1 Hypervisor: CPU Affinity ---------------------------------- *: 1 2) query hypervisor threads pining only # vcpupin vm1 --hypervisor Hypervisor: CPU Affinity ---------------------------------- *: 1 3) change hypervisor threads pining # vcpupin vm1 --hypervisor 0-1 # vcpupin vm1 --hypervisor Hypervisor: CPU Affinity ---------------------------------- *: 0-1 # taskset -p 397 pid 397's current affinity mask: 3 Note: If users want to pin a vcpu thread to pcpu, --vcpu option could no longer be omitted. Tang Chen (9): Enable cpuset cgroup and synchronous vcpupin info to cgroup. Support hypervisorpin xml parse. Introduce qemuSetupCgroupHypervisorPin and synchronize hypervisorpin info to cgroup. Add qemuProcessSetHypervisorAffinites and set hypervisor threads affinities Introduce virDomainHypervisorPinAdd and virDomainHypervisorPinDel functions Introduce virDomainPinHypervisorFlags and virDomainGetHypervisorPinInfo functions. Introduce qemudDomainPinHypervisorFlags and qemudDomainGetHypervisorPinInfo in qemu driver. Introduce remoteDomainPinHypervisorFlags and remoteDomainGetHypervisorPinInfo functions in remote driver. Improve vcpupin to support hypervisorpin dynically. Wen Congyang (3): Introduce the function virCgroupForHypervisor Introduce the function virCgroupMoveTask create a new cgroup and move all hypervisor threads to the new cgroup daemon/remote.c | 103 +++++++++ docs/schemas/domaincommon.rng | 7 + include/libvirt/libvirt.h.in | 10 + src/conf/domain_conf.c | 169 +++++++++++++- src/conf/domain_conf.h | 7 + src/driver.h | 13 +- src/libvirt.c | 147 +++++++++++++ src/libvirt_private.syms | 7 + src/libvirt_public.syms | 2 + src/qemu/qemu_cgroup.c | 151 ++++++++++++- src/qemu/qemu_cgroup.h | 5 + src/qemu/qemu_driver.c | 267 ++++++++++++++++++++++- src/qemu/qemu_process.c | 60 ++++- src/remote/remote_driver.c | 102 +++++++++ src/remote/remote_protocol.x | 23 +- src/remote_protocol-structs | 24 ++ src/util/cgroup.c | 188 +++++++++++++++- src/util/cgroup.h | 15 ++ tests/qemuxml2argvdata/qemuxml2argv-cputune.xml | 1 + tests/vcpupin | 6 +- tools/virsh.c | 147 ++++++++----- tools/virsh.pod | 16 +- 22 files changed, 1393 insertions(+), 77 deletions(-) -- 1.7.10.2 -- Best Regards, Tang chen

From: Wen Congyang <wency@cn.fujitsu.com> Introduce the function virCgroupForHypervisor() to create sub directory for hypervisor thread(include I/O thread, vhost-net thread) Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/libvirt_private.syms | 1 + src/util/cgroup.c | 42 ++++++++++++++++++++++++++++++++++++++++++ src/util/cgroup.h | 4 ++++ 3 files changed, 47 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 734c881..d2c2844 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -71,6 +71,7 @@ virCgroupDenyDeviceMajor; virCgroupDenyDevicePath; virCgroupForDomain; virCgroupForDriver; +virCgroupForHypervisor; virCgroupForVcpu; virCgroupFree; virCgroupGetBlkioWeight; diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 5b32881..1ac8278 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -946,6 +946,48 @@ int virCgroupForVcpu(virCgroupPtr driver ATTRIBUTE_UNUSED, #endif /** + * virCgroupForHypervisor: + * + * @driver: group for the domain + * @group: Pointer to returned virCgroupPtr + * + * Returns: 0 on success or -errno on failure + */ +#if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +int virCgroupForHypervisor(virCgroupPtr driver, + virCgroupPtr *group, + int create) +{ + int rc; + char *path; + + if (driver == NULL) + return -EINVAL; + + if (virAsprintf(&path, "%s/hypervisor", driver->path) < 0) + return -ENOMEM; + + rc = virCgroupNew(path, group); + VIR_FREE(path); + + if (rc == 0) { + rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_VCPU); + if (rc != 0) + virCgroupFree(group); + } + + return rc; +} +#else +int virCgroupForHypervisor(virCgroupPtr driver ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED, + int create ATTRIBUTE_UNUSED) +{ + return -ENXIO; +} + +#endif +/** * virCgroupSetBlkioWeight: * * @group: The cgroup to change io weight for diff --git a/src/util/cgroup.h b/src/util/cgroup.h index 05325ae..315ebd6 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -47,6 +47,10 @@ int virCgroupForVcpu(virCgroupPtr driver, virCgroupPtr *group, int create); +int virCgroupForHypervisor(virCgroupPtr driver, + virCgroupPtr *group, + int create); + int virCgroupPathOfController(virCgroupPtr group, int controller, const char *key, -- 1.7.10.2

On Wed, Jul 25, 2012 at 01:20:34PM +0800, tangchen wrote:
From: Wen Congyang <wency@cn.fujitsu.com>
Introduce the function virCgroupForHypervisor() to create sub directory for hypervisor thread(include I/O thread, vhost-net thread)
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/libvirt_private.syms | 1 + src/util/cgroup.c | 42 ++++++++++++++++++++++++++++++++++++++++++ src/util/cgroup.h | 4 ++++ 3 files changed, 47 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 734c881..d2c2844 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -71,6 +71,7 @@ virCgroupDenyDeviceMajor; virCgroupDenyDevicePath; virCgroupForDomain; virCgroupForDriver; +virCgroupForHypervisor; virCgroupForVcpu; virCgroupFree; virCgroupGetBlkioWeight; diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 5b32881..1ac8278 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -946,6 +946,48 @@ int virCgroupForVcpu(virCgroupPtr driver ATTRIBUTE_UNUSED, #endif
/** + * virCgroupForHypervisor: + * + * @driver: group for the domain + * @group: Pointer to returned virCgroupPtr + * + * Returns: 0 on success or -errno on failure + */ +#if defined HAVE_MNTENT_H && defined HAVE_GETMNTENT_R +int virCgroupForHypervisor(virCgroupPtr driver, + virCgroupPtr *group, + int create) +{ + int rc; + char *path; + + if (driver == NULL) + return -EINVAL; + + if (virAsprintf(&path, "%s/hypervisor", driver->path) < 0) + return -ENOMEM; + + rc = virCgroupNew(path, group); + VIR_FREE(path); + + if (rc == 0) { + rc = virCgroupMakeGroup(driver, *group, create, VIR_CGROUP_VCPU); + if (rc != 0) + virCgroupFree(group); + } + + return rc; +} +#else +int virCgroupForHypervisor(virCgroupPtr driver ATTRIBUTE_UNUSED, + virCgroupPtr *group ATTRIBUTE_UNUSED, + int create ATTRIBUTE_UNUSED) +{ + return -ENXIO; +} + +#endif +/** * virCgroupSetBlkioWeight: * * @group: The cgroup to change io weight for diff --git a/src/util/cgroup.h b/src/util/cgroup.h index 05325ae..315ebd6 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -47,6 +47,10 @@ int virCgroupForVcpu(virCgroupPtr driver, virCgroupPtr *group, int create);
+int virCgroupForHypervisor(virCgroupPtr driver, + virCgroupPtr *group, + int create); + int virCgroupPathOfController(virCgroupPtr group, int controller, const char *key,
ACK. -- Thanks, Hu Tao

From: Wen Congyang <wency@cn.fujitsu.com> Introduce a new API to move tasks of one controller from a cgroup to another cgroup Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/libvirt_private.syms | 2 + src/util/cgroup.c | 111 ++++++++++++++++++++++++++++++++++++++++++++++ src/util/cgroup.h | 8 ++++ 3 files changed, 121 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d2c2844..738c70f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -60,6 +60,7 @@ virCapabilitiesSetMacPrefix; # cgroup.h virCgroupAddTask; +virCgroupAddTaskController; virCgroupAllowDevice; virCgroupAllowDeviceMajor; virCgroupAllowDevicePath; @@ -91,6 +92,7 @@ virCgroupKill; virCgroupKillPainfully; virCgroupKillRecursive; virCgroupMounted; +virCgroupMoveTask; virCgroupPathOfController; virCgroupRemove; virCgroupSetBlkioDeviceWeight; diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 1ac8278..d94f07b 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -791,6 +791,117 @@ int virCgroupAddTask(virCgroupPtr group, pid_t pid) return rc; } +/** + * virCgroupAddTaskController: + * + * @group: The cgroup to add a task to + * @pid: The pid of the task to add + * @controller: The cgroup controller to be operated on + * + * Returns: 0 on success or -errno on failure + */ +int virCgroupAddTaskController(virCgroupPtr group, pid_t pid, int controller) +{ + if (controller < VIR_CGROUP_CONTROLLER_CPU || + controller > VIR_CGROUP_CONTROLLER_BLKIO) + return -EINVAL; + + if (!group->controllers[controller].mountPoint) + return -EINVAL; + + return virCgroupSetValueU64(group, controller, "tasks", + (unsigned long long)pid); +} + + +static int virCgroupAddTaskStrController(virCgroupPtr group, + const char *pidstr, + int controller) +{ + char *str = NULL, *cur = NULL, *next = NULL; + unsigned long long pid = 0; + int rc = 0; + + if (virAsprintf(&str, "%s", pidstr) < 0) + return -1; + + cur = str; + while ((next = strchr(cur, '\n')) != NULL) { + *next = '\0'; + rc = virStrToLong_ull(cur, NULL, 10, &pid); + if (rc != 0) + goto cleanup; + cur = next + 1; + + rc = virCgroupAddTaskController(group, pid, controller); + if (rc != 0) + goto cleanup; + } + if (cur != '\0') { + rc = virStrToLong_ull(cur, NULL, 10, &pid); + if (rc != 0) + goto cleanup; + rc = virCgroupAddTaskController(group, pid, controller); + if (rc != 0) + goto cleanup; + } + +cleanup: + VIR_FREE(str); + return rc; +} + +/** + * virCgroupMoveTask: + * + * @src_group: The source cgroup where all tasks are removed from + * @dest_group: The destination where all tasks are added to + * @controller: The cgroup controller to be operated on + * + * Returns: 0 on success or -errno on failure + */ +int virCgroupMoveTask(virCgroupPtr src_group, virCgroupPtr dest_group, + int controller) +{ + int rc = 0, err = 0; + char *content = NULL; + + if (controller < VIR_CGROUP_CONTROLLER_CPU || + controller > VIR_CGROUP_CONTROLLER_BLKIO) + return -EINVAL; + + if (!src_group->controllers[controller].mountPoint || + !dest_group->controllers[controller].mountPoint) { + VIR_WARN("no vm cgroup in controller %d", controller); + return 0; + } + + rc = virCgroupGetValueStr(src_group, controller, "tasks", &content); + if (rc != 0) + return rc; + + rc = virCgroupAddTaskStrController(dest_group, content, controller); + if (rc != 0) + goto cleanup; + + VIR_FREE(content); + + return 0; + +cleanup: + /* + * We don't need to recover dest_cgroup because cgroup will make sure + * that one task only resides in one cgroup of the same controller. + */ + err = virCgroupAddTaskStrController(src_group, content, controller); + if (err != 0) + VIR_ERROR(_("Cannot recover cgroup %s from %s"), + src_group->controllers[controller].mountPoint, + dest_group->controllers[controller].mountPoint); + VIR_FREE(content); + + return rc; +} /** * virCgroupForDriver: diff --git a/src/util/cgroup.h b/src/util/cgroup.h index 315ebd6..f14c167 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -58,6 +58,14 @@ int virCgroupPathOfController(virCgroupPtr group, int virCgroupAddTask(virCgroupPtr group, pid_t pid); +int virCgroupAddTaskController(virCgroupPtr group, + pid_t pid, + int controller); + +int virCgroupMoveTask(virCgroupPtr src_group, + virCgroupPtr dest_group, + int controller); + int virCgroupSetBlkioWeight(virCgroupPtr group, unsigned int weight); int virCgroupGetBlkioWeight(virCgroupPtr group, unsigned int *weight); -- 1.7.10.2

On Wed, Jul 25, 2012 at 01:21:44PM +0800, tangchen wrote:
From: Wen Congyang <wency@cn.fujitsu.com>
Introduce a new API to move tasks of one controller from a cgroup to another cgroup
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/libvirt_private.syms | 2 + src/util/cgroup.c | 111 ++++++++++++++++++++++++++++++++++++++++++++++ src/util/cgroup.h | 8 ++++ 3 files changed, 121 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d2c2844..738c70f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -60,6 +60,7 @@ virCapabilitiesSetMacPrefix;
# cgroup.h virCgroupAddTask; +virCgroupAddTaskController; virCgroupAllowDevice; virCgroupAllowDeviceMajor; virCgroupAllowDevicePath; @@ -91,6 +92,7 @@ virCgroupKill; virCgroupKillPainfully; virCgroupKillRecursive; virCgroupMounted; +virCgroupMoveTask; virCgroupPathOfController; virCgroupRemove; virCgroupSetBlkioDeviceWeight; diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 1ac8278..d94f07b 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -791,6 +791,117 @@ int virCgroupAddTask(virCgroupPtr group, pid_t pid) return rc; }
+/** + * virCgroupAddTaskController: + * + * @group: The cgroup to add a task to + * @pid: The pid of the task to add + * @controller: The cgroup controller to be operated on + * + * Returns: 0 on success or -errno on failure + */ +int virCgroupAddTaskController(virCgroupPtr group, pid_t pid, int controller) +{ + if (controller < VIR_CGROUP_CONTROLLER_CPU || + controller > VIR_CGROUP_CONTROLLER_BLKIO) + return -EINVAL; + + if (!group->controllers[controller].mountPoint) + return -EINVAL; + + return virCgroupSetValueU64(group, controller, "tasks", + (unsigned long long)pid); +} + + +static int virCgroupAddTaskStrController(virCgroupPtr group, + const char *pidstr, + int controller) +{ + char *str = NULL, *cur = NULL, *next = NULL; + unsigned long long pid = 0; + int rc = 0; + + if (virAsprintf(&str, "%s", pidstr) < 0) + return -1; + + cur = str; + while ((next = strchr(cur, '\n')) != NULL) { + *next = '\0'; + rc = virStrToLong_ull(cur, NULL, 10, &pid); + if (rc != 0)
At least gives a warning message here because some pids may have added successfully, some failed.
+ goto cleanup; + cur = next + 1; + + rc = virCgroupAddTaskController(group, pid, controller); + if (rc != 0)
Some as above.
+ goto cleanup; + } + if (cur != '\0') { + rc = virStrToLong_ull(cur, NULL, 10, &pid); + if (rc != 0)
Some as above.
+ goto cleanup; + rc = virCgroupAddTaskController(group, pid, controller); + if (rc != 0)
Some as above.
+ goto cleanup; + } + +cleanup: + VIR_FREE(str); + return rc; +} + +/** + * virCgroupMoveTask: + * + * @src_group: The source cgroup where all tasks are removed from + * @dest_group: The destination where all tasks are added to + * @controller: The cgroup controller to be operated on + * + * Returns: 0 on success or -errno on failure + */ +int virCgroupMoveTask(virCgroupPtr src_group, virCgroupPtr dest_group, + int controller) +{ + int rc = 0, err = 0; + char *content = NULL; + + if (controller < VIR_CGROUP_CONTROLLER_CPU || + controller > VIR_CGROUP_CONTROLLER_BLKIO) + return -EINVAL; + + if (!src_group->controllers[controller].mountPoint || + !dest_group->controllers[controller].mountPoint) { + VIR_WARN("no vm cgroup in controller %d", controller); + return 0; + } + + rc = virCgroupGetValueStr(src_group, controller, "tasks", &content); + if (rc != 0) + return rc; + + rc = virCgroupAddTaskStrController(dest_group, content, controller); + if (rc != 0) + goto cleanup; + + VIR_FREE(content); + + return 0; + +cleanup: + /* + * We don't need to recover dest_cgroup because cgroup will make sure + * that one task only resides in one cgroup of the same controller. + */ + err = virCgroupAddTaskStrController(src_group, content, controller); + if (err != 0) + VIR_ERROR(_("Cannot recover cgroup %s from %s"), + src_group->controllers[controller].mountPoint, + dest_group->controllers[controller].mountPoint); + VIR_FREE(content); + + return rc; +}
/** * virCgroupForDriver: diff --git a/src/util/cgroup.h b/src/util/cgroup.h index 315ebd6..f14c167 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -58,6 +58,14 @@ int virCgroupPathOfController(virCgroupPtr group,
int virCgroupAddTask(virCgroupPtr group, pid_t pid);
+int virCgroupAddTaskController(virCgroupPtr group, + pid_t pid, + int controller); + +int virCgroupMoveTask(virCgroupPtr src_group, + virCgroupPtr dest_group, + int controller); + int virCgroupSetBlkioWeight(virCgroupPtr group, unsigned int weight); int virCgroupGetBlkioWeight(virCgroupPtr group, unsigned int *weight);
-- 1.7.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Thanks, Hu Tao

From: Wen Congyang <wency@cn.fujitsu.com> Create a new cgroup and move all hypervisor threads to the new cgroup. And then we can do the other things: 1. limit only vcpu usage rather than the whole qemu 2. limit for hypervisor threads(include vhost-net threads) Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/qemu/qemu_cgroup.c | 71 ++++++++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_cgroup.h | 2 ++ src/qemu/qemu_process.c | 6 +++- 3 files changed, 74 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 32184e7..46ae1db 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -521,11 +521,12 @@ int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) } if (priv->nvcpupids == 0 || priv->vcpupids[0] == vm->pid) { - /* If we does not know VCPU<->PID mapping or all vcpu runs in the same + /* If we does not know VCPU<->PID mapping or all vcpus run in the same * thread, we cannot control each vcpu. */ - virCgroupFree(&cgroup); - return 0; + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to get vcpus' pids.")); + goto cleanup; } for (i = 0; i < priv->nvcpupids; i++) { @@ -562,7 +563,11 @@ int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) return 0; cleanup: - virCgroupFree(&cgroup_vcpu); + if (cgroup_vcpu) { + virCgroupRemove(cgroup_vcpu); + virCgroupFree(&cgroup_vcpu); + } + if (cgroup) { virCgroupRemove(cgroup); virCgroupFree(&cgroup); @@ -571,6 +576,64 @@ cleanup: return -1; } +int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, + virDomainObjPtr vm) +{ + virCgroupPtr cgroup = NULL; + virCgroupPtr cgroup_hypervisor = NULL; + int rc, i; + + if (driver->cgroup == NULL) + return 0; /* Not supported, so claim success */ + + rc = virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to find cgroup for %s"), + vm->def->name); + goto cleanup; + } + + rc = virCgroupForHypervisor(cgroup, &cgroup_hypervisor, 1); + if (rc < 0) { + virReportSystemError(-rc, + _("Unable to create hypervisor cgroup for %s"), + vm->def->name); + goto cleanup; + } + + for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) { + if (!qemuCgroupControllerActive(driver, i)) { + VIR_WARN("cgroup %d is not active", i); + continue; + } + rc = virCgroupMoveTask(cgroup, cgroup_hypervisor, i); + if (rc < 0) { + virReportSystemError(-rc, + _("Unable to move tasks from domain cgroup to " + "hypervisor cgroup in controller %d for %s"), + i, vm->def->name); + goto cleanup; + } + } + + virCgroupFree(&cgroup_hypervisor); + virCgroupFree(&cgroup); + return 0; + +cleanup: + if (cgroup_hypervisor) { + virCgroupRemove(cgroup_hypervisor); + virCgroupFree(&cgroup_hypervisor); + } + + if (cgroup) { + virCgroupRemove(cgroup); + virCgroupFree(&cgroup); + } + + return rc; +} int qemuRemoveCgroup(struct qemud_driver *driver, virDomainObjPtr vm, diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 5973430..3380ee2 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -54,6 +54,8 @@ int qemuSetupCgroupVcpuBW(virCgroupPtr cgroup, unsigned long long period, long long quota); int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm); +int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, + virDomainObjPtr vm); int qemuRemoveCgroup(struct qemud_driver *driver, virDomainObjPtr vm, int quiet); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 685ea7c..d89b4d5 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3753,10 +3753,14 @@ int qemuProcessStart(virConnectPtr conn, if (qemuProcessDetectVcpuPIDs(driver, vm) < 0) goto cleanup; - VIR_DEBUG("Setting cgroup for each VCPU(if required)"); + VIR_DEBUG("Setting cgroup for each VCPU (if required)"); if (qemuSetupCgroupForVcpu(driver, vm) < 0) goto cleanup; + VIR_DEBUG("Setting cgroup for hypervisor (if required)"); + if (qemuSetupCgroupForHypervisor(driver, vm) < 0) + goto cleanup; + VIR_DEBUG("Setting VCPU affinities"); if (qemuProcessSetVcpuAffinites(conn, vm) < 0) goto cleanup; -- 1.7.10.2

On Wed, Jul 25, 2012 at 01:22:29PM +0800, tangchen wrote:
From: Wen Congyang <wency@cn.fujitsu.com>
Create a new cgroup and move all hypervisor threads to the new cgroup. And then we can do the other things: 1. limit only vcpu usage rather than the whole qemu 2. limit for hypervisor threads(include vhost-net threads)
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/qemu/qemu_cgroup.c | 71 ++++++++++++++++++++++++++++++++++++++++++++--- src/qemu/qemu_cgroup.h | 2 ++ src/qemu/qemu_process.c | 6 +++- 3 files changed, 74 insertions(+), 5 deletions(-)
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 32184e7..46ae1db 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -521,11 +521,12 @@ int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) }
if (priv->nvcpupids == 0 || priv->vcpupids[0] == vm->pid) { - /* If we does not know VCPU<->PID mapping or all vcpu runs in the same + /* If we does not know VCPU<->PID mapping or all vcpus run in the same * thread, we cannot control each vcpu. */ - virCgroupFree(&cgroup); - return 0; + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to get vcpus' pids.")); + goto cleanup; }
for (i = 0; i < priv->nvcpupids; i++) { @@ -562,7 +563,11 @@ int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) return 0;
cleanup: - virCgroupFree(&cgroup_vcpu); + if (cgroup_vcpu) { + virCgroupRemove(cgroup_vcpu); + virCgroupFree(&cgroup_vcpu); + } + if (cgroup) { virCgroupRemove(cgroup); virCgroupFree(&cgroup); @@ -571,6 +576,64 @@ cleanup: return -1; }
+int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, + virDomainObjPtr vm) +{ + virCgroupPtr cgroup = NULL; + virCgroupPtr cgroup_hypervisor = NULL; + int rc, i; + + if (driver->cgroup == NULL) + return 0; /* Not supported, so claim success */ + + rc = virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup, 0); + if (rc != 0) { + virReportSystemError(-rc, + _("Unable to find cgroup for %s"), + vm->def->name); + goto cleanup; + } + + rc = virCgroupForHypervisor(cgroup, &cgroup_hypervisor, 1); + if (rc < 0) { + virReportSystemError(-rc, + _("Unable to create hypervisor cgroup for %s"), + vm->def->name); + goto cleanup; + } + + for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) { + if (!qemuCgroupControllerActive(driver, i)) { + VIR_WARN("cgroup %d is not active", i); + continue; + } + rc = virCgroupMoveTask(cgroup, cgroup_hypervisor, i); + if (rc < 0) { + virReportSystemError(-rc, + _("Unable to move tasks from domain cgroup to " + "hypervisor cgroup in controller %d for %s"), + i, vm->def->name); + goto cleanup; + } + } + + virCgroupFree(&cgroup_hypervisor); + virCgroupFree(&cgroup); + return 0; + +cleanup: + if (cgroup_hypervisor) { + virCgroupRemove(cgroup_hypervisor); + virCgroupFree(&cgroup_hypervisor); + } + + if (cgroup) { + virCgroupRemove(cgroup); + virCgroupFree(&cgroup); + } + + return rc; +}
int qemuRemoveCgroup(struct qemud_driver *driver, virDomainObjPtr vm, diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 5973430..3380ee2 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -54,6 +54,8 @@ int qemuSetupCgroupVcpuBW(virCgroupPtr cgroup, unsigned long long period, long long quota); int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm); +int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, + virDomainObjPtr vm); int qemuRemoveCgroup(struct qemud_driver *driver, virDomainObjPtr vm, int quiet); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 685ea7c..d89b4d5 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3753,10 +3753,14 @@ int qemuProcessStart(virConnectPtr conn, if (qemuProcessDetectVcpuPIDs(driver, vm) < 0) goto cleanup;
- VIR_DEBUG("Setting cgroup for each VCPU(if required)"); + VIR_DEBUG("Setting cgroup for each VCPU (if required)"); if (qemuSetupCgroupForVcpu(driver, vm) < 0) goto cleanup;
+ VIR_DEBUG("Setting cgroup for hypervisor (if required)"); + if (qemuSetupCgroupForHypervisor(driver, vm) < 0) + goto cleanup; + VIR_DEBUG("Setting VCPU affinities"); if (qemuProcessSetVcpuAffinites(conn, vm) < 0) goto cleanup; -- 1.7.10.2
ACK. -- Thanks, Hu Tao

From: Tang Chen <tangchen@cn.fujitsu.com> vcpu threads pin are implemented using sched_setaffinity(), but not controlled by cgroup. This patch does the following things: 1) enable cpuset cgroup 2) reflect all the vcpu threads pin info to cgroup Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/libvirt_private.syms | 2 ++ src/qemu/qemu_cgroup.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_cgroup.h | 2 ++ src/qemu/qemu_driver.c | 44 ++++++++++++++++++++++++++++++++++++-------- src/util/cgroup.c | 35 ++++++++++++++++++++++++++++++++++- src/util/cgroup.h | 3 +++ 6 files changed, 121 insertions(+), 9 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 738c70f..867c2e7 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -82,6 +82,7 @@ virCgroupGetCpuShares; virCgroupGetCpuacctPercpuUsage; virCgroupGetCpuacctStat; virCgroupGetCpuacctUsage; +virCgroupGetCpusetCpus; virCgroupGetCpusetMems; virCgroupGetFreezerState; virCgroupGetMemSwapHardLimit; @@ -100,6 +101,7 @@ virCgroupSetBlkioWeight; virCgroupSetCpuCfsPeriod; virCgroupSetCpuCfsQuota; virCgroupSetCpuShares; +virCgroupSetCpusetCpus; virCgroupSetCpusetMems; virCgroupSetFreezerState; virCgroupSetMemSwapHardLimit; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 46ae1db..30346aa 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -479,11 +479,49 @@ cleanup: return -1; } +int qemuSetupCgroupVcpuPin(virCgroupPtr cgroup, virDomainDefPtr def, + int vcpuid) +{ + int i, rc = 0; + char *new_cpus = NULL; + + if (vcpuid < 0 || vcpuid >= def->vcpus) { + virReportSystemError(EINVAL, + _("invalid vcpuid: %d"), vcpuid); + return -EINVAL; + } + + for (i = 0; i < def->cputune.nvcpupin; i++) { + if (vcpuid == def->cputune.vcpupin[i]->vcpuid) { + new_cpus = virDomainCpuSetFormat(def->cputune.vcpupin[i]->cpumask, + VIR_DOMAIN_CPUMASK_LEN); + if (!new_cpus) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("failed to convert cpu mask")); + rc = -1; + goto cleanup; + } + rc = virCgroupSetCpusetCpus(cgroup, new_cpus); + if (rc != 0) { + virReportSystemError(-rc, + "%s", + _("Unable to set cpuset.cpus")); + goto cleanup; + } + } + } + +cleanup: + VIR_FREE(new_cpus); + return rc; +} + int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) { virCgroupPtr cgroup = NULL; virCgroupPtr cgroup_vcpu = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; + virDomainDefPtr def = vm->def; int rc; unsigned int i; unsigned long long period = vm->def->cputune.period; @@ -555,6 +593,12 @@ int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) } } + /* Set vcpupin in cgroup if vcpupin xml is provided */ + if (def->cputune.nvcpupin && + qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET) && + qemuSetupCgroupVcpuPin(cgroup_vcpu, def, i) < 0) + goto cleanup; + virCgroupFree(&cgroup_vcpu); } diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 3380ee2..62ec953 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -53,6 +53,8 @@ int qemuSetupCgroup(struct qemud_driver *driver, int qemuSetupCgroupVcpuBW(virCgroupPtr cgroup, unsigned long long period, long long quota); +int qemuSetupCgroupVcpuPin(virCgroupPtr cgroup, virDomainDefPtr def, + int vcpuid); int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm); int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, virDomainObjPtr vm); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f3ff5b2..7641fa6 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3613,6 +3613,8 @@ qemudDomainPinVcpuFlags(virDomainPtr dom, struct qemud_driver *driver = dom->conn->privateData; virDomainObjPtr vm; virDomainDefPtr persistentDef = NULL; + virCgroupPtr cgroup_dom = NULL; + virCgroupPtr cgroup_vcpu = NULL; int maxcpu, hostcpus; virNodeInfo nodeinfo; int ret = -1; @@ -3667,9 +3669,38 @@ qemudDomainPinVcpuFlags(virDomainPtr dom, if (flags & VIR_DOMAIN_AFFECT_LIVE) { if (priv->vcpupids != NULL) { + /* Add config to vm->def first, because qemuSetupCgroupVcpuPin needs it. */ + if (virDomainVcpuPinAdd(vm->def, cpumap, maplen, vcpu) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to update or add vcpupin xml of " + "a running domain")); + goto cleanup; + } + + /* Configure the corresponding cpuset cgroup before set affinity. */ + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup_dom, 0) == 0 && + virCgroupForVcpu(cgroup_dom, vcpu, &cgroup_vcpu, 0) == 0 && + qemuSetupCgroupVcpuPin(cgroup_vcpu, vm->def, vcpu) < 0) { + virReportError(VIR_ERR_OPERATION_INVALID, + _("failed to set cpuset.cpus in cgroup" + " for vcpu %d"), vcpu); + goto cleanup; + } + } else { + /* Here, we should go on because even if cgroup is not active, + * we can still use virProcessInfoSetAffinity. + */ + VIR_WARN("cpuset cgroup is not active"); + } + if (virProcessInfoSetAffinity(priv->vcpupids[vcpu], - cpumap, maplen, maxcpu) < 0) + cpumap, maplen, maxcpu) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, + _("failed to set cpu affinity for vcpu %d"), + vcpu); goto cleanup; + } } else { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cpu affinity is not supported")); @@ -3683,13 +3714,6 @@ qemudDomainPinVcpuFlags(virDomainPtr dom, "a running domain")); goto cleanup; } - } else { - if (virDomainVcpuPinAdd(vm->def, cpumap, maplen, vcpu) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("failed to update or add vcpupin xml of " - "a running domain")); - goto cleanup; - } } if (virDomainSaveStatus(driver->caps, driver->stateDir, vm) < 0) @@ -3721,6 +3745,10 @@ qemudDomainPinVcpuFlags(virDomainPtr dom, ret = 0; cleanup: + if (cgroup_vcpu) + virCgroupFree(&cgroup_vcpu); + if (cgroup_dom) + virCgroupFree(&cgroup_dom); if (vm) virDomainObjUnlock(vm); return ret; diff --git a/src/util/cgroup.c b/src/util/cgroup.c index d94f07b..3499730 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -532,7 +532,8 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, /* We need to control cpu bandwidth for each vcpu now */ if ((flags & VIR_CGROUP_VCPU) && (i != VIR_CGROUP_CONTROLLER_CPU && - i != VIR_CGROUP_CONTROLLER_CPUACCT)) { + i != VIR_CGROUP_CONTROLLER_CPUACCT && + i != VIR_CGROUP_CONTROLLER_CPUSET)) { /* treat it as unmounted and we can use virCgroupAddTask */ VIR_FREE(group->controllers[i].mountPoint); continue; @@ -1393,6 +1394,38 @@ int virCgroupGetCpusetMems(virCgroupPtr group, char **mems) } /** + * virCgroupSetCpusetCpus: + * + * @group: The cgroup to set cpuset.cpus for + * @cpus: the cpus to set + * + * Retuens: 0 on success + */ +int virCgroupSetCpusetCpus(virCgroupPtr group, const char *cpus) +{ + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_CPUSET, + "cpuset.cpus", + cpus); +} + +/** + * virCgroupGetCpusetCpus: + * + * @group: The cgroup to get cpuset.cpus for + * @cpus: the cpus to get + * + * Retuens: 0 on success + */ +int virCgroupGetCpusetCpus(virCgroupPtr group, char **cpus) +{ + return virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_CPUSET, + "cpuset.cpus", + cpus); +} + +/** * virCgroupDenyAllDevices: * * @group: The cgroup to deny all permissions, for all devices diff --git a/src/util/cgroup.h b/src/util/cgroup.h index f14c167..b196214 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -139,6 +139,9 @@ int virCgroupGetFreezerState(virCgroupPtr group, char **state); int virCgroupSetCpusetMems(virCgroupPtr group, const char *mems); int virCgroupGetCpusetMems(virCgroupPtr group, char **mems); +int virCgroupSetCpusetCpus(virCgroupPtr group, const char *cpus); +int virCgroupGetCpusetCpus(virCgroupPtr group, char **cpus); + int virCgroupRemove(virCgroupPtr group); void virCgroupFree(virCgroupPtr *group); -- 1.7.10.2

On Wed, Jul 25, 2012 at 01:23:16PM +0800, tangchen wrote:
From: Tang Chen <tangchen@cn.fujitsu.com>
vcpu threads pin are implemented using sched_setaffinity(), but not controlled by cgroup. This patch does the following things:
1) enable cpuset cgroup 2) reflect all the vcpu threads pin info to cgroup
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/libvirt_private.syms | 2 ++ src/qemu/qemu_cgroup.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_cgroup.h | 2 ++ src/qemu/qemu_driver.c | 44 ++++++++++++++++++++++++++++++++++++-------- src/util/cgroup.c | 35 ++++++++++++++++++++++++++++++++++- src/util/cgroup.h | 3 +++ 6 files changed, 121 insertions(+), 9 deletions(-)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 738c70f..867c2e7 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -82,6 +82,7 @@ virCgroupGetCpuShares; virCgroupGetCpuacctPercpuUsage; virCgroupGetCpuacctStat; virCgroupGetCpuacctUsage; +virCgroupGetCpusetCpus; virCgroupGetCpusetMems; virCgroupGetFreezerState; virCgroupGetMemSwapHardLimit; @@ -100,6 +101,7 @@ virCgroupSetBlkioWeight; virCgroupSetCpuCfsPeriod; virCgroupSetCpuCfsQuota; virCgroupSetCpuShares; +virCgroupSetCpusetCpus; virCgroupSetCpusetMems; virCgroupSetFreezerState; virCgroupSetMemSwapHardLimit; diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 46ae1db..30346aa 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -479,11 +479,49 @@ cleanup: return -1; }
+int qemuSetupCgroupVcpuPin(virCgroupPtr cgroup, virDomainDefPtr def, + int vcpuid) +{ + int i, rc = 0; + char *new_cpus = NULL; + + if (vcpuid < 0 || vcpuid >= def->vcpus) { + virReportSystemError(EINVAL, + _("invalid vcpuid: %d"), vcpuid); + return -EINVAL; + } + + for (i = 0; i < def->cputune.nvcpupin; i++) { + if (vcpuid == def->cputune.vcpupin[i]->vcpuid) { + new_cpus = virDomainCpuSetFormat(def->cputune.vcpupin[i]->cpumask, + VIR_DOMAIN_CPUMASK_LEN); + if (!new_cpus) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("failed to convert cpu mask")); + rc = -1; + goto cleanup; + } + rc = virCgroupSetCpusetCpus(cgroup, new_cpus); + if (rc != 0) { + virReportSystemError(-rc, + "%s", + _("Unable to set cpuset.cpus")); + goto cleanup; + } + } + } + +cleanup: + VIR_FREE(new_cpus); + return rc; +} + int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) { virCgroupPtr cgroup = NULL; virCgroupPtr cgroup_vcpu = NULL; qemuDomainObjPrivatePtr priv = vm->privateData; + virDomainDefPtr def = vm->def; int rc; unsigned int i; unsigned long long period = vm->def->cputune.period; @@ -555,6 +593,12 @@ int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) } }
+ /* Set vcpupin in cgroup if vcpupin xml is provided */ + if (def->cputune.nvcpupin && + qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET) && + qemuSetupCgroupVcpuPin(cgroup_vcpu, def, i) < 0) + goto cleanup; + virCgroupFree(&cgroup_vcpu); }
diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 3380ee2..62ec953 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -53,6 +53,8 @@ int qemuSetupCgroup(struct qemud_driver *driver, int qemuSetupCgroupVcpuBW(virCgroupPtr cgroup, unsigned long long period, long long quota); +int qemuSetupCgroupVcpuPin(virCgroupPtr cgroup, virDomainDefPtr def, + int vcpuid); int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm); int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, virDomainObjPtr vm); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f3ff5b2..7641fa6 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3613,6 +3613,8 @@ qemudDomainPinVcpuFlags(virDomainPtr dom, struct qemud_driver *driver = dom->conn->privateData; virDomainObjPtr vm; virDomainDefPtr persistentDef = NULL; + virCgroupPtr cgroup_dom = NULL; + virCgroupPtr cgroup_vcpu = NULL; int maxcpu, hostcpus; virNodeInfo nodeinfo; int ret = -1; @@ -3667,9 +3669,38 @@ qemudDomainPinVcpuFlags(virDomainPtr dom, if (flags & VIR_DOMAIN_AFFECT_LIVE) {
if (priv->vcpupids != NULL) { + /* Add config to vm->def first, because qemuSetupCgroupVcpuPin needs it. */ + if (virDomainVcpuPinAdd(vm->def, cpumap, maplen, vcpu) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to update or add vcpupin xml of " + "a running domain")); + goto cleanup; + } + + /* Configure the corresponding cpuset cgroup before set affinity. */ + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET)) { + if (virCgroupForDomain(driver->cgroup, vm->def->name, &cgroup_dom, 0) == 0 && + virCgroupForVcpu(cgroup_dom, vcpu, &cgroup_vcpu, 0) == 0 && + qemuSetupCgroupVcpuPin(cgroup_vcpu, vm->def, vcpu) < 0) { + virReportError(VIR_ERR_OPERATION_INVALID, + _("failed to set cpuset.cpus in cgroup" + " for vcpu %d"), vcpu); + goto cleanup; + } + } else { + /* Here, we should go on because even if cgroup is not active, + * we can still use virProcessInfoSetAffinity. + */ + VIR_WARN("cpuset cgroup is not active"); + } + if (virProcessInfoSetAffinity(priv->vcpupids[vcpu], - cpumap, maplen, maxcpu) < 0) + cpumap, maplen, maxcpu) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, + _("failed to set cpu affinity for vcpu %d"), + vcpu); goto cleanup; + } } else { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("cpu affinity is not supported")); @@ -3683,13 +3714,6 @@ qemudDomainPinVcpuFlags(virDomainPtr dom, "a running domain")); goto cleanup; } - } else { - if (virDomainVcpuPinAdd(vm->def, cpumap, maplen, vcpu) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("failed to update or add vcpupin xml of " - "a running domain")); - goto cleanup; - } }
if (virDomainSaveStatus(driver->caps, driver->stateDir, vm) < 0) @@ -3721,6 +3745,10 @@ qemudDomainPinVcpuFlags(virDomainPtr dom, ret = 0;
cleanup: + if (cgroup_vcpu) + virCgroupFree(&cgroup_vcpu); + if (cgroup_dom) + virCgroupFree(&cgroup_dom); if (vm) virDomainObjUnlock(vm); return ret; diff --git a/src/util/cgroup.c b/src/util/cgroup.c index d94f07b..3499730 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -532,7 +532,8 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, /* We need to control cpu bandwidth for each vcpu now */ if ((flags & VIR_CGROUP_VCPU) && (i != VIR_CGROUP_CONTROLLER_CPU && - i != VIR_CGROUP_CONTROLLER_CPUACCT)) { + i != VIR_CGROUP_CONTROLLER_CPUACCT && + i != VIR_CGROUP_CONTROLLER_CPUSET)) { /* treat it as unmounted and we can use virCgroupAddTask */ VIR_FREE(group->controllers[i].mountPoint); continue; @@ -1393,6 +1394,38 @@ int virCgroupGetCpusetMems(virCgroupPtr group, char **mems) }
/** + * virCgroupSetCpusetCpus: + * + * @group: The cgroup to set cpuset.cpus for + * @cpus: the cpus to set + * + * Retuens: 0 on success + */ +int virCgroupSetCpusetCpus(virCgroupPtr group, const char *cpus) +{ + return virCgroupSetValueStr(group, + VIR_CGROUP_CONTROLLER_CPUSET, + "cpuset.cpus", + cpus); +} + +/** + * virCgroupGetCpusetCpus: + * + * @group: The cgroup to get cpuset.cpus for + * @cpus: the cpus to get + * + * Retuens: 0 on success + */ +int virCgroupGetCpusetCpus(virCgroupPtr group, char **cpus) +{ + return virCgroupGetValueStr(group, + VIR_CGROUP_CONTROLLER_CPUSET, + "cpuset.cpus", + cpus); +} + +/** * virCgroupDenyAllDevices: * * @group: The cgroup to deny all permissions, for all devices diff --git a/src/util/cgroup.h b/src/util/cgroup.h index f14c167..b196214 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -139,6 +139,9 @@ int virCgroupGetFreezerState(virCgroupPtr group, char **state); int virCgroupSetCpusetMems(virCgroupPtr group, const char *mems); int virCgroupGetCpusetMems(virCgroupPtr group, char **mems);
+int virCgroupSetCpusetCpus(virCgroupPtr group, const char *cpus); +int virCgroupGetCpusetCpus(virCgroupPtr group, char **cpus); + int virCgroupRemove(virCgroupPtr group);
void virCgroupFree(virCgroupPtr *group); -- 1.7.10.2
ACK. -- Thanks, Hu Tao

From: Tang Chen <tangchen@cn.fujitsu.com> This patch adds a new xml element <hypervisorpin cpuset='1'>, and also the parser functions, docs, and tests. hypervisorpin means pinning hypervisor threads, and cpuset = '1' means pinning all hypervisor threads to cpu 1. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- docs/schemas/domaincommon.rng | 7 ++ src/conf/domain_conf.c | 97 ++++++++++++++++++++++- src/conf/domain_conf.h | 1 + tests/qemuxml2argvdata/qemuxml2argv-cputune.xml | 1 + 4 files changed, 103 insertions(+), 3 deletions(-) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index b7562ad..57a576a 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -576,6 +576,13 @@ </attribute> </element> </zeroOrMore> + <optional> + <element name="hypervisorpin"> + <attribute name="cpuset"> + <ref name="cpuset"/> + </attribute> + </element> + </optional> </element> </optional> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c53722a..bedbf79 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -7826,6 +7826,51 @@ error: goto cleanup; } +/* Parse the XML definition for hypervisorpin */ +static virDomainVcpuPinDefPtr +virDomainHypervisorPinDefParseXML(const xmlNodePtr node) +{ + virDomainVcpuPinDefPtr def = NULL; + char *tmp = NULL; + + if (VIR_ALLOC(def) < 0) { + virReportOOMError(); + return NULL; + } + + def->vcpuid = -1; + + tmp = virXMLPropString(node, "cpuset"); + + if (tmp) { + char *set = tmp; + int cpumasklen = VIR_DOMAIN_CPUMASK_LEN; + + if (VIR_ALLOC_N(def->cpumask, cpumasklen) < 0) { + virReportOOMError(); + goto error; + } + + if (virDomainCpuSetParse(set, 0, def->cpumask, + cpumasklen) < 0) + goto error; + + VIR_FREE(tmp); + } else { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("missing cpuset for hypervisor pin")); + goto error; + } + +cleanup: + return def; + +error: + VIR_FREE(tmp); + VIR_FREE(def); + goto cleanup; +} + static int virDomainDefMaybeAddController(virDomainDefPtr def, int type, int idx) @@ -8219,6 +8264,34 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps, } VIR_FREE(nodes); + if ((n = virXPathNodeSet("./cputune/hypervisorpin", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("cannot extract hypervisorpin nodes")); + goto error; + } + + if (n > 1) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("only one hypervisorpin is supported")); + VIR_FREE(nodes); + goto error; + } + + if (n && VIR_ALLOC(def->cputune.hypervisorpin) < 0) { + goto no_memory; + } + + if (n) { + virDomainVcpuPinDefPtr hypervisorpin = NULL; + hypervisorpin = virDomainHypervisorPinDefParseXML(nodes[0]); + + if (!hypervisorpin) + goto error; + + def->cputune.hypervisorpin = hypervisorpin; + } + VIR_FREE(nodes); + /* Extract numatune if exists. */ if ((n = virXPathNodeSet("./numatune", ctxt, &nodes)) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -9226,7 +9299,7 @@ no_memory: virReportOOMError(); /* fallthrough */ - error: +error: VIR_FREE(tmp); VIR_FREE(nodes); virBitmapFree(bootMap); @@ -12794,7 +12867,8 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAsprintf(buf, ">%u</vcpu>\n", def->maxvcpus); if (def->cputune.shares || def->cputune.vcpupin || - def->cputune.period || def->cputune.quota) + def->cputune.period || def->cputune.quota || + def->cputune.hypervisorpin) virBufferAddLit(buf, " <cputune>\n"); if (def->cputune.shares) @@ -12826,8 +12900,25 @@ virDomainDefFormatInternal(virDomainDefPtr def, } } + if (def->cputune.hypervisorpin) { + virBufferAsprintf(buf, " <hypervisorpin "); + + char *cpumask = NULL; + cpumask = virDomainCpuSetFormat(def->cputune.hypervisorpin->cpumask, + VIR_DOMAIN_CPUMASK_LEN); + if (cpumask == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("failed to format cpuset for hypervisor")); + goto cleanup; + } + + virBufferAsprintf(buf, "cpuset='%s'/>\n", cpumask); + VIR_FREE(cpumask); + } + if (def->cputune.shares || def->cputune.vcpupin || - def->cputune.period || def->cputune.quota) + def->cputune.period || def->cputune.quota || + def->cputune.hypervisorpin) virBufferAddLit(buf, " </cputune>\n"); if (def->numatune.memory.nodemask || diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 469d3b6..11f0eb0 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1607,6 +1607,7 @@ struct _virDomainDef { long long quota; int nvcpupin; virDomainVcpuPinDefPtr *vcpupin; + virDomainVcpuPinDefPtr hypervisorpin; } cputune; virDomainNumatuneDef numatune; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cputune.xml b/tests/qemuxml2argvdata/qemuxml2argv-cputune.xml index df3101d..b72af1b 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-cputune.xml +++ b/tests/qemuxml2argvdata/qemuxml2argv-cputune.xml @@ -10,6 +10,7 @@ <quota>-1</quota> <vcpupin vcpu='0' cpuset='0'/> <vcpupin vcpu='1' cpuset='1'/> + <hypervisorpin cpuset='1'/> </cputune> <os> <type arch='i686' machine='pc'>hvm</type> -- 1.7.10.2

On Wed, Jul 25, 2012 at 01:24:00PM +0800, tangchen wrote:
From: Tang Chen <tangchen@cn.fujitsu.com>
This patch adds a new xml element <hypervisorpin cpuset='1'>, and also the parser functions, docs, and tests. hypervisorpin means pinning hypervisor threads, and cpuset = '1' means pinning all hypervisor threads to cpu 1.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- docs/schemas/domaincommon.rng | 7 ++ src/conf/domain_conf.c | 97 ++++++++++++++++++++++- src/conf/domain_conf.h | 1 + tests/qemuxml2argvdata/qemuxml2argv-cputune.xml | 1 + 4 files changed, 103 insertions(+), 3 deletions(-)
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index b7562ad..57a576a 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -576,6 +576,13 @@ </attribute> </element> </zeroOrMore> + <optional> + <element name="hypervisorpin"> + <attribute name="cpuset"> + <ref name="cpuset"/> + </attribute> + </element> + </optional> </element> </optional>
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c53722a..bedbf79 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -7826,6 +7826,51 @@ error: goto cleanup; }
+/* Parse the XML definition for hypervisorpin */ +static virDomainVcpuPinDefPtr +virDomainHypervisorPinDefParseXML(const xmlNodePtr node) +{ + virDomainVcpuPinDefPtr def = NULL; + char *tmp = NULL; + + if (VIR_ALLOC(def) < 0) { + virReportOOMError(); + return NULL; + } + + def->vcpuid = -1; + + tmp = virXMLPropString(node, "cpuset"); + + if (tmp) { + char *set = tmp; + int cpumasklen = VIR_DOMAIN_CPUMASK_LEN; + + if (VIR_ALLOC_N(def->cpumask, cpumasklen) < 0) { + virReportOOMError(); + goto error; + } + + if (virDomainCpuSetParse(set, 0, def->cpumask, + cpumasklen) < 0) + goto error; + + VIR_FREE(tmp); + } else { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("missing cpuset for hypervisor pin")); + goto error; + } + +cleanup: + return def; + +error: + VIR_FREE(tmp); + VIR_FREE(def); + goto cleanup; +} + static int virDomainDefMaybeAddController(virDomainDefPtr def, int type, int idx) @@ -8219,6 +8264,34 @@ static virDomainDefPtr virDomainDefParseXML(virCapsPtr caps, } VIR_FREE(nodes);
+ if ((n = virXPathNodeSet("./cputune/hypervisorpin", ctxt, &nodes)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("cannot extract hypervisorpin nodes")); + goto error; + } + + if (n > 1) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("only one hypervisorpin is supported")); + VIR_FREE(nodes); + goto error; + } + + if (n && VIR_ALLOC(def->cputune.hypervisorpin) < 0) { + goto no_memory; + } + + if (n) { + virDomainVcpuPinDefPtr hypervisorpin = NULL; + hypervisorpin = virDomainHypervisorPinDefParseXML(nodes[0]); + + if (!hypervisorpin) + goto error; + + def->cputune.hypervisorpin = hypervisorpin; + } + VIR_FREE(nodes); + /* Extract numatune if exists. */ if ((n = virXPathNodeSet("./numatune", ctxt, &nodes)) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -9226,7 +9299,7 @@ no_memory: virReportOOMError(); /* fallthrough */
- error: +error: VIR_FREE(tmp); VIR_FREE(nodes); virBitmapFree(bootMap); @@ -12794,7 +12867,8 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAsprintf(buf, ">%u</vcpu>\n", def->maxvcpus);
if (def->cputune.shares || def->cputune.vcpupin || - def->cputune.period || def->cputune.quota) + def->cputune.period || def->cputune.quota || + def->cputune.hypervisorpin) virBufferAddLit(buf, " <cputune>\n");
if (def->cputune.shares) @@ -12826,8 +12900,25 @@ virDomainDefFormatInternal(virDomainDefPtr def, } }
+ if (def->cputune.hypervisorpin) { + virBufferAsprintf(buf, " <hypervisorpin "); + + char *cpumask = NULL; + cpumask = virDomainCpuSetFormat(def->cputune.hypervisorpin->cpumask, + VIR_DOMAIN_CPUMASK_LEN); + if (cpumask == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("failed to format cpuset for hypervisor")); + goto cleanup; + } + + virBufferAsprintf(buf, "cpuset='%s'/>\n", cpumask); + VIR_FREE(cpumask); + } + if (def->cputune.shares || def->cputune.vcpupin || - def->cputune.period || def->cputune.quota) + def->cputune.period || def->cputune.quota || + def->cputune.hypervisorpin) virBufferAddLit(buf, " </cputune>\n");
if (def->numatune.memory.nodemask || diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 469d3b6..11f0eb0 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1607,6 +1607,7 @@ struct _virDomainDef { long long quota; int nvcpupin; virDomainVcpuPinDefPtr *vcpupin; + virDomainVcpuPinDefPtr hypervisorpin; } cputune;
virDomainNumatuneDef numatune; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-cputune.xml b/tests/qemuxml2argvdata/qemuxml2argv-cputune.xml index df3101d..b72af1b 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-cputune.xml +++ b/tests/qemuxml2argvdata/qemuxml2argv-cputune.xml @@ -10,6 +10,7 @@ <quota>-1</quota> <vcpupin vcpu='0' cpuset='0'/> <vcpupin vcpu='1' cpuset='1'/> + <hypervisorpin cpuset='1'/> </cputune> <os> <type arch='i686' machine='pc'>hvm</type> -- 1.7.10.2
ACK. -- Thanks, Hu Tao

From: Tang Chen <tangchen@cn.fujitsu.com> Introduce qemuSetupCgroupHypervisorPin() function to add hypervisor threads pin info to cpuset cgroup, the same as vcpupin. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/qemu/qemu_cgroup.c | 36 ++++++++++++++++++++++++++++++++++++ src/qemu/qemu_cgroup.h | 1 + 2 files changed, 37 insertions(+) diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 30346aa..43f71dc 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -516,6 +516,36 @@ cleanup: return rc; } +int qemuSetupCgroupHypervisorPin(virCgroupPtr cgroup, virDomainDefPtr def) +{ + int rc = 0; + char *new_cpus = NULL; + + if (!def->cputune.hypervisorpin) + return 0; + + new_cpus = virDomainCpuSetFormat(def->cputune.hypervisorpin->cpumask, + VIR_DOMAIN_CPUMASK_LEN); + if (!new_cpus) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("failed to convert cpu mask")); + rc = -1; + goto cleanup; + } + + rc = virCgroupSetCpusetCpus(cgroup, new_cpus); + if (rc < 0) { + virReportSystemError(-rc, + "%s", + _("Unable to set cpuset.cpus")); + goto cleanup; + } + +cleanup: + VIR_FREE(new_cpus); + return rc; +} + int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) { virCgroupPtr cgroup = NULL; @@ -625,6 +655,7 @@ int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, { virCgroupPtr cgroup = NULL; virCgroupPtr cgroup_hypervisor = NULL; + virDomainDefPtr def = vm->def; int rc, i; if (driver->cgroup == NULL) @@ -661,6 +692,11 @@ int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, } } + if (def->cputune.hypervisorpin && + qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET) && + qemuSetupCgroupHypervisorPin(cgroup_hypervisor, def) < 0) + goto cleanup; + virCgroupFree(&cgroup_hypervisor); virCgroupFree(&cgroup); return 0; diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 62ec953..f321683 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -55,6 +55,7 @@ int qemuSetupCgroupVcpuBW(virCgroupPtr cgroup, long long quota); int qemuSetupCgroupVcpuPin(virCgroupPtr cgroup, virDomainDefPtr def, int vcpuid); +int qemuSetupCgroupHypervisorPin(virCgroupPtr cgroup, virDomainDefPtr def); int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm); int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, virDomainObjPtr vm); -- 1.7.10.2

On Wed, Jul 25, 2012 at 01:24:37PM +0800, tangchen wrote:
From: Tang Chen <tangchen@cn.fujitsu.com>
Introduce qemuSetupCgroupHypervisorPin() function to add hypervisor threads pin info to cpuset cgroup, the same as vcpupin.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/qemu/qemu_cgroup.c | 36 ++++++++++++++++++++++++++++++++++++ src/qemu/qemu_cgroup.h | 1 + 2 files changed, 37 insertions(+)
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index 30346aa..43f71dc 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -516,6 +516,36 @@ cleanup: return rc; }
+int qemuSetupCgroupHypervisorPin(virCgroupPtr cgroup, virDomainDefPtr def) +{ + int rc = 0; + char *new_cpus = NULL; + + if (!def->cputune.hypervisorpin) + return 0; + + new_cpus = virDomainCpuSetFormat(def->cputune.hypervisorpin->cpumask, + VIR_DOMAIN_CPUMASK_LEN); + if (!new_cpus) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("failed to convert cpu mask")); + rc = -1; + goto cleanup; + } + + rc = virCgroupSetCpusetCpus(cgroup, new_cpus); + if (rc < 0) { + virReportSystemError(-rc, + "%s", + _("Unable to set cpuset.cpus")); + goto cleanup; + } + +cleanup: + VIR_FREE(new_cpus); + return rc; +} + int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm) { virCgroupPtr cgroup = NULL; @@ -625,6 +655,7 @@ int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, { virCgroupPtr cgroup = NULL; virCgroupPtr cgroup_hypervisor = NULL; + virDomainDefPtr def = vm->def; int rc, i;
if (driver->cgroup == NULL) @@ -661,6 +692,11 @@ int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, } }
+ if (def->cputune.hypervisorpin && + qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_CPUSET) && + qemuSetupCgroupHypervisorPin(cgroup_hypervisor, def) < 0) + goto cleanup; + virCgroupFree(&cgroup_hypervisor); virCgroupFree(&cgroup); return 0; diff --git a/src/qemu/qemu_cgroup.h b/src/qemu/qemu_cgroup.h index 62ec953..f321683 100644 --- a/src/qemu/qemu_cgroup.h +++ b/src/qemu/qemu_cgroup.h @@ -55,6 +55,7 @@ int qemuSetupCgroupVcpuBW(virCgroupPtr cgroup, long long quota); int qemuSetupCgroupVcpuPin(virCgroupPtr cgroup, virDomainDefPtr def, int vcpuid); +int qemuSetupCgroupHypervisorPin(virCgroupPtr cgroup, virDomainDefPtr def); int qemuSetupCgroupForVcpu(struct qemud_driver *driver, virDomainObjPtr vm); int qemuSetupCgroupForHypervisor(struct qemud_driver *driver, virDomainObjPtr vm); -- 1.7.10.2
ACK. -- Thanks, Hu Tao

From: Tang Chen <tangchen@cn.fujitsu.com> Hypervisor threads should also be pinned by sched_setaffinity(), just the same as vcpu threads. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/qemu/qemu_process.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d89b4d5..bb1640a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2005,6 +2005,56 @@ cleanup: return ret; } +/* Set CPU affinities for hypervisor threads if hypervisorpin xml provided. */ +static int +qemuProcessSetHypervisorAffinites(virConnectPtr conn, + virDomainObjPtr vm) +{ + virDomainDefPtr def = vm->def; + pid_t pid = vm->pid; + unsigned char *cpumask = NULL; + unsigned char *cpumap = NULL; + virNodeInfo nodeinfo; + int cpumaplen, hostcpus, maxcpu, i; + int ret = -1; + + if (virNodeGetInfo(conn, &nodeinfo) != 0) + return -1; + + if (!def->cputune.hypervisorpin) + return 0; + + hostcpus = VIR_NODEINFO_MAXCPUS(nodeinfo); + cpumaplen = VIR_CPU_MAPLEN(hostcpus); + maxcpu = cpumaplen * 8; + + if (maxcpu > hostcpus) + maxcpu = hostcpus; + + if (VIR_ALLOC_N(cpumap, cpumaplen) < 0) { + virReportOOMError(); + return -1; + } + + cpumask = (unsigned char *)def->cputune.hypervisorpin->cpumask; + for(i = 0; i < VIR_DOMAIN_CPUMASK_LEN; i++) { + if (cpumask[i]) + VIR_USE_CPU(cpumap, i); + } + + if (virProcessInfoSetAffinity(pid, + cpumap, + cpumaplen, + maxcpu) < 0) { + goto cleanup; + } + + ret = 0; +cleanup: + VIR_FREE(cpumap); + return ret; +} + static int qemuProcessInitPasswords(virConnectPtr conn, struct qemud_driver *driver, @@ -3765,6 +3815,10 @@ int qemuProcessStart(virConnectPtr conn, if (qemuProcessSetVcpuAffinites(conn, vm) < 0) goto cleanup; + VIR_DEBUG("Setting hypervisor threads affinities"); + if (qemuProcessSetHypervisorAffinites(conn, vm) < 0) + goto cleanup; + VIR_DEBUG("Setting any required VM passwords"); if (qemuProcessInitPasswords(conn, driver, vm) < 0) goto cleanup; -- 1.7.10.2

On Wed, Jul 25, 2012 at 01:25:16PM +0800, tangchen wrote:
From: Tang Chen <tangchen@cn.fujitsu.com>
Hypervisor threads should also be pinned by sched_setaffinity(), just the same as vcpu threads.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/qemu/qemu_process.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d89b4d5..bb1640a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2005,6 +2005,56 @@ cleanup: return ret; }
+/* Set CPU affinities for hypervisor threads if hypervisorpin xml provided. */ +static int +qemuProcessSetHypervisorAffinites(virConnectPtr conn, + virDomainObjPtr vm) +{ + virDomainDefPtr def = vm->def; + pid_t pid = vm->pid; + unsigned char *cpumask = NULL; + unsigned char *cpumap = NULL; + virNodeInfo nodeinfo; + int cpumaplen, hostcpus, maxcpu, i; + int ret = -1; + + if (virNodeGetInfo(conn, &nodeinfo) != 0) + return -1; + + if (!def->cputune.hypervisorpin) + return 0;
Reorder into: if (!def->cputune.hypervisorpin) return 0; if (virNodeGetInfo(conn, &nodeinfo) != 0) return -1;
+ + hostcpus = VIR_NODEINFO_MAXCPUS(nodeinfo); + cpumaplen = VIR_CPU_MAPLEN(hostcpus); + maxcpu = cpumaplen * 8; + + if (maxcpu > hostcpus) + maxcpu = hostcpus; + + if (VIR_ALLOC_N(cpumap, cpumaplen) < 0) { + virReportOOMError(); + return -1; + } + + cpumask = (unsigned char *)def->cputune.hypervisorpin->cpumask; + for(i = 0; i < VIR_DOMAIN_CPUMASK_LEN; i++) { + if (cpumask[i]) + VIR_USE_CPU(cpumap, i); + } + + if (virProcessInfoSetAffinity(pid, + cpumap, + cpumaplen, + maxcpu) < 0) { + goto cleanup; + } + + ret = 0; +cleanup: + VIR_FREE(cpumap); + return ret; +} + static int qemuProcessInitPasswords(virConnectPtr conn, struct qemud_driver *driver, @@ -3765,6 +3815,10 @@ int qemuProcessStart(virConnectPtr conn, if (qemuProcessSetVcpuAffinites(conn, vm) < 0) goto cleanup;
+ VIR_DEBUG("Setting hypervisor threads affinities"); + if (qemuProcessSetHypervisorAffinites(conn, vm) < 0) + goto cleanup; + VIR_DEBUG("Setting any required VM passwords"); if (qemuProcessInitPasswords(conn, driver, vm) < 0) goto cleanup; -- 1.7.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Thanks, Hu Tao

From: Tang Chen <tangchen@cn.fujitsu.com> Introduce 2 APIs to support hypervisor threads pin. 1) virDomainHypervisorPinAdd: setup hypervisor threads pin with a given cpumap string. 2) virDomainHypervisorPinDel: remove all hypervisor threads pin. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/conf/domain_conf.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 6 ++++ src/libvirt_private.syms | 2 ++ 3 files changed, 80 insertions(+) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index bedbf79..cec158d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -10957,6 +10957,78 @@ virDomainVcpuPinDel(virDomainDefPtr def, int vcpu) return 0; } +int +virDomainHypervisorPinAdd(virDomainDefPtr def, + unsigned char *cpumap, + int maplen) +{ + virDomainVcpuPinDefPtr hypervisorpin = NULL; + char *cpumask = NULL; + int i; + + if (VIR_ALLOC_N(cpumask, VIR_DOMAIN_CPUMASK_LEN) < 0) { + virReportOOMError(); + goto cleanup; + } + + /* Convert bitmap (cpumap) to cpumask, which is byte map. */ + for (i = 0; i < maplen; i++) { + int cur; + + for (cur = 0; cur < 8; cur++) { + if (cpumap[i] & (1 << cur)) + cpumask[i * 8 + cur] = 1; + } + } + + if (!def->cputune.hypervisorpin) { + /* No hypervisorpin exists yet. */ + if (VIR_ALLOC(hypervisorpin) < 0) { + virReportOOMError(); + goto cleanup; + } + + hypervisorpin->vcpuid = -1; + hypervisorpin->cpumask = cpumask; + def->cputune.hypervisorpin = hypervisorpin; + } else { + /* Since there is only 1 hypervisorpin for each vm, + * juest replace the old one. + */ + VIR_FREE(def->cputune.hypervisorpin->cpumask); + def->cputune.hypervisorpin->cpumask = cpumask; + } + + return 0; + +cleanup: + if (cpumask) + VIR_FREE(cpumask); + return -1; +} + +int +virDomainHypervisorPinDel(virDomainDefPtr def) +{ + virDomainVcpuPinDefPtr hypervisorpin = NULL; + + /* No hypervisorpin exists yet */ + if (!def->cputune.hypervisorpin) { + return 0; + } + + hypervisorpin = def->cputune.hypervisorpin; + + VIR_FREE(hypervisorpin->cpumask); + VIR_FREE(hypervisorpin); + def->cputune.hypervisorpin = NULL; + + if (def->cputune.hypervisorpin) + return -1; + + return 0; +} + static int virDomainLifecycleDefFormat(virBufferPtr buf, int type, diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 11f0eb0..64096d5 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1985,6 +1985,12 @@ int virDomainVcpuPinAdd(virDomainDefPtr def, int virDomainVcpuPinDel(virDomainDefPtr def, int vcpu); +int virDomainHypervisorPinAdd(virDomainDefPtr def, + unsigned char *cpumap, + int maplen); + +int virDomainHypervisorPinDel(virDomainDefPtr def); + int virDomainDiskIndexByName(virDomainDefPtr def, const char *name, bool allow_ambiguous); const char *virDomainDiskPathByName(virDomainDefPtr, const char *name); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 867c2e7..a57486a 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -491,6 +491,8 @@ virDomainTimerTrackTypeFromString; virDomainTimerTrackTypeToString; virDomainVcpuPinAdd; virDomainVcpuPinDel; +virDomainHypervisorPinAdd; +virDomainHypervisorPinDel; virDomainVcpuPinFindByVcpu; virDomainVcpuPinIsDuplicate; virDomainVideoDefFree; -- 1.7.10.2

On Wed, Jul 25, 2012 at 01:25:52PM +0800, tangchen wrote:
From: Tang Chen <tangchen@cn.fujitsu.com>
Introduce 2 APIs to support hypervisor threads pin. 1) virDomainHypervisorPinAdd: setup hypervisor threads pin with a given cpumap string. 2) virDomainHypervisorPinDel: remove all hypervisor threads pin.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/conf/domain_conf.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 6 ++++ src/libvirt_private.syms | 2 ++ 3 files changed, 80 insertions(+)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index bedbf79..cec158d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -10957,6 +10957,78 @@ virDomainVcpuPinDel(virDomainDefPtr def, int vcpu) return 0; }
+int +virDomainHypervisorPinAdd(virDomainDefPtr def, + unsigned char *cpumap, + int maplen) +{ + virDomainVcpuPinDefPtr hypervisorpin = NULL; + char *cpumask = NULL; + int i; + + if (VIR_ALLOC_N(cpumask, VIR_DOMAIN_CPUMASK_LEN) < 0) { + virReportOOMError(); + goto cleanup; + } +
We have to check maplen here to avoid accessing beyond cpumask.
+ /* Convert bitmap (cpumap) to cpumask, which is byte map. */ + for (i = 0; i < maplen; i++) { + int cur; + + for (cur = 0; cur < 8; cur++) { + if (cpumap[i] & (1 << cur)) + cpumask[i * 8 + cur] = 1; + } + } + + if (!def->cputune.hypervisorpin) { + /* No hypervisorpin exists yet. */ + if (VIR_ALLOC(hypervisorpin) < 0) { + virReportOOMError(); + goto cleanup; + } + + hypervisorpin->vcpuid = -1; + hypervisorpin->cpumask = cpumask; + def->cputune.hypervisorpin = hypervisorpin; + } else { + /* Since there is only 1 hypervisorpin for each vm, + * juest replace the old one. + */ + VIR_FREE(def->cputune.hypervisorpin->cpumask); + def->cputune.hypervisorpin->cpumask = cpumask; + } + + return 0; + +cleanup: + if (cpumask) + VIR_FREE(cpumask); + return -1; +} + +int +virDomainHypervisorPinDel(virDomainDefPtr def) +{ + virDomainVcpuPinDefPtr hypervisorpin = NULL; + + /* No hypervisorpin exists yet */ + if (!def->cputune.hypervisorpin) { + return 0; + } + + hypervisorpin = def->cputune.hypervisorpin; + + VIR_FREE(hypervisorpin->cpumask); + VIR_FREE(hypervisorpin); + def->cputune.hypervisorpin = NULL; + + if (def->cputune.hypervisorpin) + return -1; + + return 0; +} + static int virDomainLifecycleDefFormat(virBufferPtr buf, int type, diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 11f0eb0..64096d5 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1985,6 +1985,12 @@ int virDomainVcpuPinAdd(virDomainDefPtr def,
int virDomainVcpuPinDel(virDomainDefPtr def, int vcpu);
+int virDomainHypervisorPinAdd(virDomainDefPtr def, + unsigned char *cpumap, + int maplen); + +int virDomainHypervisorPinDel(virDomainDefPtr def); + int virDomainDiskIndexByName(virDomainDefPtr def, const char *name, bool allow_ambiguous); const char *virDomainDiskPathByName(virDomainDefPtr, const char *name); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 867c2e7..a57486a 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -491,6 +491,8 @@ virDomainTimerTrackTypeFromString; virDomainTimerTrackTypeToString; virDomainVcpuPinAdd; virDomainVcpuPinDel; +virDomainHypervisorPinAdd; +virDomainHypervisorPinDel; virDomainVcpuPinFindByVcpu; virDomainVcpuPinIsDuplicate; virDomainVideoDefFree; -- 1.7.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Thanks, Hu Tao

From: Tang Chen <tangchen@cn.fujitsu.com> Introduce 2 APIs for client to use. 1) virDomainPinHypervisorFlags: call remote driver api, such as remoteDomainPinHypervisorFlags. 2) virDomainGetHypervisorPinInfo: call remote driver api, such as remoteDomainGetHypervisorPinInfo. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- include/libvirt/libvirt.h.in | 10 +++ src/driver.h | 13 +++- src/libvirt.c | 147 ++++++++++++++++++++++++++++++++++++++++++ src/libvirt_public.syms | 2 + 4 files changed, 171 insertions(+), 1 deletion(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index fcef461..74d495f 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1863,6 +1863,16 @@ int virDomainGetVcpuPinInfo (virDomainPtr domain, int maplen, unsigned int flags); +int virDomainPinHypervisorFlags (virDomainPtr domain, + unsigned char *cpumap, + int maplen, + unsigned int flags); + +int virDomainGetHypervisorPinInfo (virDomainPtr domain, + unsigned char *cpumaps, + int maplen, + unsigned int flags); + /** * VIR_USE_CPU: * @cpumap: pointer to a bit map of real CPUs (in 8-bit bytes) (IN/OUT) diff --git a/src/driver.h b/src/driver.h index 46d9846..a326754 100644 --- a/src/driver.h +++ b/src/driver.h @@ -305,7 +305,16 @@ typedef int unsigned char *cpumaps, int maplen, unsigned int flags); - + typedef int + (*virDrvDomainPinHypervisorFlags) (virDomainPtr domain, + unsigned char *cpumap, + int maplen, + unsigned int flags); +typedef int + (*virDrvDomainGetHypervisorPinInfo) (virDomainPtr domain, + unsigned char *cpumaps, + int maplen, + unsigned int flags); typedef int (*virDrvDomainGetVcpus) (virDomainPtr domain, virVcpuInfoPtr info, @@ -937,6 +946,8 @@ struct _virDriver { virDrvDomainPinVcpu domainPinVcpu; virDrvDomainPinVcpuFlags domainPinVcpuFlags; virDrvDomainGetVcpuPinInfo domainGetVcpuPinInfo; + virDrvDomainPinHypervisorFlags domainPinHypervisorFlags; + virDrvDomainGetHypervisorPinInfo domainGetHypervisorPinInfo; virDrvDomainGetVcpus domainGetVcpus; virDrvDomainGetMaxVcpus domainGetMaxVcpus; virDrvDomainGetSecurityLabel domainGetSecurityLabel; diff --git a/src/libvirt.c b/src/libvirt.c index 8315b4f..a904cc5 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -8845,6 +8845,153 @@ error: } /** + * virDomainPinHypervisorFlags: + * @domain: pointer to domain object, or NULL for Domain0 + * @cpumap: pointer to a bit map of real CPUs (in 8-bit bytes) (IN) + * Each bit set to 1 means that corresponding CPU is usable. + * Bytes are stored in little-endian order: CPU0-7, 8-15... + * In each byte, lowest CPU number is least significant bit. + * @maplen: number of bytes in cpumap, from 1 up to size of CPU map in + * underlying virtualization system (Xen...). + * If maplen < size, missing bytes are set to zero. + * If maplen > size, failure code is returned. + * @flags: bitwise-OR of virDomainModificationImpact + * + * Dynamically change the real CPUs which can be allocated to all hypervisor + * threads. This function may require privileged access to the hypervisor. + * + * @flags may include VIR_DOMAIN_AFFECT_LIVE or VIR_DOMAIN_AFFECT_CONFIG. + * Both flags may be set. + * If VIR_DOMAIN_AFFECT_LIVE is set, the change affects a running domain + * and may fail if domain is not alive. + * If VIR_DOMAIN_AFFECT_CONFIG is set, the change affects persistent state, + * and will fail for transient domains. If neither flag is specified (that is, + * @flags is VIR_DOMAIN_AFFECT_CURRENT), then an inactive domain modifies + * persistent setup, while an active domain is hypervisor-dependent on whether + * just live or both live and persistent state is changed. + * Not all hypervisors can support all flag combinations. + * + * See also virDomainGetHypervisorPinInfo for querying this information. + * + * Returns 0 in case of success, -1 in case of failure. + * + */ +int +virDomainPinHypervisorFlags(virDomainPtr domain, unsigned char *cpumap, + int maplen, unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "cpumap=%p, maplen=%d, flags=%x", + cpumap, maplen, flags); + + virResetLastError(); + + if (!VIR_IS_CONNECTED_DOMAIN(domain)) { + virLibDomainError(VIR_ERR_INVALID_DOMAIN, __FUNCTION__); + virDispatchError(NULL); + return -1; + } + + if (domain->conn->flags & VIR_CONNECT_RO) { + virLibDomainError(VIR_ERR_OPERATION_DENIED, __FUNCTION__); + goto error; + } + + if ((cpumap == NULL) || (maplen < 1)) { + virLibDomainError(VIR_ERR_INVALID_ARG, __FUNCTION__); + goto error; + } + + conn = domain->conn; + + if (conn->driver->domainPinHypervisorFlags) { + int ret; + ret = conn->driver->domainPinHypervisorFlags (domain, cpumap, maplen, flags); + if (ret < 0) + goto error; + return ret; + } + + virLibConnError(VIR_ERR_NO_SUPPORT, __FUNCTION__); + +error: + virDispatchError(domain->conn); + return -1; +} + +/** + * virDomainGetHypervisorPinInfo: + * @domain: pointer to domain object, or NULL for Domain0 + * @cpumap: pointer to a bit map of real CPUs for all hypervisor threads of + * this domain (in 8-bit bytes) (OUT) + * There is only one cpumap for all hypervisor threads. + * Must not be NULL. + * @maplen: the number of bytes in one cpumap, from 1 up to size of CPU map. + * Must be positive. + * @flags: bitwise-OR of virDomainModificationImpact + * Must not be VIR_DOMAIN_AFFECT_LIVE and + * VIR_DOMAIN_AFFECT_CONFIG concurrently. + * + * Query the CPU affinity setting of all hypervisor threads of domain, store + * it in cpumap. + * + * Returns 1 in case of success, + * 0 in case of no hypervisor threads are pined to pcpus, + * -1 in case of failure. + */ +int +virDomainGetHypervisorPinInfo(virDomainPtr domain, unsigned char *cpumap, + int maplen, unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "cpumap=%p, maplen=%d, flags=%x", + cpumap, maplen, flags); + + virResetLastError(); + + if (!VIR_IS_CONNECTED_DOMAIN(domain)) { + virLibDomainError(VIR_ERR_INVALID_DOMAIN, __FUNCTION__); + virDispatchError(NULL); + return -1; + } + + if (!cpumap || maplen <= 0) { + virLibDomainError(VIR_ERR_INVALID_ARG, __FUNCTION__); + goto error; + } + if (INT_MULTIPLY_OVERFLOW(1, maplen)) { + virLibDomainError(VIR_ERR_OVERFLOW, _("input too large: 1 * %d"), + maplen); + goto error; + } + + /* At most one of these two flags should be set. */ + if ((flags & VIR_DOMAIN_AFFECT_LIVE) && + (flags & VIR_DOMAIN_AFFECT_CONFIG)) { + virLibDomainError(VIR_ERR_INVALID_ARG, __FUNCTION__); + goto error; + } + conn = domain->conn; + + if (conn->driver->domainGetHypervisorPinInfo) { + int ret; + ret = conn->driver->domainGetHypervisorPinInfo(domain, cpumap, + maplen, flags); + if (ret < 0) + goto error; + return ret; + } + + virLibConnError(VIR_ERR_NO_SUPPORT, __FUNCTION__); + +error: + virDispatchError(domain->conn); + return -1; +} + +/** * virDomainGetVcpus: * @domain: pointer to domain object, or NULL for Domain0 * @info: pointer to an array of virVcpuInfo structures (OUT) diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index 1a8e58a..0a57331 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -547,6 +547,8 @@ LIBVIRT_0.9.13 { LIBVIRT_0.9.14 { global: virDomainGetHostname; + virDomainPinHypervisorFlags; + virDomainGetHypervisorPinInfo; } LIBVIRT_0.9.13; # .... define new API here using predicted next version number .... -- 1.7.10.2

On Wed, Jul 25, 2012 at 01:26:29PM +0800, tangchen wrote:
From: Tang Chen <tangchen@cn.fujitsu.com>
Introduce 2 APIs for client to use. 1) virDomainPinHypervisorFlags: call remote driver api, such as remoteDomainPinHypervisorFlags. 2) virDomainGetHypervisorPinInfo: call remote driver api, such as remoteDomainGetHypervisorPinInfo.
Not true. These two APIs call underlying driver's corresponding implementation.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- include/libvirt/libvirt.h.in | 10 +++ src/driver.h | 13 +++- src/libvirt.c | 147 ++++++++++++++++++++++++++++++++++++++++++ src/libvirt_public.syms | 2 + 4 files changed, 171 insertions(+), 1 deletion(-)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index fcef461..74d495f 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1863,6 +1863,16 @@ int virDomainGetVcpuPinInfo (virDomainPtr domain, int maplen, unsigned int flags);
+int virDomainPinHypervisorFlags (virDomainPtr domain, + unsigned char *cpumap, + int maplen, + unsigned int flags); + +int virDomainGetHypervisorPinInfo (virDomainPtr domain, + unsigned char *cpumaps, + int maplen, + unsigned int flags); + /** * VIR_USE_CPU: * @cpumap: pointer to a bit map of real CPUs (in 8-bit bytes) (IN/OUT) diff --git a/src/driver.h b/src/driver.h index 46d9846..a326754 100644 --- a/src/driver.h +++ b/src/driver.h @@ -305,7 +305,16 @@ typedef int unsigned char *cpumaps, int maplen, unsigned int flags); - + typedef int + (*virDrvDomainPinHypervisorFlags) (virDomainPtr domain, + unsigned char *cpumap, + int maplen, + unsigned int flags); +typedef int + (*virDrvDomainGetHypervisorPinInfo) (virDomainPtr domain, + unsigned char *cpumaps, + int maplen, + unsigned int flags); typedef int (*virDrvDomainGetVcpus) (virDomainPtr domain, virVcpuInfoPtr info, @@ -937,6 +946,8 @@ struct _virDriver { virDrvDomainPinVcpu domainPinVcpu; virDrvDomainPinVcpuFlags domainPinVcpuFlags; virDrvDomainGetVcpuPinInfo domainGetVcpuPinInfo; + virDrvDomainPinHypervisorFlags domainPinHypervisorFlags; + virDrvDomainGetHypervisorPinInfo domainGetHypervisorPinInfo; virDrvDomainGetVcpus domainGetVcpus; virDrvDomainGetMaxVcpus domainGetMaxVcpus; virDrvDomainGetSecurityLabel domainGetSecurityLabel; diff --git a/src/libvirt.c b/src/libvirt.c index 8315b4f..a904cc5 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -8845,6 +8845,153 @@ error: }
/** + * virDomainPinHypervisorFlags: + * @domain: pointer to domain object, or NULL for Domain0 + * @cpumap: pointer to a bit map of real CPUs (in 8-bit bytes) (IN) + * Each bit set to 1 means that corresponding CPU is usable. + * Bytes are stored in little-endian order: CPU0-7, 8-15... + * In each byte, lowest CPU number is least significant bit. + * @maplen: number of bytes in cpumap, from 1 up to size of CPU map in + * underlying virtualization system (Xen...). + * If maplen < size, missing bytes are set to zero. + * If maplen > size, failure code is returned. + * @flags: bitwise-OR of virDomainModificationImpact + * + * Dynamically change the real CPUs which can be allocated to all hypervisor + * threads. This function may require privileged access to the hypervisor. + * + * @flags may include VIR_DOMAIN_AFFECT_LIVE or VIR_DOMAIN_AFFECT_CONFIG. + * Both flags may be set. + * If VIR_DOMAIN_AFFECT_LIVE is set, the change affects a running domain + * and may fail if domain is not alive. + * If VIR_DOMAIN_AFFECT_CONFIG is set, the change affects persistent state, + * and will fail for transient domains. If neither flag is specified (that is, + * @flags is VIR_DOMAIN_AFFECT_CURRENT), then an inactive domain modifies + * persistent setup, while an active domain is hypervisor-dependent on whether + * just live or both live and persistent state is changed. + * Not all hypervisors can support all flag combinations. + * + * See also virDomainGetHypervisorPinInfo for querying this information. + * + * Returns 0 in case of success, -1 in case of failure. + * + */ +int +virDomainPinHypervisorFlags(virDomainPtr domain, unsigned char *cpumap, + int maplen, unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "cpumap=%p, maplen=%d, flags=%x", + cpumap, maplen, flags); + + virResetLastError(); + + if (!VIR_IS_CONNECTED_DOMAIN(domain)) { + virLibDomainError(VIR_ERR_INVALID_DOMAIN, __FUNCTION__); + virDispatchError(NULL); + return -1; + } + + if (domain->conn->flags & VIR_CONNECT_RO) { + virLibDomainError(VIR_ERR_OPERATION_DENIED, __FUNCTION__); + goto error; + } + + if ((cpumap == NULL) || (maplen < 1)) { + virLibDomainError(VIR_ERR_INVALID_ARG, __FUNCTION__); + goto error; + } + + conn = domain->conn; + + if (conn->driver->domainPinHypervisorFlags) { + int ret; + ret = conn->driver->domainPinHypervisorFlags (domain, cpumap, maplen, flags); + if (ret < 0) + goto error; + return ret; + } + + virLibConnError(VIR_ERR_NO_SUPPORT, __FUNCTION__); + +error: + virDispatchError(domain->conn); + return -1; +} + +/** + * virDomainGetHypervisorPinInfo: + * @domain: pointer to domain object, or NULL for Domain0 + * @cpumap: pointer to a bit map of real CPUs for all hypervisor threads of + * this domain (in 8-bit bytes) (OUT) + * There is only one cpumap for all hypervisor threads. + * Must not be NULL. + * @maplen: the number of bytes in one cpumap, from 1 up to size of CPU map. + * Must be positive. + * @flags: bitwise-OR of virDomainModificationImpact + * Must not be VIR_DOMAIN_AFFECT_LIVE and + * VIR_DOMAIN_AFFECT_CONFIG concurrently. + * + * Query the CPU affinity setting of all hypervisor threads of domain, store + * it in cpumap. + * + * Returns 1 in case of success, + * 0 in case of no hypervisor threads are pined to pcpus, + * -1 in case of failure. + */ +int +virDomainGetHypervisorPinInfo(virDomainPtr domain, unsigned char *cpumap, + int maplen, unsigned int flags) +{ + virConnectPtr conn; + + VIR_DOMAIN_DEBUG(domain, "cpumap=%p, maplen=%d, flags=%x", + cpumap, maplen, flags); + + virResetLastError(); + + if (!VIR_IS_CONNECTED_DOMAIN(domain)) { + virLibDomainError(VIR_ERR_INVALID_DOMAIN, __FUNCTION__); + virDispatchError(NULL); + return -1; + } + + if (!cpumap || maplen <= 0) { + virLibDomainError(VIR_ERR_INVALID_ARG, __FUNCTION__); + goto error; + } + if (INT_MULTIPLY_OVERFLOW(1, maplen)) { + virLibDomainError(VIR_ERR_OVERFLOW, _("input too large: 1 * %d"), + maplen); + goto error; + } + + /* At most one of these two flags should be set. */ + if ((flags & VIR_DOMAIN_AFFECT_LIVE) && + (flags & VIR_DOMAIN_AFFECT_CONFIG)) { + virLibDomainError(VIR_ERR_INVALID_ARG, __FUNCTION__); + goto error; + } + conn = domain->conn; + + if (conn->driver->domainGetHypervisorPinInfo) { + int ret; + ret = conn->driver->domainGetHypervisorPinInfo(domain, cpumap, + maplen, flags); + if (ret < 0) + goto error; + return ret; + } + + virLibConnError(VIR_ERR_NO_SUPPORT, __FUNCTION__); + +error: + virDispatchError(domain->conn); + return -1; +} + +/** * virDomainGetVcpus: * @domain: pointer to domain object, or NULL for Domain0 * @info: pointer to an array of virVcpuInfo structures (OUT) diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index 1a8e58a..0a57331 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -547,6 +547,8 @@ LIBVIRT_0.9.13 { LIBVIRT_0.9.14 { global: virDomainGetHostname; + virDomainPinHypervisorFlags; + virDomainGetHypervisorPinInfo; } LIBVIRT_0.9.13;
# .... define new API here using predicted next version number .... -- 1.7.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Thanks, Hu Tao

From: Tang Chen <tangchen@cn.fujitsu.com> Introduce 2 APIs to support hypervisor threads pin in qemu driver. 1) qemudDomainPinHypervisorFlags: setup hypervisor threads pin info. 2) qemudDomainGetHypervisorPinInfo: get all hypervisor threads pin info. They are similar to qemudDomainPinVcpuFlags and qemudDomainGetVcpuPinInfo. And also, remoteDispatchDomainPinHypervisorFlags and remoteDispatchDomainGetHypervisorPinInfo functions are introduced. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- daemon/remote.c | 103 ++++++++++++++++++++++ src/qemu/qemu_driver.c | 223 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 326 insertions(+) diff --git a/daemon/remote.c b/daemon/remote.c index 80626a2..0e46d18 100644 --- a/daemon/remote.c +++ b/daemon/remote.c @@ -1533,6 +1533,109 @@ no_memory: } static int +remoteDispatchDomainPinHypervisorFlags(virNetServerPtr server ATTRIBUTE_UNUSED, + virNetServerClientPtr client, + virNetMessagePtr msg ATTRIBUTE_UNUSED, + virNetMessageErrorPtr rerr, + remote_domain_pin_hypervisor_flags_args *args) +{ + int rv = -1; + virDomainPtr dom = NULL; + struct daemonClientPrivate *priv = + virNetServerClientGetPrivateData(client); + + if (!priv->conn) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("connection not open")); + goto cleanup; + } + + if (!(dom = get_nonnull_domain(priv->conn, args->dom))) + goto cleanup; + + if (virDomainPinHypervisorFlags(dom, + (unsigned char *) args->cpumap.cpumap_val, + args->cpumap.cpumap_len, + args->flags) < 0) + goto cleanup; + + rv = 0; + +cleanup: + if (rv < 0) + virNetMessageSaveError(rerr); + if (dom) + virDomainFree(dom); + return rv; +} + + +static int +remoteDispatchDomainGetHypervisorPinInfo(virNetServerPtr server ATTRIBUTE_UNUSED, + virNetServerClientPtr client ATTRIBUTE_UNUSED, + virNetMessagePtr msg ATTRIBUTE_UNUSED, + virNetMessageErrorPtr rerr, + remote_domain_get_hypervisor_pin_info_args *args, + remote_domain_get_hypervisor_pin_info_ret *ret) +{ + virDomainPtr dom = NULL; + unsigned char *cpumaps = NULL; + int num; + int rv = -1; + struct daemonClientPrivate *priv = + virNetServerClientGetPrivateData(client); + + if (!priv->conn) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("connection not open")); + goto cleanup; + } + + if (!(dom = get_nonnull_domain(priv->conn, args->dom))) + goto cleanup; + + /* There is only one cpumap struct for all hypervisor threads */ + if (args->ncpumaps != 1) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("ncpumaps != 1")); + goto cleanup; + } + + if (INT_MULTIPLY_OVERFLOW(args->ncpumaps, args->maplen) || + args->ncpumaps * args->maplen > REMOTE_CPUMAPS_MAX) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("maxinfo * maplen > REMOTE_CPUMAPS_MAX")); + goto cleanup; + } + + /* Allocate buffers to take the results */ + if (args->maplen > 0 && + VIR_ALLOC_N(cpumaps, args->maplen) < 0) + goto no_memory; + + if ((num = virDomainGetHypervisorPinInfo(dom, + cpumaps, + args->maplen, + args->flags)) < 0) + goto cleanup; + + ret->num = num; + ret->cpumaps.cpumaps_len = args->maplen; + ret->cpumaps.cpumaps_val = (char *) cpumaps; + cpumaps = NULL; + + rv = 0; + +cleanup: + if (rv < 0) + virNetMessageSaveError(rerr); + VIR_FREE(cpumaps); + if (dom) + virDomainFree(dom); + return rv; + +no_memory: + virReportOOMError(); + goto cleanup; +} + +static int remoteDispatchDomainGetVcpus(virNetServerPtr server ATTRIBUTE_UNUSED, virNetServerClientPtr client ATTRIBUTE_UNUSED, virNetMessagePtr msg ATTRIBUTE_UNUSED, diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7641fa6..5cc8e94 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3851,6 +3851,227 @@ cleanup: } static int +qemudDomainPinHypervisorFlags(virDomainPtr dom, + unsigned char *cpumap, + int maplen, + unsigned int flags) +{ + struct qemud_driver *driver = dom->conn->privateData; + virDomainObjPtr vm; + virCgroupPtr cgroup_dom = NULL; + virCgroupPtr cgroup_hypervisor = NULL; + pid_t pid; + virDomainDefPtr persistentDef = NULL; + int maxcpu, hostcpus; + virNodeInfo nodeinfo; + int ret = -1; + qemuDomainObjPrivatePtr priv; + bool canResetting = true; + int pcpu; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG, -1); + + qemuDriverLock(driver); + vm = virDomainFindByUUID(&driver->domains, dom->uuid); + qemuDriverUnlock(driver); + + if (!vm) { + char uuidstr[VIR_UUID_STRING_BUFLEN]; + virUUIDFormat(dom->uuid, uuidstr); + virReportError(VIR_ERR_NO_DOMAIN, + _("no domain with matching uuid '%s'"), uuidstr); + goto cleanup; + } + + if (virDomainLiveConfigHelperMethod(driver->caps, vm, &flags, + &persistentDef) < 0) + goto cleanup; + + priv = vm->privateData; + + if (nodeGetInfo(dom->conn, &nodeinfo) < 0) + goto cleanup; + hostcpus = VIR_NODEINFO_MAXCPUS(nodeinfo); + maxcpu = maplen * 8; + if (maxcpu > hostcpus) + maxcpu = hostcpus; + /* pinning to all physical cpus means resetting, + * so check if we can reset setting. + */ + for (pcpu = 0; pcpu < hostcpus; pcpu++) { + if ((cpumap[pcpu/8] & (1 << (pcpu % 8))) == 0) { + canResetting = false; + break; + } + } + + pid = vm->pid; + + if (flags & VIR_DOMAIN_AFFECT_LIVE) { + + if (priv->vcpupids != NULL) { + if (virDomainHypervisorPinAdd(vm->def, cpumap, maplen) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to update or add hypervisorpin xml " + "of a running domain")); + goto cleanup; + } + + if (qemuCgroupControllerActive(driver, + VIR_CGROUP_CONTROLLER_CPUSET)) { + /* + * Configure the corresponding cpuset cgroup. + * If no cgroup for domain or hypervisor exists, do nothing. + */ + if (virCgroupForDomain(driver->cgroup, vm->def->name, + &cgroup_dom, 0) == 0) { + if (virCgroupForHypervisor(cgroup_dom, &cgroup_hypervisor, 0) == 0) { + if (qemuSetupCgroupHypervisorPin(cgroup_hypervisor, vm->def) < 0) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("failed to set cpuset.cpus in cgroup" + " for hypervisor threads")); + goto cleanup; + } + } + } + } + + if (canResetting) { + if (virDomainHypervisorPinDel(vm->def) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to delete hypervisorpin xml of " + "a running domain")); + goto cleanup; + } + } + + if (virProcessInfoSetAffinity(pid, cpumap, maplen, maxcpu) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", + _("failed to set cpu affinity for " + "hypervisor threads")); + goto cleanup; + } + } else { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("cpu affinity is not supported")); + goto cleanup; + } + + if (virDomainSaveStatus(driver->caps, driver->stateDir, vm) < 0) + goto cleanup; + } + + if (flags & VIR_DOMAIN_AFFECT_CONFIG) { + + if (canResetting) { + if (virDomainHypervisorPinDel(persistentDef) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to delete hypervisorpin xml of " + "a persistent domain")); + goto cleanup; + } + } else { + if (virDomainHypervisorPinAdd(persistentDef, cpumap, maplen) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to update or add hypervisorpin xml " + "of a persistent domain")); + goto cleanup; + } + } + + ret = virDomainSaveConfig(driver->configDir, persistentDef); + goto cleanup; + } + + ret = 0; + +cleanup: + if (cgroup_hypervisor) + virCgroupFree(&cgroup_hypervisor); + if (cgroup_dom) + virCgroupFree(&cgroup_dom); + + if (vm) + virDomainObjUnlock(vm); + return ret; +} + +static int +qemudDomainGetHypervisorPinInfo(virDomainPtr dom, + unsigned char *cpumaps, + int maplen, + unsigned int flags) +{ + struct qemud_driver *driver = dom->conn->privateData; + virDomainObjPtr vm = NULL; + virNodeInfo nodeinfo; + virDomainDefPtr targetDef = NULL; + int ret = -1; + int maxcpu, hostcpus, pcpu; + virDomainVcpuPinDefPtr hypervisorpin = NULL; + char *cpumask = NULL; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG, -1); + + qemuDriverLock(driver); + vm = virDomainFindByUUID(&driver->domains, dom->uuid); + qemuDriverUnlock(driver); + + if (!vm) { + char uuidstr[VIR_UUID_STRING_BUFLEN]; + virUUIDFormat(dom->uuid, uuidstr); + virReportError(VIR_ERR_NO_DOMAIN, + _("no domain with matching uuid '%s'"), uuidstr); + goto cleanup; + } + + if (virDomainLiveConfigHelperMethod(driver->caps, vm, &flags, + &targetDef) < 0) + goto cleanup; + + if (flags & VIR_DOMAIN_AFFECT_LIVE) + targetDef = vm->def; + + /* Coverity didn't realize that targetDef must be set if we got here. */ + sa_assert(targetDef); + + if (nodeGetInfo(dom->conn, &nodeinfo) < 0) + goto cleanup; + hostcpus = VIR_NODEINFO_MAXCPUS(nodeinfo); + maxcpu = maplen * 8; + if (maxcpu > hostcpus) + maxcpu = hostcpus; + + /* initialize cpumaps */ + memset(cpumaps, 0xff, maplen); + if (maxcpu % 8) { + cpumaps[maplen - 1] &= (1 << maxcpu % 8) - 1; + } + + /* If no hypervisorpin, all cpus should be used */ + hypervisorpin = targetDef->cputune.hypervisorpin; + if (!hypervisorpin) { + ret = 0; + goto cleanup; + } + + cpumask = hypervisorpin->cpumask; + for (pcpu = 0; pcpu < maxcpu; pcpu++) { + if (cpumask[pcpu] == 0) + VIR_UNUSE_CPU(cpumaps, pcpu); + } + + ret = 1; + +cleanup: + if (vm) + virDomainObjUnlock(vm); + return ret; +} + +static int qemudDomainGetVcpus(virDomainPtr dom, virVcpuInfoPtr info, int maxinfo, @@ -13249,6 +13470,8 @@ static virDriver qemuDriver = { .domainPinVcpu = qemudDomainPinVcpu, /* 0.4.4 */ .domainPinVcpuFlags = qemudDomainPinVcpuFlags, /* 0.9.3 */ .domainGetVcpuPinInfo = qemudDomainGetVcpuPinInfo, /* 0.9.3 */ + .domainPinHypervisorFlags = qemudDomainPinHypervisorFlags, /* 0.9.13 */ + .domainGetHypervisorPinInfo = qemudDomainGetHypervisorPinInfo, /* 0.9.13 */ .domainGetVcpus = qemudDomainGetVcpus, /* 0.4.4 */ .domainGetMaxVcpus = qemudDomainGetMaxVcpus, /* 0.4.4 */ .domainGetSecurityLabel = qemudDomainGetSecurityLabel, /* 0.6.1 */ -- 1.7.10.2

On Wed, Jul 25, 2012 at 01:27:04PM +0800, tangchen wrote:
From: Tang Chen <tangchen@cn.fujitsu.com>
Introduce 2 APIs to support hypervisor threads pin in qemu driver. 1) qemudDomainPinHypervisorFlags: setup hypervisor threads pin info. 2) qemudDomainGetHypervisorPinInfo: get all hypervisor threads pin info. They are similar to qemudDomainPinVcpuFlags and qemudDomainGetVcpuPinInfo. And also, remoteDispatchDomainPinHypervisorFlags and remoteDispatchDomainGetHypervisorPinInfo functions are introduced.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- daemon/remote.c | 103 ++++++++++++++++++++++ src/qemu/qemu_driver.c | 223 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 326 insertions(+)
diff --git a/daemon/remote.c b/daemon/remote.c index 80626a2..0e46d18 100644 --- a/daemon/remote.c +++ b/daemon/remote.c @@ -1533,6 +1533,109 @@ no_memory: }
static int +remoteDispatchDomainPinHypervisorFlags(virNetServerPtr server ATTRIBUTE_UNUSED, + virNetServerClientPtr client, + virNetMessagePtr msg ATTRIBUTE_UNUSED, + virNetMessageErrorPtr rerr, + remote_domain_pin_hypervisor_flags_args *args) +{ + int rv = -1; + virDomainPtr dom = NULL; + struct daemonClientPrivate *priv = + virNetServerClientGetPrivateData(client); + + if (!priv->conn) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("connection not open")); + goto cleanup; + } + + if (!(dom = get_nonnull_domain(priv->conn, args->dom))) + goto cleanup; + + if (virDomainPinHypervisorFlags(dom, + (unsigned char *) args->cpumap.cpumap_val, + args->cpumap.cpumap_len, + args->flags) < 0) + goto cleanup; + + rv = 0; + +cleanup: + if (rv < 0) + virNetMessageSaveError(rerr); + if (dom) + virDomainFree(dom); + return rv; +} + + +static int +remoteDispatchDomainGetHypervisorPinInfo(virNetServerPtr server ATTRIBUTE_UNUSED, + virNetServerClientPtr client ATTRIBUTE_UNUSED, + virNetMessagePtr msg ATTRIBUTE_UNUSED, + virNetMessageErrorPtr rerr, + remote_domain_get_hypervisor_pin_info_args *args, + remote_domain_get_hypervisor_pin_info_ret *ret) +{ + virDomainPtr dom = NULL; + unsigned char *cpumaps = NULL; + int num; + int rv = -1; + struct daemonClientPrivate *priv = + virNetServerClientGetPrivateData(client); + + if (!priv->conn) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("connection not open")); + goto cleanup; + } + + if (!(dom = get_nonnull_domain(priv->conn, args->dom))) + goto cleanup; + + /* There is only one cpumap struct for all hypervisor threads */ + if (args->ncpumaps != 1) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("ncpumaps != 1")); + goto cleanup; + } + + if (INT_MULTIPLY_OVERFLOW(args->ncpumaps, args->maplen) || + args->ncpumaps * args->maplen > REMOTE_CPUMAPS_MAX) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("maxinfo * maplen > REMOTE_CPUMAPS_MAX")); + goto cleanup; + } + + /* Allocate buffers to take the results */ + if (args->maplen > 0 && + VIR_ALLOC_N(cpumaps, args->maplen) < 0) + goto no_memory; + + if ((num = virDomainGetHypervisorPinInfo(dom, + cpumaps, + args->maplen, + args->flags)) < 0) + goto cleanup; + + ret->num = num; + ret->cpumaps.cpumaps_len = args->maplen; + ret->cpumaps.cpumaps_val = (char *) cpumaps; + cpumaps = NULL; + + rv = 0; + +cleanup: + if (rv < 0) + virNetMessageSaveError(rerr); + VIR_FREE(cpumaps); + if (dom) + virDomainFree(dom); + return rv; + +no_memory: + virReportOOMError(); + goto cleanup; +} + +static int
Please move this part to patch 11.
remoteDispatchDomainGetVcpus(virNetServerPtr server ATTRIBUTE_UNUSED, virNetServerClientPtr client ATTRIBUTE_UNUSED, virNetMessagePtr msg ATTRIBUTE_UNUSED, diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7641fa6..5cc8e94 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3851,6 +3851,227 @@ cleanup: }
static int +qemudDomainPinHypervisorFlags(virDomainPtr dom, + unsigned char *cpumap, + int maplen, + unsigned int flags) +{ + struct qemud_driver *driver = dom->conn->privateData; + virDomainObjPtr vm; + virCgroupPtr cgroup_dom = NULL; + virCgroupPtr cgroup_hypervisor = NULL; + pid_t pid; + virDomainDefPtr persistentDef = NULL; + int maxcpu, hostcpus; + virNodeInfo nodeinfo; + int ret = -1; + qemuDomainObjPrivatePtr priv; + bool canResetting = true; + int pcpu; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG, -1); + + qemuDriverLock(driver); + vm = virDomainFindByUUID(&driver->domains, dom->uuid); + qemuDriverUnlock(driver); + + if (!vm) { + char uuidstr[VIR_UUID_STRING_BUFLEN]; + virUUIDFormat(dom->uuid, uuidstr); + virReportError(VIR_ERR_NO_DOMAIN, + _("no domain with matching uuid '%s'"), uuidstr); + goto cleanup; + } + + if (virDomainLiveConfigHelperMethod(driver->caps, vm, &flags, + &persistentDef) < 0) + goto cleanup; + + priv = vm->privateData; + + if (nodeGetInfo(dom->conn, &nodeinfo) < 0) + goto cleanup; + hostcpus = VIR_NODEINFO_MAXCPUS(nodeinfo); + maxcpu = maplen * 8; + if (maxcpu > hostcpus) + maxcpu = hostcpus; + /* pinning to all physical cpus means resetting, + * so check if we can reset setting. + */ + for (pcpu = 0; pcpu < hostcpus; pcpu++) { + if ((cpumap[pcpu/8] & (1 << (pcpu % 8))) == 0) { + canResetting = false; + break; + } + } + + pid = vm->pid; + + if (flags & VIR_DOMAIN_AFFECT_LIVE) { + + if (priv->vcpupids != NULL) { + if (virDomainHypervisorPinAdd(vm->def, cpumap, maplen) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to update or add hypervisorpin xml " + "of a running domain")); + goto cleanup; + } + + if (qemuCgroupControllerActive(driver, + VIR_CGROUP_CONTROLLER_CPUSET)) { + /* + * Configure the corresponding cpuset cgroup. + * If no cgroup for domain or hypervisor exists, do nothing. + */ + if (virCgroupForDomain(driver->cgroup, vm->def->name, + &cgroup_dom, 0) == 0) { + if (virCgroupForHypervisor(cgroup_dom, &cgroup_hypervisor, 0) == 0) { + if (qemuSetupCgroupHypervisorPin(cgroup_hypervisor, vm->def) < 0) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("failed to set cpuset.cpus in cgroup" + " for hypervisor threads")); + goto cleanup; + } + } + } + } + + if (canResetting) { + if (virDomainHypervisorPinDel(vm->def) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to delete hypervisorpin xml of " + "a running domain")); + goto cleanup; + } + } + + if (virProcessInfoSetAffinity(pid, cpumap, maplen, maxcpu) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", + _("failed to set cpu affinity for " + "hypervisor threads")); + goto cleanup; + } + } else { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("cpu affinity is not supported")); + goto cleanup; + } + + if (virDomainSaveStatus(driver->caps, driver->stateDir, vm) < 0) + goto cleanup; + } + + if (flags & VIR_DOMAIN_AFFECT_CONFIG) { + + if (canResetting) { + if (virDomainHypervisorPinDel(persistentDef) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to delete hypervisorpin xml of " + "a persistent domain")); + goto cleanup; + } + } else { + if (virDomainHypervisorPinAdd(persistentDef, cpumap, maplen) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("failed to update or add hypervisorpin xml " + "of a persistent domain")); + goto cleanup; + } + } + + ret = virDomainSaveConfig(driver->configDir, persistentDef); + goto cleanup; + } + + ret = 0; + +cleanup: + if (cgroup_hypervisor) + virCgroupFree(&cgroup_hypervisor); + if (cgroup_dom) + virCgroupFree(&cgroup_dom); + + if (vm) + virDomainObjUnlock(vm); + return ret; +} + +static int +qemudDomainGetHypervisorPinInfo(virDomainPtr dom, + unsigned char *cpumaps, + int maplen, + unsigned int flags) +{ + struct qemud_driver *driver = dom->conn->privateData; + virDomainObjPtr vm = NULL; + virNodeInfo nodeinfo; + virDomainDefPtr targetDef = NULL; + int ret = -1; + int maxcpu, hostcpus, pcpu; + virDomainVcpuPinDefPtr hypervisorpin = NULL; + char *cpumask = NULL; + + virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | + VIR_DOMAIN_AFFECT_CONFIG, -1); + + qemuDriverLock(driver); + vm = virDomainFindByUUID(&driver->domains, dom->uuid); + qemuDriverUnlock(driver); + + if (!vm) { + char uuidstr[VIR_UUID_STRING_BUFLEN]; + virUUIDFormat(dom->uuid, uuidstr); + virReportError(VIR_ERR_NO_DOMAIN, + _("no domain with matching uuid '%s'"), uuidstr); + goto cleanup; + } + + if (virDomainLiveConfigHelperMethod(driver->caps, vm, &flags, + &targetDef) < 0) + goto cleanup; + + if (flags & VIR_DOMAIN_AFFECT_LIVE) + targetDef = vm->def; + + /* Coverity didn't realize that targetDef must be set if we got here. */ + sa_assert(targetDef); + + if (nodeGetInfo(dom->conn, &nodeinfo) < 0) + goto cleanup; + hostcpus = VIR_NODEINFO_MAXCPUS(nodeinfo); + maxcpu = maplen * 8; + if (maxcpu > hostcpus) + maxcpu = hostcpus; + + /* initialize cpumaps */ + memset(cpumaps, 0xff, maplen); + if (maxcpu % 8) { + cpumaps[maplen - 1] &= (1 << maxcpu % 8) - 1; + } + + /* If no hypervisorpin, all cpus should be used */ + hypervisorpin = targetDef->cputune.hypervisorpin; + if (!hypervisorpin) { + ret = 0; + goto cleanup; + } + + cpumask = hypervisorpin->cpumask; + for (pcpu = 0; pcpu < maxcpu; pcpu++) { + if (cpumask[pcpu] == 0) + VIR_UNUSE_CPU(cpumaps, pcpu); + } + + ret = 1; + +cleanup: + if (vm) + virDomainObjUnlock(vm); + return ret; +} + +static int qemudDomainGetVcpus(virDomainPtr dom, virVcpuInfoPtr info, int maxinfo, @@ -13249,6 +13470,8 @@ static virDriver qemuDriver = { .domainPinVcpu = qemudDomainPinVcpu, /* 0.4.4 */ .domainPinVcpuFlags = qemudDomainPinVcpuFlags, /* 0.9.3 */ .domainGetVcpuPinInfo = qemudDomainGetVcpuPinInfo, /* 0.9.3 */ + .domainPinHypervisorFlags = qemudDomainPinHypervisorFlags, /* 0.9.13 */ + .domainGetHypervisorPinInfo = qemudDomainGetHypervisorPinInfo, /* 0.9.13 */ .domainGetVcpus = qemudDomainGetVcpus, /* 0.4.4 */ .domainGetMaxVcpus = qemudDomainGetMaxVcpus, /* 0.4.4 */ .domainGetSecurityLabel = qemudDomainGetSecurityLabel, /* 0.6.1 */ -- 1.7.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Thanks, Hu Tao

From: Tang Chen <tangchen@cn.fujitsu.com> Introduce 2 APIs to support hypervisor threads in remote driver. 1) remoteDomainPinHypervisorFlags: call driver api, such as qemudDomainPinHypervisorFlags. 2) remoteDomainGetHypervisorPinInfo: call driver api, such as qemudDomainGetHypervisorPinInfo. They are similar to remoteDomainPinVcpuFlags and remoteDomainGetVcpuPinInfo. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/remote/remote_driver.c | 102 ++++++++++++++++++++++++++++++++++++++++++ src/remote/remote_protocol.x | 23 +++++++++- src/remote_protocol-structs | 24 ++++++++++ 3 files changed, 148 insertions(+), 1 deletion(-) diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index a69e69a..6e33d67 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -1810,6 +1810,106 @@ done: } static int +remoteDomainPinHypervisorFlags (virDomainPtr dom, + unsigned char *cpumap, + int cpumaplen, + unsigned int flags) +{ + int rv = -1; + struct private_data *priv = dom->conn->privateData; + remote_domain_pin_hypervisor_flags_args args; + + remoteDriverLock(priv); + + if (cpumaplen > REMOTE_CPUMAP_MAX) { + virReportError(VIR_ERR_RPC, + _("%s length greater than maximum: %d > %d"), + "cpumap", (int)cpumaplen, REMOTE_CPUMAP_MAX); + goto done; + } + + make_nonnull_domain(&args.dom, dom); + args.vcpu = -1; + args.cpumap.cpumap_val = (char *)cpumap; + args.cpumap.cpumap_len = cpumaplen; + args.flags = flags; + + if (call(dom->conn, priv, 0, REMOTE_PROC_DOMAIN_PIN_HYPERVISOR_FLAGS, + (xdrproc_t) xdr_remote_domain_pin_hypervisor_flags_args, + (char *) &args, + (xdrproc_t) xdr_void, (char *) NULL) == -1) { + goto done; + } + + rv = 0; + +done: + remoteDriverUnlock(priv); + return rv; +} + + +static int +remoteDomainGetHypervisorPinInfo (virDomainPtr domain, + unsigned char *cpumaps, + int maplen, + unsigned int flags) +{ + int rv = -1; + int i; + remote_domain_get_hypervisor_pin_info_args args; + remote_domain_get_hypervisor_pin_info_ret ret; + struct private_data *priv = domain->conn->privateData; + + remoteDriverLock(priv); + + /* There is only one cpumap for all hypervisor threads */ + if (INT_MULTIPLY_OVERFLOW(1, maplen) || + maplen > REMOTE_CPUMAPS_MAX) { + virReportError(VIR_ERR_RPC, + _("vCPU map buffer length exceeds maximum: %d > %d"), + maplen, REMOTE_CPUMAPS_MAX); + goto done; + } + + make_nonnull_domain(&args.dom, domain); + args.ncpumaps = 1; + args.maplen = maplen; + args.flags = flags; + + memset(&ret, 0, sizeof ret); + + if (call (domain->conn, priv, 0, REMOTE_PROC_DOMAIN_GET_HYPERVISOR_PIN_INFO, + (xdrproc_t) xdr_remote_domain_get_hypervisor_pin_info_args, + (char *) &args, + (xdrproc_t) xdr_remote_domain_get_hypervisor_pin_info_ret, + (char *) &ret) == -1) + goto done; + + if (ret.cpumaps.cpumaps_len > maplen) { + virReportError(VIR_ERR_RPC, + _("host reports map buffer length exceeds maximum: %d > %d"), + ret.cpumaps.cpumaps_len, maplen); + goto cleanup; + } + + memset(cpumaps, 0, maplen); + + for (i = 0; i < ret.cpumaps.cpumaps_len; ++i) + cpumaps[i] = ret.cpumaps.cpumaps_val[i]; + + rv = ret.num; + +cleanup: + xdr_free ((xdrproc_t) xdr_remote_domain_get_hypervisor_pin_info_ret, + (char *) &ret); + +done: + remoteDriverUnlock(priv); + return rv; +} + +static int remoteDomainGetVcpus (virDomainPtr domain, virVcpuInfoPtr info, int maxinfo, @@ -5235,6 +5335,8 @@ static virDriver remote_driver = { .domainPinVcpu = remoteDomainPinVcpu, /* 0.3.0 */ .domainPinVcpuFlags = remoteDomainPinVcpuFlags, /* 0.9.3 */ .domainGetVcpuPinInfo = remoteDomainGetVcpuPinInfo, /* 0.9.3 */ + .domainPinHypervisorFlags = remoteDomainPinHypervisorFlags, /* 0.9.13 */ + .domainGetHypervisorPinInfo = remoteDomainGetHypervisorPinInfo, /* 0.9.13 */ .domainGetVcpus = remoteDomainGetVcpus, /* 0.3.0 */ .domainGetMaxVcpus = remoteDomainGetMaxVcpus, /* 0.3.0 */ .domainGetSecurityLabel = remoteDomainGetSecurityLabel, /* 0.6.1 */ diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index dd460d4..b5b23e0 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -1054,6 +1054,25 @@ struct remote_domain_get_vcpu_pin_info_ret { int num; }; +struct remote_domain_pin_hypervisor_flags_args { + remote_nonnull_domain dom; + unsigned int vcpu; + opaque cpumap<REMOTE_CPUMAP_MAX>; /* (unsigned char *) */ + unsigned int flags; +}; + +struct remote_domain_get_hypervisor_pin_info_args { + remote_nonnull_domain dom; + int ncpumaps; + int maplen; + unsigned int flags; +}; + +struct remote_domain_get_hypervisor_pin_info_ret { + opaque cpumaps<REMOTE_CPUMAPS_MAX>; + int num; +}; + struct remote_domain_get_vcpus_args { remote_nonnull_domain dom; int maxinfo; @@ -2854,7 +2873,9 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_LIST_ALL_SNAPSHOTS = 274, /* skipgen skipgen priority:high */ REMOTE_PROC_DOMAIN_SNAPSHOT_LIST_ALL_CHILDREN = 275, /* skipgen skipgen priority:high */ REMOTE_PROC_DOMAIN_EVENT_BALLOON_CHANGE = 276, /* autogen autogen */ - REMOTE_PROC_DOMAIN_GET_HOSTNAME = 277 /* autogen autogen */ + REMOTE_PROC_DOMAIN_GET_HOSTNAME = 277, /* autogen autogen */ + REMOTE_PROC_DOMAIN_PIN_HYPERVISOR_FLAGS = 278, /* skipgen skipgen */ + REMOTE_PROC_DOMAIN_GET_HYPERVISOR_PIN_INFO = 279 /* skipgen skipgen */ /* * Notice how the entries are grouped in sets of 10 ? diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 8d09138..b9f5260 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -718,6 +718,28 @@ struct remote_domain_get_vcpu_pin_info_ret { } cpumaps; int num; }; +struct remote_domain_pin_hypervisor_flags_args { + remote_nonnull_domain dom; + u_int vcpu; + struct { + u_int cpumap_len; + char * cpumap_val; + } cpumap; + u_int flags; +}; +struct remote_domain_get_hypervisor_pin_info_args { + remote_nonnull_domain dom; + int ncpumaps; + int maplen; + u_int flags; +}; +struct remote_domain_get_hypervisor_pin_info_ret { + struct { + u_int cpumaps_len; + char * cpumaps_val; + } cpumaps; + int num; +}; struct remote_domain_get_vcpus_args { remote_nonnull_domain dom; int maxinfo; @@ -2259,4 +2281,6 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_SNAPSHOT_LIST_ALL_CHILDREN = 275, REMOTE_PROC_DOMAIN_EVENT_BALLOON_CHANGE = 276, REMOTE_PROC_DOMAIN_GET_HOSTNAME = 277, + REMOTE_PROC_DOMAIN_PIN_HYPERVISOR_FLAGS = 278, + REMOTE_PROC_DOMAIN_GET_HYPERVISOR_PIN_INFO = 279, }; -- 1.7.10.2

On Wed, Jul 25, 2012 at 01:27:40PM +0800, tangchen wrote:
From: Tang Chen <tangchen@cn.fujitsu.com>
Introduce 2 APIs to support hypervisor threads in remote driver. 1) remoteDomainPinHypervisorFlags: call driver api, such as qemudDomainPinHypervisorFlags. 2) remoteDomainGetHypervisorPinInfo: call driver api, such as qemudDomainGetHypervisorPinInfo. They are similar to remoteDomainPinVcpuFlags and remoteDomainGetVcpuPinInfo.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- src/remote/remote_driver.c | 102 ++++++++++++++++++++++++++++++++++++++++++ src/remote/remote_protocol.x | 23 +++++++++- src/remote_protocol-structs | 24 ++++++++++ 3 files changed, 148 insertions(+), 1 deletion(-)
diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index a69e69a..6e33d67 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -1810,6 +1810,106 @@ done: }
static int +remoteDomainPinHypervisorFlags (virDomainPtr dom, + unsigned char *cpumap, + int cpumaplen, + unsigned int flags) +{ + int rv = -1; + struct private_data *priv = dom->conn->privateData; + remote_domain_pin_hypervisor_flags_args args; + + remoteDriverLock(priv); + + if (cpumaplen > REMOTE_CPUMAP_MAX) { + virReportError(VIR_ERR_RPC, + _("%s length greater than maximum: %d > %d"), + "cpumap", (int)cpumaplen, REMOTE_CPUMAP_MAX); + goto done; + } + + make_nonnull_domain(&args.dom, dom); + args.vcpu = -1; + args.cpumap.cpumap_val = (char *)cpumap; + args.cpumap.cpumap_len = cpumaplen; + args.flags = flags; + + if (call(dom->conn, priv, 0, REMOTE_PROC_DOMAIN_PIN_HYPERVISOR_FLAGS, + (xdrproc_t) xdr_remote_domain_pin_hypervisor_flags_args, + (char *) &args, + (xdrproc_t) xdr_void, (char *) NULL) == -1) { + goto done; + } + + rv = 0; + +done: + remoteDriverUnlock(priv); + return rv; +} + + +static int +remoteDomainGetHypervisorPinInfo (virDomainPtr domain, + unsigned char *cpumaps, + int maplen, + unsigned int flags) +{ + int rv = -1; + int i; + remote_domain_get_hypervisor_pin_info_args args; + remote_domain_get_hypervisor_pin_info_ret ret; + struct private_data *priv = domain->conn->privateData; + + remoteDriverLock(priv); + + /* There is only one cpumap for all hypervisor threads */ + if (INT_MULTIPLY_OVERFLOW(1, maplen) || + maplen > REMOTE_CPUMAPS_MAX) { + virReportError(VIR_ERR_RPC, + _("vCPU map buffer length exceeds maximum: %d > %d"), + maplen, REMOTE_CPUMAPS_MAX); + goto done; + } + + make_nonnull_domain(&args.dom, domain); + args.ncpumaps = 1; + args.maplen = maplen; + args.flags = flags; + + memset(&ret, 0, sizeof ret);
sizeof(ret)
+ + if (call (domain->conn, priv, 0, REMOTE_PROC_DOMAIN_GET_HYPERVISOR_PIN_INFO, + (xdrproc_t) xdr_remote_domain_get_hypervisor_pin_info_args, + (char *) &args, + (xdrproc_t) xdr_remote_domain_get_hypervisor_pin_info_ret, + (char *) &ret) == -1) + goto done; + + if (ret.cpumaps.cpumaps_len > maplen) { + virReportError(VIR_ERR_RPC, + _("host reports map buffer length exceeds maximum: %d > %d"), + ret.cpumaps.cpumaps_len, maplen); + goto cleanup; + } + + memset(cpumaps, 0, maplen); + + for (i = 0; i < ret.cpumaps.cpumaps_len; ++i) + cpumaps[i] = ret.cpumaps.cpumaps_val[i]; + + rv = ret.num; + +cleanup: + xdr_free ((xdrproc_t) xdr_remote_domain_get_hypervisor_pin_info_ret, + (char *) &ret); + +done: + remoteDriverUnlock(priv); + return rv; +} + +static int remoteDomainGetVcpus (virDomainPtr domain, virVcpuInfoPtr info, int maxinfo, @@ -5235,6 +5335,8 @@ static virDriver remote_driver = { .domainPinVcpu = remoteDomainPinVcpu, /* 0.3.0 */ .domainPinVcpuFlags = remoteDomainPinVcpuFlags, /* 0.9.3 */ .domainGetVcpuPinInfo = remoteDomainGetVcpuPinInfo, /* 0.9.3 */ + .domainPinHypervisorFlags = remoteDomainPinHypervisorFlags, /* 0.9.13 */ + .domainGetHypervisorPinInfo = remoteDomainGetHypervisorPinInfo, /* 0.9.13 */
version number should be 0.9.14 here.
.domainGetVcpus = remoteDomainGetVcpus, /* 0.3.0 */ .domainGetMaxVcpus = remoteDomainGetMaxVcpus, /* 0.3.0 */ .domainGetSecurityLabel = remoteDomainGetSecurityLabel, /* 0.6.1 */ diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index dd460d4..b5b23e0 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -1054,6 +1054,25 @@ struct remote_domain_get_vcpu_pin_info_ret { int num; };
+struct remote_domain_pin_hypervisor_flags_args { + remote_nonnull_domain dom; + unsigned int vcpu; + opaque cpumap<REMOTE_CPUMAP_MAX>; /* (unsigned char *) */ + unsigned int flags; +}; + +struct remote_domain_get_hypervisor_pin_info_args { + remote_nonnull_domain dom; + int ncpumaps; + int maplen; + unsigned int flags; +}; + +struct remote_domain_get_hypervisor_pin_info_ret { + opaque cpumaps<REMOTE_CPUMAPS_MAX>; + int num; +}; + struct remote_domain_get_vcpus_args { remote_nonnull_domain dom; int maxinfo; @@ -2854,7 +2873,9 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_LIST_ALL_SNAPSHOTS = 274, /* skipgen skipgen priority:high */ REMOTE_PROC_DOMAIN_SNAPSHOT_LIST_ALL_CHILDREN = 275, /* skipgen skipgen priority:high */ REMOTE_PROC_DOMAIN_EVENT_BALLOON_CHANGE = 276, /* autogen autogen */ - REMOTE_PROC_DOMAIN_GET_HOSTNAME = 277 /* autogen autogen */ + REMOTE_PROC_DOMAIN_GET_HOSTNAME = 277, /* autogen autogen */ + REMOTE_PROC_DOMAIN_PIN_HYPERVISOR_FLAGS = 278, /* skipgen skipgen */ + REMOTE_PROC_DOMAIN_GET_HYPERVISOR_PIN_INFO = 279 /* skipgen skipgen */
/* * Notice how the entries are grouped in sets of 10 ? diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 8d09138..b9f5260 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -718,6 +718,28 @@ struct remote_domain_get_vcpu_pin_info_ret { } cpumaps; int num; }; +struct remote_domain_pin_hypervisor_flags_args { + remote_nonnull_domain dom; + u_int vcpu; + struct { + u_int cpumap_len; + char * cpumap_val; + } cpumap; + u_int flags; +}; +struct remote_domain_get_hypervisor_pin_info_args { + remote_nonnull_domain dom; + int ncpumaps; + int maplen; + u_int flags; +}; +struct remote_domain_get_hypervisor_pin_info_ret { + struct { + u_int cpumaps_len; + char * cpumaps_val; + } cpumaps; + int num; +}; struct remote_domain_get_vcpus_args { remote_nonnull_domain dom; int maxinfo; @@ -2259,4 +2281,6 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_SNAPSHOT_LIST_ALL_CHILDREN = 275, REMOTE_PROC_DOMAIN_EVENT_BALLOON_CHANGE = 276, REMOTE_PROC_DOMAIN_GET_HOSTNAME = 277, + REMOTE_PROC_DOMAIN_PIN_HYPERVISOR_FLAGS = 278, + REMOTE_PROC_DOMAIN_GET_HYPERVISOR_PIN_INFO = 279, }; -- 1.7.10.2
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Thanks, Hu Tao

From: Tang Chen <tangchen@cn.fujitsu.com> Modify vcpupin command to support hypervisor threads pin. 1) add "--hypervisor" option to get hypervisor threads info. 2) add "--hypervisor cpuset" to set hypervisor threads to specified cpuset. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> --- tests/vcpupin | 6 +-- tools/virsh.c | 147 ++++++++++++++++++++++++++++++++++++------------------- tools/virsh.pod | 16 +++--- 3 files changed, 111 insertions(+), 58 deletions(-) diff --git a/tests/vcpupin b/tests/vcpupin index 5952862..ffd16fa 100755 --- a/tests/vcpupin +++ b/tests/vcpupin @@ -30,16 +30,16 @@ fi fail=0 # Invalid syntax. -$abs_top_builddir/tools/virsh --connect test:///default vcpupin test a 0,1 > out 2>&1 +$abs_top_builddir/tools/virsh --connect test:///default vcpupin test a --vcpu 0,1 > out 2>&1 test $? = 1 || fail=1 cat <<\EOF > exp || fail=1 -error: vcpupin: Invalid or missing vCPU number. +error: vcpupin: Invalid or missing vCPU number, or missing --hypervisor option. EOF compare exp out || fail=1 # An out-of-range vCPU number deserves a diagnostic, too. -$abs_top_builddir/tools/virsh --connect test:///default vcpupin test 100 0,1 > out 2>&1 +$abs_top_builddir/tools/virsh --connect test:///default vcpupin test --vcpu 100 0,1 > out 2>&1 test $? = 1 || fail=1 cat <<\EOF > exp || fail=1 error: vcpupin: Invalid vCPU number. diff --git a/tools/virsh.c b/tools/virsh.c index 6bb0fc8..c12fb41 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -5466,14 +5466,15 @@ cmdVcpuinfo(vshControl *ctl, const vshCmd *cmd) * "vcpupin" command */ static const vshCmdInfo info_vcpupin[] = { - {"help", N_("control or query domain vcpu affinity")}, - {"desc", N_("Pin domain VCPUs to host physical CPUs.")}, + {"help", N_("control or query domain vcpu and hypervisor threads affinities")}, + {"desc", N_("Pin domain VCPUs or hypervisor threads to host physical CPUs.")}, {NULL, NULL} }; static const vshCmdOptDef opts_vcpupin[] = { {"domain", VSH_OT_DATA, VSH_OFLAG_REQ, N_("domain name, id or uuid")}, - {"vcpu", VSH_OT_INT, 0, N_("vcpu number")}, + {"vcpu", VSH_OT_INT, VSH_OFLAG_REQ_OPT, N_("vcpu number")}, + {"hypervisor", VSH_OT_BOOL, VSH_OFLAG_REQ_OPT, N_("pin hypervisor threads")}, {"cpulist", VSH_OT_DATA, VSH_OFLAG_EMPTY_OK, N_("host cpu number(s) to set, or omit option to query")}, {"config", VSH_OT_BOOL, 0, N_("affect next boot")}, @@ -5482,6 +5483,45 @@ static const vshCmdOptDef opts_vcpupin[] = { {NULL, 0, 0, NULL} }; +/* + * Helper function to print vcpupin and hypervisorpin info. + */ +static bool +printPinInfo(unsigned char *cpumaps, size_t cpumaplen, + int maxcpu, int vcpuindex) +{ + int cpu, lastcpu; + bool bit, lastbit, isInvert; + + if (!cpumaps || cpumaplen <= 0 || maxcpu <= 0 || vcpuindex < 0) { + return false; + } + + bit = lastbit = isInvert = false; + lastcpu = -1; + + for (cpu = 0; cpu < maxcpu; cpu++) { + bit = VIR_CPU_USABLE(cpumaps, cpumaplen, vcpuindex, cpu); + + isInvert = (bit ^ lastbit); + if (bit && isInvert) { + if (lastcpu == -1) + vshPrint(ctl, "%d", cpu); + else + vshPrint(ctl, ",%d", cpu); + lastcpu = cpu; + } + if (!bit && isInvert && lastcpu != cpu - 1) + vshPrint(ctl, "-%d", cpu - 1); + lastbit = bit; + } + if (bit && !isInvert) { + vshPrint(ctl, "-%d", maxcpu - 1); + } + + return true; +} + static bool cmdVcpuPin(vshControl *ctl, const vshCmd *cmd) { @@ -5494,13 +5534,13 @@ cmdVcpuPin(vshControl *ctl, const vshCmd *cmd) unsigned char *cpumap = NULL; unsigned char *cpumaps = NULL; size_t cpumaplen; - bool bit, lastbit, isInvert; - int i, cpu, lastcpu, maxcpu, ncpus; + int i, cpu, lastcpu, maxcpu, ncpus, nhyper; bool unuse = false; const char *cur; bool config = vshCommandOptBool(cmd, "config"); bool live = vshCommandOptBool(cmd, "live"); bool current = vshCommandOptBool(cmd, "current"); + bool hypervisor = vshCommandOptBool(cmd, "hypervisor"); bool query = false; /* Query mode if no cpulist */ unsigned int flags = 0; @@ -5535,8 +5575,18 @@ cmdVcpuPin(vshControl *ctl, const vshCmd *cmd) /* In query mode, "vcpu" is optional */ if (vshCommandOptInt(cmd, "vcpu", &vcpu) < !query) { - vshError(ctl, "%s", - _("vcpupin: Invalid or missing vCPU number.")); + if (!hypervisor) { + vshError(ctl, "%s", + _("vcpupin: Invalid or missing vCPU number, " + "or missing --hypervisor option.")); + virDomainFree(dom); + return false; + } + } + + if (hypervisor && vcpu != -1) { + vshError(ctl, "%s", _("vcpupin: --hypervisor must be specified " + "exclusively to --vcpu.")); virDomainFree(dom); return false; } @@ -5568,47 +5618,45 @@ cmdVcpuPin(vshControl *ctl, const vshCmd *cmd) if (flags == -1) flags = VIR_DOMAIN_AFFECT_CURRENT; - cpumaps = vshMalloc(ctl, info.nrVirtCpu * cpumaplen); - if ((ncpus = virDomainGetVcpuPinInfo(dom, info.nrVirtCpu, - cpumaps, cpumaplen, flags)) >= 0) { - - vshPrint(ctl, "%s %s\n", _("VCPU:"), _("CPU Affinity")); - vshPrint(ctl, "----------------------------------\n"); - for (i = 0; i < ncpus; i++) { - - if (vcpu != -1 && i != vcpu) - continue; - - bit = lastbit = isInvert = false; - lastcpu = -1; - - vshPrint(ctl, "%4d: ", i); - for (cpu = 0; cpu < maxcpu; cpu++) { - - bit = VIR_CPU_USABLE(cpumaps, cpumaplen, i, cpu); - - isInvert = (bit ^ lastbit); - if (bit && isInvert) { - if (lastcpu == -1) - vshPrint(ctl, "%d", cpu); - else - vshPrint(ctl, ",%d", cpu); - lastcpu = cpu; - } - if (!bit && isInvert && lastcpu != cpu - 1) - vshPrint(ctl, "-%d", cpu - 1); - lastbit = bit; - } - if (bit && !isInvert) { - vshPrint(ctl, "-%d", maxcpu - 1); - } - vshPrint(ctl, "\n"); + if (!hypervisor) { + cpumaps = vshMalloc(ctl, info.nrVirtCpu * cpumaplen); + if ((ncpus = virDomainGetVcpuPinInfo(dom, info.nrVirtCpu, + cpumaps, cpumaplen, flags)) >= 0) { + vshPrint(ctl, "%s %s\n", _("VCPU:"), _("CPU Affinity")); + vshPrint(ctl, "----------------------------------\n"); + for (i = 0; i < ncpus; i++) { + if (vcpu != -1 && i != vcpu) + continue; + + vshPrint(ctl, "%4d: ", i); + ret = printPinInfo(cpumaps, cpumaplen, maxcpu, i); + vshPrint(ctl, "\n"); + if (!ret) + break; + } + } else { + ret = false; } + VIR_FREE(cpumaps); + } - } else { - ret = false; + if (vcpu == -1) { + cpumaps = vshMalloc(ctl, cpumaplen); + if ((nhyper = virDomainGetHypervisorPinInfo(dom, cpumaps, + cpumaplen, flags)) >= 0) { + if (!hypervisor) + vshPrint(ctl, "\n"); + vshPrint(ctl, "%s %s\n", _("Hypervisor:"), _("CPU Affinity")); + vshPrint(ctl, "----------------------------------\n"); + + vshPrint(ctl, " *: "); + ret = printPinInfo(cpumaps, cpumaplen, maxcpu, 0); + vshPrint(ctl, "\n"); + } else if (nhyper < 0) { + ret = false; + } + VIR_FREE(cpumaps); } - VIR_FREE(cpumaps); goto cleanup; } @@ -5686,13 +5734,14 @@ cmdVcpuPin(vshControl *ctl, const vshCmd *cmd) } if (flags == -1) { - if (virDomainPinVcpu(dom, vcpu, cpumap, cpumaplen) != 0) { + flags = VIR_DOMAIN_AFFECT_LIVE; + } + if (!hypervisor) { + if (virDomainPinVcpuFlags(dom, vcpu, cpumap, cpumaplen, flags) != 0) ret = false; - } } else { - if (virDomainPinVcpuFlags(dom, vcpu, cpumap, cpumaplen, flags) != 0) { + if (virDomainPinHypervisorFlags(dom, cpumap, cpumaplen, flags) != 0) ret = false; - } } cleanup: diff --git a/tools/virsh.pod b/tools/virsh.pod index b4a3d5c..9b510b7 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -1589,12 +1589,16 @@ Thus, this command always takes exactly zero or two flags. Returns basic information about the domain virtual CPUs, like the number of vCPUs, the running time, the affinity to physical processors. -=item B<vcpupin> I<domain-id> [I<vcpu>] [I<cpulist>] [[I<--live>] -[I<--config>] | [I<--current>]] - -Query or change the pinning of domain VCPUs to host physical CPUs. To -pin a single I<vcpu>, specify I<cpulist>; otherwise, you can query one -I<vcpu> or omit I<vcpu> to list all at once. +=item B<vcpupin> I<domain-id> [I<vcpu>] [I<hypervicor>] [I<cpulist>] +[[I<--live>] [I<--config>] | [I<--current>]] + +Query or change the pinning of domain VCPUs or hypervisor threads to host physical CPUs. +To pin a single I<vcpu>, specify I<cpulist>; otherwise, you can query one +I<vcpu>. +To pin all I<hypervisor> threads, specify I<cpulist>; otherwise, you can +query I<hypervisor>. +You can also omit I<vcpu> or I<hypervisor> to list vcpus and hypervisor threads +all at once. I<cpulist> is a list of physical CPU numbers. Its syntax is a comma separated list and a special markup using '-' and '^' (ex. '0-4', '0-3,^2') can -- 1.7.10.2

Hey guys, Any objection to this series? If no, would someone help to commit this patchset? Thanks, Gui On 2012/7/25 13:14, tangchen wrote:
Users can use vcpupin command to bind a vcpu thread to a specific physical cpu. But besides vcpu threads, there are alse some other threads created by qemu (known as hypervisor threads) that could not be explicitly bound to physical cpus.
The first 3 patches are from Wen Congyang, which implement Cgroup for differrent hypervisors.
The other 10 patches implemented hypervisor threads binding, in two ways: 1) Use sched_setaffinity() function; 2) Use cpuset cgroup.
A new xml element is introduced, and vcpupin command is improved, see below.
1. Introduce new xml elements: <cputune> ...... <hypervisorpin cpuset='1'/> </cputune>
2. Improve vcpupin command to support hypervisor threads binding.
For example, vm1 has the following configuration: <cputune> <vcpupin vcpu='1' cpuset='1'/> <vcpupin vcpu='0' cpuset='0'/> <hypervisorpin cpuset='1'/> </cputune>
1) query all threads pining
# vcpupin vm1 VCPU: CPU Affinity ---------------------------------- 0: 0 1: 1
Hypervisor: CPU Affinity ---------------------------------- *: 1
2) query hypervisor threads pining only
# vcpupin vm1 --hypervisor Hypervisor: CPU Affinity ---------------------------------- *: 1
3) change hypervisor threads pining
# vcpupin vm1 --hypervisor 0-1
# vcpupin vm1 --hypervisor Hypervisor: CPU Affinity ---------------------------------- *: 0-1
# taskset -p 397 pid 397's current affinity mask: 3
Note: If users want to pin a vcpu thread to pcpu, --vcpu option could no longer be omitted.
Tang Chen (9): Enable cpuset cgroup and synchronous vcpupin info to cgroup. Support hypervisorpin xml parse. Introduce qemuSetupCgroupHypervisorPin and synchronize hypervisorpin info to cgroup. Add qemuProcessSetHypervisorAffinites and set hypervisor threads affinities Introduce virDomainHypervisorPinAdd and virDomainHypervisorPinDel functions Introduce virDomainPinHypervisorFlags and virDomainGetHypervisorPinInfo functions. Introduce qemudDomainPinHypervisorFlags and qemudDomainGetHypervisorPinInfo in qemu driver. Introduce remoteDomainPinHypervisorFlags and remoteDomainGetHypervisorPinInfo functions in remote driver. Improve vcpupin to support hypervisorpin dynically.
Wen Congyang (3): Introduce the function virCgroupForHypervisor Introduce the function virCgroupMoveTask create a new cgroup and move all hypervisor threads to the new cgroup
daemon/remote.c | 103 +++++++++ docs/schemas/domaincommon.rng | 7 + include/libvirt/libvirt.h.in | 10 + src/conf/domain_conf.c | 169 +++++++++++++- src/conf/domain_conf.h | 7 + src/driver.h | 13 +- src/libvirt.c | 147 +++++++++++++ src/libvirt_private.syms | 7 + src/libvirt_public.syms | 2 + src/qemu/qemu_cgroup.c | 151 ++++++++++++- src/qemu/qemu_cgroup.h | 5 + src/qemu/qemu_driver.c | 267 ++++++++++++++++++++++- src/qemu/qemu_process.c | 60 ++++- src/remote/remote_driver.c | 102 +++++++++ src/remote/remote_protocol.x | 23 +- src/remote_protocol-structs | 24 ++ src/util/cgroup.c | 188 +++++++++++++++- src/util/cgroup.h | 15 ++ tests/qemuxml2argvdata/qemuxml2argv-cputune.xml | 1 + tests/vcpupin | 6 +- tools/virsh.c | 147 ++++++++----- tools/virsh.pod | 16 +- 22 files changed, 1393 insertions(+), 77 deletions(-)
participants (3)
-
Gui jianfeng
-
Hu Tao
-
tangchen