[libvirt] [RFC PATCH v1 0/4] Add cpu hotplug support to libvirt.

It seems that libvirt is not cpu hotplug aware. Please refer to the following problem. 1. At first, we have 2 cpus. # cat /cgroup/cpuset/cpuset.cpus 0-1 # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus 0-1 2. And we have a vm1 with following configuration. <cputune> <vcpupin vcpu='0' cpuset='1'/> <hypervisorpin cpuset='1'/> </cputune> 3. Offline cpu1. # echo 0 > /sys/devices/system/cpu/cpu1/online # cat /sys/devices/system/cpu/cpu1/online 0 # cat /cgroup/cpuset/cpuset.cpus 0 # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus 0 # cat /cgroup/cpuset/libvirt/lxc/cpuset.cpus 0 4. Online cpu1. # echo 1 > /sys/devices/system/cpu/cpu1/online # cat /sys/devices/system/cpu/cpu1/online 1 # cat /cgroup/cpuset/cpuset.cpus 0-1 # cat /cgroup/cpuset/libvirt/cpuset.cpus 0 # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus 0 # cat /cgroup/cpuset/libvirt/lxc/cpuset.cpus 0 Here,cgroup updated cpuset.cpus,but not for libvirt directory,and also qemu and lxc directory. vm1 cannot be started again. # virsh start vm1 error: Failed to start domain vm1 error: Unable to set cpuset.cpus: Permission denied And libvird gave the following errors. 2012-07-17 07:30:22.478+0000: 3118: error : qemuSetupCgroupVcpuPin:498 : Unable to set cpuset.cpus: Permission denied These patches resolves this problem by listening on the netlink for cpu hotplug event. When the netlink service gets the cpu hotplug event, it will attract the cpuid in the message, and add it into cpuset.cpus in: /cgroup/cpuset/libvirt /cgroup/cpuset/libvirt/qemu /cgroup/cpuset/libvirt/lxc Tang Chen (4): Add cpu hotplug handler for netlink service. Register cpu hotplug netlink handler for libvirtd. Register cpu hotplug netlink handler for qemu driver. Register cpu hotplug netlink handler for lxc driver. daemon/libvirtd.c | 11 +++ include/libvirt/virterror.h | 2 + src/Makefile.am | 1 + src/libvirt_private.syms | 5 + src/lxc/lxc_driver.c | 8 ++ src/qemu/qemu_driver.c | 8 ++ src/util/cgroup.c | 6 +- src/util/cgroup.h | 4 + src/util/hotplug.c | 221 +++++++++++++++++++++++++++++++++++++++++++ src/util/hotplug.h | 32 +++++++ src/util/virterror.c | 3 +- 11 files changed, 297 insertions(+), 4 deletions(-) create mode 100644 src/util/hotplug.c create mode 100644 src/util/hotplug.h -- 1.7.10.1

This patch adds a callback for cpu hotplug event. The cpu hotplug netlink message is of the following format: {online|offline}@/devices/system/cpu/cpuxx (xx is cpuid) When a cpu online message is received, the callback will get the new added cpuid from the message, and adds it to the cpuset.cpus of a specific cgroup, such as libvirtd, qemu driver, or lxc driver's cpuset cgroup. When a cpu offline message is received, nothing to for now. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> --- include/libvirt/virterror.h | 2 + src/Makefile.am | 1 + src/libvirt_private.syms | 5 + src/util/cgroup.c | 6 +- src/util/cgroup.h | 4 + src/util/hotplug.c | 221 +++++++++++++++++++++++++++++++++++++++++++ src/util/hotplug.h | 32 +++++++ src/util/virterror.c | 3 +- 8 files changed, 270 insertions(+), 4 deletions(-) create mode 100644 src/util/hotplug.c create mode 100644 src/util/hotplug.h diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h index 69c64aa..5e10338 100644 --- a/include/libvirt/virterror.h +++ b/include/libvirt/virterror.h @@ -114,6 +114,8 @@ typedef enum { VIR_FROM_SSH = 50, /* Error from libssh2 connection transport */ + VIR_FROM_HOTPLUG = 51, /* Error from Hotplug driver */ + # ifdef VIR_ENUM_SENTINELS VIR_ERR_DOMAIN_LAST # endif diff --git a/src/Makefile.am b/src/Makefile.am index 95e1bea..c65ee37 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -60,6 +60,7 @@ UTIL_SOURCES = \ util/event.c util/event.h \ util/event_poll.c util/event_poll.h \ util/hooks.c util/hooks.h \ + util/hotplug.c util/hotplug.h \ util/iptables.c util/iptables.h \ util/ebtables.c util/ebtables.h \ util/dnsmasq.c util/dnsmasq.h \ diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 27eb43e..97f9c7b 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -64,6 +64,7 @@ virCgroupAddTaskController; virCgroupAllowDevice; virCgroupAllowDeviceMajor; virCgroupAllowDevicePath; +virCgroupAppRoot; virCgroupControllerTypeFromString; virCgroupControllerTypeToString; virCgroupDenyAllDevices; @@ -643,6 +644,10 @@ virHookInitialize; virHookPresent; +# hotplug.h +virCpuHotplugRegisterCallback; + + # interface_conf.h virInterfaceAssignDef; virInterfaceDefFormat; diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 8541c7f..df5f31a 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -641,9 +641,9 @@ err: return rc; } -static int virCgroupAppRoot(int privileged, - virCgroupPtr *group, - int create) +int virCgroupAppRoot(int privileged, + virCgroupPtr *group, + int create) { virCgroupPtr rootgrp = NULL; int rc; diff --git a/src/util/cgroup.h b/src/util/cgroup.h index 68ac232..ef9b022 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -44,6 +44,10 @@ enum { VIR_ENUM_DECL(virCgroupController); +int virCgroupAppRoot(int privileged, + virCgroupPtr *group, + int create); + int virCgroupForDriver(const char *name, virCgroupPtr *group, int privileged, diff --git a/src/util/hotplug.c b/src/util/hotplug.c new file mode 100644 index 0000000..d5ffd67 --- /dev/null +++ b/src/util/hotplug.c @@ -0,0 +1,221 @@ +/* + * Copyright (C) 2012 FUJITSU, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; If not, see + * <http://www.gnu.org/licenses/>. + * + * Authors: + * Tang Chen <tangchen@cn.fujitsu.com> + */ + +#include <sys/types.h> +#include <sys/socket.h> +#include <string.h> +#include <stdlib.h> + +#include "hotplug.h" +#include "cgroup.h" +#include "virterror_internal.h" +#include "virnetlink.h" +#include "logging.h" +#include "memory.h" + +#define VIR_FROM_THIS VIR_FROM_HOTPLUG + +#ifdef __linux__ + +/** + * CPU hotplug message is of the following format: + * {online|offline}@/devices/system/cpu/cpuxx (xx is cpuid) + */ +# define CPU_ONLINE_MSG "online@/devices/system/cpu/cpu" +# define CPU_OFFLINE_MSG "offline@/devices/system/cpu/cpu" + +/** + * getCpuidFromNetlinkMsg: + * + * @msg: The buffer containing the received netlink message + * @cpuid: Contains the cpuid in the message + * + * This function get the cpuid from the following netlink message, + * {online|offline}@/devices/system/cpu/cpuxx (xx is cpuid) + * + * Returns 0 on success, or -1 on error. + */ +static int getCpuidFromNetlinkMsg(unsigned char *msg, + char **cpuid) +{ + char *p = NULL; + size_t len_online = strlen(CPU_ONLINE_MSG); + size_t len_offline = strlen(CPU_OFFLINE_MSG); + + if (VIR_ALLOC_N(*cpuid, 64) < 0) + goto memory_error; + + /** + * For now, we aren't sure if the message is a '/0' ended string. + * So we only test the first len_online or len_offline characters. + */ + if (strncmp((const char *)msg, CPU_ONLINE_MSG, len_online) == 0 || + strncmp((const char *)msg, CPU_OFFLINE_MSG, len_offline) == 0) { + p = strrchr((const char *)msg, '/'); + p = p + 4; + strcpy(*cpuid, p); + return 0; + } else { + VIR_DEBUG("Event is not a cpu hotplug event."); + VIR_FREE(*cpuid); + return -1; + } + +memory_error: + virReportOOMError(); + return -1; +} + +/** + * virCpuHotplugCallback: + * + * @msg: The buffer containing the received netlink message + * @length: The length of the received netlink message + * @peer: The netling sockaddr containing the peer information + * @handled: Contains information if the message has been replied to yet + * @opaque: Contains a cgroup identifier to be modified + * + * Cpu hotplug netlink event handler. It is called when libvirtd + * receives netlink message from kernel, and modifies cpuset.cpus + * of specified cgroup. + */ +static void +virCpuHotplugCallback(unsigned char *msg, + int length, + struct sockaddr_nl *peer, + bool *handled, + void *opaque) +{ + char *cpuid = NULL; + char *cpus = NULL; + virCgroupPtr cgroup = opaque; + + if (!cgroup) + return; + + if (VIR_ALLOC_N(cpus, 1024) < 0) { + virReportOOMError(); + return; + } + + if (getCpuidFromNetlinkMsg(msg, &cpuid) < 0) { + goto error_2; + } + + /** + * If getCpuidFromNetlinkMsg() succeeds, we are sure the + * netlink message is a '/0' ended string. + */ + VIR_DEBUG("netlink (cpu hotplug): %s", msg); + + if (msg == strstr((const char *)msg, "online")) { + VIR_DEBUG("CPU %s online message received.", cpuid); + + if (virCgroupGetCpusetCpus(cgroup, &cpus) < 0) { + virReportSystemError(errno, + "%s", + _("Unable to get cpuset.cpus")); + goto error_1; + } + + if (virAsprintf(&cpus, "%s,%s", cpus, cpuid) < 0) { + virReportOOMError(); + goto error_1; + } + + if (virCgroupSetCpusetCpus(cgroup, cpus) < 0) { + virReportSystemError(errno, + "%s", + _("Unable to set cpuset.cpus")); + goto error_1; + } + + } else if (msg == strstr((const char *)msg, "offline")) { + VIR_DEBUG("CPU %s offline message received.", cpuid); + } + +error_1: + VIR_FREE(cpuid); + +error_2: + VIR_FREE(cpus); + + return; +} + +/** + * virCpuHotplugDestroyCallback: + * + * @watch: watch whose handle to remove + * @macaddr: macaddr whose handle to remove + * @opaque: Contains user data to pass to the callback + * + * This function is called when a netlink message handler is terminated. + * For now, nothing to do. + */ +static void +virCpuHotplugDestroyCallback(int watch ATTRIBUTE_UNUSED, + const virMacAddrPtr macaddr ATTRIBUTE_UNUSED, + void *opaque ATTRIBUTE_UNUSED) +{ + VIR_DEBUG("CPU hotplug netlink handler has been removed."); +} + +/** + * virCpuHotplugRegisterCallback: + * + * @opaque: Contains user data to pass to the callback, which is a + * cgroup identifier to be modified + * + * Register a callback for cpu hotplug event. + * + * Returns -1 if the file handle cannot be registered, number of + * monitor upon success. + */ +int +virCpuHotplugRegisterCallback(void *opaque) +{ + int ret = 0; + virCgroupPtr cgroup = (virCgroupPtr)opaque; + + if (virNetlinkEventServiceIsRunning(NETLINK_KOBJECT_UEVENT)) + ret = virNetlinkEventAddClient(virCpuHotplugCallback, + virCpuHotplugDestroyCallback, + cgroup, NULL, NETLINK_KOBJECT_UEVENT); + + return ret; +} + +#else + +static const char *unsupported = N_("Not a linux system."); + +/** + * virCpuHotplugRegisterCallback: Register a callback for cpu hotplug event. + */ +int +virCpuHotplugRegisterCallback(void *opaque ATTRIBUTE_UNUSED) +{ + VIR_DEBUG("%s", _(unsupported)); + return 0; +} + +#endif /* __linux__ */ diff --git a/src/util/hotplug.h b/src/util/hotplug.h new file mode 100644 index 0000000..8f2cdf3 --- /dev/null +++ b/src/util/hotplug.h @@ -0,0 +1,32 @@ +/* + * Copyright (C) 2012 FUJITSU, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; If not, see + * <http://www.gnu.org/licenses/>. + * + * Authors: + * Tang Chen <tangchen@cn.fujitsu.com> + */ + +#ifndef __HOTPLUG_H_ +# define __HOTPLUG_H_ + +# include "internal.h" + +/** + * virCpuHotplugRegisterCallback: Register a callback for cpu hotplug event. + */ +int virCpuHotplugRegisterCallback(void *opaque); + +#endif diff --git a/src/util/virterror.c b/src/util/virterror.c index 3ee2ae0..1ae26f9 100644 --- a/src/util/virterror.c +++ b/src/util/virterror.c @@ -115,7 +115,8 @@ VIR_ENUM_IMPL(virErrorDomain, VIR_ERR_DOMAIN_LAST, "Parallels Cloud Server", "Device Config", - "SSH transport layer" /* 50 */ + "SSH transport layer", /* 50 */ + "Hotplug driver" /* 51 */ ) -- 1.7.10.1

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> --- daemon/libvirtd.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/daemon/libvirtd.c b/daemon/libvirtd.c index 19dd26b..e71cd79 100644 --- a/daemon/libvirtd.c +++ b/daemon/libvirtd.c @@ -56,6 +56,8 @@ #include "uuid.h" #include "viraudit.h" #include "locking/lock_manager.h" +#include "hotplug.h" +#include "cgroup.h" #ifdef WITH_DRIVER_MODULES # include "driver.h" @@ -948,6 +950,7 @@ int main(int argc, char **argv) { bool implicit_conf = false; char *run_dir = NULL; mode_t old_umask; + virCgroupPtr rootgrp = NULL; struct option opts[] = { { "verbose", no_argument, &verbose, 1}, @@ -1324,6 +1327,13 @@ int main(int argc, char **argv) { goto cleanup; } + /* Register cpu hotplug netlink handler for libvirtd */ + if (virCgroupAppRoot(privileged, &rootgrp, 0) != 0 || + virCpuHotplugRegisterCallback(rootgrp) < 0) { + ret = VIR_DAEMON_ERR_NETWORK; + goto cleanup; + } + /* Run event loop. */ virNetServerRun(srv); @@ -1352,6 +1362,7 @@ cleanup: if (pid_file_fd != -1) virPidFileReleasePath(pid_file, pid_file_fd); + virCgroupFree(&rootgrp); VIR_FREE(sock_file); VIR_FREE(sock_file_ro); VIR_FREE(pid_file); -- 1.7.10.1

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> --- src/qemu/qemu_driver.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 64c407d..509cdd7 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -92,6 +92,7 @@ #include "virnodesuspend.h" #include "virtime.h" #include "virtypedparam.h" +#include "hotplug.h" #define VIR_FROM_THIS VIR_FROM_QEMU @@ -710,6 +711,13 @@ qemudStartup(int privileged) { virStrerror(-rc, ebuf, sizeof(ebuf))); } + /* Register cpu hotplug netlink handler for qemu driver */ + if (virCpuHotplugRegisterCallback(qemu_driver->cgroup) < 0) { + VIR_ERROR(_("Unable to register cpu hotplug netlink handler" + " for qemu driver")); + goto error; + } + if (qemudLoadDriverConfig(qemu_driver, driverConf) < 0) { goto error; } -- 1.7.10.1

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> --- src/lxc/lxc_driver.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index ff11c2c..45f6cc0 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -63,6 +63,7 @@ #include "virtime.h" #include "virtypedparam.h" #include "viruri.h" +#include "hotplug.h" #define VIR_FROM_THIS VIR_FROM_LXC @@ -1453,6 +1454,13 @@ static int lxcStartup(int privileged) */ } + /* Register cpu hotplug netlink handler for lxc driver */ + if (virCpuHotplugRegisterCallback(lxc_driver->cgroup) < 0) { + VIR_ERROR(_("Unable to register cpu hotplug netlink handler" + " for lxc driver")); + goto cleanup; + } + /* Call function to load lxc driver configuration information */ if (lxcLoadDriverConfig(lxc_driver) < 0) goto cleanup; -- 1.7.10.1

On 09/03/2012 08:06 AM, Tang Chen wrote:
It seems that libvirt is not cpu hotplug aware. Please refer to the following problem.
1. At first, we have 2 cpus. # cat /cgroup/cpuset/cpuset.cpus 0-1 # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus 0-1
2. And we have a vm1 with following configuration. <cputune> <vcpupin vcpu='0' cpuset='1'/> <hypervisorpin cpuset='1'/> </cputune>
3. Offline cpu1. # echo 0 > /sys/devices/system/cpu/cpu1/online # cat /sys/devices/system/cpu/cpu1/online 0 # cat /cgroup/cpuset/cpuset.cpus 0 # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus 0 # cat /cgroup/cpuset/libvirt/lxc/cpuset.cpus 0
4. Online cpu1. # echo 1 > /sys/devices/system/cpu/cpu1/online # cat /sys/devices/system/cpu/cpu1/online 1 # cat /cgroup/cpuset/cpuset.cpus 0-1 # cat /cgroup/cpuset/libvirt/cpuset.cpus 0 # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus 0 # cat /cgroup/cpuset/libvirt/lxc/cpuset.cpus 0
Here,cgroup updated cpuset.cpus,but not for libvirt directory,and also qemu and lxc directory. vm1 cannot be started again. # virsh start vm1 error: Failed to start domain vm1 error: Unable to set cpuset.cpus: Permission denied
And libvird gave the following errors. 2012-07-17 07:30:22.478+0000: 3118: error : qemuSetupCgroupVcpuPin:498 : Unable to set cpuset.cpus: Permission denied
These patches resolves this problem by listening on the netlink for cpu hotplug event. When the netlink service gets the cpu hotplug event, it will attract the cpuid in the message, and add it into cpuset.cpus in: /cgroup/cpuset/libvirt /cgroup/cpuset/libvirt/qemu /cgroup/cpuset/libvirt/lxc
Hi, this approach requires that libvirtd keeps running through the entire lifecycle of a guest. That is something that cannot be safely assumed and therefore hotplug events can be missed. That means that libvirt must synchronize the hypervisors cpusets with the host's current CPU states. You could do that for instance when registering the callback. -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 09/03/2012 04:42 PM, Viktor Mihajlovski wrote:
Hi,
this approach requires that libvirtd keeps running through the entire lifecycle of a guest. That is something that cannot be safely assumed and therefore hotplug events can be missed. That means that libvirt must synchronize the hypervisors cpusets with the host's current CPU states. You could do that for instance when registering the callback.
Yes, I will fix it soon in the next version. Thanks. :)
participants (2)
-
Tang Chen
-
Viktor Mihajlovski