[libvirt] [RFC] cgroups net_cls controller implementation

This a basic implemantation to support the net_cls feature of cgroups. It adds the setting of a net_cls.classid value to the existing cgroups setup in the qemu driver. The classid is specified in the qemu.conf file. This enables the use of the tc utility to manage traffic from/to vitual machines based on the setting combination of classid and network interface. Signed-off-by: D.Herrendoerfer <d.herrendoerfer [at] herrendoerfer [dot] name > src/libvirt_private.syms | 1 + src/qemu/qemu.conf | 6 +++++- src/qemu/qemu_conf.c | 7 ++++++- src/qemu/qemu_conf.h | 1 + src/qemu/qemu_driver.c | 12 ++++++++++++ src/util/cgroup.c | 18 +++++++++++++++++- src/util/cgroup.h | 3 +++ 7 files changed, 45 insertions(+), 3 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index f251c94..771911e 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -80,6 +80,7 @@ virCgroupSetFreezerState; virCgroupSetMemory; virCgroupSetMemoryHardLimit; virCgroupSetMemorySoftLimit; +virCgroupSetNetworkClassID; virCgroupSetSwapHardLimit; diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index f4f965e..591d8dc 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -157,7 +157,7 @@ # can be mounted in different locations. libvirt will detect # where they are located. # -# cgroup_controllers = [ "cpu", "devices", "memory" ] +# cgroup_controllers = [ "cpu", "devices", "memory", "net_cls" ] # This is the basic set of devices allowed / required by # all virtual machines. @@ -175,6 +175,10 @@ # "/dev/rtc", "/dev/hpet", "/dev/net/tun", #] +# This is the default classid that will be assigned +# to all virtual machines. +# cgroup_net_cls_classid = 4096 + # The default format for Qemu/KVM guest save images is raw; that is, the # memory from the domain is dumped out directly to a file. If you have diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index 7cd0603..46ac040 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -326,7 +326,8 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, driver->cgroupControllers = (1 << VIR_CGROUP_CONTROLLER_CPU) | (1 << VIR_CGROUP_CONTROLLER_DEVICES) | - (1 << VIR_CGROUP_CONTROLLER_MEMORY); + (1 << VIR_CGROUP_CONTROLLER_MEMORY) | + (1 << VIR_CGROUP_CONTROLLER_NETWORK); } for (i = 0 ; i < VIR_CGROUP_CONTROLLER_LAST ; i++) { if (driver->cgroupControllers & (1 << i)) { @@ -364,6 +365,10 @@ int qemudLoadDriverConfig(struct qemud_driver *driver, driver->cgroupDeviceACL[i] = NULL; } + p = virConfGetValue (conf, "cgroup_net_cls_classid"); + CHECK_TYPE ("cgroup_net_cls_classid", VIR_CONF_LONG); + if (p) driver->cgroupNetClsClassid = p->l; + p = virConfGetValue (conf, "save_image_format"); CHECK_TYPE ("save_image_format", VIR_CONF_STRING); if (p && p->str) { diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index aba64d6..961c6cd 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -119,6 +119,7 @@ struct qemud_driver { virCgroupPtr cgroup; int cgroupControllers; char **cgroupDeviceACL; + int cgroupNetClsClassid; virDomainObjList domains; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 1a7c1ad..42448b5 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -186,6 +186,8 @@ static int qemudVMFiltersInstantiate(virConnectPtr conn, static struct qemud_driver *qemu_driver = NULL; +#include "interface.h" + static void *qemuDomainObjPrivateAlloc(void) { @@ -3597,6 +3599,16 @@ static int qemuSetupCgroup(struct qemud_driver *driver, vm->def->name); } + if (qemuCgroupControllerActive(driver, VIR_CGROUP_CONTROLLER_NETWORK)) { + + if (driver->cgroupNetClsClassid != 0) { + rc = virCgroupSetNetworkClassID(cgroup, driver- >cgroupNetClsClassid); + if (rc != 0) { + VIR_WARN("Cannot set net_cls.classid for: %s", + vm->def->name); + } + } + } done: virCgroupFree(&cgroup); return 0; diff --git a/src/util/cgroup.c b/src/util/cgroup.c index 2758a8f..a2ed0ed 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -37,7 +37,7 @@ VIR_ENUM_IMPL(virCgroupController, VIR_CGROUP_CONTROLLER_LAST, "cpu", "cpuacct", "cpuset", "memory", "devices", - "freezer"); + "freezer", "net_cls"); struct virCgroupController { int type; @@ -851,6 +851,22 @@ int virCgroupForDomain(virCgroupPtr driver ATTRIBUTE_UNUSED, #endif /** + * virCgroupSetNetworkClassID: + * + * @group: The cgroup to change memory for + * @classid: The classID number + * + * Returns: 0 on success + */ +int virCgroupSetNetworkClassID(virCgroupPtr group, unsigned long classid) +{ + return virCgroupSetValueU64(group, + VIR_CGROUP_CONTROLLER_NETWORK, + "net_cls.classid", + classid); +} + +/** * virCgroupSetMemory: * * @group: The cgroup to change memory for diff --git a/src/util/cgroup.h b/src/util/cgroup.h index 9e1c61f..9626e82 100644 --- a/src/util/cgroup.h +++ b/src/util/cgroup.h @@ -22,6 +22,7 @@ enum { VIR_CGROUP_CONTROLLER_MEMORY, VIR_CGROUP_CONTROLLER_DEVICES, VIR_CGROUP_CONTROLLER_FREEZER, + VIR_CGROUP_CONTROLLER_NETWORK, VIR_CGROUP_CONTROLLER_LAST }; @@ -40,6 +41,8 @@ int virCgroupForDomain(virCgroupPtr driver, int virCgroupAddTask(virCgroupPtr group, pid_t pid); +int virCgroupSetNetworkClassID(virCgroupPtr group, unsigned long classid); + int virCgroupSetMemory(virCgroupPtr group, unsigned long kb); int virCgroupGetMemoryUsage(virCgroupPtr group, unsigned long *kb);

On Thu, 2010-12-02 at 14:47 +0100, D. Herrendoerfer wrote:
This a basic implemantation to support the net_cls feature of cgroups. It adds the setting of a net_cls.classid value to the existing cgroups setup in the qemu driver. The classid is specified in the qemu.conf file.
This enables the use of the tc utility to manage traffic from/to vitual machines based on the setting combination of classid and network interface.
Signed-off-by: D.Herrendoerfer <d.herrendoerfer [at] herrendoerfer [dot] name >
I verified that the patch works as intended. Are there any objections or comments regarding the patch or the approach it implements? We would very much appreciate it if this would make it into 0.8.7 -- Best regards, Gerhard Stenzel, ----------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On Thu, Dec 02, 2010 at 02:47:15PM +0100, D. Herrendoerfer wrote:
This a basic implemantation to support the net_cls feature of cgroups. It adds the setting of a net_cls.classid value to the existing cgroups setup in the qemu driver. The classid is specified in the qemu.conf file.
This enables the use of the tc utility to manage traffic from/to vitual machines based on the setting combination of classid and network interface.
I don't think this patch is a good approach. The goal of libvirt is that you can configure & control guests using terminology & APIs that are platform & hypervisor independent. This precludes exposing classid as a direct concept. Requiring the half the configuration job to be performed via the tc command line utility is also not a viable solution for apps that are communicating with libvirt over a remote connection. If we were to support this patch in libvirt, then it would make it harder for us to incorporate a alternative solution for networking traffic controls without causing behavioural regressions for anyone who had started depending on this patch. Regards, Daniel

I disagree. The concept of QoS is not pertinent to a single VM description; it is a constrainment of the Host. Libvirt vor example has no concept of multiple VLAN distribution on a single machine and is dependent on other tools to provide them. True network QoS (802.1p) on the other hand is, in linux, bound to a specific VLAN interface, and having a single VM try to modify these settings will affect all users of that VLAN. Using cgroups in this context is a wonderful way of managing bandwidth between groups of VMs, to make sure one cannot stave-out the other, but again - this is a Host feature. Apart from the above, I don't see great potential for regressions by simply introducing a sensible default behavior, it is still possible to simply leave zero in the classid file, and all is back to normal. D.Herrendoerfer On Dec 8, 2010, at 11:43 AM, Daniel P. Berrange wrote:
On Thu, Dec 02, 2010 at 02:47:15PM +0100, D. Herrendoerfer wrote:
This a basic implemantation to support the net_cls feature of cgroups. It adds the setting of a net_cls.classid value to the existing cgroups setup in the qemu driver. The classid is specified in the qemu.conf file.
This enables the use of the tc utility to manage traffic from/to vitual machines based on the setting combination of classid and network interface.
I don't think this patch is a good approach. The goal of libvirt is that you can configure & control guests using terminology & APIs that are platform & hypervisor independent. This precludes exposing classid as a direct concept. Requiring the half the configuration job to be performed via the tc command line utility is also not a viable solution for apps that are communicating with libvirt over a remote connection.
If we were to support this patch in libvirt, then it would make it harder for us to incorporate a alternative solution for networking traffic controls without causing behavioural regressions for anyone who had started depending on this patch.
Regards, Daniel

On Wed, Dec 08, 2010 at 01:08:36PM +0100, D. Herrendoerfer wrote:
I disagree. The concept of QoS is not pertinent to a single VM description; it is a constrainment of the Host. Libvirt vor example has no concept of multiple VLAN distribution on a single machine and is dependent on other tools to provide them. True network QoS (802.1p) on the other hand is, in linux, bound to a specific VLAN interface, and having a single VM try to modify these settings will affect all users of that VLAN.
Using cgroups in this context is a wonderful way of managing bandwidth between groups of VMs, to make sure one cannot stave-out the other, but again - this is a Host
Where the network settings need to be configured & applied can vary depending on what network settings you're trying to manage. There's been no description of what the design goals or required network controls are, so it is impossible to say where they must be configured in libvirt. Even if there are settings which are not applicable for the VM configuration, there are other existing areas of libvirt where they can be applied, and scope for adding further APIs if no existing functionality is sufficient. As an example, in VMWare they have a 'Virtual Switch' which then has a number of 'Port Groups'. A VM NIC is associated with a port group. Network QoS is configured against the port groups. I could easily envisage such a model being something should have in the libvirt API and build for QEMU, potentially backed up by cgroups+tc. Cgroups could even be the wrong approach if we want a VM to have multiple NICs, each with a different policy, since cgroups control the VM as a whole, not individual VM NICs. Starting from "We need to let admins set the 'tc' classid" is wrong approach to dealing with this in libvirt. We need to start from a consideration of what general capabilities & concepts we want to represent & model in the XML/APIs. Then consider how these concepts could be mapped to the platforms or hypervisors libvirt wants to support. Directly exposing 'tc' as an implementation to the end administrator or app developer is at odds with the libvirt goals. Regards, Daniel
participants (3)
-
D. Herrendoerfer
-
Daniel P. Berrange
-
Gerhard Stenzel