
On Fri, Oct 3, 2008 at 11:43 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
On Fri, Oct 03, 2008 at 09:31:52PM +0530, Balbir Singh wrote:
I understand that in the past there has been a perception that libcgroups might not yet be ready, because we did not have ABI stability built into the library and the header file had old comments about things changing. I would urge the group to look at the current implementation of libcgroups (look at v0.32) and help us
1. Fix any issues you see or point them to us 2. Add new API or request for new API that can help us integrate better with libvirt
To expand on what I said in my other mail about providing value-add over the representation exposed by the kernel, here's some thoughts on the API exposed.
Consider the following high level use case of libvirt
- A set of groups, in a 3 level hierarchy <APPNAME>/<DRIVER>/<DOMAIN> - Control the ACL for block/char devices - Control memory limits
This translates into an underling implementation, that I need to create 3 levels of cgroups in the filesystem, attach my PIDs at the 3rd level use the memory and device controllers and attach PIDs at the 3rd, and set values for attributes exposed by the controllers. Notice I'm not actually setting any config parms at the 1st & 2nd levels, but they do need to still exist to ensure namespace uniqueness amongst different applications using cgroups.
The current cgroups API provides APIs that directly map to individual actions wrt the kernel filesystem exposed. So as an application developer I have to explicitly create the 3 levels of hierarchy, tell it I want to use memory & device controllers, format config values into the syntax required for each attribute, and remeber the attribute names.
// Create the hierachy <APPNAME>/<DRIVER>/<DOMAIN> c1 = cgroup_new_cgroup("libvirt") c2 = cgroup_new_cgroup_parent(c1, "lxc") c3 = cgroup_new_cgroup_parent(c2, domain.name)
// Setup the controllers I want to use cgroup_add_controler(c3, "devices") cgroup_add_controller(c3, "memory")
// Add my domain's PID to the cgroup cgroup_attach_task(c3, domain.pid)
// Set the device ACL limits cgroup_set_value_string(c2, "devices.deny", "a");
char buf[1024]; sprintf(buf, "%c %d:%d", 'c', 1, 3); cgroup_set_value_stirng(c2, "devices.allow", buf);
// Set memory limit cgroup_set_value_uint64(c2, "memory.limit_in_bytes", domain.memory * 1024);
This really isn't providing any semantically useful abstraction over the direct filesytem manipulation. Just a bunch of wrappers for mkdir(), mount() and read()/write() calls. My application still has to know far too much information about the details of cgroups as exposed by the kernel.
True, it definitely does and the way I look at APIs is that they are layers. We've built the first layer that abstracts permissions, paths and strings into a set of useful API. The second layer does things that you say, the question then is why don't we have it yet? Let me try and answer that question 1. We've been trying to build configuration, classification and the low level plumbing 2. We've been planning to build the exact same thing that you say, we call that the pluggable architecture, where controller plug in their logic and provide the abstractions you need, but not gotten there yet. When you announced cgroup support in libvirt, it was definitely going to be a user and we hoped that you would come to us with your exact requirements that you've mentioned now (believe me, your feedback is very useful). The question then to ask is, is it cheaper for you to build these abstractions into libvirt or either helped us or asked us to do so, we would have gladly obliged. You might say that the onus is on the maintainers to do the right thing without feedback, but I would beg to differ. What you've asked for, I consider as a layer on top of the API we have now and should be easy to build.
I do not care that there is a concept of 'controllers' at all, I just want to set device ACLs and memory limits. I do not care what the attributes in the filesystem are called, again I just want to set device ACLs and memory limits. I do not care what the data format for them must be for device/memory settings. Memory settings could be stored in base-2, base-10 or base-16 I should not have to know this information.
With this style of API, the library provide no real value-add or compelling reason to use it.
What might a more useful API look like? At least from my point of view, I'd like to be able to say:
// Tell it I want $PID placed in <APPNAME>/<DRIVER>/<DOMAIN> char *path[] = { "libvirt", "lxc", domain.name}; cg = cgroup_new_path(path, domain.pid)
// I want to deny all devices cgroup_deny_all_devices(cg);
// Allow /dev/null - either by node/major/minor cgroup_allow_device_node(cg, 'c', 1, 3);
// Or more conviently just give it a node to copy info from cgroup_allow_device_node(cg, "/dev/null")
// Set memory in KB cgroup_set_memory_limit_kb(cg, domain.memory)
Notice how with such a style of API, I don't need to know anything about the low level implementation details - I'm working entirely in terms of semantically meaningful concepts.
Yes, I agree this is definitely more usable and friendlier. These are not hard to do implement today, in fact implementing them would require a few calls to existing API and can be built as controller specific layers (I call them as plugins for each controller).
Now, comes the hard bit - you have to figure out what semantic concepts you want to expose to applications. The example here would be suitable for libvirt, but not neccessarily for other applications. Picking the right APIs is very much much harder than just exposing the kernel capabilities directly as libcgroup.h does now, but the trade off is that the resulting API would be much more useful and interesting to app developers.
I like what you've proposed very much and I am going to start building these abstractions and make them available in libcgroup. At some point, I hope you will find them useful enough so as to drop your abstractions (which I would hope you had directly built into libcgroup and used, so that more people would have benefited from it) and use them. Balbir