On Fri, Oct 03, 2008 at 08:40:24AM -0700, Dan Smith wrote:
This patch adds code to the controller to set up a cgroup named after
the
domain name, set the memory limit, and restrict devices. It also
adds bits to lxc_driver to properly clean up the cgroup on domain death.
The device whitelisting is all very nice, but we completely forgot / ignored
the fact that there's nothing stopping a container mounting the cgroups
device controller and giving itself the device access we just took away :-)
The kernel code says
/*
* Modify the whitelist using allow/deny rules.
* CAP_SYS_ADMIN is needed for this. It's at least separate from CAP_MKNOD
* so we can give a container CAP_MKNOD to let it create devices but not
* modify the whitelist.
* It seems likely we'll want to add a CAP_CONTAINER capability to allow
* us to also grant CAP_SYS_ADMIN to containers without giving away the
* device whitelist controls, but for now we'll stick with CAP_SYS_ADMIN
*
* Taking rules away is always allowed (given CAP_SYS_ADMIN). Granting
* new access is only allowed if you're in the top-level cgroup, or your
* parent cgroup has the access you're asking for.
*/
So, looks like we need to explicitly set the capabilities of containers
to either mask out CAP_SYS_ADMIN from libvirtd's set, or construct an
explicit capability whitelist
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://ovirt.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|