When I wrote the private root filesystem stuff for LXC (which I just
committed) I noted that we couldn't actually make this secure, because
someone inside the chroot can just 'mknod' and access the host devices.
What I completely forgot was that cgroups as of 2.6.26 has device ACLs
If we place every container in a cgroup (which was planned anyway), then
we can trivially prevent containers accessing host devices
One time setup
mount -t cgroups /dev/cgroups
mkdir /dev/cgroups/libvirt
mkdir /dev/cgroups/libvirt/lxc
For each new container 'NAME'
mkdir /dev/cgroups/libvirt/lxc/{NAME}
echo "a" > /dev/cgroups/libvirt/lxc/{NAME}/devices.deny
echo "c 1:3 rwm" > /dev/cgroups/libvirt/lxc/{NAME}/devices.allow
echo "c 1:5 rwm" > /dev/cgroups/libvirt/lxc/{NAME}/devices.allow
echo "c 1:7 rwm" > /dev/cgroups/libvirt/lxc/{NAME}/devices.allow
echo "c 5:1 rwm" > /dev/cgroups/libvirt/lxc/{NAME}/devices.allow
echo "c 1:8 rwm" > /dev/cgroups/libvirt/lxc/{NAME}/devices.allow
echo "c 1:9 rwm" > /dev/cgroups/libvirt/lxc/{NAME}/devices.allow
This denies all devices, and then allows null, zero, full, console, random
and urandom. Allowing use of 'random' is debatable.
The 'devpts' namespace stuff is also needed to provide private PTYs.
The 'user' namespace stuff is needed to prevent an unprivileged user
in the host OS from killing off processes with same UID inside the
container. There looks to be active patchsets for both of these being
discussed, so we're getting close to having a genuinely useful
container based virt driver with LXC
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://ovirt.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|