Re: [libvirt-users] Libvirt-LXC + systemd + user namespace

29 Jan 2014

      On 28.01.2014 12:46, Daniel P. Berrange wrote:
...
...
Hi there!
I am trying to turn on user namespace by adding following lines to the
config:
<idmap>
<uid start='0' target='0' count='100000'/>
<gid start='0' target='0' count='100000'/>
</idmap>
As you can see the root in container is mapped to the root outside. I was
expected to see no difference after adding this lines, but unfortunately
there are some (see details below).
Am I missing something or is there a problem with system, libvirt or kernel?
I've not had any chance to try LXC + user namespaces + systemd yet, but
On Tue, Jan 28, 2014 at 12:32:41PM +0100, Jan Olszak wrote:
based on the list of things which fail, it seems like it might not be
detecting that it is inside a container. Seems almost like it has still
got the CAP_MKNOD permission and so is strying to start things it should
not have like udev, and various filesystems.
Daniel
I was able to reduce the problem by not using libvirt nor systemd.

I've created a bash process inside user namespace with mapping 
root_inside<->root_outside.
I've used a program from https://lwn.net/Articles/532593/ :
./userns_child_exec -U -M '0 0 1' -G '0 0 1' bash
This program simply calls clone with CLONE_NEWUSER flag and set proper 
uid_map and gid_map.

The test commands are as follows:
mkdir /test
mount debugfs /test -t debugfs

and strace shows:
mount("debugfs", "/test", "debugfs", MS_MGC_VAL, NULL) = -1 EPERM 
(Operation not permitted)

Now the question is:
Is it a kernel bug or expected behavior ie. inside user namespace we 
have always limited permissions even if uid=0 inside container is mapped 
to uid=0 outside?

# cat /proc/$$/uid_map
          0          0          1
# cat /proc/$$/gid_map
          0          0          1

# cat /proc/$$/status | grep Cap
CapInh:    0000000000000000
CapPrm:    0000001fffffffff
CapEff:    0000001fffffffff
CapBnd:    0000001fffffffff

-- 
Piotr Bartosiewicz