[libvirt] [PATCH v2]lxc: don't mount dir if ownership couldn't be known

From: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> If we enable userns, we could bind mount some dirs from host to guest, which don't belong to the target mapped uid/gid. Such as we could bind mount root's dirs to guest. What is worse, we could even modify root's files in that bind dir inside container. So if we couldn't know the dir's ownership(without a proper uid/gid mapping), don't mount it. Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> --- v2: add more description src/lxc/lxc_container.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 255c711..4cf209e 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -96,6 +96,8 @@ typedef char lxc_message_t; #define LXC_CONTINUE_MSG 'c' +#define OVERFLOW_UGID 65534 + typedef struct __lxc_child_argv lxc_child_argv_t; struct __lxc_child_argv { virDomainDefPtr config; @@ -1073,6 +1075,22 @@ static int lxcContainerMountFSBind(virDomainFSDefPtr fs, if (virAsprintf(&src, "%s%s", srcprefix, fs->src) < 0) goto cleanup; + if (stat(src, &st) < 0) { + virReportSystemError(errno, _("Unable to stat bind source %s"), + src); + goto cleanup; + } else { + if (OVERFLOW_UGID == st.st_uid || OVERFLOW_UGID == st.st_gid) { + errno = EPERM; + VIR_DEBUG("Unknown st_uid %d, st_gid %d for %s", + st.st_uid, st.st_gid, fs->src); + virReportSystemError(errno, + _("Check the permission of src dir '%s' provided for container") + ,fs->src); + goto cleanup; + } + } + if (stat(fs->dst, &st) < 0) { if (errno != ENOENT) { virReportSystemError(errno, _("Unable to stat bind target %s"), -- 1.8.2.1

On 11/13/2013 04:51 PM, Chen Hanxiao wrote:
From: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
If we enable userns, we could bind mount some dirs from host to guest, which don't belong to the target mapped uid/gid.
Such as we could bind mount root's dirs to guest. What is worse, we could even modify root's files in that bind dir inside container.
If this happend, this problem should be a generic permission problem. it should be fixed in kernel.
So if we couldn't know the dir's ownership(without a proper uid/gid mapping), don't mount it.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> --- v2: add more description
src/lxc/lxc_container.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 255c711..4cf209e 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -96,6 +96,8 @@ typedef char lxc_message_t; #define LXC_CONTINUE_MSG 'c'
+#define OVERFLOW_UGID 65534 + typedef struct __lxc_child_argv lxc_child_argv_t; struct __lxc_child_argv { virDomainDefPtr config; @@ -1073,6 +1075,22 @@ static int lxcContainerMountFSBind(virDomainFSDefPtr fs, if (virAsprintf(&src, "%s%s", srcprefix, fs->src) < 0) goto cleanup;
+ if (stat(src, &st) < 0) { + virReportSystemError(errno, _("Unable to stat bind source %s"), + src); + goto cleanup; + } else { + if (OVERFLOW_UGID == st.st_uid || OVERFLOW_UGID == st.st_gid) { + errno = EPERM; + VIR_DEBUG("Unknown st_uid %d, st_gid %d for %s", + st.st_uid, st.st_gid, fs->src); + virReportSystemError(errno, + _("Check the permission of src dir '%s' provided for container") + ,fs->src); + goto cleanup; + } + } + if (stat(fs->dst, &st) < 0) { if (errno != ENOENT) { virReportSystemError(errno, _("Unable to stat bind target %s"),

On Wed, Nov 13, 2013 at 04:51:43PM +0800, Chen Hanxiao wrote:
From: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
If we enable userns, we could bind mount some dirs from host to guest, which don't belong to the target mapped uid/gid.
Such as we could bind mount root's dirs to guest. What is worse, we could even modify root's files in that bind dir inside container.
I still can't see what the problem is from the description here. Please can you give a clear example of the config used and exactly what goes wrong. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

-----Original Message----- From: Daniel P. Berrange [mailto:berrange@redhat.com] Sent: Wednesday, November 13, 2013 6:35 PM To: Chen Hanxiao Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH v2]lxc: don't mount dir if ownership couldn't be known
On Wed, Nov 13, 2013 at 04:51:43PM +0800, Chen Hanxiao wrote:
From: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
If we enable userns, we could bind mount some dirs from host to guest, which don't belong to the target mapped uid/gid.
Such as we could bind mount root's dirs to guest. What is worse, we could even modify root's files in that bind dir inside container.
I still can't see what the problem is from the description here. Please can you give a clear example of the config used and exactly what goes wrong.
1. enable user namespace <idmap> <uid start='0' target='1001' count='10'/> <gid start='0' target='1001' count='10'/> </idmap> 2. bind mount some dirs to container, which belongs to root or other users. <filesystem type='mount' accessmode='passthrough'> <source dir='/media/LXC1'/> <target dir='/mnt'/> </filesystem> # ll /media/ ... drwxr-xr-x. 3 root root 4096 Nov 13 17:21 LXC1 ... 3. start container I used to encounter issues: inside container, we could modify files under /mnt So I think inside user namespace, if we do not have a proper id mapping, we should not bind mount it for containers, or at least set it as readonly.

On Thu, Nov 14, 2013 at 05:44:40PM +0800, Chen Hanxiao wrote:
-----Original Message----- From: Daniel P. Berrange [mailto:berrange@redhat.com] Sent: Wednesday, November 13, 2013 6:35 PM To: Chen Hanxiao Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH v2]lxc: don't mount dir if ownership couldn't be known
On Wed, Nov 13, 2013 at 04:51:43PM +0800, Chen Hanxiao wrote:
From: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
If we enable userns, we could bind mount some dirs from host to guest, which don't belong to the target mapped uid/gid.
Such as we could bind mount root's dirs to guest. What is worse, we could even modify root's files in that bind dir inside container.
I still can't see what the problem is from the description here. Please can you give a clear example of the config used and exactly what goes wrong.
1. enable user namespace <idmap> <uid start='0' target='1001' count='10'/> <gid start='0' target='1001' count='10'/> </idmap>
2. bind mount some dirs to container, which belongs to root or other users. <filesystem type='mount' accessmode='passthrough'> <source dir='/media/LXC1'/> <target dir='/mnt'/> </filesystem>
# ll /media/ ... drwxr-xr-x. 3 root root 4096 Nov 13 17:21 LXC1 ...
3. start container
I used to encounter issues: inside container, we could modify files under /mnt
So I think inside user namespace, if we do not have a proper id mapping, we should not bind mount it for containers, or at least set it as readonly.
FYI, I'm trying to reproduce the problem myself, but have discovered that current kernels cause a regression which prevents libirt starting any user namespace kernels - it fails mounting /proc and /dev/pts What kernel version are you testing with ? Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11/16/2013 12:24 AM, Daniel P. Berrange wrote:
On Thu, Nov 14, 2013 at 05:44:40PM +0800, Chen Hanxiao wrote:
-----Original Message----- From: Daniel P. Berrange [mailto:berrange@redhat.com] Sent: Wednesday, November 13, 2013 6:35 PM To: Chen Hanxiao Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH v2]lxc: don't mount dir if ownership couldn't be known
On Wed, Nov 13, 2013 at 04:51:43PM +0800, Chen Hanxiao wrote:
From: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
If we enable userns, we could bind mount some dirs from host to guest, which don't belong to the target mapped uid/gid.
Such as we could bind mount root's dirs to guest. What is worse, we could even modify root's files in that bind dir inside container.
I still can't see what the problem is from the description here. Please can you give a clear example of the config used and exactly what goes wrong.
1. enable user namespace <idmap> <uid start='0' target='1001' count='10'/> <gid start='0' target='1001' count='10'/> </idmap>
2. bind mount some dirs to container, which belongs to root or other users. <filesystem type='mount' accessmode='passthrough'> <source dir='/media/LXC1'/> <target dir='/mnt'/> </filesystem>
# ll /media/ ... drwxr-xr-x. 3 root root 4096 Nov 13 17:21 LXC1 ...
3. start container
I used to encounter issues: inside container, we could modify files under /mnt
So I think inside user namespace, if we do not have a proper id mapping, we should not bind mount it for containers, or at least set it as readonly.
FYI, I'm trying to reproduce the problem myself, but have discovered that current kernels cause a regression which prevents libirt starting any user namespace kernels - it fails mounting /proc and /dev/pts
FYI,I'm working on fix this bug. both kernel and libvirt need to be changed.

On Thu, Nov 14, 2013 at 05:44:40PM +0800, Chen Hanxiao wrote:
-----Original Message----- From: Daniel P. Berrange [mailto:berrange@redhat.com] Sent: Wednesday, November 13, 2013 6:35 PM To: Chen Hanxiao Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH v2]lxc: don't mount dir if ownership couldn't be known
On Wed, Nov 13, 2013 at 04:51:43PM +0800, Chen Hanxiao wrote:
From: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
If we enable userns, we could bind mount some dirs from host to guest, which don't belong to the target mapped uid/gid.
Such as we could bind mount root's dirs to guest. What is worse, we could even modify root's files in that bind dir inside container.
I still can't see what the problem is from the description here. Please can you give a clear example of the config used and exactly what goes wrong.
1. enable user namespace <idmap> <uid start='0' target='1001' count='10'/> <gid start='0' target='1001' count='10'/> </idmap>
2. bind mount some dirs to container, which belongs to root or other users. <filesystem type='mount' accessmode='passthrough'> <source dir='/media/LXC1'/> <target dir='/mnt'/> </filesystem>
# ll /media/ ... drwxr-xr-x. 3 root root 4096 Nov 13 17:21 LXC1 ...
3. start container
I used to encounter issues: inside container, we could modify files under /mnt
So I think inside user namespace, if we do not have a proper id mapping, we should not bind mount it for containers, or at least set it as readonly.
I don't see any security problem in what we're doing already In the host I ran # mkdir /tmp/otheruser # echo foo > /tmp/otheruser/hello.txt # chown 500:500 /tmp/otheruser/ # chown 500:500 /tmp/otheruser/hello.txt # chmod o-rwx /tmp/otheruser/hello.txt And the container config has <idmap> <uid start='0' target='1001' count='10'/> <gid start='0' target='1001' count='10'/> </idmap> <filesystem type='mount' accessmode='passthrough'> <source dir='/tmp/otheruser'/> <target dir='/mnt'/> </filesystem> If I start the container now # virsh start --console shell Connected to domain shell Escape character is ^] # cd /mnt/ # ls -al total 8 drwxr-xr-x 2 65534 65534 60 Nov 18 15:51 . drwxr-xr-x 8 0 0 4096 Nov 18 15:52 .. -rw-r----- 1 65534 65534 4 Nov 18 15:51 hello.txt # cat hello.txt cat: can't open 'hello.txt': Permission denied Everything appears to be working as designed. The directory is set to the overflow users, and so my permissions inside the container are restricted to whatever the 'other' bit in the permission mask allows for. 'r-x' for the directory lets me see it, but '---' prevents we reading the file 'hello.txt'. So I don't see what your patch is trying to fix Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

-----Original Message----- From: Daniel P. Berrange [mailto:berrange@redhat.com] Sent: Monday, November 18, 2013 11:57 PM To: Chen Hanxiao Cc: libvir-list@redhat.com Subject: Re: [libvirt] [PATCH v2]lxc: don't mount dir if ownership couldn't be known
On Thu, Nov 14, 2013 at 05:44:40PM +0800, Chen Hanxiao wrote:
I used to encounter issues: inside container, we could modify files under /mnt
So I think inside user namespace, if we do not have a proper id mapping, we should not bind mount it for containers, or at least set it as readonly.
I don't see any security problem in what we're doing already
In the host I ran
# mkdir /tmp/otheruser # echo foo > /tmp/otheruser/hello.txt # chown 500:500 /tmp/otheruser/ # chown 500:500 /tmp/otheruser/hello.txt # chmod o-rwx /tmp/otheruser/hello.txt
And the container config has
<idmap> <uid start='0' target='1001' count='10'/> <gid start='0' target='1001' count='10'/> </idmap>
<filesystem type='mount' accessmode='passthrough'> <source dir='/tmp/otheruser'/> <target dir='/mnt'/> </filesystem>
If I start the container now
# virsh start --console shell Connected to domain shell Escape character is ^] # cd /mnt/ # ls -al total 8 drwxr-xr-x 2 65534 65534 60 Nov 18 15:51 . drwxr-xr-x 8 0 0 4096 Nov 18 15:52 .. -rw-r----- 1 65534 65534 4 Nov 18 15:51 hello.txt # cat hello.txt cat: can't open 'hello.txt': Permission denied
Everything appears to be working as designed. The directory is set to the overflow users, and so my permissions inside the container are restricted to whatever the 'other' bit in the permission mask allows for. 'r-x' for the directory lets me see it, but '---' prevents we reading the file 'hello.txt'.
So I don't see what your patch is trying to fix
Sorry for the late reply. On one of kernel version of 3.11-rcX, I do encounter an issue that we can MODIFY kernel's file without related permission mask inside container. Gao said that couldn't be happen and I couldn't reproduce that issue on 3.12. (I lost the original env) If I could encounter this issue again, I'll let Gao check it with me. Thanks for your experiment.
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
participants (3)
-
Chen Hanxiao
-
Daniel P. Berrange
-
Gao feng