[libvirt] [PATCH v3 00/12] Add user namespace support for libvirt lxc

This patchset try to add userns support for libvirt lxc. Since userns is nearly completed in linux-3.9, the old kernel doesn't support userns, I add some New XML elements to let people decide if enable userns.The userns is enabled only when user configure the XML. The format of user namespace related XML file like below: <idmap> <uid start='0' target='1000' count='10'> <gid start='0' target='1000' count='10'> </idmap> it means the user in container (which uid:gid is 0:0) will be mapped to the user in host (uid:gid is 1000:1000), count is used to form an u/gid range: The users in container which uid in [start, start + count -1] will be mapped. You can have multiple lines to map differnet id ranges, caution, you must make sure the root user of container has been mapped. This patchset also does the below jobs. 1, Because the uninit userns has no right to create devices, we should create devices for container on host. 2, Changes the owner of fuse and tty device. Change from v2: 1, Mount tmpfs on /stateDir/domain.dev 2, Create devices under /stateDir/doamin.dev/ 3, Mount Move the /.oldroot/stateDir/doamin.dev/ on the /dev/ of container 4, Enhance the configuration, disallow the semi configuration Gao feng (12): LXC: Introduce New XML element for user namespace LXC: enable user namespace only when user set the uidmap LXC: sort the uidmap/gidmap of domain LXC: introduce virLXCControllerSetupUserns and lxcContainerSetID LXC: Creating devices for container on host side LXC: Move creating /dev/ptmx to virLXCControllerSetupDevPTS LXC: fuse: Change files owner to the root user of container LXC: controller: change the owner of tty devices to the root user of container LXC: controller: change the owner of /dev to the root user of container LXC: controller: change the owner of devices created on host LXC: controller: change the owner of /dev/pts and ptmx to the root of container LXC: introduce virLXCControllerChown docs/formatdomain.html.in | 23 ++++ docs/schemas/domaincommon.rng | 31 +++++ src/conf/domain_conf.c | 115 ++++++++++++++++++ src/conf/domain_conf.h | 22 ++++ src/lxc/lxc_container.c | 183 ++++++++++++++-------------- src/lxc/lxc_controller.c | 271 +++++++++++++++++++++++++++++++++++++++++- src/lxc/lxc_fuse.c | 6 + 7 files changed, 557 insertions(+), 94 deletions(-) -- 1.8.1.4

This patch introduces new element <idmap> for user namespace. for example <idmap> <uid start='0' target='1000' count='10'/> <gid start='0' target='1000' count='10'/> </idmap> this new element is used for setting proc files /proc/<pid>/{uid_map,gid_map}. This patch also supports multiple uid/gid elements setting in XML configuration. We don't support the semi configuation, user has to configure uid and gid both. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- docs/formatdomain.html.in | 23 +++++++++++ docs/schemas/domaincommon.rng | 31 ++++++++++++++ src/conf/domain_conf.c | 95 +++++++++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 22 ++++++++++ 4 files changed, 171 insertions(+) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 3a200aa..cc0ae52 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -285,6 +285,29 @@ </pre> + <p> + If you want to enable user namespace,set the <code>idmap</code> element. + the <code>uid</code> and <code>gid</code> elements have three attributes: + </p> + + <dl> + <dt><code>start</code></dt> + <dd>First user id in container.</dd> + <dt><code>target</code></dt> + <dd>The first user id in container will be mapped to this target user + id in host.</dd> + <dt><code>count</code></dt> + <dd>How many users in container being allowed to map to host's user.</dd> + </dl> + + <pre> + <idmap> + <uid start='0' target='1000' count='10'/> + <gid start='0' target='1000' count='10'/> + </idmap> + </pre> + + <h3><a name="elementsSysinfo">SMBIOS System Information</a></h3> <p> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 3cace35..ac07407 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -56,6 +56,9 @@ <ref name="pm"/> </optional> <optional> + <ref name="idmap"/> + </optional> + <optional> <ref name="devices"/> </optional> <zeroOrMore> @@ -465,6 +468,34 @@ </optional> </interleave> </define> + <define name="idmap"> + <zeroOrMore> + <element name="uid"> + <attribute name="start"> + <ref name="unsignedInt"/> + </attribute> + <attribute name="target"> + <ref name="unsignedInt"/> + </attribute> + <attribute name="count"> + <ref name="unsignedInt"/> + </attribute> + </element> + </zeroOrMore> + <zeroOrMore> + <element name="gid"> + <attribute name="start"> + <ref name="unsignedInt"/> + </attribute> + <attribute name="target"> + <ref name="unsignedInt"/> + </attribute> + <attribute name="count"> + <ref name="unsignedInt"/> + </attribute> + </element> + </zeroOrMore> + </define> <!-- Resources usage defines the amount of memory (maximum and possibly current usage) and number of virtual CPUs used by that domain. diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index ad5550c..a3c5c84 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1939,6 +1939,9 @@ void virDomainDefFree(virDomainDefPtr def) virDomainTPMDefFree(def->tpm); + VIR_FREE(def->idmap.uidmap); + VIR_FREE(def->idmap.gidmap); + VIR_FREE(def->os.type); VIR_FREE(def->os.machine); VIR_FREE(def->os.init); @@ -10057,6 +10060,40 @@ cleanup: return ret; } + +/* Parse the XML definition for user namespace id map. + * + * idmap has the form of + * + * <uid start='0' target='1000' count='10'/> + * <gid start='0' target='1000' count='10'/> + */ +static virDomainIdMapEntryPtr +virDomainIdmapDefParseXML(const xmlNodePtr *node, + xmlXPathContextPtr ctxt, + ssize_t num) +{ + int i; + virDomainIdMapEntryPtr idmap = NULL; + xmlNodePtr save_ctxt = ctxt->node; + + if (VIR_ALLOC_N(idmap, num) < 0) { + virReportOOMError(); + goto error; + } + + for (i = 0; i < num; i++) { + ctxt->node = node[i]; + virXPathUInt("string(./@start)", ctxt, &idmap[i].start); + virXPathUInt("string(./@target)", ctxt, &idmap[i].target); + virXPathUInt("string(./@count)", ctxt, &idmap[i].count); + } + error: + ctxt->node = save_ctxt; + return idmap; +} + + /* Parse the XML definition for a vcpupin or emulatorpin. * * vcpupin has the form of @@ -11804,6 +11841,43 @@ virDomainDefParseXML(xmlDocPtr xml, } VIR_FREE(nodes); + /* analysis of the user namespace mapping */ + def->idmap.nuidmap = 0; + def->idmap.uidmap = NULL; + if ((n = virXPathNodeSet("./idmap/uid", ctxt, &nodes)) < 0) + goto error; + + if (n) { + def->idmap.uidmap = virDomainIdmapDefParseXML(nodes, ctxt, n); + if (!def->idmap.uidmap) + goto error; + + def->idmap.nuidmap = n; + } + VIR_FREE(nodes); + + def->idmap.ngidmap = 0; + def->idmap.gidmap = NULL; + + if ((n = virXPathNodeSet("./idmap/gid", ctxt, &nodes)) < 0) + goto error; + + if (n) { + def->idmap.gidmap = virDomainIdmapDefParseXML(nodes, ctxt, n); + if (!def->idmap.gidmap) + goto error; + + def->idmap.ngidmap = n; + } + VIR_FREE(nodes); + + if ((def->idmap.uidmap && !def->idmap.gidmap) || + (!def->idmap.uidmap && def->idmap.gidmap)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("uid and gid should be mapped both")); + goto error; + } + /* analysis of cpu handling */ if ((node = virXPathNode("./cpu[1]", ctxt)) != NULL) { xmlNodePtr oldnode = ctxt->node; @@ -16008,6 +16082,27 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAddLit(buf, " </os>\n"); + + if (def->idmap.uidmap) { + virBufferAddLit(buf, " <idmap>\n"); + for (i = 0 ; i < def->idmap.nuidmap; i++) { + virBufferAsprintf(buf, + " <uid start='%u' target='%u' count='%u'/>\n", + def->idmap.uidmap[i].start, + def->idmap.uidmap[i].target, + def->idmap.uidmap[i].count); + } + for (i = 0 ; i < def->idmap.ngidmap; i++) { + virBufferAsprintf(buf, + " <gid start='%u' target='%u' count='%u'/>\n", + def->idmap.gidmap[i].start, + def->idmap.gidmap[i].target, + def->idmap.gidmap[i].count); + } + virBufferAddLit(buf, " </idmap>\n"); + } + + if (def->features) { virBufferAddLit(buf, " <features>\n"); for (i = 0; i < VIR_DOMAIN_FEATURE_LAST; i++) { diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 3a71d6c..29a6d8a 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -120,6 +120,12 @@ typedef virDomainSnapshotObjList *virDomainSnapshotObjListPtr; typedef struct _virDomainRNGDef virDomainRNGDef; typedef virDomainRNGDef *virDomainRNGDefPtr; +typedef struct _virDomainIdMapEntry virDomainIdMapEntry; +typedef virDomainIdMapEntry *virDomainIdMapEntryPtr; + +typedef struct _virDomainIdMapDef virDomainIdMapDef; +typedef virDomainIdMapDef *virDomainIdMapDefPtr; + /* Flags for the 'type' field in virDomainDeviceDef */ typedef enum { VIR_DOMAIN_DEVICE_NONE = 0, @@ -1837,6 +1843,21 @@ struct _virDomainRNGDef { virDomainDeviceInfo info; }; +struct _virDomainIdMapEntry { + unsigned int start; + unsigned int target; + unsigned int count; +}; + +struct _virDomainIdMapDef { + size_t nuidmap; + virDomainIdMapEntryPtr uidmap; + + size_t ngidmap; + virDomainIdMapEntryPtr gidmap; +}; + + void virBlkioDeviceWeightArrayClear(virBlkioDeviceWeightPtr deviceWeights, int ndevices); @@ -1899,6 +1920,7 @@ struct _virDomainDef { virNumaTuneDef numatune; virDomainResourceDefPtr resource; + virDomainIdMapDef idmap; /* These 3 are based on virDomainLifeCycleAction enum flags */ int onReboot; -- 1.8.1.4

On Thu, May 23, 2013 at 12:06:45PM +0800, Gao feng wrote:
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index ad5550c..a3c5c84 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -10057,6 +10060,40 @@ cleanup: return ret; }
+ +/* Parse the XML definition for user namespace id map. + * + * idmap has the form of + * + * <uid start='0' target='1000' count='10'/> + * <gid start='0' target='1000' count='10'/> + */ +static virDomainIdMapEntryPtr +virDomainIdmapDefParseXML(const xmlNodePtr *node, + xmlXPathContextPtr ctxt, + ssize_t num)
s/ssize_t/size_t/ also prefer to call it 'size_t nnodes' and 'const xmlNodePtr *nodes' to make it clear these params are related. So use: (xmlXPathContextPtr ctxt, const xmlNodePtr *nodes, size_t nnodes)
+{ + int i;
s/int/size_t/
+ virDomainIdMapEntryPtr idmap = NULL; + xmlNodePtr save_ctxt = ctxt->node; + + if (VIR_ALLOC_N(idmap, num) < 0) { + virReportOOMError(); + goto error; + } + + for (i = 0; i < num; i++) { + ctxt->node = node[i]; + virXPathUInt("string(./@start)", ctxt, &idmap[i].start); + virXPathUInt("string(./@target)", ctxt, &idmap[i].target); + virXPathUInt("string(./@count)", ctxt, &idmap[i].count); + } + error: + ctxt->node = save_ctxt; + return idmap; +} + + /* Parse the XML definition for a vcpupin or emulatorpin. * * vcpupin has the form of @@ -11804,6 +11841,43 @@ virDomainDefParseXML(xmlDocPtr xml, } VIR_FREE(nodes);
+ /* analysis of the user namespace mapping */ + def->idmap.nuidmap = 0; + def->idmap.uidmap = NULL;
No need for these 2 lines - VIR_ALLOC initializes all memory to 0 by default
+ if ((n = virXPathNodeSet("./idmap/uid", ctxt, &nodes)) < 0) + goto error; + + if (n) { + def->idmap.uidmap = virDomainIdmapDefParseXML(nodes, ctxt, n); + if (!def->idmap.uidmap) + goto error; + + def->idmap.nuidmap = n; + } + VIR_FREE(nodes); + + def->idmap.ngidmap = 0; + def->idmap.gidmap = NULL;
Again no need for these two lines
+ + if ((n = virXPathNodeSet("./idmap/gid", ctxt, &nodes)) < 0) + goto error; + + if (n) { + def->idmap.gidmap = virDomainIdmapDefParseXML(nodes, ctxt, n); + if (!def->idmap.gidmap) + goto error; + + def->idmap.ngidmap = n; + } + VIR_FREE(nodes); + + if ((def->idmap.uidmap && !def->idmap.gidmap) || + (!def->idmap.uidmap && def->idmap.gidmap)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("uid and gid should be mapped both")); + goto error; + } + /* analysis of cpu handling */ if ((node = virXPathNode("./cpu[1]", ctxt)) != NULL) { xmlNodePtr oldnode = ctxt->node; @@ -16008,6 +16082,27 @@ virDomainDefFormatInternal(virDomainDefPtr def,
virBufferAddLit(buf, " </os>\n");
+ + if (def->idmap.uidmap) { + virBufferAddLit(buf, " <idmap>\n"); + for (i = 0 ; i < def->idmap.nuidmap; i++) { + virBufferAsprintf(buf, + " <uid start='%u' target='%u' count='%u'/>\n", + def->idmap.uidmap[i].start, + def->idmap.uidmap[i].target, + def->idmap.uidmap[i].count); + } + for (i = 0 ; i < def->idmap.ngidmap; i++) { + virBufferAsprintf(buf, + " <gid start='%u' target='%u' count='%u'/>\n", + def->idmap.gidmap[i].start, + def->idmap.gidmap[i].target, + def->idmap.gidmap[i].count); + } + virBufferAddLit(buf, " </idmap>\n"); + } + + if (def->features) { virBufferAddLit(buf, " <features>\n"); for (i = 0; i < VIR_DOMAIN_FEATURE_LAST; i++) {
Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

User namespace will be enabled only when the idmap exist in configuration. If you want disable user namespace,just remove these elements from XML. If kernel doesn't support user namespace and idmap exist in configuration file, libvirt lxc will start failed and return "Kernel doesn't support user namespace" message. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_container.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index c74e3ca..618252c 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -2029,14 +2029,12 @@ cleanup: static int userns_supported(void) { -#if 1 - /* - * put off using userns until uid mapping is implemented - */ - return 0; -#else return lxcContainerAvailable(LXC_CONTAINER_FEATURE_USER) == 0; -#endif +} + +static int userns_required(virDomainDefPtr def) +{ + return def->idmap.uidmap && def->idmap.gidmap; } virArch lxcContainerGetAlt32bitArch(virArch arch) @@ -2116,9 +2114,15 @@ int lxcContainerStart(virDomainDefPtr def, cflags = CLONE_NEWPID|CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|SIGCHLD; - if (userns_supported()) { - VIR_DEBUG("Enable user namespaces"); - cflags |= CLONE_NEWUSER; + if (userns_required(def)) { + if (userns_supported()) { + VIR_DEBUG("Enable user namespace"); + cflags |= CLONE_NEWUSER; + } else { + virReportSystemError(VIR_ERR_NO_KERNEL, "%s", + _("Kernel doesn't support user namespace")); + return -1; + } } if (lxcNeedNetworkNamespace(def)) { -- 1.8.1.4

On Thu, May 23, 2013 at 12:06:46PM +0800, Gao feng wrote:
User namespace will be enabled only when the idmap exist in configuration.
If you want disable user namespace,just remove these elements from XML.
If kernel doesn't support user namespace and idmap exist in configuration file, libvirt lxc will start failed and return "Kernel doesn't support user namespace" message.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_container.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index c74e3ca..618252c 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -2116,9 +2114,15 @@ int lxcContainerStart(virDomainDefPtr def,
cflags = CLONE_NEWPID|CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|SIGCHLD;
- if (userns_supported()) { - VIR_DEBUG("Enable user namespaces"); - cflags |= CLONE_NEWUSER; + if (userns_required(def)) { + if (userns_supported()) { + VIR_DEBUG("Enable user namespace"); + cflags |= CLONE_NEWUSER; + } else { + virReportSystemError(VIR_ERR_NO_KERNEL, "%s",
Use VIR_ERR_CONFIG_UNSUPPORTED for this error message.
+ _("Kernel doesn't support user namespace")); + return -1; + } }
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 05/23/2013 12:06 AM, Gao feng wrote:
User namespace will be enabled only when the idmap exist in configuration.
If you want disable user namespace,just remove these elements from XML.
If kernel doesn't support user namespace and idmap exist in configuration file, libvirt lxc will start failed and return "Kernel doesn't support user namespace" message.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_container.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index c74e3ca..618252c 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -2029,14 +2029,12 @@ cleanup:
static int userns_supported(void) { -#if 1 - /* - * put off using userns until uid mapping is implemented - */ - return 0; -#else return lxcContainerAvailable(LXC_CONTAINER_FEATURE_USER) == 0; -#endif +} + +static int userns_required(virDomainDefPtr def) +{ + return def->idmap.uidmap && def->idmap.gidmap; }
virArch lxcContainerGetAlt32bitArch(virArch arch) @@ -2116,9 +2114,15 @@ int lxcContainerStart(virDomainDefPtr def,
cflags = CLONE_NEWPID|CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|SIGCHLD;
- if (userns_supported()) { - VIR_DEBUG("Enable user namespaces"); - cflags |= CLONE_NEWUSER; + if (userns_required(def)) { + if (userns_supported()) { + VIR_DEBUG("Enable user namespace"); + cflags |= CLONE_NEWUSER; + } else { + virReportSystemError(VIR_ERR_NO_KERNEL, "%s", + _("Kernel doesn't support user namespace")); + return -1; + }
Since this was pushed yesterday, my overnight Coverity run picked up a problem (resource leak because stack is not VIR_FREE()'d): 2118 /* allocate a stack for the container */ (1) Event alloc_arg: "virAllocN(void *, size_t, size_t)" allocates memory that is stored into "stack". [details] (2) Event cond_false: Condition "virAllocN(&stack, 1UL /* sizeof (*stack) */, stacksize) < 0", taking false branch Also see events: [var_assign][leaked_storage][leaked_storage] 2119 if (VIR_ALLOC_N(stack, stacksize) < 0) { 2120 virReportOOMError(); 2121 return -1; (3) Event if_end: End of if statement 2122 } (4) Event var_assign: Assigning: "stacktop" = "stack". Also see events: [alloc_arg][leaked_storage][leaked_storage] 2123 stacktop = stack + stacksize; 2124 2125 cflags = CLONE_NEWPID|CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|SIGCHLD; 2126 (5) Event cond_true: Condition "userns_required(def)", taking true branch 2127 if (userns_required(def)) { (6) Event cond_false: Condition "userns_supported()", taking false branch 2128 if (userns_supported()) { 2129 VIR_DEBUG("Enable user namespace"); 2130 cflags |= CLONE_NEWUSER; (7) Event else_branch: Reached else branch 2131 } else { 2132 virReportSystemError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", 2133 _("Kernel doesn't support user namespace")); (8) Event leaked_storage: Variable "stacktop" going out of scope leaks the storage it points to. (9) Event leaked_storage: Variable "stack" going out of scope leaks the storage it points to. Also see events: [alloc_arg][var_assign] 2134 return -1; John
}
if (lxcNeedNetworkNamespace(def)) {

We forgot to free the stack when userns configuration is incorrect. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_container.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 282c726..c8420db 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -2131,6 +2131,7 @@ int lxcContainerStart(virDomainDefPtr def, } else { virReportSystemError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Kernel doesn't support user namespace")); + VIR_FREE(stack); return -1; } } -- 1.8.1.4

We forgot to free the stack when Kernel doesn't support user namespace. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_container.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 282c726..c8420db 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -2131,6 +2131,7 @@ int lxcContainerStart(virDomainDefPtr def, } else { virReportSystemError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Kernel doesn't support user namespace")); + VIR_FREE(stack); return -1; } } -- 1.8.1.4

On Wed, Jul 03, 2013 at 07:16:04PM +0800, Gao feng wrote:
We forgot to free the stack when Kernel doesn't support user namespace.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_container.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 282c726..c8420db 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -2131,6 +2131,7 @@ int lxcContainerStart(virDomainDefPtr def, } else { virReportSystemError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Kernel doesn't support user namespace")); + VIR_FREE(stack); return -1; } }
ACK, will push shortly. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Make sure the mapping line contains the root user of container is the first element of idmap array. So we can get the real user id on host for the container easily. This patch also check the map information, User must map the root user of container to any user of host. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/conf/domain_conf.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a3c5c84..e271f5a 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -10061,6 +10061,14 @@ cleanup: } +static int virDomainIdMapEntrySort(const void *a, const void *b) +{ + const virDomainIdMapEntryPtr entrya = (const virDomainIdMapEntryPtr) a; + const virDomainIdMapEntryPtr entryb = (const virDomainIdMapEntryPtr) b; + + return entrya->start > entryb->start; +} + /* Parse the XML definition for user namespace id map. * * idmap has the form of @@ -10088,6 +10096,18 @@ virDomainIdmapDefParseXML(const xmlNodePtr *node, virXPathUInt("string(./@target)", ctxt, &idmap[i].target); virXPathUInt("string(./@count)", ctxt, &idmap[i].count); } + + qsort(idmap, num, sizeof(idmap[0]), virDomainIdMapEntrySort); + + if (idmap[0].start != 0) { + /* Root user of container hasn't been mapped to any user of host, + * return error. */ + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("You must map the root user of container")); + VIR_FREE(idmap); + idmap = NULL; + } + error: ctxt->node = save_ctxt; return idmap; -- 1.8.1.4

On Thu, May 23, 2013 at 12:06:47PM +0800, Gao feng wrote:
Make sure the mapping line contains the root user of container is the first element of idmap array. So we can get the real user id on host for the container easily.
This patch also check the map information, User must map the root user of container to any user of host.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/conf/domain_conf.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a3c5c84..e271f5a 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -10061,6 +10061,14 @@ cleanup: }
+static int virDomainIdMapEntrySort(const void *a, const void *b) +{ + const virDomainIdMapEntryPtr entrya = (const virDomainIdMapEntryPtr) a; + const virDomainIdMapEntryPtr entryb = (const virDomainIdMapEntryPtr) b; + + return entrya->start > entryb->start;
That isn't compliant with qsort requirements. You need to return -ve, 0 or +ve. So if (entrya->start > entryb->start) return 1; else if (entrya->start < entryb->start) return -1; else return 0;
@@ -10088,6 +10096,18 @@ virDomainIdmapDefParseXML(const xmlNodePtr *node, virXPathUInt("string(./@target)", ctxt, &idmap[i].target); virXPathUInt("string(./@count)", ctxt, &idmap[i].count); } + + qsort(idmap, num, sizeof(idmap[0]), virDomainIdMapEntrySort); + + if (idmap[0].start != 0) { + /* Root user of container hasn't been mapped to any user of host, + * return error. */ + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("You must map the root user of container")); + VIR_FREE(idmap); + idmap = NULL; + } +
Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

This patch introduces new helper function virLXCControllerSetupUserns, in this function, we set the files uid_map and gid_map of the init task of container. lxcContainerSetID is used for creating cred for tasks running in container. Since after setuid/setgid, we may be a new user. This patch calls lxcContainerSetUserns at first to make sure the new created files belong to right user. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_container.c | 63 ++++++++++++++++++++++++++++++++-------------- src/lxc/lxc_controller.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 110 insertions(+), 18 deletions(-) diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 618252c..52fcf39 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -335,6 +335,30 @@ int lxcContainerWaitForContinue(int control) /** + * lxcContainerSetID: + * + * This function calls setuid and setgid to create proper + * cred for tasks running in container. + * + * Returns 0 on success or -1 in case of error + */ +static int lxcContainerSetID(virDomainDefPtr def) +{ + /* Only call virSetUIDGID when user namespace is enabled + * for this container. And user namespace is only enabled + * when nuidmap&ngidmap is not zero */ + + if (def->idmap.nuidmap && virSetUIDGID(0, 0) < 0) { + virReportSystemError(errno, "%s", + _("setuid or setgid failed")); + return -1; + } + + return 0; +} + + +/** * lxcContainerRenameAndEnableInterfaces: * @nveths: number of interfaces * @veths: interface names @@ -1937,6 +1961,27 @@ static int lxcContainerChild(void *data) cmd = lxcContainerBuildInitCmd(vmDef); virCommandWriteArgLog(cmd, 1); + if (lxcContainerResolveSymlinks(vmDef) < 0) + goto cleanup; + + if (!virFileExists(vmDef->os.init)) { + virReportSystemError(errno, + _("cannot find init path '%s' relative to container root"), + vmDef->os.init); + goto cleanup; + } + + /* Wait for interface devices to show up */ + if (lxcContainerWaitForContinue(argv->monitor) < 0) { + virReportSystemError(errno, "%s", + _("Failed to read the container continue message")); + goto cleanup; + } + VIR_DEBUG("Received container continue message"); + + if (lxcContainerSetID(vmDef) < 0) + goto cleanup; + root = virDomainGetRootFilesystem(vmDef); if (argv->nttyPaths) { @@ -1962,29 +2007,11 @@ static int lxcContainerChild(void *data) goto cleanup; } - if (lxcContainerResolveSymlinks(vmDef) < 0) - goto cleanup; - if (lxcContainerSetupPivotRoot(vmDef, root, argv->ttyPaths, argv->nttyPaths, argv->securityDriver) < 0) goto cleanup; - if (!virFileExists(vmDef->os.init)) { - virReportSystemError(errno, - _("cannot find init path '%s' relative to container root"), - vmDef->os.init); - goto cleanup; - } - - /* Wait for interface devices to show up */ - if (lxcContainerWaitForContinue(argv->monitor) < 0) { - virReportSystemError(errno, "%s", - _("Failed to read the container continue message")); - goto cleanup; - } - VIR_DEBUG("Received container continue message"); - /* rename and enable interfaces */ if (lxcContainerRenameAndEnableInterfaces(!!(vmDef->features & (1 << VIR_DOMAIN_FEATURE_PRIVNET)), diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index dfe686b..0a2e3ac 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1122,6 +1122,68 @@ cleanup2: } +static int +virLXCControllerSetupUsernsMap(virDomainIdMapEntryPtr map, + char *path, + int num) +{ + virBuffer map_value = VIR_BUFFER_INITIALIZER; + int i, ret = -1; + + for (i = 0; i < num; i++) + virBufferAsprintf(&map_value, "%u %u %u\n", + map[i].start, map[i].target, map[i].count); + + if (virFileWriteStr(path, virBufferCurrentContent(&map_value), 0) < 0) { + virReportSystemError(errno, _("unable write to %s"), path); + goto cleanup; + } + ret = 0; + +cleanup: + virBufferFreeAndReset(&map_value); + return ret; +} + +/** + * virLXCControllerSetupUserns + * + * Set proc files for user namespace + * + * Returns 0 on success or -1 in case of error + */ +static int virLXCControllerSetupUserns(virLXCControllerPtr ctrl) +{ + char *uid_map = NULL; + char *gid_map = NULL; + int ret = -1; + + /* User namespace is disabled for container */ + if (ctrl->def->idmap.nuidmap == 0) + return 0; + + if (virAsprintf(&uid_map, "/proc/%d/uid_map", ctrl->initpid) < 0) + goto cleanup; + + if (virLXCControllerSetupUsernsMap(ctrl->def->idmap.uidmap, uid_map, + ctrl->def->idmap.nuidmap) < 0) + goto cleanup; + + if (virAsprintf(&gid_map, "/proc/%d/gid_map", ctrl->initpid) < 0) + goto cleanup; + + if (virLXCControllerSetupUsernsMap(ctrl->def->idmap.gidmap, gid_map, + ctrl->def->idmap.ngidmap) < 0) + goto cleanup; + + ret = 0; +cleanup: + VIR_FREE(uid_map); + VIR_FREE(gid_map); + return ret; +} + + /** * virLXCControllerMoveInterfaces @@ -1544,6 +1606,9 @@ virLXCControllerRun(virLXCControllerPtr ctrl) VIR_FORCE_CLOSE(control[1]); VIR_FORCE_CLOSE(containerhandshake[1]); + if (virLXCControllerSetupUserns(ctrl) < 0) + goto cleanup; + if (virLXCControllerMoveInterfaces(ctrl) < 0) goto cleanup; -- 1.8.1.4

On Thu, May 23, 2013 at 12:06:48PM +0800, Gao feng wrote:
This patch introduces new helper function virLXCControllerSetupUserns, in this function, we set the files uid_map and gid_map of the init task of container.
lxcContainerSetID is used for creating cred for tasks running in container. Since after setuid/setgid, we may be a new user. This patch calls lxcContainerSetUserns at first to make sure the new created files belong to right user.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_container.c | 63 ++++++++++++++++++++++++++++++++-------------- src/lxc/lxc_controller.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 110 insertions(+), 18 deletions(-)
diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 618252c..52fcf39 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -335,6 +335,30 @@ int lxcContainerWaitForContinue(int control)
/** + * lxcContainerSetID: + * + * This function calls setuid and setgid to create proper + * cred for tasks running in container. + * + * Returns 0 on success or -1 in case of error + */ +static int lxcContainerSetID(virDomainDefPtr def) +{ + /* Only call virSetUIDGID when user namespace is enabled + * for this container. And user namespace is only enabled + * when nuidmap&ngidmap is not zero */ + + if (def->idmap.nuidmap && virSetUIDGID(0, 0) < 0) { + virReportSystemError(errno, "%s", + _("setuid or setgid failed")); + return -1; + } + + return 0; +} + + +/** * lxcContainerRenameAndEnableInterfaces: * @nveths: number of interfaces * @veths: interface names @@ -1937,6 +1961,27 @@ static int lxcContainerChild(void *data) cmd = lxcContainerBuildInitCmd(vmDef); virCommandWriteArgLog(cmd, 1);
+ if (lxcContainerResolveSymlinks(vmDef) < 0) + goto cleanup; + + if (!virFileExists(vmDef->os.init)) { + virReportSystemError(errno, + _("cannot find init path '%s' relative to container root"), + vmDef->os.init); + goto cleanup; + } + + /* Wait for interface devices to show up */ + if (lxcContainerWaitForContinue(argv->monitor) < 0) { + virReportSystemError(errno, "%s", + _("Failed to read the container continue message")); + goto cleanup; + } + VIR_DEBUG("Received container continue message"); + + if (lxcContainerSetID(vmDef) < 0) + goto cleanup; + root = virDomainGetRootFilesystem(vmDef);
if (argv->nttyPaths) { @@ -1962,29 +2007,11 @@ static int lxcContainerChild(void *data) goto cleanup; }
- if (lxcContainerResolveSymlinks(vmDef) < 0) - goto cleanup; - if (lxcContainerSetupPivotRoot(vmDef, root, argv->ttyPaths, argv->nttyPaths, argv->securityDriver) < 0) goto cleanup;
- if (!virFileExists(vmDef->os.init)) { - virReportSystemError(errno, - _("cannot find init path '%s' relative to container root"), - vmDef->os.init); - goto cleanup; - } - - /* Wait for interface devices to show up */ - if (lxcContainerWaitForContinue(argv->monitor) < 0) { - virReportSystemError(errno, "%s", - _("Failed to read the container continue message")); - goto cleanup; - } - VIR_DEBUG("Received container continue message");
Why did you need to move these functions before the lxcContainerSetID call ?
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index dfe686b..0a2e3ac 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1122,6 +1122,68 @@ cleanup2: }
+static int +virLXCControllerSetupUsernsMap(virDomainIdMapEntryPtr map, + char *path, + int num) +{ + virBuffer map_value = VIR_BUFFER_INITIALIZER; + int i, ret = -1; + + for (i = 0; i < num; i++) + virBufferAsprintf(&map_value, "%u %u %u\n", + map[i].start, map[i].target, map[i].count); +
You need to check virBufferError() to see if OOM has occured here.
+ if (virFileWriteStr(path, virBufferCurrentContent(&map_value), 0) < 0) { + virReportSystemError(errno, _("unable write to %s"), path); + goto cleanup; + } + ret = 0; + +cleanup: + virBufferFreeAndReset(&map_value); + return ret; +}
Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 06/04/2013 09:35 PM, Daniel P. Berrange wrote:
On Thu, May 23, 2013 at 12:06:48PM +0800, Gao feng wrote:
This patch introduces new helper function virLXCControllerSetupUserns, in this function, we set the files uid_map and gid_map of the init task of container.
lxcContainerSetID is used for creating cred for tasks running in container. Since after setuid/setgid, we may be a new user. This patch calls lxcContainerSetUserns at first to make sure the new created files belong to right user.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_container.c | 63 ++++++++++++++++++++++++++++++++-------------- src/lxc/lxc_controller.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 110 insertions(+), 18 deletions(-)
diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 618252c..52fcf39 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -335,6 +335,30 @@ int lxcContainerWaitForContinue(int control)
/** + * lxcContainerSetID: + * + * This function calls setuid and setgid to create proper + * cred for tasks running in container. + * + * Returns 0 on success or -1 in case of error + */ +static int lxcContainerSetID(virDomainDefPtr def) +{ + /* Only call virSetUIDGID when user namespace is enabled + * for this container. And user namespace is only enabled + * when nuidmap&ngidmap is not zero */ + + if (def->idmap.nuidmap && virSetUIDGID(0, 0) < 0) { + virReportSystemError(errno, "%s", + _("setuid or setgid failed")); + return -1; + } + + return 0; +} + + +/** * lxcContainerRenameAndEnableInterfaces: * @nveths: number of interfaces * @veths: interface names @@ -1937,6 +1961,27 @@ static int lxcContainerChild(void *data) cmd = lxcContainerBuildInitCmd(vmDef); virCommandWriteArgLog(cmd, 1);
+ if (lxcContainerResolveSymlinks(vmDef) < 0) + goto cleanup; + + if (!virFileExists(vmDef->os.init)) { + virReportSystemError(errno, + _("cannot find init path '%s' relative to container root"), + vmDef->os.init); + goto cleanup; + } + + /* Wait for interface devices to show up */ + if (lxcContainerWaitForContinue(argv->monitor) < 0) { + virReportSystemError(errno, "%s", + _("Failed to read the container continue message")); + goto cleanup; + } + VIR_DEBUG("Received container continue message"); + + if (lxcContainerSetID(vmDef) < 0) + goto cleanup; + root = virDomainGetRootFilesystem(vmDef);
if (argv->nttyPaths) { @@ -1962,29 +2007,11 @@ static int lxcContainerChild(void *data) goto cleanup; }
- if (lxcContainerResolveSymlinks(vmDef) < 0) - goto cleanup; - if (lxcContainerSetupPivotRoot(vmDef, root, argv->ttyPaths, argv->nttyPaths, argv->securityDriver) < 0) goto cleanup;
- if (!virFileExists(vmDef->os.init)) { - virReportSystemError(errno, - _("cannot find init path '%s' relative to container root"), - vmDef->os.init); - goto cleanup; - } - - /* Wait for interface devices to show up */ - if (lxcContainerWaitForContinue(argv->monitor) < 0) { - virReportSystemError(errno, "%s", - _("Failed to read the container continue message")); - goto cleanup; - } - VIR_DEBUG("Received container continue message");
Why did you need to move these functions before the lxcContainerSetID call ?
lxcContainerSetID is only meaningful after we set idmap for user namespace, Actually I just move lxcContainerSetID to the top of lxcContainerChild. It's better to call setuid/setgid as early as we can.
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index dfe686b..0a2e3ac 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1122,6 +1122,68 @@ cleanup2: }
+static int +virLXCControllerSetupUsernsMap(virDomainIdMapEntryPtr map, + char *path, + int num) +{ + virBuffer map_value = VIR_BUFFER_INITIALIZER; + int i, ret = -1; + + for (i = 0; i < num; i++) + virBufferAsprintf(&map_value, "%u %u %u\n", + map[i].start, map[i].target, map[i].count); +
You need to check
virBufferError()
to see if OOM has occured here.
sure, I will wait for your related patches being merged and then update this patchset. Thanks Gao
+ if (virFileWriteStr(path, virBufferCurrentContent(&map_value), 0) < 0) { + virReportSystemError(errno, _("unable write to %s"), path); + goto cleanup; + } + ret = 0; + +cleanup: + virBufferFreeAndReset(&map_value); + return ret; +}
Daniel

On Wed, Jun 05, 2013 at 02:43:41PM +0800, Gao feng wrote:
On 06/04/2013 09:35 PM, Daniel P. Berrange wrote:
On Thu, May 23, 2013 at 12:06:48PM +0800, Gao feng wrote:
This patch introduces new helper function virLXCControllerSetupUserns, in this function, we set the files uid_map and gid_map of the init task of container.
lxcContainerSetID is used for creating cred for tasks running in container. Since after setuid/setgid, we may be a new user. This patch calls lxcContainerSetUserns at first to make sure the new created files belong to right user.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
sure, I will wait for your related patches being merged and then update this patchset.
Both my patches are pushed now. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

user namespace doesn't allow to create devices in uninit userns. We should create devices on host side. We first mount tmpfs on dev directroy under state dir of container. then create devices under this dev dir. Finally in container, mount the dev directroy created on host to the /dev/ directroy of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_container.c | 96 +++++++++++++--------------------- src/lxc/lxc_controller.c | 130 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 166 insertions(+), 60 deletions(-) diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 52fcf39..c647f8e 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -683,7 +683,7 @@ err: } -static int lxcContainerMountBasicFS(char *sec_mount_options) +static int lxcContainerMountBasicFS(void) { const struct { const char *src; @@ -711,8 +711,6 @@ static int lxcContainerMountBasicFS(char *sec_mount_options) int i, rc = -1; char *opts = NULL; - VIR_DEBUG("Mounting basic filesystems sec_mount_options=%s", sec_mount_options); - for (i = 0; i < ARRAY_CARDINALITY(mnts); i++) { const char *srcpath = NULL; @@ -750,26 +748,6 @@ static int lxcContainerMountBasicFS(char *sec_mount_options) } } - /* - * tmpfs is limited to 64kb, since we only have device nodes in there - * and don't want to DOS the entire OS RAM usage - */ - - if (virAsprintf(&opts, - "mode=755,size=65536%s", sec_mount_options) < 0) { - virReportOOMError(); - goto cleanup; - } - - VIR_DEBUG("Mount devfs on /dev type=tmpfs flags=%x, opts=%s", - MS_NOSUID, opts); - if (mount("devfs", "/dev", "tmpfs", MS_NOSUID, opts) < 0) { - virReportSystemError(errno, - _("Failed to mount %s on %s type %s (%s)"), - "devfs", "/dev", "tmpfs", opts); - goto cleanup; - } - rc = 0; cleanup: @@ -811,6 +789,33 @@ static int lxcContainerMountProcFuse(virDomainDefPtr def ATTRIBUTE_UNUSED, } #endif +static int lxcContainerMountFSDev(virDomainDefPtr def, + const char *stateDir) +{ + int ret; + char *path = NULL; + + VIR_DEBUG("Mount /dev/ stateDir=%s", stateDir); + + if ((ret = virAsprintf(&path, + "/.oldroot/%s/%s.dev", + stateDir, + def->name)) < 0) + return ret; + + VIR_DEBUG("Trying to move %s to /dev", path); + + if ((ret = mount(path, "/dev", + NULL, MS_MOVE, NULL)) < 0) { + virReportSystemError(errno, + _("Failed to mount %s on /dev"), + path); + } + + VIR_FREE(path); + return ret; +} + static int lxcContainerMountFSDevPTS(virDomainDefPtr def, const char *stateDir) { @@ -847,22 +852,10 @@ cleanup: return ret; } -static int lxcContainerPopulateDevices(char **ttyPaths, size_t nttyPaths) +static int lxcContainerSetupDevices(char **ttyPaths, size_t nttyPaths) { size_t i; const struct { - int maj; - int min; - mode_t mode; - const char *path; - } devs[] = { - { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_NULL, 0666, "/dev/null" }, - { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_ZERO, 0666, "/dev/zero" }, - { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_FULL, 0666, "/dev/full" }, - { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_RANDOM, 0666, "/dev/random" }, - { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_URANDOM, 0666, "/dev/urandom" }, - }; - const struct { const char *src; const char *dst; } links[] = { @@ -872,18 +865,6 @@ static int lxcContainerPopulateDevices(char **ttyPaths, size_t nttyPaths) { "/proc/self/fd", "/dev/fd" }, }; - /* Populate /dev/ with a few important bits */ - for (i = 0; i < ARRAY_CARDINALITY(devs); i++) { - dev_t dev = makedev(devs[i].maj, devs[i].min); - if (mknod(devs[i].path, S_IFCHR, dev) < 0 || - chmod(devs[i].path, devs[i].mode)) { - virReportSystemError(errno, - _("Failed to make device %s"), - devs[i].path); - return -1; - } - } - for (i = 0; i < ARRAY_CARDINALITY(links); i++) { if (symlink(links[i].src, links[i].dst) < 0) { virReportSystemError(errno, @@ -903,15 +884,6 @@ static int lxcContainerPopulateDevices(char **ttyPaths, size_t nttyPaths) _("Failed to bind /dev/pts/ptmx on to /dev/ptmx")); return -1; } - } else { - /* Legacy devpts, so we need to just use shared one */ - dev_t dev = makedev(LXC_DEV_MAJ_TTY, LXC_DEV_MIN_PTMX); - if (mknod("/dev/ptmx", S_IFCHR, dev) < 0 || - chmod("/dev/ptmx", 0666)) { - virReportSystemError(errno, "%s", - _("Failed to make device /dev/ptmx")); - return -1; - } } for (i = 0; i < nttyPaths; i++) { @@ -1813,7 +1785,7 @@ static int lxcContainerSetupPivotRoot(virDomainDefPtr vmDef, goto cleanup; /* Mounts the core /proc, /sys, etc filesystems */ - if (lxcContainerMountBasicFS(sec_mount_options) < 0) + if (lxcContainerMountBasicFS() < 0) goto cleanup; /* Mounts /proc/meminfo etc sysinfo */ @@ -1825,12 +1797,16 @@ static int lxcContainerSetupPivotRoot(virDomainDefPtr vmDef, if (virCgroupIsolateMount(cgroup, "/.oldroot/", sec_mount_options) < 0) goto cleanup; + /* Mounts /dev */ + if (lxcContainerMountFSDev(vmDef, stateDir) < 0) + goto cleanup; + /* Mounts /dev/pts */ if (lxcContainerMountFSDevPTS(vmDef, stateDir) < 0) goto cleanup; - /* Populates device nodes in /dev/ */ - if (lxcContainerPopulateDevices(ttyPaths, nttyPaths) < 0) + /* Setup device nodes in /dev/ */ + if (lxcContainerSetupDevices(ttyPaths, nttyPaths) < 0) goto cleanup; /* Sets up any non-root mounts from guest config */ diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 0a2e3ac..e9808f3 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1184,6 +1184,132 @@ cleanup: } +static int +virLXCControllerSetupDev(virLXCControllerPtr ctrl) +{ + char *mount_options = NULL; + char *opts = NULL; + char *dev = NULL; + int ret = -1; + + VIR_DEBUG("Setting up /dev/ for container"); + + mount_options = virSecurityManagerGetMountOptions(ctrl->securityManager, + ctrl->def); + + if (virAsprintf(&dev, "/%s/%s.dev", + LXC_STATE_DIR, ctrl->def->name) < 0) { + virReportOOMError(); + goto cleanup; + } + + if (virFileMakePath(dev) < 0) { + virReportSystemError(errno, + _("Failed to make path %s"), + dev); + goto cleanup; + } + + /* + * tmpfs is limited to 64kb, since we only have device nodes in there + * and don't want to DOS the entire OS RAM usage + */ + + if (virAsprintf(&opts, + "mode=755,size=65536%s", mount_options) < 0) { + virReportOOMError(); + goto cleanup; + } + + + VIR_DEBUG("Mount devfs on %s type=tmpfs flags=%x, opts=%s", + dev, MS_NOSUID, opts); + if (mount("devfs", dev, "tmpfs", MS_NOSUID, opts) < 0) { + virReportSystemError(errno, + _("Failed to mount devfs (%s) on %s"), + opts, dev); + goto cleanup; + } + + ret = 0; + +cleanup: + VIR_FREE(opts); + VIR_FREE(mount_options); + VIR_FREE(dev); + return ret; +} + + +static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) +{ + size_t i; + int ret = -1; + char *ptmx = NULL; + char *path = NULL; + const struct { + int maj; + int min; + mode_t mode; + const char *path; + } devs[] = { + { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_NULL, 0666, "/null" }, + { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_ZERO, 0666, "/zero" }, + { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_FULL, 0666, "/full" }, + { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_RANDOM, 0666, "/random" }, + { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_URANDOM, 0666, "/urandom" }, + }; + + if (virLXCControllerSetupDev(ctrl) < 0) + goto out; + + /* Populate /dev/ with a few important bits */ + for (i = 0 ; i < ARRAY_CARDINALITY(devs) ; i++) { + if (virAsprintf(&path, "/%s/%s.dev/%s", + LXC_STATE_DIR, ctrl->def->name, + devs[i].path) < 0) { + virReportOOMError(); + goto out; + } + + dev_t dev = makedev(devs[i].maj, devs[i].min); + if (mknod(path, S_IFCHR, dev) < 0 || + chmod(path, devs[i].mode)) { + virReportSystemError(errno, + _("Failed to make device %s"), + devs[i].path); + goto out; + } + VIR_FREE(path); + } + + if (virAsprintf(&ptmx, "/%s/%s.devpts/ptmx", + LXC_STATE_DIR, ctrl->def->name) < 0) { + virReportOOMError(); + goto out; + } + + if (access(ptmx, W_OK)) { + if (virAsprintf(&path, "/%s/%s.dev/ptmx", + LXC_STATE_DIR, ctrl->def->name)) { + virReportOOMError(); + goto out; + } + /* Legacy devpts, so we need to just use shared one */ + dev_t dev = makedev(LXC_DEV_MAJ_TTY, LXC_DEV_MIN_PTMX); + if (mknod(path, S_IFCHR, dev) < 0 || + chmod(path, 0666)) { + virReportSystemError(errno, _("Failed to make device %s"), path); + goto out; + } + } + + ret = 0; +out: + VIR_FREE(ptmx); + VIR_FREE(path); + return ret; +} /** * virLXCControllerMoveInterfaces @@ -1585,6 +1711,9 @@ virLXCControllerRun(virLXCControllerPtr ctrl) if (virLXCControllerSetupDevPTS(ctrl) < 0) goto cleanup; + if (virLXCControllerPopulateDevices(ctrl) < 0) + goto cleanup; + if (virLXCControllerSetupFuse(ctrl) < 0) goto cleanup; @@ -1609,6 +1738,7 @@ virLXCControllerRun(virLXCControllerPtr ctrl) if (virLXCControllerSetupUserns(ctrl) < 0) goto cleanup; + if (virLXCControllerMoveInterfaces(ctrl) < 0) goto cleanup; -- 1.8.1.4

On Thu, May 23, 2013 at 12:06:49PM +0800, Gao feng wrote:
user namespace doesn't allow to create devices in uninit userns. We should create devices on host side.
We first mount tmpfs on dev directroy under state dir of container. then create devices under this dev dir.
Finally in container, mount the dev directroy created on host to the /dev/ directroy of container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_container.c | 96 +++++++++++++--------------------- src/lxc/lxc_controller.c | 130 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 166 insertions(+), 60 deletions(-)
@@ -903,15 +884,6 @@ static int lxcContainerPopulateDevices(char **ttyPaths, size_t nttyPaths) _("Failed to bind /dev/pts/ptmx on to /dev/ptmx")); return -1; } - } else { - /* Legacy devpts, so we need to just use shared one */ - dev_t dev = makedev(LXC_DEV_MAJ_TTY, LXC_DEV_MIN_PTMX); - if (mknod("/dev/ptmx", S_IFCHR, dev) < 0 || - chmod("/dev/ptmx", 0666)) { - virReportSystemError(errno, "%s", - _("Failed to make device /dev/ptmx")); - return -1; - } }
Opps, that code should have been deleted already. I've just sent a patch to kill this legacy code....
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 0a2e3ac..e9808f3 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c +static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) +{ + size_t i; + int ret = -1; + char *ptmx = NULL; + char *path = NULL; + const struct { + int maj; + int min; + mode_t mode; + const char *path; + } devs[] = { + { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_NULL, 0666, "/null" }, + { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_ZERO, 0666, "/zero" }, + { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_FULL, 0666, "/full" }, + { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_RANDOM, 0666, "/random" }, + { LXC_DEV_MAJ_MEMORY, LXC_DEV_MIN_URANDOM, 0666, "/urandom" }, + }; + + if (virLXCControllerSetupDev(ctrl) < 0) + goto out; + + /* Populate /dev/ with a few important bits */ + for (i = 0 ; i < ARRAY_CARDINALITY(devs) ; i++) { + if (virAsprintf(&path, "/%s/%s.dev/%s", + LXC_STATE_DIR, ctrl->def->name, + devs[i].path) < 0) { + virReportOOMError(); + goto out; + } + + dev_t dev = makedev(devs[i].maj, devs[i].min); + if (mknod(path, S_IFCHR, dev) < 0 || + chmod(path, devs[i].mode)) { + virReportSystemError(errno, + _("Failed to make device %s"), + devs[i].path); + goto out; + } + VIR_FREE(path); + } + + if (virAsprintf(&ptmx, "/%s/%s.devpts/ptmx", + LXC_STATE_DIR, ctrl->def->name) < 0) { + virReportOOMError(); + goto out; + } + + if (access(ptmx, W_OK)) { + if (virAsprintf(&path, "/%s/%s.dev/ptmx", + LXC_STATE_DIR, ctrl->def->name)) { + virReportOOMError(); + goto out; + } + /* Legacy devpts, so we need to just use shared one */ + dev_t dev = makedev(LXC_DEV_MAJ_TTY, LXC_DEV_MIN_PTMX); + if (mknod(path, S_IFCHR, dev) < 0 || + chmod(path, 0666)) { + virReportSystemError(errno, _("Failed to make device %s"), path); + goto out; + }
So you can avoid this legacy code here too. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Make codes clearer and reduce some virAsprintf. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 44 +++++++++++++++++++------------------------- 1 file changed, 19 insertions(+), 25 deletions(-) diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index e9808f3..7d10660 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1245,7 +1245,6 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) { size_t i; int ret = -1; - char *ptmx = NULL; char *path = NULL; const struct { int maj; @@ -1283,30 +1282,8 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) VIR_FREE(path); } - if (virAsprintf(&ptmx, "/%s/%s.devpts/ptmx", - LXC_STATE_DIR, ctrl->def->name) < 0) { - virReportOOMError(); - goto out; - } - - if (access(ptmx, W_OK)) { - if (virAsprintf(&path, "/%s/%s.dev/ptmx", - LXC_STATE_DIR, ctrl->def->name)) { - virReportOOMError(); - goto out; - } - /* Legacy devpts, so we need to just use shared one */ - dev_t dev = makedev(LXC_DEV_MAJ_TTY, LXC_DEV_MIN_PTMX); - if (mknod(path, S_IFCHR, dev) < 0 || - chmod(path, 0666)) { - virReportSystemError(errno, _("Failed to make device %s"), path); - goto out; - } - } - ret = 0; out: - VIR_FREE(ptmx); VIR_FREE(path); return ret; } @@ -1492,6 +1469,7 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) char *mount_options = NULL; char *opts = NULL; char *devpts = NULL; + char *path = NULL; int ret = -1; VIR_DEBUG("Setting up private /dev/pts"); @@ -1537,9 +1515,25 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) goto cleanup; } + if (access(ctrl->devptmx, W_OK) < 0) { + if (virAsprintf(&path, "/%s/%s.dev/ptmx", + LXC_STATE_DIR, ctrl->def->name)) { + virReportOOMError(); + goto cleanup; + } + /* Legacy devpts, so we need to just use shared one */ + dev_t dev = makedev(LXC_DEV_MAJ_TTY, LXC_DEV_MIN_PTMX); + if (mknod(path, S_IFCHR, dev) < 0 || + chmod(path, 0666)) { + virReportSystemError(errno, _("Failed to make device %s"), path); + goto cleanup; + } + } + ret = 0; cleanup: + VIR_FREE(path); VIR_FREE(opts); VIR_FREE(devpts); VIR_FREE(mount_options); @@ -1708,10 +1702,10 @@ virLXCControllerRun(virLXCControllerPtr ctrl) if (virLXCControllerSetupResourceLimits(ctrl, cgroup) < 0) goto cleanup; - if (virLXCControllerSetupDevPTS(ctrl) < 0) + if (virLXCControllerPopulateDevices(ctrl) < 0) goto cleanup; - if (virLXCControllerPopulateDevices(ctrl) < 0) + if (virLXCControllerSetupDevPTS(ctrl) < 0) goto cleanup; if (virLXCControllerSetupFuse(ctrl) < 0) -- 1.8.1.4

On Thu, May 23, 2013 at 12:06:50PM +0800, Gao feng wrote:
Make codes clearer and reduce some virAsprintf.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 44 +++++++++++++++++++------------------------- 1 file changed, 19 insertions(+), 25 deletions(-)
This patch will probably disappear since I deleted the legacy devpts code. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Otherwise we will fail to mount the meminfo file. This patch also allows any users to access the fuse mount point. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_fuse.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/lxc/lxc_fuse.c b/src/lxc/lxc_fuse.c index b6df99c..ddb737c 100644 --- a/src/lxc/lxc_fuse.c +++ b/src/lxc/lxc_fuse.c @@ -49,6 +49,9 @@ static int lxcProcGetattr(const char *path, struct stat *stbuf) char *mempath = NULL; struct stat sb; + struct fuse_context *context = fuse_get_context(); + virDomainDefPtr def = (virDomainDefPtr)context->private_data; + memset(stbuf, 0, sizeof(struct stat)); if (virAsprintf(&mempath, "/proc/%s", path) < 0) { virReportOOMError(); @@ -66,6 +69,8 @@ static int lxcProcGetattr(const char *path, struct stat *stbuf) goto cleanup; } + stbuf->st_uid = def->idmap.uidmap ? def->idmap.uidmap[0].target : 0; + stbuf->st_gid = def->idmap.gidmap ? def->idmap.gidmap[0].target : 0; stbuf->st_mode = sb.st_mode; stbuf->st_nlink = 1; stbuf->st_blksize = sb.st_blksize; @@ -307,6 +312,7 @@ int lxcSetupFuse(virLXCFusePtr *f, virDomainDefPtr def) /* process name is libvirt_lxc */ if (fuse_opt_add_arg(&args, "libvirt_lxc") == -1 || fuse_opt_add_arg(&args, "-odirect_io") == -1 || + fuse_opt_add_arg(&args, "-oallow_other") == -1 || fuse_opt_add_arg(&args, "-ofsname=libvirt") == -1) goto cleanup1; -- 1.8.1.4

On Thu, May 23, 2013 at 12:06:51PM +0800, Gao feng wrote:
Otherwise we will fail to mount the meminfo file. This patch also allows any users to access the fuse mount point.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_fuse.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/src/lxc/lxc_fuse.c b/src/lxc/lxc_fuse.c index b6df99c..ddb737c 100644 --- a/src/lxc/lxc_fuse.c +++ b/src/lxc/lxc_fuse.c @@ -49,6 +49,9 @@ static int lxcProcGetattr(const char *path, struct stat *stbuf) char *mempath = NULL; struct stat sb;
+ struct fuse_context *context = fuse_get_context(); + virDomainDefPtr def = (virDomainDefPtr)context->private_data; + memset(stbuf, 0, sizeof(struct stat)); if (virAsprintf(&mempath, "/proc/%s", path) < 0) { virReportOOMError(); @@ -66,6 +69,8 @@ static int lxcProcGetattr(const char *path, struct stat *stbuf) goto cleanup; }
+ stbuf->st_uid = def->idmap.uidmap ? def->idmap.uidmap[0].target : 0; + stbuf->st_gid = def->idmap.gidmap ? def->idmap.gidmap[0].target : 0; stbuf->st_mode = sb.st_mode; stbuf->st_nlink = 1; stbuf->st_blksize = sb.st_blksize; @@ -307,6 +312,7 @@ int lxcSetupFuse(virLXCFusePtr *f, virDomainDefPtr def) /* process name is libvirt_lxc */ if (fuse_opt_add_arg(&args, "libvirt_lxc") == -1 || fuse_opt_add_arg(&args, "-odirect_io") == -1 || + fuse_opt_add_arg(&args, "-oallow_other") == -1 || fuse_opt_add_arg(&args, "-ofsname=libvirt") == -1) goto cleanup1;
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, May 23, 2013 at 12:06:51PM +0800, Gao feng wrote:
Otherwise we will fail to mount the meminfo file. This patch also allows any users to access the fuse mount point.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
We actually need the following fix even without userns work:
@@ -307,6 +312,7 @@ int lxcSetupFuse(virLXCFusePtr *f, virDomainDefPtr def) /* process name is libvirt_lxc */ if (fuse_opt_add_arg(&args, "libvirt_lxc") == -1 || fuse_opt_add_arg(&args, "-odirect_io") == -1 || + fuse_opt_add_arg(&args, "-oallow_other") == -1 || fuse_opt_add_arg(&args, "-ofsname=libvirt") == -1) goto cleanup1;
So I've sent a patch to add that separately. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Since these tty devices will be used by container, the owner of them should be the root user of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 43 +++++++++++++++++++++++++++++++++++++------ 1 file changed, 37 insertions(+), 6 deletions(-) diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 7d10660..4660f25 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1380,13 +1380,14 @@ static int lxcSetPersonality(virDomainDefPtr def) * *TTYNAME. Heavily borrowed from glibc, but doesn't require that * devpts == "/dev/pts" */ static int -lxcCreateTty(char *ptmx, int *ttymaster, char **ttyName) +lxcCreateTty(virLXCControllerPtr ctrl, int *ttymaster, + char **ttyName, char **ttyHostPath) { int ret = -1; int ptyno; int unlock = 0; - if ((*ttymaster = open(ptmx, O_RDWR|O_NOCTTY|O_NONBLOCK)) < 0) + if ((*ttymaster = open(ctrl->devptmx, O_RDWR|O_NOCTTY|O_NONBLOCK)) < 0) goto cleanup; if (ioctl(*ttymaster, TIOCSPTLCK, &unlock) < 0) @@ -1407,6 +1408,13 @@ lxcCreateTty(char *ptmx, int *ttymaster, char **ttyName) goto cleanup; } + if (virAsprintf(ttyHostPath, "/%s/%s.devpts/%d", LXC_STATE_DIR, + ctrl->def->name, ptyno) < 0) { + virReportOOMError(); + errno = ENOMEM; + goto cleanup; + } + ret = 0; cleanup: @@ -1552,18 +1560,41 @@ virLXCControllerSetupConsoles(virLXCControllerPtr ctrl, char **containerTTYPaths) { size_t i; + int ret = -1; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; + char *ttyHostPath = NULL; + + if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + } for (i = 0; i < ctrl->nconsoles; i++) { VIR_DEBUG("Opening tty on private %s", ctrl->devptmx); - if (lxcCreateTty(ctrl->devptmx, + if (lxcCreateTty(ctrl, &ctrl->consoles[i].contFd, - &containerTTYPaths[i]) < 0) { + &containerTTYPaths[i], &ttyHostPath) < 0) { virReportSystemError(errno, "%s", _("Failed to allocate tty")); - return -1; + goto out; } + + /* Change the owner of tty device to the root user of container */ + if (chown(ttyHostPath, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of tty" + " %s to %u:%u"), + ttyHostPath, uid, gid); + goto out; + } + VIR_FREE(ttyHostPath); } - return 0; + + ret = 0; +out: + VIR_FREE(ttyHostPath); + return ret; } -- 1.8.1.4

Hi! ----- Ursprüngliche Mail -----
Since these tty devices will be used by container, the owner of them should be the root user of container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 43 +++++++++++++++++++++++++++++++++++++------ 1 file changed, 37 insertions(+), 6 deletions(-)
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 7d10660..4660f25 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1380,13 +1380,14 @@ static int lxcSetPersonality(virDomainDefPtr def) * *TTYNAME. Heavily borrowed from glibc, but doesn't require that * devpts == "/dev/pts" */ static int -lxcCreateTty(char *ptmx, int *ttymaster, char **ttyName) +lxcCreateTty(virLXCControllerPtr ctrl, int *ttymaster, + char **ttyName, char **ttyHostPath) { int ret = -1; int ptyno; int unlock = 0;
- if ((*ttymaster = open(ptmx, O_RDWR|O_NOCTTY|O_NONBLOCK)) < 0) + if ((*ttymaster = open(ctrl->devptmx, O_RDWR|O_NOCTTY|O_NONBLOCK)) < 0) goto cleanup;
if (ioctl(*ttymaster, TIOCSPTLCK, &unlock) < 0) @@ -1407,6 +1408,13 @@ lxcCreateTty(char *ptmx, int *ttymaster, char **ttyName) goto cleanup; }
+ if (virAsprintf(ttyHostPath, "/%s/%s.devpts/%d", LXC_STATE_DIR, + ctrl->def->name, ptyno) < 0) { + virReportOOMError(); + errno = ENOMEM; + goto cleanup; + } + ret = 0;
cleanup: @@ -1552,18 +1560,41 @@ virLXCControllerSetupConsoles(virLXCControllerPtr ctrl, char **containerTTYPaths) { size_t i; + int ret = -1; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; + char *ttyHostPath = NULL; + + if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + }
for (i = 0; i < ctrl->nconsoles; i++) { VIR_DEBUG("Opening tty on private %s", ctrl->devptmx); - if (lxcCreateTty(ctrl->devptmx, + if (lxcCreateTty(ctrl, &ctrl->consoles[i].contFd, - &containerTTYPaths[i]) < 0) { + &containerTTYPaths[i], &ttyHostPath) < 0) { virReportSystemError(errno, "%s", _("Failed to allocate tty")); - return -1; + goto out; } + + /* Change the owner of tty device to the root user of container */ + if (chown(ttyHostPath, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of tty" + " %s to %u:%u"), + ttyHostPath, uid, gid); + goto out; + } + VIR_FREE(ttyHostPath);
Why do you free ttyHostPath here? You already do it in the exit path.
} - return 0; + + ret = 0; +out: + VIR_FREE(ttyHostPath);
Double free?
+ return ret; }
Thanks, //richard

On 05/23/2013 01:52 PM, Richard RW. Weinberger wrote:
Hi!
----- Ursprüngliche Mail -----
Since these tty devices will be used by container, the owner of them should be the root user of container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 43 +++++++++++++++++++++++++++++++++++++------ 1 file changed, 37 insertions(+), 6 deletions(-)
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 7d10660..4660f25 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1380,13 +1380,14 @@ static int lxcSetPersonality(virDomainDefPtr def) * *TTYNAME. Heavily borrowed from glibc, but doesn't require that * devpts == "/dev/pts" */ static int -lxcCreateTty(char *ptmx, int *ttymaster, char **ttyName) +lxcCreateTty(virLXCControllerPtr ctrl, int *ttymaster, + char **ttyName, char **ttyHostPath) { int ret = -1; int ptyno; int unlock = 0;
- if ((*ttymaster = open(ptmx, O_RDWR|O_NOCTTY|O_NONBLOCK)) < 0) + if ((*ttymaster = open(ctrl->devptmx, O_RDWR|O_NOCTTY|O_NONBLOCK)) < 0) goto cleanup;
if (ioctl(*ttymaster, TIOCSPTLCK, &unlock) < 0) @@ -1407,6 +1408,13 @@ lxcCreateTty(char *ptmx, int *ttymaster, char **ttyName) goto cleanup; }
+ if (virAsprintf(ttyHostPath, "/%s/%s.devpts/%d", LXC_STATE_DIR, + ctrl->def->name, ptyno) < 0) { + virReportOOMError(); + errno = ENOMEM; + goto cleanup; + } + ret = 0;
cleanup: @@ -1552,18 +1560,41 @@ virLXCControllerSetupConsoles(virLXCControllerPtr ctrl, char **containerTTYPaths) { size_t i; + int ret = -1; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; + char *ttyHostPath = NULL; + + if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + }
for (i = 0; i < ctrl->nconsoles; i++) { VIR_DEBUG("Opening tty on private %s", ctrl->devptmx); - if (lxcCreateTty(ctrl->devptmx, + if (lxcCreateTty(ctrl, &ctrl->consoles[i].contFd, - &containerTTYPaths[i]) < 0) { + &containerTTYPaths[i], &ttyHostPath) < 0) { virReportSystemError(errno, "%s", _("Failed to allocate tty")); - return -1; + goto out; } + + /* Change the owner of tty device to the root user of container */ + if (chown(ttyHostPath, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of tty" + " %s to %u:%u"), + ttyHostPath, uid, gid); + goto out; + } + VIR_FREE(ttyHostPath);
Why do you free ttyHostPath here? You already do it in the exit path.
It has a cycle here, we need free the ttyHostPath since we allocate it in lxcCreateTty every cycle.
} - return 0; + + ret = 0; +out: + VIR_FREE(ttyHostPath);
Double free?
Don't worry about it, VIR_FREE does some extra jobs for us. ;)
+ return ret; }
Thanks, //richard

----- Ursprüngliche Mail -----
On 05/23/2013 01:52 PM, Richard RW. Weinberger wrote:
Hi!
----- Ursprüngliche Mail -----
Since these tty devices will be used by container, the owner of them should be the root user of container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 43 +++++++++++++++++++++++++++++++++++++------ 1 file changed, 37 insertions(+), 6 deletions(-)
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 7d10660..4660f25 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1380,13 +1380,14 @@ static int lxcSetPersonality(virDomainDefPtr def) * *TTYNAME. Heavily borrowed from glibc, but doesn't require that * devpts == "/dev/pts" */ static int -lxcCreateTty(char *ptmx, int *ttymaster, char **ttyName) +lxcCreateTty(virLXCControllerPtr ctrl, int *ttymaster, + char **ttyName, char **ttyHostPath) { int ret = -1; int ptyno; int unlock = 0;
- if ((*ttymaster = open(ptmx, O_RDWR|O_NOCTTY|O_NONBLOCK)) < 0) + if ((*ttymaster = open(ctrl->devptmx, O_RDWR|O_NOCTTY|O_NONBLOCK)) < 0) goto cleanup;
if (ioctl(*ttymaster, TIOCSPTLCK, &unlock) < 0) @@ -1407,6 +1408,13 @@ lxcCreateTty(char *ptmx, int *ttymaster, char **ttyName) goto cleanup; }
+ if (virAsprintf(ttyHostPath, "/%s/%s.devpts/%d", LXC_STATE_DIR, + ctrl->def->name, ptyno) < 0) { + virReportOOMError(); + errno = ENOMEM; + goto cleanup; + } + ret = 0;
cleanup: @@ -1552,18 +1560,41 @@ virLXCControllerSetupConsoles(virLXCControllerPtr ctrl, char **containerTTYPaths) { size_t i; + int ret = -1; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; + char *ttyHostPath = NULL; + + if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + }
for (i = 0; i < ctrl->nconsoles; i++) { VIR_DEBUG("Opening tty on private %s", ctrl->devptmx); - if (lxcCreateTty(ctrl->devptmx, + if (lxcCreateTty(ctrl, &ctrl->consoles[i].contFd, - &containerTTYPaths[i]) < 0) { + &containerTTYPaths[i], &ttyHostPath) < 0) { virReportSystemError(errno, "%s", _("Failed to allocate tty")); - return -1; + goto out; } + + /* Change the owner of tty device to the root user of container */ + if (chown(ttyHostPath, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of tty" + " %s to %u:%u"), + ttyHostPath, uid, gid); + goto out; + } + VIR_FREE(ttyHostPath);
Why do you free ttyHostPath here? You already do it in the exit path.
It has a cycle here, we need free the ttyHostPath since we allocate it in lxcCreateTty every cycle.
} - return 0; + + ret = 0; +out: + VIR_FREE(ttyHostPath);
Double free?
Don't worry about it, VIR_FREE does some extra jobs for us. ;)
Ahhh, there is some hidden magic. Now it makes sense. :D Thanks, //richard

On Thu, May 23, 2013 at 12:06:52PM +0800, Gao feng wrote:
Since these tty devices will be used by container, the owner of them should be the root user of container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 43 +++++++++++++++++++++++++++++++++++++------ 1 file changed, 37 insertions(+), 6 deletions(-)
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 7d10660..4660f25 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1552,18 +1560,41 @@ virLXCControllerSetupConsoles(virLXCControllerPtr ctrl, char **containerTTYPaths) { size_t i; + int ret = -1; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; + char *ttyHostPath = NULL; + + if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + }
for (i = 0; i < ctrl->nconsoles; i++) { VIR_DEBUG("Opening tty on private %s", ctrl->devptmx); - if (lxcCreateTty(ctrl->devptmx, + if (lxcCreateTty(ctrl, &ctrl->consoles[i].contFd, - &containerTTYPaths[i]) < 0) { + &containerTTYPaths[i], &ttyHostPath) < 0) { virReportSystemError(errno, "%s", _("Failed to allocate tty")); - return -1; + goto out; } + + /* Change the owner of tty device to the root user of container */ + if (chown(ttyHostPath, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of tty" + " %s to %u:%u"), + ttyHostPath, uid, gid); + goto out; + } + VIR_FREE(ttyHostPath); } - return 0; + + ret = 0; +out:
Replace 'out' with 'cleanup' to follow normal naming conventions in libvirt.
+ VIR_FREE(ttyHostPath); + return ret; }
daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

container will create /dev/pts directory in /dev. the owner of /dev should be the root user of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 4660f25..f892ce3 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1191,6 +1191,13 @@ virLXCControllerSetupDev(virLXCControllerPtr ctrl) char *opts = NULL; char *dev = NULL; int ret = -1; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; + + if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + } VIR_DEBUG("Setting up /dev/ for container"); @@ -1231,6 +1238,13 @@ virLXCControllerSetupDev(virLXCControllerPtr ctrl) goto cleanup; } + if (chown(dev, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of %s to %d:%d"), + dev, uid, gid); + goto cleanup; + } + ret = 0; cleanup: -- 1.8.1.4

On Thu, May 23, 2013 at 12:06:53PM +0800, Gao feng wrote:
container will create /dev/pts directory in /dev. the owner of /dev should be the root user of container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 4660f25..f892ce3 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1191,6 +1191,13 @@ virLXCControllerSetupDev(virLXCControllerPtr ctrl) char *opts = NULL; char *dev = NULL; int ret = -1; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; + + if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + }
VIR_DEBUG("Setting up /dev/ for container");
@@ -1231,6 +1238,13 @@ virLXCControllerSetupDev(virLXCControllerPtr ctrl) goto cleanup; }
+ if (chown(dev, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of %s to %d:%d"), + dev, uid, gid); + goto cleanup; + } + ret = 0;
cleanup:
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Since these devices are created for the container. the owner should be the root user of the container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index f892ce3..b2ace20 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1260,6 +1260,8 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) size_t i; int ret = -1; char *path = NULL; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; const struct { int maj; int min; @@ -1276,6 +1278,11 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) if (virLXCControllerSetupDev(ctrl) < 0) goto out; + if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + } + /* Populate /dev/ with a few important bits */ for (i = 0 ; i < ARRAY_CARDINALITY(devs) ; i++) { if (virAsprintf(&path, "/%s/%s.dev/%s", @@ -1293,6 +1300,13 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) devs[i].path); goto out; } + + if (chown(path, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of %s to %u:%u"), + devs[i].path, uid, gid); + goto out; + } VIR_FREE(path); } -- 1.8.1.4

Hi! ----- Ursprüngliche Mail -----
Since these devices are created for the container. the owner should be the root user of the container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index f892ce3..b2ace20 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1260,6 +1260,8 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) size_t i; int ret = -1; char *path = NULL; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; const struct { int maj; int min; @@ -1276,6 +1278,11 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) if (virLXCControllerSetupDev(ctrl) < 0) goto out;
+ if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + } + /* Populate /dev/ with a few important bits */ for (i = 0 ; i < ARRAY_CARDINALITY(devs) ; i++) { if (virAsprintf(&path, "/%s/%s.dev/%s", @@ -1293,6 +1300,13 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) devs[i].path); goto out; } + + if (chown(path, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of %s to %u:%u"), + devs[i].path, uid, gid); + goto out; + } VIR_FREE(path);
This looks suspicious. If you free path in the exit path you end up with a double free. If not you may leak memory if chown() fails.
}
Thanks, //richard

On Thu, May 23, 2013 at 12:06:54PM +0800, Gao feng wrote:
Since these devices are created for the container. the owner should be the root user of the container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index f892ce3..b2ace20 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1260,6 +1260,8 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) size_t i; int ret = -1; char *path = NULL; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; const struct { int maj; int min; @@ -1276,6 +1278,11 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) if (virLXCControllerSetupDev(ctrl) < 0) goto out;
+ if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + } + /* Populate /dev/ with a few important bits */ for (i = 0 ; i < ARRAY_CARDINALITY(devs) ; i++) { if (virAsprintf(&path, "/%s/%s.dev/%s", @@ -1293,6 +1300,13 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) devs[i].path); goto out; } + + if (chown(path, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of %s to %u:%u"), + devs[i].path, uid, gid); + goto out; + } VIR_FREE(path); }
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

This two files are created for container, the owner should be the root user of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index b2ace20..7d27135 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1506,8 +1506,15 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) char *opts = NULL; char *devpts = NULL; char *path = NULL; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; int ret = -1; + if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + } + VIR_DEBUG("Setting up private /dev/pts"); mount_options = virSecurityManagerGetMountOptions(ctrl->securityManager, @@ -1551,6 +1558,21 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) goto cleanup; } + if (chown(ctrl->devptmx, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change the owner of" + "%s to %u:%u"), + path, uid, gid); + goto cleanup; + } + if (chown(devpts, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change the owner of" + "%s to %u:%u"), + devpts, uid, gid); + goto cleanup; + } + if (access(ctrl->devptmx, W_OK) < 0) { if (virAsprintf(&path, "/%s/%s.dev/ptmx", LXC_STATE_DIR, ctrl->def->name)) { @@ -1564,8 +1586,16 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) virReportSystemError(errno, _("Failed to make device %s"), path); goto cleanup; } + if (chown(path, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change the owner of" + "%s to %u:%u"), + path, uid, gid); + goto cleanup; + } } + ret = 0; cleanup: -- 1.8.1.4

Hi! ----- Ursprüngliche Mail -----
This two files are created for container, the owner should be the root user of container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index b2ace20..7d27135 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1506,8 +1506,15 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) char *opts = NULL; char *devpts = NULL; char *path = NULL; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; int ret = -1;
+ if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + } +
You're using this pattern a few times. Maybe it's worth a helper function. Thanks, //richard

On 05/23/2013 01:56 PM, Richard RW. Weinberger wrote:
Hi!
----- Ursprüngliche Mail -----
This two files are created for container, the owner should be the root user of container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index b2ace20..7d27135 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1506,8 +1506,15 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) char *opts = NULL; char *devpts = NULL; char *path = NULL; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; int ret = -1;
+ if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + } +
You're using this pattern a few times. Maybe it's worth a helper function.
Yep, it's what we did in PATCH 12/12.

On Thu, May 23, 2013 at 12:06:55PM +0800, Gao feng wrote:
This two files are created for container, the owner should be the root user of container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index b2ace20..7d27135 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1506,8 +1506,15 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) char *opts = NULL; char *devpts = NULL; char *path = NULL; + uid_t uid = (uid_t)-1; + gid_t gid = (gid_t)-1; int ret = -1;
+ if (ctrl->def->idmap.uidmap) { + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + } + VIR_DEBUG("Setting up private /dev/pts");
mount_options = virSecurityManagerGetMountOptions(ctrl->securityManager, @@ -1551,6 +1558,21 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) goto cleanup; }
+ if (chown(ctrl->devptmx, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change the owner of" + "%s to %u:%u"), + path, uid, gid); + goto cleanup; + } + if (chown(devpts, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change the owner of" + "%s to %u:%u"), + devpts, uid, gid); + goto cleanup; + } + if (access(ctrl->devptmx, W_OK) < 0) { if (virAsprintf(&path, "/%s/%s.dev/ptmx", LXC_STATE_DIR, ctrl->def->name)) { @@ -1564,8 +1586,16 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) virReportSystemError(errno, _("Failed to make device %s"), path); goto cleanup; } + if (chown(path, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change the owner of" + "%s to %u:%u"), + path, uid, gid); + goto cleanup; + } }
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

use virLXCControllerChown to make codes clearer. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 81 ++++++++++++++++++------------------------------ 1 file changed, 31 insertions(+), 50 deletions(-) diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 7d27135..1cceecf 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1121,6 +1121,29 @@ cleanup2: return rc; } +static int +virLXCControllerChown(virLXCControllerPtr ctrl, + char *path) +{ + uid_t uid = -1; + gid_t gid = -1; + + if (!ctrl->def->idmap.uidmap) + return 0; + + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + + if (chown(path, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of %s to %u:%u"), + path, uid, gid); + return -1; + } + + return 0; +} + static int virLXCControllerSetupUsernsMap(virDomainIdMapEntryPtr map, @@ -1260,8 +1283,6 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) size_t i; int ret = -1; char *path = NULL; - uid_t uid = (uid_t)-1; - gid_t gid = (gid_t)-1; const struct { int maj; int min; @@ -1278,11 +1299,6 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) if (virLXCControllerSetupDev(ctrl) < 0) goto out; - if (ctrl->def->idmap.uidmap) { - uid = ctrl->def->idmap.uidmap[0].target; - gid = ctrl->def->idmap.gidmap[0].target; - } - /* Populate /dev/ with a few important bits */ for (i = 0 ; i < ARRAY_CARDINALITY(devs) ; i++) { if (virAsprintf(&path, "/%s/%s.dev/%s", @@ -1301,12 +1317,9 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) goto out; } - if (chown(path, uid, gid) < 0) { - virReportSystemError(errno, - _("Failed to change owner of %s to %u:%u"), - devs[i].path, uid, gid); + if (virLXCControllerChown(ctrl, path) < 0) goto out; - } + VIR_FREE(path); } @@ -1506,15 +1519,8 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) char *opts = NULL; char *devpts = NULL; char *path = NULL; - uid_t uid = (uid_t)-1; - gid_t gid = (gid_t)-1; int ret = -1; - if (ctrl->def->idmap.uidmap) { - uid = ctrl->def->idmap.uidmap[0].target; - gid = ctrl->def->idmap.gidmap[0].target; - } - VIR_DEBUG("Setting up private /dev/pts"); mount_options = virSecurityManagerGetMountOptions(ctrl->securityManager, @@ -1558,20 +1564,11 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) goto cleanup; } - if (chown(ctrl->devptmx, uid, gid) < 0) { - virReportSystemError(errno, - _("Failed to change the owner of" - "%s to %u:%u"), - path, uid, gid); + if (virLXCControllerChown(ctrl, ctrl->devptmx) < 0) goto cleanup; - } - if (chown(devpts, uid, gid) < 0) { - virReportSystemError(errno, - _("Failed to change the owner of" - "%s to %u:%u"), - devpts, uid, gid); + + if (virLXCControllerChown(ctrl, devpts) < 0) goto cleanup; - } if (access(ctrl->devptmx, W_OK) < 0) { if (virAsprintf(&path, "/%s/%s.dev/ptmx", @@ -1586,13 +1583,8 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) virReportSystemError(errno, _("Failed to make device %s"), path); goto cleanup; } - if (chown(path, uid, gid) < 0) { - virReportSystemError(errno, - _("Failed to change the owner of" - "%s to %u:%u"), - path, uid, gid); + if (virLXCControllerChown(ctrl, path) < 0) goto cleanup; - } } @@ -1619,15 +1611,8 @@ virLXCControllerSetupConsoles(virLXCControllerPtr ctrl, { size_t i; int ret = -1; - uid_t uid = (uid_t)-1; - gid_t gid = (gid_t)-1; char *ttyHostPath = NULL; - if (ctrl->def->idmap.uidmap) { - uid = ctrl->def->idmap.uidmap[0].target; - gid = ctrl->def->idmap.gidmap[0].target; - } - for (i = 0; i < ctrl->nconsoles; i++) { VIR_DEBUG("Opening tty on private %s", ctrl->devptmx); if (lxcCreateTty(ctrl, @@ -1639,13 +1624,9 @@ virLXCControllerSetupConsoles(virLXCControllerPtr ctrl, } /* Change the owner of tty device to the root user of container */ - if (chown(ttyHostPath, uid, gid) < 0) { - virReportSystemError(errno, - _("Failed to change owner of tty" - " %s to %u:%u"), - ttyHostPath, uid, gid); + if (virLXCControllerChown(ctrl, ttyHostPath) < 0) goto out; - } + VIR_FREE(ttyHostPath); } -- 1.8.1.4

On Thu, May 23, 2013 at 12:06:56PM +0800, Gao feng wrote:
use virLXCControllerChown to make codes clearer.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> --- src/lxc/lxc_controller.c | 81 ++++++++++++++++++------------------------------ 1 file changed, 31 insertions(+), 50 deletions(-)
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 7d27135..1cceecf 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1121,6 +1121,29 @@ cleanup2: return rc; }
+static int +virLXCControllerChown(virLXCControllerPtr ctrl, + char *path) +{ + uid_t uid = -1; + gid_t gid = -1; + + if (!ctrl->def->idmap.uidmap) + return 0; + + uid = ctrl->def->idmap.uidmap[0].target; + gid = ctrl->def->idmap.gidmap[0].target; + + if (chown(path, uid, gid) < 0) { + virReportSystemError(errno, + _("Failed to change owner of %s to %u:%u"), + path, uid, gid); + return -1; + } + + return 0; +}
I suggest you move this patch to be before current #7. That way patches 7-11 can use this function right away, avoiding all the code below
+
static int virLXCControllerSetupUsernsMap(virDomainIdMapEntryPtr map, @@ -1260,8 +1283,6 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) size_t i; int ret = -1; char *path = NULL; - uid_t uid = (uid_t)-1; - gid_t gid = (gid_t)-1; const struct { int maj; int min; @@ -1278,11 +1299,6 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) if (virLXCControllerSetupDev(ctrl) < 0) goto out;
- if (ctrl->def->idmap.uidmap) { - uid = ctrl->def->idmap.uidmap[0].target; - gid = ctrl->def->idmap.gidmap[0].target; - } - /* Populate /dev/ with a few important bits */ for (i = 0 ; i < ARRAY_CARDINALITY(devs) ; i++) { if (virAsprintf(&path, "/%s/%s.dev/%s", @@ -1301,12 +1317,9 @@ static int virLXCControllerPopulateDevices(virLXCControllerPtr ctrl) goto out; }
- if (chown(path, uid, gid) < 0) { - virReportSystemError(errno, - _("Failed to change owner of %s to %u:%u"), - devs[i].path, uid, gid); + if (virLXCControllerChown(ctrl, path) < 0) goto out; - } + VIR_FREE(path); }
@@ -1506,15 +1519,8 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) char *opts = NULL; char *devpts = NULL; char *path = NULL; - uid_t uid = (uid_t)-1; - gid_t gid = (gid_t)-1; int ret = -1;
- if (ctrl->def->idmap.uidmap) { - uid = ctrl->def->idmap.uidmap[0].target; - gid = ctrl->def->idmap.gidmap[0].target; - } - VIR_DEBUG("Setting up private /dev/pts");
mount_options = virSecurityManagerGetMountOptions(ctrl->securityManager, @@ -1558,20 +1564,11 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) goto cleanup; }
- if (chown(ctrl->devptmx, uid, gid) < 0) { - virReportSystemError(errno, - _("Failed to change the owner of" - "%s to %u:%u"), - path, uid, gid); + if (virLXCControllerChown(ctrl, ctrl->devptmx) < 0) goto cleanup; - } - if (chown(devpts, uid, gid) < 0) { - virReportSystemError(errno, - _("Failed to change the owner of" - "%s to %u:%u"), - devpts, uid, gid); + + if (virLXCControllerChown(ctrl, devpts) < 0) goto cleanup; - }
if (access(ctrl->devptmx, W_OK) < 0) { if (virAsprintf(&path, "/%s/%s.dev/ptmx", @@ -1586,13 +1583,8 @@ virLXCControllerSetupDevPTS(virLXCControllerPtr ctrl) virReportSystemError(errno, _("Failed to make device %s"), path); goto cleanup; } - if (chown(path, uid, gid) < 0) { - virReportSystemError(errno, - _("Failed to change the owner of" - "%s to %u:%u"), - path, uid, gid); + if (virLXCControllerChown(ctrl, path) < 0) goto cleanup; - } }
@@ -1619,15 +1611,8 @@ virLXCControllerSetupConsoles(virLXCControllerPtr ctrl, { size_t i; int ret = -1; - uid_t uid = (uid_t)-1; - gid_t gid = (gid_t)-1; char *ttyHostPath = NULL;
- if (ctrl->def->idmap.uidmap) { - uid = ctrl->def->idmap.uidmap[0].target; - gid = ctrl->def->idmap.gidmap[0].target; - } - for (i = 0; i < ctrl->nconsoles; i++) { VIR_DEBUG("Opening tty on private %s", ctrl->devptmx); if (lxcCreateTty(ctrl, @@ -1639,13 +1624,9 @@ virLXCControllerSetupConsoles(virLXCControllerPtr ctrl, }
/* Change the owner of tty device to the root user of container */ - if (chown(ttyHostPath, uid, gid) < 0) { - virReportSystemError(errno, - _("Failed to change owner of tty" - " %s to %u:%u"), - ttyHostPath, uid, gid); + if (virLXCControllerChown(ctrl, ttyHostPath) < 0) goto out; - } + VIR_FREE(ttyHostPath); }
Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, May 23, 2013 at 6:06 AM, Gao feng <gaofeng@cn.fujitsu.com> wrote:
This patchset try to add userns support for libvirt lxc. Since userns is nearly completed in linux-3.9, the old kernel doesn't support userns, I add some New XML elements to let people decide if enable userns.The userns is enabled only when user configure the XML.
The format of user namespace related XML file like below: <idmap> <uid start='0' target='1000' count='10'> <gid start='0' target='1000' count='10'> </idmap> it means the user in container (which uid:gid is 0:0) will be mapped to the user in host (uid:gid is 1000:1000), count is used to form an u/gid range: The users in container which uid in [start, start + count -1] will be mapped.
You can have multiple lines to map differnet id ranges, caution, you must make sure the root user of container has been mapped.
This patchset also does the below jobs.
1, Because the uninit userns has no right to create devices, we should create devices for container on host. 2, Changes the owner of fuse and tty device.
Change from v2: 1, Mount tmpfs on /stateDir/domain.dev 2, Create devices under /stateDir/doamin.dev/ 3, Mount Move the /.oldroot/stateDir/doamin.dev/ on the /dev/ of container 4, Enhance the configuration, disallow the semi configuration
Gao feng (12): LXC: Introduce New XML element for user namespace LXC: enable user namespace only when user set the uidmap LXC: sort the uidmap/gidmap of domain LXC: introduce virLXCControllerSetupUserns and lxcContainerSetID LXC: Creating devices for container on host side LXC: Move creating /dev/ptmx to virLXCControllerSetupDevPTS LXC: fuse: Change files owner to the root user of container LXC: controller: change the owner of tty devices to the root user of container LXC: controller: change the owner of /dev to the root user of container LXC: controller: change the owner of devices created on host LXC: controller: change the owner of /dev/pts and ptmx to the root of container LXC: introduce virLXCControllerChown
docs/formatdomain.html.in | 23 ++++ docs/schemas/domaincommon.rng | 31 +++++ src/conf/domain_conf.c | 115 ++++++++++++++++++ src/conf/domain_conf.h | 22 ++++ src/lxc/lxc_container.c | 183 ++++++++++++++-------------- src/lxc/lxc_controller.c | 271 +++++++++++++++++++++++++++++++++++++++++- src/lxc/lxc_fuse.c | 6 + 7 files changed, 557 insertions(+), 94 deletions(-)
I'm wondering what the state of this patch set is. I'd really like to see it mainline. :-) -- Thanks, //richard

On 06/04/2013 06:41 PM, richard -rw- weinberger wrote:
On Thu, May 23, 2013 at 6:06 AM, Gao feng <gaofeng@cn.fujitsu.com> wrote:
This patchset try to add userns support for libvirt lxc. Since userns is nearly completed in linux-3.9, the old kernel doesn't support userns, I add some New XML elements to let people decide if enable userns.The userns is enabled only when user configure the XML.
The format of user namespace related XML file like below: <idmap> <uid start='0' target='1000' count='10'> <gid start='0' target='1000' count='10'> </idmap> it means the user in container (which uid:gid is 0:0) will be mapped to the user in host (uid:gid is 1000:1000), count is used to form an u/gid range: The users in container which uid in [start, start + count -1] will be mapped.
You can have multiple lines to map differnet id ranges, caution, you must make sure the root user of container has been mapped.
This patchset also does the below jobs.
1, Because the uninit userns has no right to create devices, we should create devices for container on host. 2, Changes the owner of fuse and tty device.
Change from v2: 1, Mount tmpfs on /stateDir/domain.dev 2, Create devices under /stateDir/doamin.dev/ 3, Mount Move the /.oldroot/stateDir/doamin.dev/ on the /dev/ of container 4, Enhance the configuration, disallow the semi configuration
Gao feng (12): LXC: Introduce New XML element for user namespace LXC: enable user namespace only when user set the uidmap LXC: sort the uidmap/gidmap of domain LXC: introduce virLXCControllerSetupUserns and lxcContainerSetID LXC: Creating devices for container on host side LXC: Move creating /dev/ptmx to virLXCControllerSetupDevPTS LXC: fuse: Change files owner to the root user of container LXC: controller: change the owner of tty devices to the root user of container LXC: controller: change the owner of /dev to the root user of container LXC: controller: change the owner of devices created on host LXC: controller: change the owner of /dev/pts and ptmx to the root of container LXC: introduce virLXCControllerChown
docs/formatdomain.html.in | 23 ++++ docs/schemas/domaincommon.rng | 31 +++++ src/conf/domain_conf.c | 115 ++++++++++++++++++ src/conf/domain_conf.h | 22 ++++ src/lxc/lxc_container.c | 183 ++++++++++++++-------------- src/lxc/lxc_controller.c | 271 +++++++++++++++++++++++++++++++++++++++++- src/lxc/lxc_fuse.c | 6 + 7 files changed, 557 insertions(+), 94 deletions(-)
I'm wondering what the state of this patch set is. I'd really like to see it mainline. :-)
It's still under review. needs some ACK. If you can help to test or ACK this patchset, it will be very helpful. :) Actually, I just want to ping... Thanks, Gao
-- Thanks, //richard

On Tue, Jun 04, 2013 at 06:54:10PM +0800, Gao feng wrote:
On 06/04/2013 06:41 PM, richard -rw- weinberger wrote:
On Thu, May 23, 2013 at 6:06 AM, Gao feng <gaofeng@cn.fujitsu.com> wrote:
This patchset try to add userns support for libvirt lxc. Since userns is nearly completed in linux-3.9, the old kernel doesn't support userns, I add some New XML elements to let people decide if enable userns.The userns is enabled only when user configure the XML.
The format of user namespace related XML file like below: <idmap> <uid start='0' target='1000' count='10'> <gid start='0' target='1000' count='10'> </idmap> it means the user in container (which uid:gid is 0:0) will be mapped to the user in host (uid:gid is 1000:1000), count is used to form an u/gid range: The users in container which uid in [start, start + count -1] will be mapped.
You can have multiple lines to map differnet id ranges, caution, you must make sure the root user of container has been mapped.
This patchset also does the below jobs.
1, Because the uninit userns has no right to create devices, we should create devices for container on host. 2, Changes the owner of fuse and tty device.
Change from v2: 1, Mount tmpfs on /stateDir/domain.dev 2, Create devices under /stateDir/doamin.dev/ 3, Mount Move the /.oldroot/stateDir/doamin.dev/ on the /dev/ of container 4, Enhance the configuration, disallow the semi configuration
Gao feng (12): LXC: Introduce New XML element for user namespace LXC: enable user namespace only when user set the uidmap LXC: sort the uidmap/gidmap of domain LXC: introduce virLXCControllerSetupUserns and lxcContainerSetID LXC: Creating devices for container on host side LXC: Move creating /dev/ptmx to virLXCControllerSetupDevPTS LXC: fuse: Change files owner to the root user of container LXC: controller: change the owner of tty devices to the root user of container LXC: controller: change the owner of /dev to the root user of container LXC: controller: change the owner of devices created on host LXC: controller: change the owner of /dev/pts and ptmx to the root of container LXC: introduce virLXCControllerChown
docs/formatdomain.html.in | 23 ++++ docs/schemas/domaincommon.rng | 31 +++++ src/conf/domain_conf.c | 115 ++++++++++++++++++ src/conf/domain_conf.h | 22 ++++ src/lxc/lxc_container.c | 183 ++++++++++++++-------------- src/lxc/lxc_controller.c | 271 +++++++++++++++++++++++++++++++++++++++++- src/lxc/lxc_fuse.c | 6 + 7 files changed, 557 insertions(+), 94 deletions(-)
I'm wondering what the state of this patch set is. I'd really like to see it mainline. :-)
It's still under review. needs some ACK. If you can help to test or ACK this patchset, it will be very helpful. :)
Actually, I just want to ping...
I've been away on holiday for 2 weeks, so not had a chance to review it yet. I'll get to it this week. I hope we'll get this in the 1.0.6 release this month. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Hi! Am 04.06.2013 13:03, schrieb Daniel P. Berrange:
It's still under review. needs some ACK. If you can help to test or ACK this patchset, it will be very helpful. :)
Actually, I just want to ping...
I've been away on holiday for 2 weeks, so not had a chance to review it yet. I'll get to it this week. I hope we'll get this in the 1.0.6 release this month.
Finally I've found some time to test version 4 of the userns patch set. But I'm unable to create a container. ---cut--- linux:~ # LANG=C /opt/libvirt/bin/virsh -c lxc:/// create c1.conf error: Failed to create domain from c1.conf error: Interner Fehler guest failed to start: PATH=/bin:/sbin TERM=linux container=lxc-libvirt container_uuid=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_UUID=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_NAME=c1 /bin/bash error receiving signal from container: Input/output error ---cut--- lxcContainerWaitForContinue() in src/lxc/lxc_controller.c fails with EIO. Maybe because the clone()'ed child dies and the file descriptor used for synchronization becomes invalid. Here my container config: ---cut--- <domain type='lxc'> <name>c1</name> <memory>102400</memory> <os> <type>exe</type> <init>/bin/bash</init> </os> <idmap> <uid start='0' target='100000' count='100000'/> <gid start='0' target='100000' count='100000'/> </idmap> <devices> <console type='pty'/> <filesystem type='mount'> <source dir='/root/c1/rootfs'/> <target dir='/'/> </filesystem> </devices> </domain> ---cut--- Any ideas how to debug this further? This is Linux 3.9.0 with all namespaces enabled. Thanks, //richard

Am 10.06.2013 21:17, schrieb Richard Weinberger:
Hi!
Am 04.06.2013 13:03, schrieb Daniel P. Berrange:
It's still under review. needs some ACK. If you can help to test or ACK this patchset, it will be very helpful. :)
Actually, I just want to ping...
I've been away on holiday for 2 weeks, so not had a chance to review it yet. I'll get to it this week. I hope we'll get this in the 1.0.6 release this month.
Finally I've found some time to test version 4 of the userns patch set. But I'm unable to create a container.
---cut--- linux:~ # LANG=C /opt/libvirt/bin/virsh -c lxc:/// create c1.conf error: Failed to create domain from c1.conf error: Interner Fehler guest failed to start: PATH=/bin:/sbin TERM=linux container=lxc-libvirt container_uuid=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_UUID=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_NAME=c1 /bin/bash error receiving signal from container: Input/output error ---cut---
lxcContainerWaitForContinue() in src/lxc/lxc_controller.c fails with EIO. Maybe because the clone()'ed child dies and the file descriptor used for synchronization becomes invalid.
Here my container config: ---cut--- <domain type='lxc'> <name>c1</name> <memory>102400</memory> <os> <type>exe</type> <init>/bin/bash</init> </os> <idmap> <uid start='0' target='100000' count='100000'/> <gid start='0' target='100000' count='100000'/> </idmap> <devices> <console type='pty'/> <filesystem type='mount'> <source dir='/root/c1/rootfs'/> <target dir='/'/> </filesystem> </devices> </domain> ---cut---
Any ideas how to debug this further? This is Linux 3.9.0 with all namespaces enabled.
Whoops, forgot to add the libvirtd debug output: ---cut--- 2013-06-10 19:41:24.661+0000: 29211: debug : virCommandRunAsync:2241 : About to run PATH=/usr/lib64/mpi/gcc/openmpi/bin:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/usr/games LIBVIRT_DEBUG=1 LIBVIRT_LOG_OUTPUTS=1:stderr /opt/libvirt/lib/libvirt_lxc --name c1 --console 20 --security=none --handshake 23 --background 2013-06-10 19:41:24.663+0000: 29211: debug : virFileClose:90 : Closed fd 24 2013-06-10 19:41:24.663+0000: 29211: debug : virCommandRunAsync:2246 : Command result 0, with PID 29303 2013-06-10 19:41:24.664+0000: 29303: debug : virFileClose:90 : Closed fd 3 2013-06-10 19:41:24.665+0000: 29303: debug : virFileClose:90 : Closed fd 4 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 5 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 6 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 7 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 8 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 9 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 10 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 11 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 12 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 13 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 14 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 15 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 16 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 17 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 18 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 19 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 22 2013-06-10 19:41:24.790+0000: 29211: debug : virCommandRun:2115 : Result status 0, stdout: '(null)' stderr: '(null)' ---cut--- Looks like libvirt_lxc was executed and died silently. Thanks, //richard

Am 10.06.2013 21:53, schrieb Richard Weinberger:
Am 10.06.2013 21:17, schrieb Richard Weinberger:
Hi!
Am 04.06.2013 13:03, schrieb Daniel P. Berrange:
It's still under review. needs some ACK. If you can help to test or ACK this patchset, it will be very helpful. :)
Actually, I just want to ping...
I've been away on holiday for 2 weeks, so not had a chance to review it yet. I'll get to it this week. I hope we'll get this in the 1.0.6 release this month.
Finally I've found some time to test version 4 of the userns patch set. But I'm unable to create a container.
---cut--- linux:~ # LANG=C /opt/libvirt/bin/virsh -c lxc:/// create c1.conf error: Failed to create domain from c1.conf error: Interner Fehler guest failed to start: PATH=/bin:/sbin TERM=linux container=lxc-libvirt container_uuid=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_UUID=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_NAME=c1 /bin/bash error receiving signal from container: Input/output error ---cut---
lxcContainerWaitForContinue() in src/lxc/lxc_controller.c fails with EIO. Maybe because the clone()'ed child dies and the file descriptor used for synchronization becomes invalid.
Here my container config: ---cut--- <domain type='lxc'> <name>c1</name> <memory>102400</memory> <os> <type>exe</type> <init>/bin/bash</init> </os> <idmap> <uid start='0' target='100000' count='100000'/> <gid start='0' target='100000' count='100000'/> </idmap> <devices> <console type='pty'/> <filesystem type='mount'> <source dir='/root/c1/rootfs'/> <target dir='/'/> </filesystem> </devices> </domain> ---cut---
Any ideas how to debug this further? This is Linux 3.9.0 with all namespaces enabled.
Whoops, forgot to add the libvirtd debug output:
---cut--- 2013-06-10 19:41:24.661+0000: 29211: debug : virCommandRunAsync:2241 : About to run PATH=/usr/lib64/mpi/gcc/openmpi/bin:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/usr/games LIBVIRT_DEBUG=1 LIBVIRT_LOG_OUTPUTS=1:stderr /opt/libvirt/lib/libvirt_lxc --name c1 --console 20 --security=none --handshake 23 --background 2013-06-10 19:41:24.663+0000: 29211: debug : virFileClose:90 : Closed fd 24 2013-06-10 19:41:24.663+0000: 29211: debug : virCommandRunAsync:2246 : Command result 0, with PID 29303 2013-06-10 19:41:24.664+0000: 29303: debug : virFileClose:90 : Closed fd 3 2013-06-10 19:41:24.665+0000: 29303: debug : virFileClose:90 : Closed fd 4 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 5 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 6 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 7 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 8 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 9 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 10 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 11 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 12 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 13 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 14 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 15 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 16 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 17 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 18 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 19 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 22 2013-06-10 19:41:24.790+0000: 29211: debug : virCommandRun:2115 : Result status 0, stdout: '(null)' stderr: '(null)' ---cut---
Looks like libvirt_lxc was executed and died silently.
Found the problem. /opt/libvirt/var/log/libvirt/lxc/c1.log contained the info I needed. Search permissions for /root were missing. m( Would be nice if virsh would be able to tell one this... Thanks, //richard

On 06/11/2013 04:51 AM, Richard Weinberger wrote:
Am 10.06.2013 21:53, schrieb Richard Weinberger:
Am 10.06.2013 21:17, schrieb Richard Weinberger:
Hi!
Am 04.06.2013 13:03, schrieb Daniel P. Berrange:
It's still under review. needs some ACK. If you can help to test or ACK this patchset, it will be very helpful. :)
Actually, I just want to ping...
I've been away on holiday for 2 weeks, so not had a chance to review it yet. I'll get to it this week. I hope we'll get this in the 1.0.6 release this month.
Finally I've found some time to test version 4 of the userns patch set. But I'm unable to create a container.
---cut--- linux:~ # LANG=C /opt/libvirt/bin/virsh -c lxc:/// create c1.conf error: Failed to create domain from c1.conf error: Interner Fehler guest failed to start: PATH=/bin:/sbin TERM=linux container=lxc-libvirt container_uuid=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_UUID=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_NAME=c1 /bin/bash error receiving signal from container: Input/output error ---cut---
lxcContainerWaitForContinue() in src/lxc/lxc_controller.c fails with EIO. Maybe because the clone()'ed child dies and the file descriptor used for synchronization becomes invalid.
Here my container config: ---cut--- <domain type='lxc'> <name>c1</name> <memory>102400</memory> <os> <type>exe</type> <init>/bin/bash</init> </os> <idmap> <uid start='0' target='100000' count='100000'/> <gid start='0' target='100000' count='100000'/> </idmap> <devices> <console type='pty'/> <filesystem type='mount'> <source dir='/root/c1/rootfs'/> <target dir='/'/> </filesystem> </devices> </domain> ---cut---
Any ideas how to debug this further? This is Linux 3.9.0 with all namespaces enabled.
Whoops, forgot to add the libvirtd debug output:
---cut--- 2013-06-10 19:41:24.661+0000: 29211: debug : virCommandRunAsync:2241 : About to run PATH=/usr/lib64/mpi/gcc/openmpi/bin:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/usr/games LIBVIRT_DEBUG=1 LIBVIRT_LOG_OUTPUTS=1:stderr /opt/libvirt/lib/libvirt_lxc --name c1 --console 20 --security=none --handshake 23 --background 2013-06-10 19:41:24.663+0000: 29211: debug : virFileClose:90 : Closed fd 24 2013-06-10 19:41:24.663+0000: 29211: debug : virCommandRunAsync:2246 : Command result 0, with PID 29303 2013-06-10 19:41:24.664+0000: 29303: debug : virFileClose:90 : Closed fd 3 2013-06-10 19:41:24.665+0000: 29303: debug : virFileClose:90 : Closed fd 4 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 5 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 6 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 7 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 8 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 9 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 10 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 11 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 12 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 13 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 14 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 15 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 16 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 17 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 18 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 19 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 22 2013-06-10 19:41:24.790+0000: 29211: debug : virCommandRun:2115 : Result status 0, stdout: '(null)' stderr: '(null)' ---cut---
Looks like libvirt_lxc was executed and died silently.
Found the problem. /opt/libvirt/var/log/libvirt/lxc/c1.log contained the info I needed. Search permissions for /root were missing. m( Would be nice if virsh would be able to tell one this...
:) have fun with user namespace & libvirt. And thanks for your test. Thanks, Gao
Thanks, //richard

Am 11.06.2013 05:12, schrieb Gao feng:
On 06/11/2013 04:51 AM, Richard Weinberger wrote:
Am 10.06.2013 21:53, schrieb Richard Weinberger:
Am 10.06.2013 21:17, schrieb Richard Weinberger:
Hi!
Am 04.06.2013 13:03, schrieb Daniel P. Berrange:
It's still under review. needs some ACK. If you can help to test or ACK this patchset, it will be very helpful. :)
Actually, I just want to ping...
I've been away on holiday for 2 weeks, so not had a chance to review it yet. I'll get to it this week. I hope we'll get this in the 1.0.6 release this month.
Finally I've found some time to test version 4 of the userns patch set. But I'm unable to create a container.
---cut--- linux:~ # LANG=C /opt/libvirt/bin/virsh -c lxc:/// create c1.conf error: Failed to create domain from c1.conf error: Interner Fehler guest failed to start: PATH=/bin:/sbin TERM=linux container=lxc-libvirt container_uuid=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_UUID=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_NAME=c1 /bin/bash error receiving signal from container: Input/output error ---cut---
lxcContainerWaitForContinue() in src/lxc/lxc_controller.c fails with EIO. Maybe because the clone()'ed child dies and the file descriptor used for synchronization becomes invalid.
Here my container config: ---cut--- <domain type='lxc'> <name>c1</name> <memory>102400</memory> <os> <type>exe</type> <init>/bin/bash</init> </os> <idmap> <uid start='0' target='100000' count='100000'/> <gid start='0' target='100000' count='100000'/> </idmap> <devices> <console type='pty'/> <filesystem type='mount'> <source dir='/root/c1/rootfs'/> <target dir='/'/> </filesystem> </devices> </domain> ---cut---
Any ideas how to debug this further? This is Linux 3.9.0 with all namespaces enabled.
Whoops, forgot to add the libvirtd debug output:
---cut--- 2013-06-10 19:41:24.661+0000: 29211: debug : virCommandRunAsync:2241 : About to run PATH=/usr/lib64/mpi/gcc/openmpi/bin:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/usr/games LIBVIRT_DEBUG=1 LIBVIRT_LOG_OUTPUTS=1:stderr /opt/libvirt/lib/libvirt_lxc --name c1 --console 20 --security=none --handshake 23 --background 2013-06-10 19:41:24.663+0000: 29211: debug : virFileClose:90 : Closed fd 24 2013-06-10 19:41:24.663+0000: 29211: debug : virCommandRunAsync:2246 : Command result 0, with PID 29303 2013-06-10 19:41:24.664+0000: 29303: debug : virFileClose:90 : Closed fd 3 2013-06-10 19:41:24.665+0000: 29303: debug : virFileClose:90 : Closed fd 4 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 5 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 6 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 7 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 8 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 9 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 10 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 11 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 12 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 13 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 14 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 15 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 16 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 17 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 18 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 19 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 22 2013-06-10 19:41:24.790+0000: 29211: debug : virCommandRun:2115 : Result status 0, stdout: '(null)' stderr: '(null)' ---cut---
Looks like libvirt_lxc was executed and died silently.
Found the problem. /opt/libvirt/var/log/libvirt/lxc/c1.log contained the info I needed. Search permissions for /root were missing. m( Would be nice if virsh would be able to tell one this...
:) have fun with user namespace & libvirt. And thanks for your test.
Yeah. So far it looks very good. I was able to convert my containers from my custom lxc/userns setup to libvirt+userns. One more question, is it by design that virsh lxc-enter-namespace does not setup uid/gid mappings? Thanks, //richard

On 06/11/2013 02:02 PM, Richard Weinberger wrote:
Am 11.06.2013 05:12, schrieb Gao feng:
On 06/11/2013 04:51 AM, Richard Weinberger wrote:
Am 10.06.2013 21:53, schrieb Richard Weinberger:
Am 10.06.2013 21:17, schrieb Richard Weinberger:
Hi!
Am 04.06.2013 13:03, schrieb Daniel P. Berrange:
> It's still under review. needs some ACK. > If you can help to test or ACK this patchset, it will be very helpful. :) > > Actually, I just want to ping...
I've been away on holiday for 2 weeks, so not had a chance to review it yet. I'll get to it this week. I hope we'll get this in the 1.0.6 release this month.
Finally I've found some time to test version 4 of the userns patch set. But I'm unable to create a container.
---cut--- linux:~ # LANG=C /opt/libvirt/bin/virsh -c lxc:/// create c1.conf error: Failed to create domain from c1.conf error: Interner Fehler guest failed to start: PATH=/bin:/sbin TERM=linux container=lxc-libvirt container_uuid=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_UUID=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_NAME=c1 /bin/bash error receiving signal from container: Input/output error ---cut---
lxcContainerWaitForContinue() in src/lxc/lxc_controller.c fails with EIO. Maybe because the clone()'ed child dies and the file descriptor used for synchronization becomes invalid.
Here my container config: ---cut--- <domain type='lxc'> <name>c1</name> <memory>102400</memory> <os> <type>exe</type> <init>/bin/bash</init> </os> <idmap> <uid start='0' target='100000' count='100000'/> <gid start='0' target='100000' count='100000'/> </idmap> <devices> <console type='pty'/> <filesystem type='mount'> <source dir='/root/c1/rootfs'/> <target dir='/'/> </filesystem> </devices> </domain> ---cut---
Any ideas how to debug this further? This is Linux 3.9.0 with all namespaces enabled.
Whoops, forgot to add the libvirtd debug output:
---cut--- 2013-06-10 19:41:24.661+0000: 29211: debug : virCommandRunAsync:2241 : About to run PATH=/usr/lib64/mpi/gcc/openmpi/bin:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/usr/games LIBVIRT_DEBUG=1 LIBVIRT_LOG_OUTPUTS=1:stderr /opt/libvirt/lib/libvirt_lxc --name c1 --console 20 --security=none --handshake 23 --background 2013-06-10 19:41:24.663+0000: 29211: debug : virFileClose:90 : Closed fd 24 2013-06-10 19:41:24.663+0000: 29211: debug : virCommandRunAsync:2246 : Command result 0, with PID 29303 2013-06-10 19:41:24.664+0000: 29303: debug : virFileClose:90 : Closed fd 3 2013-06-10 19:41:24.665+0000: 29303: debug : virFileClose:90 : Closed fd 4 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 5 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 6 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 7 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 8 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 9 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 10 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 11 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 12 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 13 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 14 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 15 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 16 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 17 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 18 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 19 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 22 2013-06-10 19:41:24.790+0000: 29211: debug : virCommandRun:2115 : Result status 0, stdout: '(null)' stderr: '(null)' ---cut---
Looks like libvirt_lxc was executed and died silently.
Found the problem. /opt/libvirt/var/log/libvirt/lxc/c1.log contained the info I needed. Search permissions for /root were missing. m( Would be nice if virsh would be able to tell one this...
:) have fun with user namespace & libvirt. And thanks for your test.
Yeah. So far it looks very good. I was able to convert my containers from my custom lxc/userns setup to libvirt+userns.
One more question, is it by design that virsh lxc-enter-namespace does not setup uid/gid mappings?
lxc-enter-namespace doesn't have the need to setup uid/gid mappings, Since lxc-enter-namespace is running on the host side, the uid/gid mappings already exist, But we should call setid for the child task of lxc-enter-namespace, this child task running in the container. I will improve lxc-enter-namespace after this patchset being accepted. Thanks, Gao
Thanks, //richard

Am 11.06.2013 08:17, schrieb Gao feng:
On 06/11/2013 02:02 PM, Richard Weinberger wrote:
Am 11.06.2013 05:12, schrieb Gao feng:
On 06/11/2013 04:51 AM, Richard Weinberger wrote:
Am 10.06.2013 21:53, schrieb Richard Weinberger:
Am 10.06.2013 21:17, schrieb Richard Weinberger:
Hi!
Am 04.06.2013 13:03, schrieb Daniel P. Berrange: >> It's still under review. needs some ACK. >> If you can help to test or ACK this patchset, it will be very helpful. :) >> >> Actually, I just want to ping... > > I've been away on holiday for 2 weeks, so not had a chance to review > it yet. I'll get to it this week. I hope we'll get this in the 1.0.6 > release this month.
Finally I've found some time to test version 4 of the userns patch set. But I'm unable to create a container.
---cut--- linux:~ # LANG=C /opt/libvirt/bin/virsh -c lxc:/// create c1.conf error: Failed to create domain from c1.conf error: Interner Fehler guest failed to start: PATH=/bin:/sbin TERM=linux container=lxc-libvirt container_uuid=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_UUID=3f86c48b-b027-4838-ba17-6202a1d7398b LIBVIRT_LXC_NAME=c1 /bin/bash error receiving signal from container: Input/output error ---cut---
lxcContainerWaitForContinue() in src/lxc/lxc_controller.c fails with EIO. Maybe because the clone()'ed child dies and the file descriptor used for synchronization becomes invalid.
Here my container config: ---cut--- <domain type='lxc'> <name>c1</name> <memory>102400</memory> <os> <type>exe</type> <init>/bin/bash</init> </os> <idmap> <uid start='0' target='100000' count='100000'/> <gid start='0' target='100000' count='100000'/> </idmap> <devices> <console type='pty'/> <filesystem type='mount'> <source dir='/root/c1/rootfs'/> <target dir='/'/> </filesystem> </devices> </domain> ---cut---
Any ideas how to debug this further? This is Linux 3.9.0 with all namespaces enabled.
Whoops, forgot to add the libvirtd debug output:
---cut--- 2013-06-10 19:41:24.661+0000: 29211: debug : virCommandRunAsync:2241 : About to run PATH=/usr/lib64/mpi/gcc/openmpi/bin:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/usr/games LIBVIRT_DEBUG=1 LIBVIRT_LOG_OUTPUTS=1:stderr /opt/libvirt/lib/libvirt_lxc --name c1 --console 20 --security=none --handshake 23 --background 2013-06-10 19:41:24.663+0000: 29211: debug : virFileClose:90 : Closed fd 24 2013-06-10 19:41:24.663+0000: 29211: debug : virCommandRunAsync:2246 : Command result 0, with PID 29303 2013-06-10 19:41:24.664+0000: 29303: debug : virFileClose:90 : Closed fd 3 2013-06-10 19:41:24.665+0000: 29303: debug : virFileClose:90 : Closed fd 4 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 5 2013-06-10 19:41:24.666+0000: 29303: debug : virFileClose:90 : Closed fd 6 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 7 2013-06-10 19:41:24.667+0000: 29303: debug : virFileClose:90 : Closed fd 8 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 9 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 10 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 11 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 12 2013-06-10 19:41:24.668+0000: 29303: debug : virFileClose:90 : Closed fd 13 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 14 2013-06-10 19:41:24.669+0000: 29303: debug : virFileClose:90 : Closed fd 15 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 16 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 17 2013-06-10 19:41:24.670+0000: 29303: debug : virFileClose:90 : Closed fd 18 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 19 2013-06-10 19:41:24.671+0000: 29303: debug : virFileClose:90 : Closed fd 22 2013-06-10 19:41:24.790+0000: 29211: debug : virCommandRun:2115 : Result status 0, stdout: '(null)' stderr: '(null)' ---cut---
Looks like libvirt_lxc was executed and died silently.
Found the problem. /opt/libvirt/var/log/libvirt/lxc/c1.log contained the info I needed. Search permissions for /root were missing. m( Would be nice if virsh would be able to tell one this...
:) have fun with user namespace & libvirt. And thanks for your test.
Yeah. So far it looks very good. I was able to convert my containers from my custom lxc/userns setup to libvirt+userns.
One more question, is it by design that virsh lxc-enter-namespace does not setup uid/gid mappings?
lxc-enter-namespace doesn't have the need to setup uid/gid mappings, Since lxc-enter-namespace is running on the host side, the uid/gid mappings already exist, But we should call setid for the child task of lxc-enter-namespace, this child task running in the container.
I will improve lxc-enter-namespace after this patchset being accepted.
This makes sense. As of now I'm using su to become uid 0. Thanks, //richard

Am 11.06.2013 08:17, schrieb Gao feng:
:) have fun with user namespace & libvirt. And thanks for your test.
Found an nasty issue. It looks like libvirt execs the lxc init within the wrong rootfs context. My container's rootfs contains the script named /xxx. If I try to use it as init, libvirt fails. 2013-06-13 13:18:04.499+0000: 1: error : lxcContainerChild:1941 : cannot find init path '/xxx' relative to container root: No such file or directory It fails because it looks in the rootfs of the host. If I create /xxx within my hostfs it works. Nobody noticed so far because in 99.9% of all case you have /bin/bash, /sbin/init and friends in both filesystems. ---cut--- <domain type='lxc'> <name>c_test1</name> <memory>102400</memory> <os> <type>exe</type> <init>/xxx</init> </os> <idmap> <uid start='0' target='100000' count='100000'/> <gid start='0' target='100000' count='100000'/> </idmap> <devices> <console type='pty'/> <filesystem type='mount'> <source dir='/home/container/c_test1/rootfs/'/> <target dir='/'/> </filesystem> <filesystem type='ram'> <source usage='1024' /> <target dir='/sys/fs/cgroup/systemd'/> </filesystem> <interface type='bridge'> <source bridge='br0'/> <mac address='52:54:00:be:49:be'/> </interface> </devices> </domain> ---cut--- Thanks, //richard

On 06/13/2013 09:30 PM, Richard Weinberger wrote:
Am 11.06.2013 08:17, schrieb Gao feng:
:) have fun with user namespace & libvirt. And thanks for your test.
Found an nasty issue. It looks like libvirt execs the lxc init within the wrong rootfs context.
My container's rootfs contains the script named /xxx. If I try to use it as init, libvirt fails.
2013-06-13 13:18:04.499+0000: 1: error : lxcContainerChild:1941 : cannot find init path '/xxx' relative to container root: No such file or directory
It fails because it looks in the rootfs of the host. If I create /xxx within my hostfs it works.
Nobody noticed so far because in 99.9% of all case you have /bin/bash, /sbin/init and friends in both filesystems.
Interesting.. I will cook a patch to fix this problem, thanks for your report. But this is not a bug of this patchset, right? Thanks, Gao
---cut--- <domain type='lxc'> <name>c_test1</name> <memory>102400</memory> <os> <type>exe</type> <init>/xxx</init> </os> <idmap> <uid start='0' target='100000' count='100000'/> <gid start='0' target='100000' count='100000'/> </idmap> <devices> <console type='pty'/> <filesystem type='mount'> <source dir='/home/container/c_test1/rootfs/'/> <target dir='/'/> </filesystem> <filesystem type='ram'> <source usage='1024' /> <target dir='/sys/fs/cgroup/systemd'/> </filesystem> <interface type='bridge'> <source bridge='br0'/> <mac address='52:54:00:be:49:be'/> </interface> </devices> </domain> ---cut---
Thanks, //richard
participants (6)
-
Daniel P. Berrange
-
Gao feng
-
John Ferlan
-
richard -rw- weinberger
-
Richard RW. Weinberger
-
Richard Weinberger