[libvirt] [PATCH 0/7] backport of getGroupList to v0.10.2-maint

https://bugzilla.redhat.com/show_bug.cgi?id=964358 Since it was on Fedora 18 that I first noticed the deadlock possible when a child process calls getpwuid_r while the parent owned the lock in a different thread, I'm interested in backporting my recent work on virGetGroupList to v0.10.2-maint. However, I hit enough conflict resolution that I'd like a review of my backport decisions before pushing this to the stable branch. Daniel P. Berrange (1): Fix potential deadlock across fork() in QEMU driver Eric Blake (6): util: improve user lookup helper util: add virGetGroupList util: make virSetUIDGID async-signal-safe security: framework for driver PreFork handler security_dac: compute supplemental groups before fork security: fix deadlock with prefork configure.ac | 6 +- src/libvirt_private.syms | 3 + src/lxc/lxc_container.c | 2 +- src/qemu/qemu_process.c | 8 ++ src/security/security_dac.c | 56 ++++++++-- src/security/security_driver.h | 4 + src/security/security_manager.c | 30 +++++ src/security/security_manager.h | 3 + src/security/security_stack.c | 28 +++++ src/storage/storage_backend.c | 16 ++- src/util/util.c | 237 ++++++++++++++++++++++++---------------- src/util/util.h | 4 +- 12 files changed, 285 insertions(+), 112 deletions(-) -- 1.8.3.1

https://bugzilla.redhat.com/show_bug.cgi?id=964358 A future patch needs to look up pw_gid; but it is wasteful to crawl through getpwuid_r twice for two separate pieces of information, and annoying to copy that much boilerplate code for doing the crawl. The current internal-only virGetUserEnt is also a rather awkward interface; it's easier to just design it to let callers request multiple pieces of data as needed from one traversal. And while at it, I noticed that virGetXDGDirectory could deref NULL if the getpwuid_r lookup fails. * src/util/virutil.c (virGetUserEnt): Alter signature. (virGetUserDirectory, virGetXDGDirectory, virGetUserName): Adjust callers. Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit c1983ba4e3902308054e961fcae75cece73ef4ba) Conflicts: src/util/virutil.c - oom reporting/strdup changes not backported --- src/util/util.c | 60 ++++++++++++++++++++++++++++++++++----------------------- 1 file changed, 36 insertions(+), 24 deletions(-) diff --git a/src/util/util.c b/src/util/util.c index 8315515..dd4e4ea 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2290,21 +2290,23 @@ check_and_return: } #ifdef HAVE_GETPWUID_R -enum { - VIR_USER_ENT_DIRECTORY, - VIR_USER_ENT_NAME, -}; - -static char *virGetUserEnt(uid_t uid, - int field) +/* Look up fields from the user database for the given user. On + * error, set errno, report the error, and return -1. */ +static int +virGetUserEnt(uid_t uid, char **name, gid_t *group, char **dir) { char *strbuf; - char *ret; struct passwd pwbuf; struct passwd *pw = NULL; long val = sysconf(_SC_GETPW_R_SIZE_MAX); size_t strbuflen = val; int rc; + int ret = -1; + + if (name) + *name = NULL; + if (dir) + *dir = NULL; /* sysconf is a hint; if it fails, fall back to a reasonable size */ if (val < 0) @@ -2312,7 +2314,7 @@ static char *virGetUserEnt(uid_t uid, if (VIR_ALLOC_N(strbuf, strbuflen) < 0) { virReportOOMError(); - return NULL; + return -1; } /* @@ -2325,28 +2327,33 @@ static char *virGetUserEnt(uid_t uid, while ((rc = getpwuid_r(uid, &pwbuf, strbuf, strbuflen, &pw)) == ERANGE) { if (VIR_RESIZE_N(strbuf, strbuflen, strbuflen, strbuflen) < 0) { virReportOOMError(); - VIR_FREE(strbuf); - return NULL; + goto cleanup; } } if (rc != 0 || pw == NULL) { virReportSystemError(rc, _("Failed to find user record for uid '%u'"), (unsigned int) uid); - VIR_FREE(strbuf); - return NULL; + goto cleanup; } - if (field == VIR_USER_ENT_DIRECTORY) - ret = strdup(pw->pw_dir); - else - ret = strdup(pw->pw_name); + if (name && !(*name = strdup(pw->pw_name))) + goto no_memory; + if (group) + *group = pw->pw_gid; + if (dir && !(*dir = strdup(pw->pw_dir))) { + if (name) + VIR_FREE(*name); + goto no_memory; + } + ret = 0; +cleanup: VIR_FREE(strbuf); - if (!ret) - virReportOOMError(); - return ret; +no_memory: + virReportOOMError(); + goto cleanup; } static char *virGetGroupEnt(gid_t gid) @@ -2401,20 +2408,23 @@ static char *virGetGroupEnt(gid_t gid) char *virGetUserDirectory(void) { - return virGetUserEnt(geteuid(), VIR_USER_ENT_DIRECTORY); + char *ret; + virGetUserEnt(geteuid(), NULL, NULL, &ret); + return ret; } static char *virGetXDGDirectory(const char *xdgenvname, const char *xdgdefdir) { const char *path = getenv(xdgenvname); char *ret = NULL; - char *home = virGetUserEnt(geteuid(), VIR_USER_ENT_DIRECTORY); + char *home = virGetUserDirectory(); if (path && path[0]) { if (virAsprintf(&ret, "%s/libvirt", path) < 0) goto no_memory; } else { - if (virAsprintf(&ret, "%s/%s/libvirt", home, xdgdefdir) < 0) + if (home && + virAsprintf(&ret, "%s/%s/libvirt", home, xdgdefdir) < 0) goto no_memory; } @@ -2456,7 +2466,9 @@ char *virGetUserRuntimeDirectory(void) char *virGetUserName(uid_t uid) { - return virGetUserEnt(uid, VIR_USER_ENT_NAME); + char *ret; + virGetUserEnt(uid, &ret, NULL, NULL); + return ret; } char *virGetGroupName(gid_t gid) -- 1.8.3.1

On 07/23/2013 11:03 AM, Eric Blake wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=964358
A future patch needs to look up pw_gid; but it is wasteful to crawl through getpwuid_r twice for two separate pieces of information, and annoying to copy that much boilerplate code for doing the crawl. The current internal-only virGetUserEnt is also a rather awkward interface; it's easier to just design it to let callers request multiple pieces of data as needed from one traversal.
And while at it, I noticed that virGetXDGDirectory could deref NULL if the getpwuid_r lookup fails.
* src/util/virutil.c (virGetUserEnt): Alter signature. (virGetUserDirectory, virGetXDGDirectory, virGetUserName): Adjust callers.
Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit c1983ba4e3902308054e961fcae75cece73ef4ba)
Conflicts: src/util/virutil.c - oom reporting/strdup changes not backported --- src/util/util.c | 60 ++++++++++++++++++++++++++++++++++----------------------- 1 file changed, 36 insertions(+), 24 deletions(-)
ACK - Cole

https://bugzilla.redhat.com/show_bug.cgi?id=964358 Since neither getpwuid_r() nor initgroups() are safe to call in between fork and exec (they obtain a mutex, but if some other thread in the parent also held the mutex at the time of the fork, the child will deadlock), we have to split out the functionality that is unsafe. At least glibc's initgroups() uses getgrouplist under the hood, so the ideal split is to expose getgrouplist for use before a fork. Gnulib already gives us a nice wrapper via mgetgroups; we wrap it once more to look up by uid instead of name. * bootstrap.conf (gnulib_modules): Add mgetgroups. * src/util/virutil.h (virGetGroupList): New declaration. * src/util/virutil.c (virGetGroupList): New function. * src/libvirt_private.syms (virutil.h): Export it. Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit 75c125641ac73473ba4b0542524d67a184769c8e) Conflicts: bootstrap.conf - not updating gnulib submodule... configure.ac - ...so checking for getgrouplist by hand... src/util/virutil.c - ...and copying only the getgrouplist implementation rather than calling the gnulib function; also, file still named util.c src/libvirt_private.syms - context --- configure.ac | 2 +- src/libvirt_private.syms | 1 + src/util/util.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/util.h | 2 ++ 4 files changed, 64 insertions(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 5f1a273..aeff11d 100644 --- a/configure.ac +++ b/configure.ac @@ -171,7 +171,7 @@ AC_CHECK_SIZEOF([long]) dnl Availability of various common functions (non-fatal if missing), dnl and various less common threadsafe functions -AC_CHECK_FUNCS_ONCE([cfmakeraw geteuid getgid getgrnam_r getmntent_r \ +AC_CHECK_FUNCS_ONCE([cfmakeraw geteuid getgid getgrnam_r getgrouplist getmntent_r \ getpwuid_r getuid initgroups kill mmap newlocale posix_fallocate \ posix_memalign regexec sched_getaffinity]) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 45cdace..24f7047 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1243,6 +1243,7 @@ virFileWaitForDevices; virFileWriteStr; virFindFileInPath; virGetGroupID; +virGetGroupList; virGetGroupName; virGetHostname; virGetUserDirectory; diff --git a/src/util/util.c b/src/util/util.c index dd4e4ea..5ca5034 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -2575,6 +2575,66 @@ int virGetGroupID(const char *name, } +/* Compute the list of supplementary groups associated with @uid, and + * including @gid in the list (unless it is -1), storing a malloc'd + * result into @list. Return the size of the list on success, or -1 + * on failure with error reported and errno set. May not be called + * between fork and exec. */ +int +virGetGroupList(uid_t uid, gid_t gid, gid_t **list) +{ + int ret = -1; + char *user = NULL; + + *list = NULL; + if (uid == (uid_t)-1) + return 0; + + if (virGetUserEnt(uid, &user, + gid == (gid_t)-1 ? &gid : NULL, NULL) < 0) + return -1; + +# if HAVE_GETGROUPLIST + /* Borrowing from gnulib's LGPLv2+ mgetgroups.c as of July 2013. */ + /* Avoid a bug in older glibc with size 0, by pre-allocating a + * list size and then enlarging if needed. */ + int max = 10; + if (VIR_ALLOC_N(*list, max) < 0) + goto no_memory; + while (1) + { + int ngroups; + int last = max; + + ngroups = getgrouplist(user, gid, *list, &max); + + /* Avoid a bug in Darwin where max is not increased. */ + if (ngroups < 0 && last == max) + max *= 2; + if (VIR_REALLOC_N(*list, max) < 0) { + VIR_FREE(*list); + goto no_memory; + } + if (0 <= ngroups) { + ret = ngroups; + break; + } + } +# else + if (VIR_ALLOC_N(*list, 1) < 0) + goto no_memory; + (*list)[0] = gid; + ret = 1; +# endif + +cleanup: + VIR_FREE(user); + return ret; +no_memory: + virReportOOMError(); + goto cleanup; +} + /* Set the real and effective uid and gid to the given values, and call * initgroups so that the process has all the assumed group membership of * that uid. return 0 on success, -1 on failure (the original system error diff --git a/src/util/util.h b/src/util/util.h index 5ab36ed..98964b2 100644 --- a/src/util/util.h +++ b/src/util/util.h @@ -261,6 +261,8 @@ char *virGetUserCacheDirectory(void); char *virGetUserRuntimeDirectory(void); char *virGetUserName(uid_t uid); char *virGetGroupName(gid_t gid); +int virGetGroupList(uid_t uid, gid_t group, gid_t **groups) + ATTRIBUTE_NONNULL(3); int virGetUserID(const char *name, uid_t *uid) ATTRIBUTE_RETURN_CHECK; int virGetGroupID(const char *name, -- 1.8.3.1

On 07/23/2013 11:03 AM, Eric Blake wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=964358
Since neither getpwuid_r() nor initgroups() are safe to call in between fork and exec (they obtain a mutex, but if some other thread in the parent also held the mutex at the time of the fork, the child will deadlock), we have to split out the functionality that is unsafe. At least glibc's initgroups() uses getgrouplist under the hood, so the ideal split is to expose getgrouplist for use before a fork. Gnulib already gives us a nice wrapper via mgetgroups; we wrap it once more to look up by uid instead of name.
* bootstrap.conf (gnulib_modules): Add mgetgroups. * src/util/virutil.h (virGetGroupList): New declaration. * src/util/virutil.c (virGetGroupList): New function. * src/libvirt_private.syms (virutil.h): Export it.
Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit 75c125641ac73473ba4b0542524d67a184769c8e)
Conflicts: bootstrap.conf - not updating gnulib submodule... configure.ac - ...so checking for getgrouplist by hand... src/util/virutil.c - ...and copying only the getgrouplist implementation rather than calling the gnulib function; also, file still named util.c src/libvirt_private.syms - context
ACK - Cole

https://bugzilla.redhat.com/show_bug.cgi?id=964358 POSIX states that multi-threaded apps should not use functions that are not async-signal-safe between fork and exec, yet we were using getpwuid_r and initgroups. Although rare, it is possible to hit deadlock in the child, when it tries to grab a mutex that was already held by another thread in the parent. I actually hit this deadlock when testing multiple domains being started in parallel with a command hook, with the following backtrace in the child: Thread 1 (Thread 0x7fd56bbf2700 (LWP 3212)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 #1 0x00007fd5761e7388 in _L_lock_854 () from /lib64/libpthread.so.0 #2 0x00007fd5761e7257 in __pthread_mutex_lock (mutex=0x7fd56be00360) at pthread_mutex_lock.c:61 #3 0x00007fd56bbf9fc5 in _nss_files_getpwuid_r (uid=0, result=0x7fd56bbf0c70, buffer=0x7fd55c2a65f0 "", buflen=1024, errnop=0x7fd56bbf25b8) at nss_files/files-pwd.c:40 #4 0x00007fd575aeff1d in __getpwuid_r (uid=0, resbuf=0x7fd56bbf0c70, buffer=0x7fd55c2a65f0 "", buflen=1024, result=0x7fd56bbf0cb0) at ../nss/getXXbyYY_r.c:253 #5 0x00007fd578aebafc in virSetUIDGID (uid=0, gid=0) at util/virutil.c:1031 #6 0x00007fd578aebf43 in virSetUIDGIDWithCaps (uid=0, gid=0, capBits=0, clearExistingCaps=true) at util/virutil.c:1388 #7 0x00007fd578a9a20b in virExec (cmd=0x7fd55c231f10) at util/vircommand.c:654 #8 0x00007fd578a9dfa2 in virCommandRunAsync (cmd=0x7fd55c231f10, pid=0x0) at util/vircommand.c:2247 #9 0x00007fd578a9d74e in virCommandRun (cmd=0x7fd55c231f10, exitstatus=0x0) at util/vircommand.c:2100 #10 0x00007fd56326fde5 in qemuProcessStart (conn=0x7fd53c000df0, driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, migrateFrom=0x0, stdin_fd=-1, stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1) at qemu/qemu_process.c:3694 ... The solution is to split the work of getpwuid_r/initgroups into the unsafe portions (getgrouplist, called pre-fork) and safe portions (setgroups, called post-fork). * src/util/virutil.h (virSetUIDGID, virSetUIDGIDWithCaps): Adjust signature. * src/util/virutil.c (virSetUIDGID): Add parameters. (virSetUIDGIDWithCaps): Adjust clients. * src/util/vircommand.c (virExec): Likewise. * src/util/virfile.c (virFileAccessibleAs, virFileOpenForked) (virDirCreate): Likewise. * src/security/security_dac.c (virSecurityDACSetProcessLabel): Likewise. * src/lxc/lxc_container.c (lxcContainerSetID): Likewise. * configure.ac (AC_CHECK_FUNCS_ONCE): Check for setgroups, not initgroups. Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit ee777e994927ed5f2d427fbc5a53cbe8b5969bda) Conflicts: src/lxc/lxc_container.c - did not use setUIDGID before 1.1.0 src/util/virutil.c - oom handling changes not backported; no virSetUIDGIDWithCaps src/util/virfile.c - functions still lived in virutil.c this far back configure.ac - context with previous commit src/util/command.c - no UID/GID handling in vircommand.c... src/storage/storage_backend.c - ...so do it in the one hook user instead --- configure.ac | 4 +- src/lxc/lxc_container.c | 2 +- src/security/security_dac.c | 16 ++++-- src/storage/storage_backend.c | 16 +++++- src/util/util.c | 129 +++++++++++++++++------------------------- src/util/util.h | 2 +- 6 files changed, 83 insertions(+), 86 deletions(-) diff --git a/configure.ac b/configure.ac index aeff11d..ca34ac1 100644 --- a/configure.ac +++ b/configure.ac @@ -172,8 +172,8 @@ AC_CHECK_SIZEOF([long]) dnl Availability of various common functions (non-fatal if missing), dnl and various less common threadsafe functions AC_CHECK_FUNCS_ONCE([cfmakeraw geteuid getgid getgrnam_r getgrouplist getmntent_r \ - getpwuid_r getuid initgroups kill mmap newlocale posix_fallocate \ - posix_memalign regexec sched_getaffinity]) + getpwuid_r getuid kill mmap newlocale posix_fallocate posix_memalign \ + prlimit regexec sched_getaffinity setgroups]) dnl Availability of pthread functions (if missing, win32 threading is dnl assumed). Because of $LIB_PTHREAD, we cannot use AC_CHECK_FUNCS_ONCE. diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 8faa664..9d94042 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2008-2012 Red Hat, Inc. + * Copyright (C) 2008-2013 Red Hat, Inc. * Copyright (C) 2008 IBM Corp. * * lxc_container.c: file description diff --git a/src/security/security_dac.c b/src/security/security_dac.c index ae5d8aa..61b705c 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -839,16 +839,24 @@ virSecurityDACSetProcessLabel(virSecurityManagerPtr mgr, uid_t user; gid_t group; virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); + gid_t *groups; + int ngroups; + int ret = -1; if (virSecurityDACGetIds(def, priv, &user, &group)) return -1; + ngroups = virGetGroupList(user, group, &groups); + if (ngroups < 0) + return -1; VIR_DEBUG("Dropping privileges of DEF to %u:%u", user, group); - if (virSetUIDGID(user, group) < 0) - return -1; - - return 0; + if (virSetUIDGID(user, group, groups, ngroups) < 0) + goto cleanup; + ret = 0; +cleanup: + VIR_FREE(groups); + return ret; } diff --git a/src/storage/storage_backend.c b/src/storage/storage_backend.c index 75a3667..c820d74 100644 --- a/src/storage/storage_backend.c +++ b/src/storage/storage_backend.c @@ -538,6 +538,8 @@ cleanup: struct hookdata { virStorageVolDefPtr vol; bool skip; + gid_t *groups; + int ngroups; }; static int virStorageBuildSetUIDHook(void *data) { @@ -547,9 +549,13 @@ static int virStorageBuildSetUIDHook(void *data) { if (tmp->skip) return 0; - if (virSetUIDGID(vol->target.perms.uid, vol->target.perms.gid) < 0) + if (virSetUIDGID(vol->target.perms.uid, vol->target.perms.gid, + tmp->groups, tmp->ngroups) < 0) { + VIR_FREE(tmp->groups); return -1; + } + VIR_FREE(tmp->groups); return 0; } @@ -560,7 +566,7 @@ static int virStorageBackendCreateExecCommand(virStoragePoolObjPtr pool, gid_t gid; uid_t uid; int filecreated = 0; - struct hookdata data = {vol, false}; + struct hookdata data = {vol, false, NULL, 0}; if ((pool->def->type == VIR_STORAGE_POOL_NETFS) && (((getuid() == 0) @@ -569,6 +575,11 @@ static int virStorageBackendCreateExecCommand(virStoragePoolObjPtr pool, || ((vol->target.perms.gid != -1) && (vol->target.perms.gid != getgid())))) { + if ((data.ngroups = virGetGroupList(vol->target.perms.uid, + vol->target.perms.gid, + &data.groups)) < 0) + return -1; + virCommandSetPreExecHook(cmd, virStorageBuildSetUIDHook, &data); if (virCommandRun(cmd, NULL) == 0) { @@ -576,6 +587,7 @@ static int virStorageBackendCreateExecCommand(virStoragePoolObjPtr pool, if (stat(vol->target.path, &st) >=0) filecreated = 1; } + VIR_FREE(data.groups); } data.skip = true; diff --git a/src/util/util.c b/src/util/util.c index 5ca5034..f0ccaf4 100644 --- a/src/util/util.c +++ b/src/util/util.c @@ -718,18 +718,26 @@ virFileAccessibleAs(const char *path, int mode, pid_t pid = 0; int status, ret = 0; int forkRet = 0; + gid_t *groups; + int ngroups; if (uid == getuid() && gid == getgid()) return access(path, mode); + ngroups = virGetGroupList(uid, gid, &groups); + if (ngroups < 0) + return -1; + forkRet = virFork(&pid); if (pid < 0) { + VIR_FREE(groups); return -1; } if (pid) { /* parent */ + VIR_FREE(groups); if (virProcessWait(pid, &status) < 0) { /* virProcessWait() already * reported error */ @@ -758,7 +766,7 @@ virFileAccessibleAs(const char *path, int mode, goto childerror; } - if (virSetUIDGID(uid, gid) < 0) { + if (virSetUIDGID(uid, gid, groups, ngroups) < 0) { ret = errno; goto childerror; } @@ -834,17 +842,24 @@ virFileOpenForked(const char *path, int openflags, mode_t mode, int fd = -1; int pair[2] = { -1, -1 }; int forkRet; + gid_t *groups; + int ngroups; /* parent is running as root, but caller requested that the * file be opened as some other user and/or group). The * following dance avoids problems caused by root-squashing * NFS servers. */ + ngroups = virGetGroupList(uid, gid, &groups); + if (ngroups < 0) + return -errno; + if (socketpair(AF_UNIX, SOCK_STREAM, 0, pair) < 0) { ret = -errno; virReportSystemError(errno, _("failed to create socket needed for '%s'"), path); + VIR_FREE(groups); return ret; } @@ -865,7 +880,7 @@ virFileOpenForked(const char *path, int openflags, mode_t mode, /* set desired uid/gid, then attempt to create the file */ - if (virSetUIDGID(uid, gid) < 0) { + if (virSetUIDGID(uid, gid, groups, ngroups) < 0) { ret = -errno; goto childerror; } @@ -913,6 +928,7 @@ virFileOpenForked(const char *path, int openflags, mode_t mode, /* parent */ + VIR_FREE(groups); VIR_FORCE_CLOSE(pair[1]); do { @@ -1123,6 +1139,8 @@ int virDirCreate(const char *path, mode_t mode, pid_t pid; int waitret; int status, ret = 0; + gid_t *groups; + int ngroups; /* allow using -1 to mean "current value" */ if (uid == (uid_t) -1) @@ -1137,15 +1155,21 @@ int virDirCreate(const char *path, mode_t mode, return virDirCreateNoFork(path, mode, uid, gid, flags); } + ngroups = virGetGroupList(uid, gid, &groups); + if (ngroups < 0) + return -errno; + int forkRet = virFork(&pid); if (pid < 0) { ret = -errno; + VIR_FREE(groups); return ret; } if (pid) { /* parent */ /* wait for child to complete, and retrieve its exit code */ + VIR_FREE(groups); while ((waitret = waitpid(pid, &status, 0) == -1) && (errno == EINTR)); if (waitret == -1) { ret = -errno; @@ -1172,7 +1196,7 @@ parenterror: /* set desired uid/gid, then attempt to create the directory */ - if (virSetUIDGID(uid, gid) < 0) { + if (virSetUIDGID(uid, gid, groups, ngroups) < 0) { ret = -errno; goto childerror; } @@ -2635,87 +2659,38 @@ no_memory: goto cleanup; } -/* Set the real and effective uid and gid to the given values, and call - * initgroups so that the process has all the assumed group membership of - * that uid. return 0 on success, -1 on failure (the original system error - * remains in errno). +/* Set the real and effective uid and gid to the given values, as well + * as all the supplementary groups, so that the process has all the + * assumed group membership of that uid. Return 0 on success, -1 on + * failure (the original system error remains in errno). */ int -virSetUIDGID(uid_t uid, gid_t gid) +virSetUIDGID(uid_t uid, gid_t gid, gid_t *groups ATTRIBUTE_UNUSED, + int ngroups ATTRIBUTE_UNUSED) { - int err; - char *buf = NULL; - - if (gid > 0) { - if (setregid(gid, gid) < 0) { - virReportSystemError(err = errno, - _("cannot change to '%d' group"), - (unsigned int) gid); - goto error; - } + if (gid != (gid_t)-1 && setregid(gid, gid) < 0) { + virReportSystemError(errno, + _("cannot change to '%u' group"), + (unsigned int) gid); + return -1; } - if (uid > 0) { -# ifdef HAVE_INITGROUPS - struct passwd pwd, *pwd_result; - size_t bufsize; - int rc; - - bufsize = sysconf(_SC_GETPW_R_SIZE_MAX); - if (bufsize == -1) - bufsize = 16384; - - if (VIR_ALLOC_N(buf, bufsize) < 0) { - virReportOOMError(); - err = ENOMEM; - goto error; - } - while ((rc = getpwuid_r(uid, &pwd, buf, bufsize, - &pwd_result)) == ERANGE) { - if (VIR_RESIZE_N(buf, bufsize, bufsize, bufsize) < 0) { - virReportOOMError(); - err = ENOMEM; - goto error; - } - } - - if (rc) { - virReportSystemError(err = rc, _("cannot getpwuid_r(%d)"), - (unsigned int) uid); - goto error; - } - - if (!pwd_result) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("getpwuid_r failed to retrieve data " - "for uid '%d'"), - (unsigned int) uid); - err = EINVAL; - goto error; - } - - if (initgroups(pwd.pw_name, pwd.pw_gid) < 0) { - virReportSystemError(err = errno, - _("cannot initgroups(\"%s\", %d)"), - pwd.pw_name, (unsigned int) pwd.pw_gid); - goto error; - } +# if HAVE_SETGROUPS + if (ngroups && setgroups(ngroups, groups) < 0) { + virReportSystemError(errno, "%s", + _("cannot set supplemental groups")); + return -1; + } # endif - if (setreuid(uid, uid) < 0) { - virReportSystemError(err = errno, - _("cannot change to uid to '%d'"), - (unsigned int) uid); - goto error; - } + + if (uid != (uid_t)-1 && setreuid(uid, uid) < 0) { + virReportSystemError(errno, + _("cannot change to uid to '%u'"), + (unsigned int) uid); + return -1; } - VIR_FREE(buf); return 0; - -error: - VIR_FREE(buf); - errno = err; - return -1; } #else /* ! HAVE_GETPWUID_R */ @@ -2932,7 +2907,9 @@ int virGetGroupID(const char *name ATTRIBUTE_UNUSED, int virSetUIDGID(uid_t uid ATTRIBUTE_UNUSED, - gid_t gid ATTRIBUTE_UNUSED) + gid_t gid ATTRIBUTE_UNUSED, + gid_t *groups ATTRIBUTE_UNUSED, + int ngroups ATTRIBUTE_UNUSED) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("virSetUIDGID is not available")); diff --git a/src/util/util.h b/src/util/util.h index 98964b2..0f904c3 100644 --- a/src/util/util.h +++ b/src/util/util.h @@ -53,7 +53,7 @@ int virSetCloseExec(int fd) ATTRIBUTE_RETURN_CHECK; int virPipeReadUntilEOF(int outfd, int errfd, char **outbuf, char **errbuf); -int virSetUIDGID(uid_t uid, gid_t gid); +int virSetUIDGID(uid_t uid, gid_t gid, gid_t *groups, int ngroups); int virFileReadLimFD(int fd, int maxlen, char **buf) ATTRIBUTE_RETURN_CHECK; -- 1.8.3.1

On 07/23/2013 11:03 AM, Eric Blake wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=964358
POSIX states that multi-threaded apps should not use functions that are not async-signal-safe between fork and exec, yet we were using getpwuid_r and initgroups. Although rare, it is possible to hit deadlock in the child, when it tries to grab a mutex that was already held by another thread in the parent. I actually hit this deadlock when testing multiple domains being started in parallel with a command hook, with the following backtrace in the child:
Thread 1 (Thread 0x7fd56bbf2700 (LWP 3212)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 #1 0x00007fd5761e7388 in _L_lock_854 () from /lib64/libpthread.so.0 #2 0x00007fd5761e7257 in __pthread_mutex_lock (mutex=0x7fd56be00360) at pthread_mutex_lock.c:61 #3 0x00007fd56bbf9fc5 in _nss_files_getpwuid_r (uid=0, result=0x7fd56bbf0c70, buffer=0x7fd55c2a65f0 "", buflen=1024, errnop=0x7fd56bbf25b8) at nss_files/files-pwd.c:40 #4 0x00007fd575aeff1d in __getpwuid_r (uid=0, resbuf=0x7fd56bbf0c70, buffer=0x7fd55c2a65f0 "", buflen=1024, result=0x7fd56bbf0cb0) at ../nss/getXXbyYY_r.c:253 #5 0x00007fd578aebafc in virSetUIDGID (uid=0, gid=0) at util/virutil.c:1031 #6 0x00007fd578aebf43 in virSetUIDGIDWithCaps (uid=0, gid=0, capBits=0, clearExistingCaps=true) at util/virutil.c:1388 #7 0x00007fd578a9a20b in virExec (cmd=0x7fd55c231f10) at util/vircommand.c:654 #8 0x00007fd578a9dfa2 in virCommandRunAsync (cmd=0x7fd55c231f10, pid=0x0) at util/vircommand.c:2247 #9 0x00007fd578a9d74e in virCommandRun (cmd=0x7fd55c231f10, exitstatus=0x0) at util/vircommand.c:2100 #10 0x00007fd56326fde5 in qemuProcessStart (conn=0x7fd53c000df0, driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, migrateFrom=0x0, stdin_fd=-1, stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1) at qemu/qemu_process.c:3694 ...
The solution is to split the work of getpwuid_r/initgroups into the unsafe portions (getgrouplist, called pre-fork) and safe portions (setgroups, called post-fork).
* src/util/virutil.h (virSetUIDGID, virSetUIDGIDWithCaps): Adjust signature. * src/util/virutil.c (virSetUIDGID): Add parameters. (virSetUIDGIDWithCaps): Adjust clients. * src/util/vircommand.c (virExec): Likewise. * src/util/virfile.c (virFileAccessibleAs, virFileOpenForked) (virDirCreate): Likewise. * src/security/security_dac.c (virSecurityDACSetProcessLabel): Likewise. * src/lxc/lxc_container.c (lxcContainerSetID): Likewise. * configure.ac (AC_CHECK_FUNCS_ONCE): Check for setgroups, not initgroups.
Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit ee777e994927ed5f2d427fbc5a53cbe8b5969bda)
Conflicts: src/lxc/lxc_container.c - did not use setUIDGID before 1.1.0 src/util/virutil.c - oom handling changes not backported; no virSetUIDGIDWithCaps src/util/virfile.c - functions still lived in virutil.c this far back configure.ac - context with previous commit src/util/command.c - no UID/GID handling in vircommand.c... src/storage/storage_backend.c - ...so do it in the one hook user instead
ACK - Cole

From: "Daniel P. Berrange" <berrange@redhat.com> https://bugzilla.redhat.com/show_bug.cgi?id=964358 The hook scripts used by virCommand must be careful wrt accessing any mutexes that may have been held by other threads in the parent process. With the recent refactoring there are 2 potential flaws lurking, which will become real deadlock bugs once the global QEMU driver lock is removed. Remove use of the QEMU driver lock from the hook function by passing in the 'virQEMUDriverConfigPtr' instance directly. Add functions to the virSecurityManager to be invoked before and after fork, to ensure the mutex is held by the current thread. This allows it to be safely used in the hook script in the child process. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> (cherry picked from commit 61b52d2e3813cc8c9ff3ab67f232bd0c65f7318d) Conflicts: src/libvirt_private.syms - context src/qemu/qemu_process.c - no backport of qemud_driver struct rename src/security/security_manager.c - no backport of making the security driver self-locking; just expose the interface --- src/libvirt_private.syms | 2 ++ src/qemu/qemu_process.c | 7 +++++++ src/security/security_manager.c | 20 ++++++++++++++++++++ src/security/security_manager.h | 3 +++ 4 files changed, 32 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 24f7047..17e5eff 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1047,6 +1047,8 @@ virSecurityManagerGetProcessLabel; virSecurityManagerNew; virSecurityManagerNewStack; virSecurityManagerNewDAC; +virSecurityManagerPostFork; +virSecurityManagerPreFork; virSecurityManagerReleaseLabel; virSecurityManagerReserveLabel; virSecurityManagerRestoreImageLabel; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index cfadc2c..01c6880 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -2579,6 +2579,11 @@ static int qemuProcessHook(void *data) struct qemuProcessHookData *h = data; int ret = -1; int fd; + /* This method cannot use any mutexes, which are not + * protected across fork() + */ + + virSecurityManagerPostFork(h->driver->securityManager); /* Some later calls want pid present */ h->vm->pid = getpid(); @@ -3640,7 +3645,9 @@ int qemuProcessStart(virConnectPtr conn, virCommandDaemonize(cmd); virCommandRequireHandshake(cmd); + virSecurityManagerPreFork(driver->securityManager); ret = virCommandRun(cmd, NULL); + virSecurityManagerPostFork(driver->securityManager); /* wait for qemu process to show up */ if (ret == 0) { diff --git a/src/security/security_manager.c b/src/security/security_manager.c index d446607..fc7acff 100644 --- a/src/security/security_manager.c +++ b/src/security/security_manager.c @@ -161,6 +161,26 @@ virSecurityManagerPtr virSecurityManagerNew(const char *name, requireConfined); } + +/* + * Must be called before fork()'ing to ensure mutex state + * is sane for the child to use + */ +void virSecurityManagerPreFork(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED) +{ + /* XXX Grab our own mutex here instead of relying on caller's mutex */ +} + + +/* + * Must be called after fork()'ing in both parent and child + * to ensure mutex state is sane for the child to use + */ +void virSecurityManagerPostFork(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED) +{ + /* XXX Release our own mutex here instead of relying on caller's mutex */ +} + void *virSecurityManagerGetPrivateData(virSecurityManagerPtr mgr) { /* This accesses the memory just beyond mgr, which was allocated diff --git a/src/security/security_manager.h b/src/security/security_manager.h index 1fdaf8e..d1a5997 100644 --- a/src/security/security_manager.h +++ b/src/security/security_manager.h @@ -46,6 +46,9 @@ virSecurityManagerPtr virSecurityManagerNewDAC(const char *virtDriver, bool requireConfined, bool dynamicOwnership); +void virSecurityManagerPreFork(virSecurityManagerPtr mgr); +void virSecurityManagerPostFork(virSecurityManagerPtr mgr); + void *virSecurityManagerGetPrivateData(virSecurityManagerPtr mgr); void virSecurityManagerFree(virSecurityManagerPtr mgr); -- 1.8.3.1

On 07/23/2013 11:03 AM, Eric Blake wrote:
From: "Daniel P. Berrange" <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=964358
The hook scripts used by virCommand must be careful wrt accessing any mutexes that may have been held by other threads in the parent process. With the recent refactoring there are 2 potential flaws lurking, which will become real deadlock bugs once the global QEMU driver lock is removed.
Remove use of the QEMU driver lock from the hook function by passing in the 'virQEMUDriverConfigPtr' instance directly.
Add functions to the virSecurityManager to be invoked before and after fork, to ensure the mutex is held by the current thread. This allows it to be safely used in the hook script in the child process.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> (cherry picked from commit 61b52d2e3813cc8c9ff3ab67f232bd0c65f7318d)
Conflicts: src/libvirt_private.syms - context src/qemu/qemu_process.c - no backport of qemud_driver struct rename src/security/security_manager.c - no backport of making the security driver self-locking; just expose the interface
ACK - Cole

https://bugzilla.redhat.com/show_bug.cgi?id=964358 A future patch wants the DAC security manager to be able to safely get the supplemental group list for a given uid, but at the time of a fork rather than during initialization so as to pick up on live changes to the system's group database. This patch adds the framework, including the possibility of a pre-fork callback failing. For now, any driver that implements a prefork callback must be robust against the possibility of being part of a security stack where a later element in the chain fails prefork. This means that drivers cannot do any action that requires a call to postfork for proper cleanup (no grabbing a mutex, for example). If this is too prohibitive in the future, we would have to switch to a transactioning sequence, where each driver has (up to) 3 callbacks: PreForkPrepare, PreForkCommit, and PreForkAbort, to either clean up or commit changes made during prepare. * src/security/security_driver.h (virSecurityDriverPreFork): New callback. * src/security/security_manager.h (virSecurityManagerPreFork): Change signature. * src/security/security_manager.c (virSecurityManagerPreFork): Optionally call into driver, and allow returning failure. * src/security/security_stack.c (virSecurityDriverStack): Wrap the handler for the stack driver. * src/qemu/qemu_process.c (qemuProcessStart): Adjust caller. Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit fdb3bde31ccf8ff172abf00ef5aa974b87af2794) Conflicts: src/security/security_manager.c - context from previous backport differences --- src/qemu/qemu_process.c | 3 ++- src/security/security_driver.h | 4 ++++ src/security/security_manager.c | 14 ++++++++++++-- src/security/security_manager.h | 2 +- src/security/security_stack.c | 23 +++++++++++++++++++++++ 5 files changed, 42 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 01c6880..4fdad6a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3645,7 +3645,8 @@ int qemuProcessStart(virConnectPtr conn, virCommandDaemonize(cmd); virCommandRequireHandshake(cmd); - virSecurityManagerPreFork(driver->securityManager); + if (virSecurityManagerPreFork(driver->securityManager) < 0) + goto cleanup; ret = virCommandRun(cmd, NULL); virSecurityManagerPostFork(driver->securityManager); diff --git a/src/security/security_driver.h b/src/security/security_driver.h index d49b401..92bed3f 100644 --- a/src/security/security_driver.h +++ b/src/security/security_driver.h @@ -47,6 +47,8 @@ typedef int (*virSecurityDriverClose) (virSecurityManagerPtr mgr); typedef const char *(*virSecurityDriverGetModel) (virSecurityManagerPtr mgr); typedef const char *(*virSecurityDriverGetDOI) (virSecurityManagerPtr mgr); +typedef int (*virSecurityDriverPreFork) (virSecurityManagerPtr mgr); + typedef int (*virSecurityDomainRestoreImageLabel) (virSecurityManagerPtr mgr, virDomainDefPtr def, virDomainDiskDefPtr disk); @@ -111,6 +113,8 @@ struct _virSecurityDriver { virSecurityDriverGetModel getModel; virSecurityDriverGetDOI getDOI; + virSecurityDriverPreFork preFork; + virSecurityDomainSecurityVerify domainSecurityVerify; virSecurityDomainSetImageLabel domainSetSecurityImageLabel; diff --git a/src/security/security_manager.c b/src/security/security_manager.c index fc7acff..b138cc1 100644 --- a/src/security/security_manager.c +++ b/src/security/security_manager.c @@ -164,11 +164,21 @@ virSecurityManagerPtr virSecurityManagerNew(const char *name, /* * Must be called before fork()'ing to ensure mutex state - * is sane for the child to use + * is sane for the child to use. A negative return means the + * child must not be forked; a successful return must be + * followed by a call to virSecurityManagerPostFork() in both + * parent and child. */ -void virSecurityManagerPreFork(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED) +int virSecurityManagerPreFork(virSecurityManagerPtr mgr) { + int ret = 0; + /* XXX Grab our own mutex here instead of relying on caller's mutex */ + if (mgr->drv->preFork) { + ret = mgr->drv->preFork(mgr); + } + + return ret; } diff --git a/src/security/security_manager.h b/src/security/security_manager.h index d1a5997..be8774d 100644 --- a/src/security/security_manager.h +++ b/src/security/security_manager.h @@ -46,7 +46,7 @@ virSecurityManagerPtr virSecurityManagerNewDAC(const char *virtDriver, bool requireConfined, bool dynamicOwnership); -void virSecurityManagerPreFork(virSecurityManagerPtr mgr); +int virSecurityManagerPreFork(virSecurityManagerPtr mgr); void virSecurityManagerPostFork(virSecurityManagerPtr mgr); void *virSecurityManagerGetPrivateData(virSecurityManagerPtr mgr); diff --git a/src/security/security_stack.c b/src/security/security_stack.c index 91401c6..e8133c4 100644 --- a/src/security/security_stack.c +++ b/src/security/security_stack.c @@ -114,6 +114,27 @@ virSecurityStackGetDOI(virSecurityManagerPtr mgr) } static int +virSecurityStackPreFork(virSecurityManagerPtr mgr) +{ + virSecurityStackDataPtr priv = virSecurityManagerGetPrivateData(mgr); + virSecurityStackItemPtr item = priv->itemsHead; + int rc = 0; + + /* XXX For now, we rely on no driver having any state that requires + * rollback if a later driver in the stack fails; if this changes, + * we'd need to split this into transaction semantics by dividing + * the work into prepare/commit/abort. */ + for (; item; item = item->next) { + if (virSecurityManagerPreFork(item->securityManager) < 0) { + rc = -1; + break; + } + } + + return rc; +} + +static int virSecurityStackVerify(virSecurityManagerPtr mgr, virDomainDefPtr def) { @@ -500,6 +521,8 @@ virSecurityDriver virSecurityDriverStack = { .getModel = virSecurityStackGetModel, .getDOI = virSecurityStackGetDOI, + .preFork = virSecurityStackPreFork, + .domainSecurityVerify = virSecurityStackVerify, .domainSetSecurityImageLabel = virSecurityStackSetSecurityImageLabel, -- 1.8.3.1

On 07/23/2013 11:03 AM, Eric Blake wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=964358
A future patch wants the DAC security manager to be able to safely get the supplemental group list for a given uid, but at the time of a fork rather than during initialization so as to pick up on live changes to the system's group database. This patch adds the framework, including the possibility of a pre-fork callback failing.
For now, any driver that implements a prefork callback must be robust against the possibility of being part of a security stack where a later element in the chain fails prefork. This means that drivers cannot do any action that requires a call to postfork for proper cleanup (no grabbing a mutex, for example). If this is too prohibitive in the future, we would have to switch to a transactioning sequence, where each driver has (up to) 3 callbacks: PreForkPrepare, PreForkCommit, and PreForkAbort, to either clean up or commit changes made during prepare.
* src/security/security_driver.h (virSecurityDriverPreFork): New callback. * src/security/security_manager.h (virSecurityManagerPreFork): Change signature. * src/security/security_manager.c (virSecurityManagerPreFork): Optionally call into driver, and allow returning failure. * src/security/security_stack.c (virSecurityDriverStack): Wrap the handler for the stack driver. * src/qemu/qemu_process.c (qemuProcessStart): Adjust caller.
Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit fdb3bde31ccf8ff172abf00ef5aa974b87af2794)
Conflicts: src/security/security_manager.c - context from previous backport differences
ACK - Cole

https://bugzilla.redhat.com/show_bug.cgi?id=964358 Commit 75c1256 states that virGetGroupList must not be called between fork and exec, then commit ee777e99 promptly violated that for lxc's use of virSecurityManagerSetProcessLabel. Hoist the supplemental group detection to the time that the security manager needs to fork. Qemu is safe, as it uses virSecurityManagerSetChildProcessLabel which in turn uses virCommand to determine supplemental groups. This does not fix the fact that virSecurityManagerSetProcessLabel calls virSecurityDACParseIds calls parseIds which eventually calls getpwnam_r, which also violates fork/exec async-signal-safe safety rules, but so far no one has complained of hitting deadlock in that case. * src/security/security_dac.c (_virSecurityDACData): Track groups in private data. (virSecurityDACPreFork): New function, to set them. (virSecurityDACClose): Clean up new fields. (virSecurityDACGetIds): Alter signature. (virSecurityDACSetSecurityHostdevLabelHelper) (virSecurityDACSetChardevLabel, virSecurityDACSetProcessLabel) (virSecurityDACSetChildProcessLabel): Update callers. Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit 29fe5d745fbe207ec2415441d4807ae76be05974) Conflicts: src/security/security_dac.c - virSecurityDACSetSecurityUSBLabel needed similar treatment; no virSecurityDACSetChildPrcessLabel --- src/security/security_dac.c | 66 +++++++++++++++++++++++++++++++-------------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 61b705c..4975773 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -41,6 +41,8 @@ typedef virSecurityDACData *virSecurityDACDataPtr; struct _virSecurityDACData { uid_t user; gid_t group; + gid_t *groups; + int ngroups; bool dynamicOwnership; }; @@ -122,9 +124,10 @@ int virSecurityDACParseIds(virDomainDefPtr def, uid_t *uidPtr, gid_t *gidPtr) return 0; } -static -int virSecurityDACGetIds(virDomainDefPtr def, virSecurityDACDataPtr priv, - uid_t *uidPtr, gid_t *gidPtr) +static int +virSecurityDACGetIds(virDomainDefPtr def, virSecurityDACDataPtr priv, + uid_t *uidPtr, gid_t *gidPtr, + gid_t **groups, int *ngroups) { int ret; @@ -135,8 +138,13 @@ int virSecurityDACGetIds(virDomainDefPtr def, virSecurityDACDataPtr priv, return -1; } - if ((ret = virSecurityDACParseIds(def, uidPtr, gidPtr)) <= 0) + if ((ret = virSecurityDACParseIds(def, uidPtr, gidPtr)) <= 0) { + if (groups) + *groups = NULL; + if (ngroups) + *ngroups = 0; return ret; + } if (!priv) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -149,6 +157,10 @@ int virSecurityDACGetIds(virDomainDefPtr def, virSecurityDACDataPtr priv, *uidPtr = priv->user; if (gidPtr) *gidPtr = priv->group; + if (groups) + *groups = priv->groups; + if (ngroups) + *ngroups = priv->ngroups; return 0; } @@ -233,8 +245,10 @@ virSecurityDACOpen(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED) } static int -virSecurityDACClose(virSecurityManagerPtr mgr ATTRIBUTE_UNUSED) +virSecurityDACClose(virSecurityManagerPtr mgr) { + virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); + VIR_FREE(priv->groups); return 0; } @@ -250,6 +264,21 @@ static const char * virSecurityDACGetDOI(virSecurityManagerPtr mgr ATTRIBUTE_UNU } static int +virSecurityDACPreFork(virSecurityManagerPtr mgr) +{ + virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); + int ngroups; + + VIR_FREE(priv->groups); + priv->ngroups = 0; + if ((ngroups = virGetGroupList(priv->user, priv->group, + &priv->groups)) < 0) + return -1; + priv->ngroups = ngroups; + return 0; +} + +static int virSecurityDACSetOwnership(const char *path, uid_t uid, gid_t gid) { VIR_INFO("Setting DAC user and group on '%s' to '%ld:%ld'", @@ -437,7 +466,7 @@ virSecurityDACSetSecurityPCILabel(pciDevice *dev ATTRIBUTE_UNUSED, uid_t user; gid_t group; - if (virSecurityDACGetIds(def, priv, &user, &group)) + if (virSecurityDACGetIds(def, priv, &user, &group, NULL, NULL)) return -1; return virSecurityDACSetOwnership(file, user, group); @@ -456,7 +485,7 @@ virSecurityDACSetSecurityUSBLabel(usbDevice *dev ATTRIBUTE_UNUSED, uid_t user; gid_t group; - if (virSecurityDACGetIds(def, priv, &user, &group)) + if (virSecurityDACGetIds(def, priv, &user, &group, NULL, NULL)) return -1; return virSecurityDACSetOwnership(file, user, group); @@ -602,7 +631,7 @@ virSecurityDACSetChardevLabel(virSecurityManagerPtr mgr, uid_t user; gid_t group; - if (virSecurityDACGetIds(def, priv, &user, &group)) + if (virSecurityDACGetIds(def, priv, &user, &group, NULL, NULL)) return -1; switch (dev->type) { @@ -838,25 +867,20 @@ virSecurityDACSetProcessLabel(virSecurityManagerPtr mgr, { uid_t user; gid_t group; - virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); gid_t *groups; int ngroups; - int ret = -1; + virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr); - if (virSecurityDACGetIds(def, priv, &user, &group)) - return -1; - ngroups = virGetGroupList(user, group, &groups); - if (ngroups < 0) + if (virSecurityDACGetIds(def, priv, &user, &group, &groups, &ngroups)) return -1; - VIR_DEBUG("Dropping privileges of DEF to %u:%u", user, group); + VIR_DEBUG("Dropping privileges of DEF to %u:%u, %d supplemental groups", + (unsigned int) user, (unsigned int) group, ngroups); if (virSetUIDGID(user, group, groups, ngroups) < 0) - goto cleanup; - ret = 0; -cleanup: - VIR_FREE(groups); - return ret; + return -1; + + return 0; } @@ -1037,6 +1061,8 @@ virSecurityDriver virSecurityDriverDAC = { .getModel = virSecurityDACGetModel, .getDOI = virSecurityDACGetDOI, + .preFork = virSecurityDACPreFork, + .domainSecurityVerify = virSecurityDACVerify, .domainSetSecurityImageLabel = virSecurityDACSetSecurityImageLabel, -- 1.8.3.1

On 07/23/2013 11:03 AM, Eric Blake wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=964358
Commit 75c1256 states that virGetGroupList must not be called between fork and exec, then commit ee777e99 promptly violated that for lxc's use of virSecurityManagerSetProcessLabel. Hoist the supplemental group detection to the time that the security manager needs to fork. Qemu is safe, as it uses virSecurityManagerSetChildProcessLabel which in turn uses virCommand to determine supplemental groups.
This does not fix the fact that virSecurityManagerSetProcessLabel calls virSecurityDACParseIds calls parseIds which eventually calls getpwnam_r, which also violates fork/exec async-signal-safe safety rules, but so far no one has complained of hitting deadlock in that case.
* src/security/security_dac.c (_virSecurityDACData): Track groups in private data. (virSecurityDACPreFork): New function, to set them. (virSecurityDACClose): Clean up new fields. (virSecurityDACGetIds): Alter signature. (virSecurityDACSetSecurityHostdevLabelHelper) (virSecurityDACSetChardevLabel, virSecurityDACSetProcessLabel) (virSecurityDACSetChildProcessLabel): Update callers.
Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit 29fe5d745fbe207ec2415441d4807ae76be05974)
Conflicts: src/security/security_dac.c - virSecurityDACSetSecurityUSBLabel needed similar treatment; no virSecurityDACSetChildPrcessLabel
ACK - Cole

https://bugzilla.redhat.com/show_bug.cgi?id=964358 Attempts to start a domain with both SELinux and DAC security modules loaded will deadlock; latent problem introduced in commit fdb3bde and exposed in commit 29fe5d7. Basically, when recursing into the security manager for other driver's prefork, we have to undo the asymmetric lock taken at the manager level. Reported by Jiri Denemark, with diagnosis help from Dan Berrange. * src/security/security_stack.c (virSecurityStackPreFork): Undo extra lock grabbed during recursion. Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit bfc183c1e377b24cebf5cede4c00f3dc0d1b3486) --- src/security/security_stack.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/security/security_stack.c b/src/security/security_stack.c index e8133c4..38fe8b5 100644 --- a/src/security/security_stack.c +++ b/src/security/security_stack.c @@ -129,6 +129,11 @@ virSecurityStackPreFork(virSecurityManagerPtr mgr) rc = -1; break; } + /* Undo the unbalanced locking left behind after recursion; if + * PostFork ever delegates to driver callbacks, we'd instead + * need to recurse to an internal method that does not regrab + * a lock. */ + virSecurityManagerPostFork(item->securityManager); } return rc; -- 1.8.3.1

On 07/23/2013 11:04 AM, Eric Blake wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=964358
Attempts to start a domain with both SELinux and DAC security modules loaded will deadlock; latent problem introduced in commit fdb3bde and exposed in commit 29fe5d7. Basically, when recursing into the security manager for other driver's prefork, we have to undo the asymmetric lock taken at the manager level.
Reported by Jiri Denemark, with diagnosis help from Dan Berrange.
* src/security/security_stack.c (virSecurityStackPreFork): Undo extra lock grabbed during recursion.
Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit bfc183c1e377b24cebf5cede4c00f3dc0d1b3486) --- src/security/security_stack.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/src/security/security_stack.c b/src/security/security_stack.c index e8133c4..38fe8b5 100644 --- a/src/security/security_stack.c +++ b/src/security/security_stack.c @@ -129,6 +129,11 @@ virSecurityStackPreFork(virSecurityManagerPtr mgr) rc = -1; break; } + /* Undo the unbalanced locking left behind after recursion; if + * PostFork ever delegates to driver callbacks, we'd instead + * need to recurse to an internal method that does not regrab + * a lock. */ + virSecurityManagerPostFork(item->securityManager); }
return rc;
ACK - Cole

On 07/23/2013 11:03 AM, Eric Blake wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=964358
Since it was on Fedora 18 that I first noticed the deadlock possible when a child process calls getpwuid_r while the parent owned the lock in a different thread, I'm interested in backporting my recent work on virGetGroupList to v0.10.2-maint. However, I hit enough conflict resolution that I'd like a review of my backport decisions before pushing this to the stable branch.
Thanks for doing this Eric! I've ACK'd each patch. - Cole

On 07/25/2013 04:36 PM, Cole Robinson wrote:
On 07/23/2013 11:03 AM, Eric Blake wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=964358
Since it was on Fedora 18 that I first noticed the deadlock possible when a child process calls getpwuid_r while the parent owned the lock in a different thread, I'm interested in backporting my recent work on virGetGroupList to v0.10.2-maint. However, I hit enough conflict resolution that I'd like a review of my backport decisions before pushing this to the stable branch.
Thanks for doing this Eric! I've ACK'd each patch.
I've now pushed to v0.10.2-maint and v1.0.5-maint, if you'd like to cut new releases for Fedora 18 and 19. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (2)
-
Cole Robinson
-
Eric Blake