[PATCH 0/2] Fix a few deadlocks with musl libc

Fix a couple of deadlocks due to use of async-unsafe calls (malloc/free) after fork() and before exec(). They don't fix all theoretical problems but at least they make libvirt usable again with musl 1.2 on Alpine Linux. Natanael Copa (2): util: avoid free() when reset log after fork util: command: improve generic mass close of fds src/util/vircommand.c | 80 ++++++++++++++++++++++++++++++++----------- src/util/virlog.c | 44 ++++++++++++++++++------ src/util/virlog.h | 1 + 3 files changed, 95 insertions(+), 30 deletions(-) -- 2.28.0

Doing malloc/free after fork is techincally not allowed in POSIX and deadlocks[1] with musl libc. [1]: https://gitlab.com/libvirt/libvirt/-/issues/52 Signed-off-by: Natanael Copa <ncopa@alpinelinux.org> --- src/util/vircommand.c | 4 ++-- src/util/virlog.c | 44 +++++++++++++++++++++++++++++++++---------- src/util/virlog.h | 1 + 3 files changed, 37 insertions(+), 12 deletions(-) diff --git a/src/util/vircommand.c b/src/util/vircommand.c index 76f7eb9a3d..17e5bb00d3 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -304,7 +304,7 @@ virFork(void) /* Make sure any hook logging is sent to stderr, since child * process may close the logfile FDs */ logprio = virLogGetDefaultPriority(); - virLogReset(); + virLogResetWithoutFree(); virLogSetDefaultPriority(logprio); /* Clear out all signal handlers from parent so nothing @@ -861,7 +861,7 @@ virExec(virCommandPtr cmd) goto fork_error; /* Close logging again to ensure no FDs leak to child */ - virLogReset(); + virLogResetWithoutFree(); if (cmd->env) execve(binary, cmd->args, cmd->env); diff --git a/src/util/virlog.c b/src/util/virlog.c index 3217e5eb73..3959de5ca7 100644 --- a/src/util/virlog.c +++ b/src/util/virlog.c @@ -108,8 +108,8 @@ static size_t virLogNbOutputs; */ static virLogPriority virLogDefaultPriority = VIR_LOG_DEFAULT; -static void virLogResetFilters(void); -static void virLogResetOutputs(void); +static void virLogResetFilters(bool freemem); +static void virLogResetOutputs(bool freemem); static void virLogOutputToFd(virLogSourcePtr src, virLogPriority priority, const char *filename, @@ -284,8 +284,30 @@ virLogReset(void) return -1; virLogLock(); - virLogResetFilters(); - virLogResetOutputs(); + virLogResetFilters(true); + virLogResetOutputs(true); + virLogDefaultPriority = VIR_LOG_DEFAULT; + virLogUnlock(); + return 0; +} + +/** + * virLogResetWithoutFree: + * + * Reset the logging module to its default initial state, but avoid doing + * free() so it can be used after fork and before exec. + * + * Returns 0 if successful, and -1 in case or error + */ +int +virLogResetWithoutFree(void) +{ + if (virLogInitialize() < 0) + return -1; + + virLogLock(); + virLogResetFilters(false); + virLogResetOutputs(false); virLogDefaultPriority = VIR_LOG_DEFAULT; virLogUnlock(); return 0; @@ -324,9 +346,10 @@ virLogSetDefaultPriority(virLogPriority priority) * Removes the set of logging filters defined. */ static void -virLogResetFilters(void) +virLogResetFilters(bool freemem) { - virLogFilterListFree(virLogFilters, virLogNbFilters); + if (freemem) + virLogFilterListFree(virLogFilters, virLogNbFilters); virLogFilters = NULL; virLogNbFilters = 0; virLogFiltersSerial++; @@ -371,9 +394,10 @@ virLogFilterListFree(virLogFilterPtr *list, int count) * Removes the set of logging output defined. */ static void -virLogResetOutputs(void) +virLogResetOutputs(bool freemem) { - virLogOutputListFree(virLogOutputs, virLogNbOutputs); + if (freemem) + virLogOutputListFree(virLogOutputs, virLogNbOutputs); virLogOutputs = NULL; virLogNbOutputs = 0; } @@ -1392,7 +1416,7 @@ virLogDefineOutputs(virLogOutputPtr *outputs, size_t noutputs) return -1; virLogLock(); - virLogResetOutputs(); + virLogResetOutputs(true); #if HAVE_SYSLOG_H /* syslog needs to be special-cased, since it keeps the fd in private */ @@ -1435,7 +1459,7 @@ virLogDefineFilters(virLogFilterPtr *filters, size_t nfilters) return -1; virLogLock(); - virLogResetFilters(); + virLogResetFilters(true); virLogFilters = filters; virLogNbFilters = nfilters; virLogUnlock(); diff --git a/src/util/virlog.h b/src/util/virlog.h index 984a9d5a43..69f7b1ef94 100644 --- a/src/util/virlog.h +++ b/src/util/virlog.h @@ -168,6 +168,7 @@ void virLogSetDefaultOutput(const char *fname, bool godaemon, bool privileged); void virLogLock(void); void virLogUnlock(void); int virLogReset(void); +int virLogResetWithoutFree(void); int virLogParseDefaultPriority(const char *priority); int virLogPriorityFromSyslog(int priority); void virLogMessage(virLogSourcePtr source, -- 2.28.0

On Wed, Aug 19, 2020 at 12:03:40PM +0200, Natanael Copa wrote:
Doing malloc/free after fork is techincally not allowed in POSIX and deadlocks[1] with musl libc.
Removing one free() call isn't going to make the logging code work with musl. It merely hides one isolated instance that happens to have been hit in the reset code. The log setup code will malloc, as well actually emitting a log message which triggers a allocation during printf formatting of the message.
[1]: https://gitlab.com/libvirt/libvirt/-/issues/52
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org> --- src/util/vircommand.c | 4 ++-- src/util/virlog.c | 44 +++++++++++++++++++++++++++++++++---------- src/util/virlog.h | 1 + 3 files changed, 37 insertions(+), 12 deletions(-)
diff --git a/src/util/vircommand.c b/src/util/vircommand.c index 76f7eb9a3d..17e5bb00d3 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -304,7 +304,7 @@ virFork(void) /* Make sure any hook logging is sent to stderr, since child * process may close the logfile FDs */ logprio = virLogGetDefaultPriority(); - virLogReset(); + virLogResetWithoutFree(); virLogSetDefaultPriority(logprio);
/* Clear out all signal handlers from parent so nothing @@ -861,7 +861,7 @@ virExec(virCommandPtr cmd) goto fork_error;
/* Close logging again to ensure no FDs leak to child */ - virLogReset(); + virLogResetWithoutFree();
if (cmd->env) execve(binary, cmd->args, cmd->env); diff --git a/src/util/virlog.c b/src/util/virlog.c index 3217e5eb73..3959de5ca7 100644 --- a/src/util/virlog.c +++ b/src/util/virlog.c @@ -108,8 +108,8 @@ static size_t virLogNbOutputs; */ static virLogPriority virLogDefaultPriority = VIR_LOG_DEFAULT;
-static void virLogResetFilters(void); -static void virLogResetOutputs(void); +static void virLogResetFilters(bool freemem); +static void virLogResetOutputs(bool freemem); static void virLogOutputToFd(virLogSourcePtr src, virLogPriority priority, const char *filename, @@ -284,8 +284,30 @@ virLogReset(void) return -1;
virLogLock(); - virLogResetFilters(); - virLogResetOutputs(); + virLogResetFilters(true); + virLogResetOutputs(true); + virLogDefaultPriority = VIR_LOG_DEFAULT; + virLogUnlock(); + return 0; +} + +/** + * virLogResetWithoutFree: + * + * Reset the logging module to its default initial state, but avoid doing + * free() so it can be used after fork and before exec. + * + * Returns 0 if successful, and -1 in case or error + */ +int +virLogResetWithoutFree(void) +{ + if (virLogInitialize() < 0) + return -1; + + virLogLock(); + virLogResetFilters(false); + virLogResetOutputs(false); virLogDefaultPriority = VIR_LOG_DEFAULT; virLogUnlock(); return 0; @@ -324,9 +346,10 @@ virLogSetDefaultPriority(virLogPriority priority) * Removes the set of logging filters defined. */ static void -virLogResetFilters(void) +virLogResetFilters(bool freemem) { - virLogFilterListFree(virLogFilters, virLogNbFilters); + if (freemem) + virLogFilterListFree(virLogFilters, virLogNbFilters); virLogFilters = NULL; virLogNbFilters = 0; virLogFiltersSerial++; @@ -371,9 +394,10 @@ virLogFilterListFree(virLogFilterPtr *list, int count) * Removes the set of logging output defined. */ static void -virLogResetOutputs(void) +virLogResetOutputs(bool freemem) { - virLogOutputListFree(virLogOutputs, virLogNbOutputs); + if (freemem) + virLogOutputListFree(virLogOutputs, virLogNbOutputs); virLogOutputs = NULL; virLogNbOutputs = 0; } @@ -1392,7 +1416,7 @@ virLogDefineOutputs(virLogOutputPtr *outputs, size_t noutputs) return -1;
virLogLock(); - virLogResetOutputs(); + virLogResetOutputs(true);
#if HAVE_SYSLOG_H /* syslog needs to be special-cased, since it keeps the fd in private */ @@ -1435,7 +1459,7 @@ virLogDefineFilters(virLogFilterPtr *filters, size_t nfilters) return -1;
virLogLock(); - virLogResetFilters(); + virLogResetFilters(true); virLogFilters = filters; virLogNbFilters = nfilters; virLogUnlock(); diff --git a/src/util/virlog.h b/src/util/virlog.h index 984a9d5a43..69f7b1ef94 100644 --- a/src/util/virlog.h +++ b/src/util/virlog.h @@ -168,6 +168,7 @@ void virLogSetDefaultOutput(const char *fname, bool godaemon, bool privileged); void virLogLock(void); void virLogUnlock(void); int virLogReset(void); +int virLogResetWithoutFree(void); int virLogParseDefaultPriority(const char *priority); int virLogPriorityFromSyslog(int priority); void virLogMessage(virLogSourcePtr source, -- 2.28.0
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Add a portable generic implementation of virMassClose as fallback on non-FreeBSD and non-glibc. This implementation uses poll(2) to look for open files to keep performance reasonable while not using any mallocs. This solves a deadlock with musl libc. Signed-off-by: Natanael Copa <ncopa@alpinelinux.org> --- src/util/vircommand.c | 76 +++++++++++++++++++++++++++++++++---------- 1 file changed, 58 insertions(+), 18 deletions(-) diff --git a/src/util/vircommand.c b/src/util/vircommand.c index 17e5bb00d3..06579cfb44 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -443,7 +443,7 @@ virExecCommon(virCommandPtr cmd, gid_t *groups, int ngroups) return 0; } -# ifdef __linux__ +# if defined(__linux__) && defined(__GLIBC__) /* On Linux, we can utilize procfs and read the table of opened * FDs and selectively close only those FDs we don't want to pass * onto child process (well, the one we will exec soon since this @@ -482,17 +482,7 @@ virCommandMassCloseGetFDsLinux(virCommandPtr cmd G_GNUC_UNUSED, VIR_DIR_CLOSE(dp); return ret; } - -# else /* !__linux__ */ - -static int -virCommandMassCloseGetFDsGeneric(virCommandPtr cmd G_GNUC_UNUSED, - virBitmapPtr fds) -{ - virBitmapSetAll(fds); - return 0; -} -# endif /* !__linux__ */ +# endif /* __linux__ && __GLIBC__ */ # ifdef __FreeBSD__ @@ -546,7 +536,7 @@ virCommandMassClose(virCommandPtr cmd, return 0; } -# else /* ! __FreeBSD__ */ +# elif defined(__GLIBC_) /* ! __FreeBSD__ */ static int virCommandMassClose(virCommandPtr cmd, @@ -574,13 +564,8 @@ virCommandMassClose(virCommandPtr cmd, if (!(fds = virBitmapNew(openmax))) return -1; -# ifdef __linux__ if (virCommandMassCloseGetFDsLinux(cmd, fds) < 0) return -1; -# else - if (virCommandMassCloseGetFDsGeneric(cmd, fds) < 0) - return -1; -# endif fd = virBitmapNextSetBit(fds, 2); for (; fd >= 0; fd = virBitmapNextSetBit(fds, fd)) { @@ -598,6 +583,61 @@ virCommandMassClose(virCommandPtr cmd, return 0; } +#else /* ! __FreeBSD__ && ! __GLIBC__ */ +static int +virCommandMassClose(virCommandPtr cmd, + int childin, + int childout, + int childerr) +{ + static struct pollfd pfds[1024]; + int fd = 0; + int i, total; + int max_fd = sysconf(_SC_OPEN_MAX); + + if (max_fd < 0) { + virReportSystemError(errno, "%s", _("sysconf(_SC_OPEN_MAX) failed")); + return -1; + } + + total = max_fd - fd; + for (i = 0; i < (total < 1024 ? total : 1024); i++) + pfds[i].events = 0; + + while (fd < max_fd) { + int nfds, r = 0; + + total = max_fd - fd; + nfds = total < 1024 ? total : 1024; + + for (i = 0; i < nfds; i++) + pfds[i].fd = fd + i; + + do { + r = poll(pfds, nfds, 0); + } while (r == -1 && errno == EINTR); + + if (r < 0) { + virReportSystemError(errno, "%s", _("poll() failed")); + return -1; + } + + for (i = 0; i < nfds; i++) + if (pfds[i].revents != POLLNVAL) { + if (pfds[i].fd == childin || pfds[i].fd == childout || pfds[i].fd == childerr) + continue; + if (!virCommandFDIsSet(cmd, pfds[i].fd)) { + VIR_MASS_CLOSE(pfds[i].fd); + } else if (virSetInherit(pfds[i].fd, true) < 0) { + virReportSystemError(errno, _("failed to preserve fd %d"), pfds[i].fd); + return -1; + } + } + fd += nfds; + } + return 0; +} + # endif /* ! __FreeBSD__ */ /* -- 2.28.0

On Wed, Aug 19, 2020 at 12:03:41PM +0200, Natanael Copa wrote:
Add a portable generic implementation of virMassClose as fallback on non-FreeBSD and non-glibc.
This implementation uses poll(2) to look for open files to keep performance reasonable while not using any mallocs.
The patch isn't avoiding malloc - the virReportSystemError calls all trigger mallocs in the error reporting code, as well as triggering the logging code which mallocs. The idea of using poll() as a fallback still makes sense as a general feature though.
This solves a deadlock with musl libc.
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org> --- src/util/vircommand.c | 76 +++++++++++++++++++++++++++++++++---------- 1 file changed, 58 insertions(+), 18 deletions(-)
diff --git a/src/util/vircommand.c b/src/util/vircommand.c index 17e5bb00d3..06579cfb44 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -443,7 +443,7 @@ virExecCommon(virCommandPtr cmd, gid_t *groups, int ngroups) return 0; }
-# ifdef __linux__ +# if defined(__linux__) && defined(__GLIBC__) /* On Linux, we can utilize procfs and read the table of opened * FDs and selectively close only those FDs we don't want to pass * onto child process (well, the one we will exec soon since this @@ -482,17 +482,7 @@ virCommandMassCloseGetFDsLinux(virCommandPtr cmd G_GNUC_UNUSED, VIR_DIR_CLOSE(dp); return ret; } - -# else /* !__linux__ */ - -static int -virCommandMassCloseGetFDsGeneric(virCommandPtr cmd G_GNUC_UNUSED, - virBitmapPtr fds) -{ - virBitmapSetAll(fds); - return 0; -} -# endif /* !__linux__ */ +# endif /* __linux__ && __GLIBC__ */
# ifdef __FreeBSD__
@@ -546,7 +536,7 @@ virCommandMassClose(virCommandPtr cmd, return 0; }
-# else /* ! __FreeBSD__ */ +# elif defined(__GLIBC_) /* ! __FreeBSD__ */
static int virCommandMassClose(virCommandPtr cmd, @@ -574,13 +564,8 @@ virCommandMassClose(virCommandPtr cmd, if (!(fds = virBitmapNew(openmax))) return -1;
-# ifdef __linux__ if (virCommandMassCloseGetFDsLinux(cmd, fds) < 0) return -1; -# else - if (virCommandMassCloseGetFDsGeneric(cmd, fds) < 0) - return -1; -# endif
fd = virBitmapNextSetBit(fds, 2); for (; fd >= 0; fd = virBitmapNextSetBit(fds, fd)) { @@ -598,6 +583,61 @@ virCommandMassClose(virCommandPtr cmd, return 0; }
+#else /* ! __FreeBSD__ && ! __GLIBC__ */ +static int +virCommandMassClose(virCommandPtr cmd, + int childin, + int childout, + int childerr) +{ + static struct pollfd pfds[1024]; + int fd = 0; + int i, total; + int max_fd = sysconf(_SC_OPEN_MAX); + + if (max_fd < 0) { + virReportSystemError(errno, "%s", _("sysconf(_SC_OPEN_MAX) failed")); + return -1; + } + + total = max_fd - fd; + for (i = 0; i < (total < 1024 ? total : 1024); i++) + pfds[i].events = 0; + + while (fd < max_fd) { + int nfds, r = 0; + + total = max_fd - fd; + nfds = total < 1024 ? total : 1024; + + for (i = 0; i < nfds; i++) + pfds[i].fd = fd + i; + + do { + r = poll(pfds, nfds, 0); + } while (r == -1 && errno == EINTR); + + if (r < 0) { + virReportSystemError(errno, "%s", _("poll() failed")); + return -1; + } + + for (i = 0; i < nfds; i++) + if (pfds[i].revents != POLLNVAL) { + if (pfds[i].fd == childin || pfds[i].fd == childout || pfds[i].fd == childerr) + continue; + if (!virCommandFDIsSet(cmd, pfds[i].fd)) { + VIR_MASS_CLOSE(pfds[i].fd); + } else if (virSetInherit(pfds[i].fd, true) < 0) { + virReportSystemError(errno, _("failed to preserve fd %d"), pfds[i].fd); + return -1; + } + } + fd += nfds; + } + return 0; +} + # endif /* ! __FreeBSD__ */
/* -- 2.28.0
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, 19 Aug 2020 11:55:06 +0100 Daniel P. Berrangé <berrange@redhat.com> wrote:
On Wed, Aug 19, 2020 at 12:03:41PM +0200, Natanael Copa wrote:
Add a portable generic implementation of virMassClose as fallback on non-FreeBSD and non-glibc.
This implementation uses poll(2) to look for open files to keep performance reasonable while not using any mallocs.
The patch isn't avoiding malloc - the virReportSystemError calls all trigger mallocs in the error reporting code, as well as triggering the logging code which mallocs.
Even if it does not avoid all malloc's it does avoid the allocation and use of virBitmap and made my system usable again, probably because there was nothing to report. I also consider use of malloc in virReportSystemError a different problem and should probably be dealth with separately. (how would you report out-of-memory if the error reporting needs allocate memory?) I suppose the commit message needs to be improved.
The idea of using poll() as a fallback still makes sense as a general feature though.
Credit for the idea goes to Rich Felker. On a side note, it is theoretically possible to open a large number of files, then set max fds to a lower number and you'll have open file handles above sysconf(_SC_OPEN_MAX) that will not be closed. The current fallback implementation also has this problem.
This solves a deadlock with musl libc.
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org> --- src/util/vircommand.c | 76 +++++++++++++++++++++++++++++++++---------- 1 file changed, 58 insertions(+), 18 deletions(-)
diff --git a/src/util/vircommand.c b/src/util/vircommand.c index 17e5bb00d3..06579cfb44 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c
...
@@ -546,7 +536,7 @@ virCommandMassClose(virCommandPtr cmd, return 0; }
-# else /* ! __FreeBSD__ */ +# elif defined(__GLIBC_) /* ! __FreeBSD__ */
I did a typo here. __GLIBC_ instead of __GLIBC__ -nc
participants (2)
-
Daniel P. Berrangé
-
Natanael Copa