[PATCH 0/4] vircommand: Make FD closing more robust on __APPLE__

Found these on an old branch. Might as well post them. Michal Prívozník (4): vircommand: Drop unused arguments from virCommandMassCloseGetFDs*() vircommand: Isolate FD dir parsing into a separate function vircommand: Make sysconf(_SC_OPEN_MAX) failure non-fatal vircommand: Parse /dev/fd on *BSD-like systems when looking for opened FDs src/util/vircommand.c | 43 ++++++++++++++++++------------------------- 1 file changed, 18 insertions(+), 25 deletions(-) -- 2.44.2

Both virCommandMassCloseGetFDsLinux() and virCommandMassCloseGetFDsGeneric() take @cmd argument only to mark it as unused. Drop it from both. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/vircommand.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/src/util/vircommand.c b/src/util/vircommand.c index a03fcc37ae..8bd5e5e771 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -478,8 +478,7 @@ virExecCommon(virCommand *cmd, gid_t *groups, int ngroups) * onto child process (well, the one we will exec soon since this * is called from the child). */ static int -virCommandMassCloseGetFDsLinux(virCommand *cmd G_GNUC_UNUSED, - virBitmap *fds) +virCommandMassCloseGetFDsLinux(virBitmap *fds) { g_autoptr(DIR) dp = NULL; struct dirent *entry; @@ -511,8 +510,7 @@ virCommandMassCloseGetFDsLinux(virCommand *cmd G_GNUC_UNUSED, # else /* !__linux__ */ static int -virCommandMassCloseGetFDsGeneric(virCommand *cmd G_GNUC_UNUSED, - virBitmap *fds) +virCommandMassCloseGetFDsGeneric(virBitmap *fds) { virBitmapSetAll(fds); return 0; @@ -547,10 +545,10 @@ virCommandMassCloseFrom(virCommand *cmd, fds = virBitmapNew(openmax); # ifdef __linux__ - if (virCommandMassCloseGetFDsLinux(cmd, fds) < 0) + if (virCommandMassCloseGetFDsLinux(fds) < 0) return -1; # else - if (virCommandMassCloseGetFDsGeneric(cmd, fds) < 0) + if (virCommandMassCloseGetFDsGeneric(fds) < 0) return -1; # endif -- 2.44.2

On Wed, Aug 28, 2024 at 02:16:14PM +0200, Michal Privoznik wrote:
Both virCommandMassCloseGetFDsLinux() and virCommandMassCloseGetFDsGeneric() take @cmd argument only to mark it as unused. Drop it from both.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>

So far, virCommandMassCloseGetFDsLinux() opens "/proc/self/fd", iterates over it marking opened FDs in @fds bitmap. Well, we can do the same on other systems (with altered path), like MacOS or FreeBSD. Therefore, isolate dir iteration into a separate function that accepts dir path as an argument. Unfortunately, this function might be unused on some systems (e.g. mingw), therefore mark it as such. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/vircommand.c | 31 +++++++++++++------------------ 1 file changed, 13 insertions(+), 18 deletions(-) diff --git a/src/util/vircommand.c b/src/util/vircommand.c index 8bd5e5e771..c0aab85c53 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -472,17 +472,12 @@ virExecCommon(virCommand *cmd, gid_t *groups, int ngroups) return 0; } -# ifdef __linux__ -/* On Linux, we can utilize procfs and read the table of opened - * FDs and selectively close only those FDs we don't want to pass - * onto child process (well, the one we will exec soon since this - * is called from the child). */ -static int -virCommandMassCloseGetFDsLinux(virBitmap *fds) +static int G_GNUC_UNUSED +virCommandMassCloseGetFDsDir(virBitmap *fds, + const char *dirName) { g_autoptr(DIR) dp = NULL; struct dirent *entry; - const char *dirName = "/proc/self/fd"; int rc; if (virDirOpen(&dp, dirName) < 0) @@ -507,15 +502,20 @@ virCommandMassCloseGetFDsLinux(virBitmap *fds) return 0; } -# else /* !__linux__ */ - static int -virCommandMassCloseGetFDsGeneric(virBitmap *fds) +virCommandMassCloseGetFDs(virBitmap *fds) { +# ifdef __linux__ + /* On Linux, we can utilize procfs and read the table of opened + * FDs and selectively close only those FDs we don't want to pass + * onto child process (well, the one we will exec soon since this + * is called from the child). */ + return virCommandMassCloseGetFDsDir(fds, "/proc/self/fd"); +# else virBitmapSetAll(fds); return 0; +# endif } -# endif /* !__linux__ */ static int virCommandMassCloseFrom(virCommand *cmd, @@ -544,13 +544,8 @@ virCommandMassCloseFrom(virCommand *cmd, fds = virBitmapNew(openmax); -# ifdef __linux__ - if (virCommandMassCloseGetFDsLinux(fds) < 0) + if (virCommandMassCloseGetFDs(fds) < 0) return -1; -# else - if (virCommandMassCloseGetFDsGeneric(fds) < 0) - return -1; -# endif lastfd = MAX(lastfd, childin); lastfd = MAX(lastfd, childout); -- 2.44.2

On Wed, Aug 28, 2024 at 02:16:15PM +0200, Michal Privoznik wrote:
So far, virCommandMassCloseGetFDsLinux() opens "/proc/self/fd", iterates over it marking opened FDs in @fds bitmap. Well, we can do the same on other systems (with altered path), like MacOS or FreeBSD. Therefore, isolate dir iteration into a separate function that accepts dir path as an argument.
Unfortunately, this function might be unused on some systems (e.g. mingw), therefore mark it as such.
Not in this patch. This patch should leave it in without marking it as unused. With that Reviewed-by: Martin Kletzander <mkletzan@redhat.com>

On 9/12/24 15:19, Martin Kletzander wrote:
On Wed, Aug 28, 2024 at 02:16:15PM +0200, Michal Privoznik wrote:
So far, virCommandMassCloseGetFDsLinux() opens "/proc/self/fd", iterates over it marking opened FDs in @fds bitmap. Well, we can do the same on other systems (with altered path), like MacOS or FreeBSD. Therefore, isolate dir iteration into a separate function that accepts dir path as an argument.
Unfortunately, this function might be unused on some systems (e.g. mingw), therefore mark it as such.
Not in this patch. This patch should leave it in without marking it as unused. With that
Why not? A new function is introduced, outside of any #ifdef block, but it's used solely in #ifdef __linux__ (for now). Or am I missing something?
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Michal

On Fri, Sep 13, 2024 at 09:37:30AM +0200, Michal Prívozník wrote:
On 9/12/24 15:19, Martin Kletzander wrote:
On Wed, Aug 28, 2024 at 02:16:15PM +0200, Michal Privoznik wrote:
So far, virCommandMassCloseGetFDsLinux() opens "/proc/self/fd", iterates over it marking opened FDs in @fds bitmap. Well, we can do the same on other systems (with altered path), like MacOS or FreeBSD. Therefore, isolate dir iteration into a separate function that accepts dir path as an argument.
Unfortunately, this function might be unused on some systems (e.g. mingw), therefore mark it as such.
Not in this patch. This patch should leave it in without marking it as unused. With that
Why not? A new function is introduced, outside of any #ifdef block, but it's used solely in #ifdef __linux__ (for now). Or am I missing something?
Yes, it is used only if __linux__ so it should be marked as such. Later patch should add the other systems. The issue I have with marking it as unused is that it is then there forever and we won't notice it being completely unused at some point. And it does not feel too clean.
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Michal

The point of calling sysconf(_SC_OPEN_MAX) is to allocate big enough bitmap so that subsequent call to virCommandMassCloseGetFDsDir() can just set the bit instead of expanding memory (this code runs in a forked off child and thus using async-signal-unsafe functions like malloc() is a bit tricky). But on some systems the limit for opened FDs is virtually non-existent (typically macOS Ventura started reporting EINVAL). But with both glibc and musl using malloc() after fork() is safe. And with sufficiently new glib too, as it's using malloc() with newer releases instead of their own allocator. Therefore, pick a sufficiently large value (glibc falls back to 256, [1], so 1024 should be good enough) to fall back to and make the error non-fatal. 1: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/get... Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/vircommand.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/src/util/vircommand.c b/src/util/vircommand.c index c0aab85c53..e48f361114 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -493,7 +493,7 @@ virCommandMassCloseGetFDsDir(virBitmap *fds, return -1; } - ignore_value(virBitmapSetBit(fds, fd)); + virBitmapSetBitExpand(fds, fd); } if (rc < 0) @@ -537,10 +537,8 @@ virCommandMassCloseFrom(virCommand *cmd, * Therefore we can safely allocate memory here (and transitively call * opendir/readdir) without a deadlock. */ - if (openmax < 0) { - virReportSystemError(errno, "%s", _("sysconf(_SC_OPEN_MAX) failed")); - return -1; - } + if (openmax <= 0) + openmax = 1024; fds = virBitmapNew(openmax); -- 2.44.2

On Wed, Aug 28, 2024 at 02:16:16PM +0200, Michal Privoznik wrote:
The point of calling sysconf(_SC_OPEN_MAX) is to allocate big enough bitmap so that subsequent call to virCommandMassCloseGetFDsDir() can just set the bit instead of expanding memory (this code runs in a forked off child and thus using async-signal-unsafe functions like malloc() is a bit tricky).
But on some systems the limit for opened FDs is virtually non-existent (typically macOS Ventura started reporting EINVAL).
But with both glibc and musl using malloc() after fork() is safe. And with sufficiently new glib too, as it's using malloc() with newer releases instead of their own allocator.
Therefore, pick a sufficiently large value (glibc falls back to 256, [1], so 1024 should be good enough) to fall back to and make the error non-fatal.
1: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/get... Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/vircommand.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/src/util/vircommand.c b/src/util/vircommand.c index c0aab85c53..e48f361114 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -493,7 +493,7 @@ virCommandMassCloseGetFDsDir(virBitmap *fds, return -1; }
- ignore_value(virBitmapSetBit(fds, fd)); + virBitmapSetBitExpand(fds, fd); }
if (rc < 0) @@ -537,10 +537,8 @@ virCommandMassCloseFrom(virCommand *cmd, * Therefore we can safely allocate memory here (and transitively call * opendir/readdir) without a deadlock. */
t - if (openmax < 0) { - virReportSystemError(errno, "%s", _("sysconf(_SC_OPEN_MAX) failed")); - return -1; - } + if (openmax <= 0) + openmax = 1024;
Darwin has a OPEN_MAX constant that is x10 bigger than this: https://github.com/apple/darwin-xnu/blob/main/bsd/sys/syslimits.h#L104 IMHO we should be using that instead With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 8/28/24 14:39, Daniel P. Berrangé wrote:
On Wed, Aug 28, 2024 at 02:16:16PM +0200, Michal Privoznik wrote:
The point of calling sysconf(_SC_OPEN_MAX) is to allocate big enough bitmap so that subsequent call to virCommandMassCloseGetFDsDir() can just set the bit instead of expanding memory (this code runs in a forked off child and thus using async-signal-unsafe functions like malloc() is a bit tricky).
But on some systems the limit for opened FDs is virtually non-existent (typically macOS Ventura started reporting EINVAL).
But with both glibc and musl using malloc() after fork() is safe. And with sufficiently new glib too, as it's using malloc() with newer releases instead of their own allocator.
Therefore, pick a sufficiently large value (glibc falls back to 256, [1], so 1024 should be good enough) to fall back to and make the error non-fatal.
1: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/get... Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/vircommand.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/src/util/vircommand.c b/src/util/vircommand.c index c0aab85c53..e48f361114 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -493,7 +493,7 @@ virCommandMassCloseGetFDsDir(virBitmap *fds, return -1; }
- ignore_value(virBitmapSetBit(fds, fd)); + virBitmapSetBitExpand(fds, fd); }
if (rc < 0) @@ -537,10 +537,8 @@ virCommandMassCloseFrom(virCommand *cmd, * Therefore we can safely allocate memory here (and transitively call * opendir/readdir) without a deadlock. */
t - if (openmax < 0) { - virReportSystemError(errno, "%s", _("sysconf(_SC_OPEN_MAX) failed")); - return -1; - } + if (openmax <= 0) + openmax = 1024;
Darwin has a OPEN_MAX constant that is x10 bigger than this:
https://github.com/apple/darwin-xnu/blob/main/bsd/sys/syslimits.h#L104
IMHO we should be using that instead
Fair enough. virBitmapSetBitExpand() would cause realloc of the bitmap, but we can start with a generous value. Consider the following squashed in: diff --git i/src/util/vircommand.c w/src/util/vircommand.c index e48f361114..0f2f87c80c 100644 --- i/src/util/vircommand.c +++ w/src/util/vircommand.c @@ -537,8 +537,12 @@ virCommandMassCloseFrom(virCommand *cmd, * Therefore we can safely allocate memory here (and transitively call * opendir/readdir) without a deadlock. */ - if (openmax <= 0) - openmax = 1024; + if (openmax <= 0) { + /* Darwin defaults to 10240. Start with a generous value. + * virCommandMassCloseGetFDsDir() uses virBitmapSetBitExpand() anyways. + */ + openmax = 10240; + } fds = virBitmapNew(openmax); Michal

On Wed, Aug 28, 2024 at 02:59:09PM +0200, Michal Prívozník wrote:
On 8/28/24 14:39, Daniel P. Berrangé wrote:
On Wed, Aug 28, 2024 at 02:16:16PM +0200, Michal Privoznik wrote:
The point of calling sysconf(_SC_OPEN_MAX) is to allocate big enough bitmap so that subsequent call to virCommandMassCloseGetFDsDir() can just set the bit instead of expanding memory (this code runs in a forked off child and thus using async-signal-unsafe functions like malloc() is a bit tricky).
But on some systems the limit for opened FDs is virtually non-existent (typically macOS Ventura started reporting EINVAL).
But with both glibc and musl using malloc() after fork() is safe. And with sufficiently new glib too, as it's using malloc() with newer releases instead of their own allocator.
I guess I'm out of the loop lately, could you point the possible readers to the proof of the above? I'm hesitant to say this is fine due to the virBitmapSetBitExpand() call.
Therefore, pick a sufficiently large value (glibc falls back to 256, [1], so 1024 should be good enough) to fall back to and make the error non-fatal.
1: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/get... Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/vircommand.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/src/util/vircommand.c b/src/util/vircommand.c index c0aab85c53..e48f361114 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -493,7 +493,7 @@ virCommandMassCloseGetFDsDir(virBitmap *fds, return -1; }
- ignore_value(virBitmapSetBit(fds, fd)); + virBitmapSetBitExpand(fds, fd); }
if (rc < 0) @@ -537,10 +537,8 @@ virCommandMassCloseFrom(virCommand *cmd, * Therefore we can safely allocate memory here (and transitively call * opendir/readdir) without a deadlock. */
t - if (openmax < 0) { - virReportSystemError(errno, "%s", _("sysconf(_SC_OPEN_MAX) failed")); - return -1; - } + if (openmax <= 0) + openmax = 1024;
Darwin has a OPEN_MAX constant that is x10 bigger than this:
https://github.com/apple/darwin-xnu/blob/main/bsd/sys/syslimits.h#L104
IMHO we should be using that instead
Fair enough. virBitmapSetBitExpand() would cause realloc of the bitmap, but we can start with a generous value. Consider the following squashed in:
And don't forget to adjust the commit message to go with the below change as well.
diff --git i/src/util/vircommand.c w/src/util/vircommand.c index e48f361114..0f2f87c80c 100644 --- i/src/util/vircommand.c +++ w/src/util/vircommand.c @@ -537,8 +537,12 @@ virCommandMassCloseFrom(virCommand *cmd, * Therefore we can safely allocate memory here (and transitively call * opendir/readdir) without a deadlock. */
- if (openmax <= 0) - openmax = 1024; + if (openmax <= 0) { + /* Darwin defaults to 10240. Start with a generous value. + * virCommandMassCloseGetFDsDir() uses virBitmapSetBitExpand() anyways. + */ + openmax = 10240; + }
fds = virBitmapNew(openmax);
Michal

On Thu, Sep 12, 2024 at 03:31:38PM +0200, Martin Kletzander wrote:
On Wed, Aug 28, 2024 at 02:59:09PM +0200, Michal Prívozník wrote:
On 8/28/24 14:39, Daniel P. Berrangé wrote:
On Wed, Aug 28, 2024 at 02:16:16PM +0200, Michal Privoznik wrote:
The point of calling sysconf(_SC_OPEN_MAX) is to allocate big enough bitmap so that subsequent call to virCommandMassCloseGetFDsDir() can just set the bit instead of expanding memory (this code runs in a forked off child and thus using async-signal-unsafe functions like malloc() is a bit tricky).
But on some systems the limit for opened FDs is virtually non-existent (typically macOS Ventura started reporting EINVAL).
But with both glibc and musl using malloc() after fork() is safe. And with sufficiently new glib too, as it's using malloc() with newer releases instead of their own allocator.
I guess I'm out of the loop lately, could you point the possible readers to the proof of the above? I'm hesitant to say this is fine due to the virBitmapSetBitExpand() call.
Libvirt has relied on malloc-after-fork being safe, essentially, forever. For a while this caused us to have problems on musl, but musl maintainers eventually relented and made it safe too. glib is hardcoded to use system malloc since before our minimum glib version, and again we've relied on that ever since we adopted use of glib. So there's nothing new in what Michal writes above. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Thu, Sep 12, 2024 at 02:36:24PM +0100, Daniel P. Berrangé wrote:
On Thu, Sep 12, 2024 at 03:31:38PM +0200, Martin Kletzander wrote:
On Wed, Aug 28, 2024 at 02:59:09PM +0200, Michal Prívozník wrote:
On 8/28/24 14:39, Daniel P. Berrangé wrote:
On Wed, Aug 28, 2024 at 02:16:16PM +0200, Michal Privoznik wrote:
The point of calling sysconf(_SC_OPEN_MAX) is to allocate big enough bitmap so that subsequent call to virCommandMassCloseGetFDsDir() can just set the bit instead of expanding memory (this code runs in a forked off child and thus using async-signal-unsafe functions like malloc() is a bit tricky).
But on some systems the limit for opened FDs is virtually non-existent (typically macOS Ventura started reporting EINVAL).
But with both glibc and musl using malloc() after fork() is safe. And with sufficiently new glib too, as it's using malloc() with newer releases instead of their own allocator.
I guess I'm out of the loop lately, could you point the possible readers to the proof of the above? I'm hesitant to say this is fine due to the virBitmapSetBitExpand() call.
Libvirt has relied on malloc-after-fork being safe, essentially, forever. For a while this caused us to have problems on musl, but musl maintainers eventually relented and made it safe too. glib is hardcoded to use system malloc since before our minimum glib version, and again we've relied on that ever since we adopted use of glib.
So there's nothing new in what Michal writes above.
OK, I thought it might not be that obvious, but clearly I'm just out of as I said. So in that case, the fix incorporated and the commit message adjusted it's Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On BSD-like systems "/dev/fd" serves the same purpose as "/proc/self/fd". And since procfs is usually not mounted, on such systems we can use "/dev/fd" instead. Resolves: https://gitlab.com/libvirt/libvirt/-/issues/518 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/vircommand.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/util/vircommand.c b/src/util/vircommand.c index e48f361114..d48bf8122d 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -511,6 +511,8 @@ virCommandMassCloseGetFDs(virBitmap *fds) * onto child process (well, the one we will exec soon since this * is called from the child). */ return virCommandMassCloseGetFDsDir(fds, "/proc/self/fd"); +# elif defined(__APPLE__) || defined(__FreeBSD__) + return virCommandMassCloseGetFDsDir(fds, "/dev/fd"); # else virBitmapSetAll(fds); return 0; -- 2.44.2

On Wed, Aug 28, 2024 at 02:16:17PM +0200, Michal Privoznik wrote:
On BSD-like systems "/dev/fd" serves the same purpose as "/proc/self/fd". And since procfs is usually not mounted, on such systems we can use "/dev/fd" instead.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/518 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/vircommand.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/src/util/vircommand.c b/src/util/vircommand.c index e48f361114..d48bf8122d 100644 --- a/src/util/vircommand.c +++ b/src/util/vircommand.c @@ -511,6 +511,8 @@ virCommandMassCloseGetFDs(virBitmap *fds) * onto child process (well, the one we will exec soon since this * is called from the child). */ return virCommandMassCloseGetFDsDir(fds, "/proc/self/fd"); +# elif defined(__APPLE__) || defined(__FreeBSD__) + return virCommandMassCloseGetFDsDir(fds, "/dev/fd");
And in this patch the similar condition should be used for virCommandMassCloseGetFDsDir() to avoid unneeded G_GNUC_UNUSED. With that fixed: Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
# else virBitmapSetAll(fds); return 0; -- 2.44.2
participants (4)
-
Daniel P. Berrangé
-
Martin Kletzander
-
Michal Privoznik
-
Michal Prívozník