[libvirt RFCv4] virfile: set pipe size in virFileWrapperFdNew to improve throughput

currently the only user of virFileWrapperFdNew is the qemu driver; virsh save is very slow with a default pipe size. This change improves throughput by ~400% on fast nvme or ramdisk. Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) see v2 at https://listman.redhat.com/archives/libvir-list/2022-March/229423.html Changes v3 -> v4: * changed INFO and WARN messages to DEBUG (Daniel) Changes v2 -> v3: * removed reading of max-pipe-size from procfs, instead make multiple attempts on EPERM with smaller sizes. In the regular case, this should succeed on the first try. (Daniel) Changes v1 -> v2: * removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal) * moved code to separate functions (Michal) * removed ternary op, disliked in libvirt (Michal) * added #ifdef __linux__ (Ani Sinha) * try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel) diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..87539be0b9 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,50 @@ struct _virFileWrapperFd { }; #ifndef WIN32 + +#ifdef __linux__ + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. A value of 1MB has been measured to improve virsh save + * by 400% in ideal conditions. We retry multiple times with smaller sizes + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size. + * + * OS note: only for linux, on other OS this is a no-op. + */ +static void +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz); + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz); + return; + } + virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow")); +} + +#else /* !__linux__ */ +static void virFileWrapperSetPipeSize(int fd) +{ + return; +} +#endif /* !__linux__ */ + + /** * virFileWrapperFdNew: * @fd: pointer to fd to wrap @@ -282,6 +326,8 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags) ret->cmd = virCommandNewArgList(iohelper_path, name, NULL); + virFileWrapperSetPipeSize(pipefd[!output]); + if (output) { virCommandSetInputFD(ret->cmd, pipefd[0]); virCommandSetOutputFD(ret->cmd, fd); -- 2.35.1

On 3/25/22 16:10, Claudio Fontana wrote:
currently the only user of virFileWrapperFdNew is the qemu driver; virsh save is very slow with a default pipe size. This change improves throughput by ~400% on fast nvme or ramdisk.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+)
see v2 at https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
Changes v3 -> v4:
* changed INFO and WARN messages to DEBUG (Daniel)
Changes v2 -> v3:
* removed reading of max-pipe-size from procfs, instead make multiple attempts on EPERM with smaller sizes. In the regular case, this should succeed on the first try. (Daniel)
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..87539be0b9 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,50 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. A value of 1MB has been measured to improve virsh save + * by 400% in ideal conditions. We retry multiple times with smaller sizes + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size. + * + * OS note: only for linux, on other OS this is a no-op. + */ +static void +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz); + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz); + return; + } + virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow"));
This should have been VIR_WARN(). It's weird to report an error when the function returns void.
+} + +#else /* !__linux__ */ +static void virFileWrapperSetPipeSize(int fd) The @fd argument is unused and thus has to be marked as such.
+{ + return; +} +#endif /* !__linux__ */ + + /** * virFileWrapperFdNew: * @fd: pointer to fd to wrap @@ -282,6 +326,8 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
+ virFileWrapperSetPipeSize(pipefd[!output]);
This feels weird, because just a few lines below the pipefd[!output]) is closed. As I said earlier, it doesn't matter what end of the pipe we set the size on, therefore, let's switch over to pipefd[output].
+ if (output) { virCommandSetInputFD(ret->cmd, pipefd[0]); virCommandSetOutputFD(ret->cmd, fd);
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> and pushed. Michal

On 3/28/22 1:06 PM, Michal Prívozník wrote:
On 3/25/22 16:10, Claudio Fontana wrote:
currently the only user of virFileWrapperFdNew is the qemu driver; virsh save is very slow with a default pipe size. This change improves throughput by ~400% on fast nvme or ramdisk.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+)
see v2 at https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
Changes v3 -> v4:
* changed INFO and WARN messages to DEBUG (Daniel)
Changes v2 -> v3:
* removed reading of max-pipe-size from procfs, instead make multiple attempts on EPERM with smaller sizes. In the regular case, this should succeed on the first try. (Daniel)
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..87539be0b9 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,50 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. A value of 1MB has been measured to improve virsh save + * by 400% in ideal conditions. We retry multiple times with smaller sizes + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size. + * + * OS note: only for linux, on other OS this is a no-op. + */ +static void +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz); + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz); + return; + } + virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow"));
This should have been VIR_WARN(). It's weird to report an error when the function returns void.
+} + +#else /* !__linux__ */ +static void virFileWrapperSetPipeSize(int fd) The @fd argument is unused and thus has to be marked as such.
+{ + return; +} +#endif /* !__linux__ */ + + /** * virFileWrapperFdNew: * @fd: pointer to fd to wrap @@ -282,6 +326,8 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
+ virFileWrapperSetPipeSize(pipefd[!output]);
This feels weird, because just a few lines below the pipefd[!output]) is closed. As I said earlier, it doesn't matter what end of the pipe we set the size on, therefore, let's switch over to pipefd[output].
+ if (output) { virCommandSetInputFD(ret->cmd, pipefd[0]); virCommandSetOutputFD(ret->cmd, fd);
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
and pushed.
Michal
Thanks a lot for fixing up the remaining few pieces! Ciao, Claudio

On Mon, Mar 28, 2022 at 01:06:03PM +0200, Michal Prívozník wrote:
On 3/25/22 16:10, Claudio Fontana wrote:
currently the only user of virFileWrapperFdNew is the qemu driver; virsh save is very slow with a default pipe size. This change improves throughput by ~400% on fast nvme or ramdisk.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+)
see v2 at https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
Changes v3 -> v4:
* changed INFO and WARN messages to DEBUG (Daniel)
Changes v2 -> v3:
* removed reading of max-pipe-size from procfs, instead make multiple attempts on EPERM with smaller sizes. In the regular case, this should succeed on the first try. (Daniel)
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..87539be0b9 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,50 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. A value of 1MB has been measured to improve virsh save + * by 400% in ideal conditions. We retry multiple times with smaller sizes + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size. + * + * OS note: only for linux, on other OS this is a no-op. + */ +static void +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz); + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz); + return; + } + virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow"));
This should have been VIR_WARN(). It's weird to report an error when the function returns void.
Actually I said to report an error in prvious version, as I figured we were handling the expect EPERM, but I guess we could even fail the last 64 KB iteration and stick with the default. So we need a slight tweak: static void virFileWrapperSetPipeSize(int fd) { int sz; for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { int rv = fcntl(fd, F_SETPIPE_SZ, sz); if (rv < 0 && errno == EPERM) { VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz); continue; /* retry with half the size */ } if (rv < 0) { virReportSystemError(errno, "%s", _("unable to set pipe size")); return -1; } VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz); return 0; } VIR_WARN("Could set pipe size to 64 KB, leaving on default size"); return 0; } then the caller can treat -1 as fatal With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 3/28/22 13:10, Daniel P. Berrangé wrote:
On Mon, Mar 28, 2022 at 01:06:03PM +0200, Michal Prívozník wrote:
On 3/25/22 16:10, Claudio Fontana wrote:
currently the only user of virFileWrapperFdNew is the qemu driver; virsh save is very slow with a default pipe size. This change improves throughput by ~400% on fast nvme or ramdisk.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+)
see v2 at https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
Changes v3 -> v4:
* changed INFO and WARN messages to DEBUG (Daniel)
Changes v2 -> v3:
* removed reading of max-pipe-size from procfs, instead make multiple attempts on EPERM with smaller sizes. In the regular case, this should succeed on the first try. (Daniel)
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..87539be0b9 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,50 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. A value of 1MB has been measured to improve virsh save + * by 400% in ideal conditions. We retry multiple times with smaller sizes + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size. + * + * OS note: only for linux, on other OS this is a no-op. + */ +static void +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz); + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz); + return; + } + virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow"));
This should have been VIR_WARN(). It's weird to report an error when the function returns void.
Actually I said to report an error in prvious version, as I figured we were handling the expect EPERM, but I guess we could even fail the last 64 KB iteration and stick with the default. So we need a slight tweak:
static void virFileWrapperSetPipeSize(int fd) { int sz;
for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { int rv = fcntl(fd, F_SETPIPE_SZ, sz); if (rv < 0 && errno == EPERM) { VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz); continue; /* retry with half the size */ } if (rv < 0) { virReportSystemError(errno, "%s", _("unable to set pipe size")); return -1; } VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz); return 0; } VIR_WARN("Could set pipe size to 64 KB, leaving on default size"); return 0; }
then the caller can treat -1 as fatal
Yes, in that case we could call virReportSystemError(), but the way the code is currently written doesn't make much sense. Anyway, let me post a follow up patch that does report error. Michal
participants (3)
-
Claudio Fontana
-
Daniel P. Berrangé
-
Michal Prívozník