[libvirt RFCv3] virfile: set pipe size in virFileWrapperFdNew to improve throughput

currently the only user of virFileWrapperFdNew is the qemu driver; virsh save is very slow with a default pipe size. This change improves throughput by ~400% on fast nvme or ramdisk. Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- see v2 at https://listman.redhat.com/archives/libvir-list/2022-March/229423.html Changes v2 -> v3: * removed reading of max-pipe-size from procfs, instead make multiple attempts on EPERM with smaller sizes. In the regular case, this should succeed on the first try. (Daniel) Changes v1 -> v2: * removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal) * moved code to separate functions (Michal) * removed ternary op, disliked in libvirt (Michal) * added #ifdef __linux__ (Ani Sinha) * try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel) src/util/virfile.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..876b865974 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,51 @@ struct _virFileWrapperFd { }; #ifndef WIN32 + +#ifdef __linux__ + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. A value of 1MB has been measured to improve virsh save + * by 400% in ideal conditions. We retry multiple times with smaller sizes + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size. + * + * Return value is 0 on success, -1 and errno set on error. + * OS note: only for linux, on other OS this is a no-op. + */ +static int +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_INFO("fd %d pipe size adjusted to %d", fd, sz); + return 0; + } + VIR_WARN("failed to set pipe size to %d (errno=%d)", sz, errno); + return -1; +} + +#else /* !__linux__ */ +static int virFileWrapperSetPipeSize(int fd) +{ + return 0; +} +#endif /* !__linux__ */ + + /** * virFileWrapperFdNew: * @fd: pointer to fd to wrap @@ -282,6 +327,10 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags) ret->cmd = virCommandNewArgList(iohelper_path, name, NULL); + if (virFileWrapperSetPipeSize(pipefd[!output]) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", _("unable to set pipe size, data transfer might be slow")); + } + if (output) { virCommandSetInputFD(ret->cmd, pipefd[0]); virCommandSetOutputFD(ret->cmd, fd); -- 2.35.1

On Fri, Mar 25, 2022 at 01:54:51PM +0100, Claudio Fontana wrote:
currently the only user of virFileWrapperFdNew is the qemu driver; virsh save is very slow with a default pipe size. This change improves throughput by ~400% on fast nvme or ramdisk.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
Signed-off-by: Claudio Fontana <cfontana@suse.de> ---
see v2 at https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
Changes v2 -> v3:
* removed reading of max-pipe-size from procfs, instead make multiple attempts on EPERM with smaller sizes. In the regular case, this should succeed on the first try. (Daniel)
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
src/util/virfile.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..876b865974 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,51 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. A value of 1MB has been measured to improve virsh save + * by 400% in ideal conditions. We retry multiple times with smaller sizes + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size. + * + * Return value is 0 on success, -1 and errno set on error. + * OS note: only for linux, on other OS this is a no-op. + */ +static int +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_INFO("fd %d pipe size adjusted to %d", fd, sz); + return 0; + } + VIR_WARN("failed to set pipe size to %d (errno=%d)", sz, errno); + return -1; +} + +#else /* !__linux__ */ +static int virFileWrapperSetPipeSize(int fd) +{ + return 0; +} +#endif /* !__linux__ */ + + /** * virFileWrapperFdNew: * @fd: pointer to fd to wrap @@ -282,6 +327,10 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
+ if (virFileWrapperSetPipeSize(pipefd[!output]) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", _("unable to set pipe size, data transfer might be slow"));
Push this into virFileWrapperSetPipeSize instead of the VIR_WARN there, and use virReportSystemError passing in the errno value too.
+ } + if (output) { virCommandSetInputFD(ret->cmd, pipefd[0]); virCommandSetOutputFD(ret->cmd, fd);
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 3/25/22 2:13 PM, Daniel P. Berrangé wrote:
On Fri, Mar 25, 2022 at 01:54:51PM +0100, Claudio Fontana wrote:
currently the only user of virFileWrapperFdNew is the qemu driver; virsh save is very slow with a default pipe size. This change improves throughput by ~400% on fast nvme or ramdisk.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
Signed-off-by: Claudio Fontana <cfontana@suse.de> ---
see v2 at https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
Changes v2 -> v3:
* removed reading of max-pipe-size from procfs, instead make multiple attempts on EPERM with smaller sizes. In the regular case, this should succeed on the first try. (Daniel)
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
src/util/virfile.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..876b865974 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,51 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. A value of 1MB has been measured to improve virsh save + * by 400% in ideal conditions. We retry multiple times with smaller sizes + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size. + * + * Return value is 0 on success, -1 and errno set on error. + * OS note: only for linux, on other OS this is a no-op. + */ +static int +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_INFO("fd %d pipe size adjusted to %d", fd, sz); + return 0; + } + VIR_WARN("failed to set pipe size to %d (errno=%d)", sz, errno); + return -1; +} + +#else /* !__linux__ */ +static int virFileWrapperSetPipeSize(int fd) +{ + return 0; +} +#endif /* !__linux__ */ + + /** * virFileWrapperFdNew: * @fd: pointer to fd to wrap @@ -282,6 +327,10 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
+ if (virFileWrapperSetPipeSize(pipefd[!output]) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", _("unable to set pipe size, data transfer might be slow"));
Push this into virFileWrapperSetPipeSize instead of the VIR_WARN there, and use virReportSystemError passing in the errno value too.
ok, what about also warning on EPERM? In the normal case we should succeed on the first try I think. +static void +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + VIR_WARN("EPERM trying to set fd %d pipe size to %d", fd, sz); + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_INFO("fd %d pipe size adjusted to %d", fd, sz); + return; + } + virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow")); +} + +#else /* !__linux__ */ +static void virFileWrapperSetPipeSize(int fd) +{ + return; +} +#endif /* !__linux__ */ ? Claudio
+ } + if (output) { virCommandSetInputFD(ret->cmd, pipefd[0]); virCommandSetOutputFD(ret->cmd, fd);
With regards, Daniel

On Fri, Mar 25, 2022 at 02:52:05PM +0100, Claudio Fontana wrote:
On 3/25/22 2:13 PM, Daniel P. Berrangé wrote:
On Fri, Mar 25, 2022 at 01:54:51PM +0100, Claudio Fontana wrote:
currently the only user of virFileWrapperFdNew is the qemu driver; virsh save is very slow with a default pipe size. This change improves throughput by ~400% on fast nvme or ramdisk.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
Signed-off-by: Claudio Fontana <cfontana@suse.de> ---
see v2 at https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
Changes v2 -> v3:
* removed reading of max-pipe-size from procfs, instead make multiple attempts on EPERM with smaller sizes. In the regular case, this should succeed on the first try. (Daniel)
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
src/util/virfile.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..876b865974 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,51 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. A value of 1MB has been measured to improve virsh save + * by 400% in ideal conditions. We retry multiple times with smaller sizes + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size. + * + * Return value is 0 on success, -1 and errno set on error. + * OS note: only for linux, on other OS this is a no-op. + */ +static int +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_INFO("fd %d pipe size adjusted to %d", fd, sz); + return 0; + } + VIR_WARN("failed to set pipe size to %d (errno=%d)", sz, errno); + return -1; +} + +#else /* !__linux__ */ +static int virFileWrapperSetPipeSize(int fd) +{ + return 0; +} +#endif /* !__linux__ */ + + /** * virFileWrapperFdNew: * @fd: pointer to fd to wrap @@ -282,6 +327,10 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
+ if (virFileWrapperSetPipeSize(pipefd[!output]) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", _("unable to set pipe size, data transfer might be slow"));
Push this into virFileWrapperSetPipeSize instead of the VIR_WARN there, and use virReportSystemError passing in the errno value too.
ok, what about also warning on EPERM? In the normal case we should succeed on the first try I think.
We generally try to avoid any VIR_WARN in cases that we expect to be still functional. Users tend to complain when they get warnings for these kind of things. I think coping with smaller max size is a normal situation, so its merely a perf factor, not a functional problem. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 3/25/22 2:54 PM, Daniel P. Berrangé wrote:
On Fri, Mar 25, 2022 at 02:52:05PM +0100, Claudio Fontana wrote:
On 3/25/22 2:13 PM, Daniel P. Berrangé wrote:
On Fri, Mar 25, 2022 at 01:54:51PM +0100, Claudio Fontana wrote:
currently the only user of virFileWrapperFdNew is the qemu driver; virsh save is very slow with a default pipe size. This change improves throughput by ~400% on fast nvme or ramdisk.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
Signed-off-by: Claudio Fontana <cfontana@suse.de> ---
see v2 at https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
Changes v2 -> v3:
* removed reading of max-pipe-size from procfs, instead make multiple attempts on EPERM with smaller sizes. In the regular case, this should succeed on the first try. (Daniel)
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
src/util/virfile.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..876b865974 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,51 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. A value of 1MB has been measured to improve virsh save + * by 400% in ideal conditions. We retry multiple times with smaller sizes + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size. + * + * Return value is 0 on success, -1 and errno set on error. + * OS note: only for linux, on other OS this is a no-op. + */ +static int +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_INFO("fd %d pipe size adjusted to %d", fd, sz); + return 0; + } + VIR_WARN("failed to set pipe size to %d (errno=%d)", sz, errno); + return -1; +} + +#else /* !__linux__ */ +static int virFileWrapperSetPipeSize(int fd) +{ + return 0; +} +#endif /* !__linux__ */ + + /** * virFileWrapperFdNew: * @fd: pointer to fd to wrap @@ -282,6 +327,10 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
+ if (virFileWrapperSetPipeSize(pipefd[!output]) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", _("unable to set pipe size, data transfer might be slow"));
Push this into virFileWrapperSetPipeSize instead of the VIR_WARN there, and use virReportSystemError passing in the errno value too.
ok, what about also warning on EPERM? In the normal case we should succeed on the first try I think.
We generally try to avoid any VIR_WARN in cases that we expect to be still functional. Users tend to complain when they get warnings for
INFO? DEBUG? Or nothing at all? Thanks again
these kind of things. I think coping with smaller max size is a normal situation, so its merely a perf factor, not a functional problem.
With regards, Daniel

On Fri, Mar 25, 2022 at 02:57:14PM +0100, Claudio Fontana wrote:
On 3/25/22 2:54 PM, Daniel P. Berrangé wrote:
On Fri, Mar 25, 2022 at 02:52:05PM +0100, Claudio Fontana wrote:
On 3/25/22 2:13 PM, Daniel P. Berrangé wrote:
On Fri, Mar 25, 2022 at 01:54:51PM +0100, Claudio Fontana wrote:
currently the only user of virFileWrapperFdNew is the qemu driver; virsh save is very slow with a default pipe size. This change improves throughput by ~400% on fast nvme or ramdisk.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
Signed-off-by: Claudio Fontana <cfontana@suse.de> ---
see v2 at https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
Changes v2 -> v3:
* removed reading of max-pipe-size from procfs, instead make multiple attempts on EPERM with smaller sizes. In the regular case, this should succeed on the first try. (Daniel)
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
src/util/virfile.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..876b865974 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,51 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. A value of 1MB has been measured to improve virsh save + * by 400% in ideal conditions. We retry multiple times with smaller sizes + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size. + * + * Return value is 0 on success, -1 and errno set on error. + * OS note: only for linux, on other OS this is a no-op. + */ +static int +virFileWrapperSetPipeSize(int fd) +{ + int sz; + + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) { + int rv = fcntl(fd, F_SETPIPE_SZ, sz); + if (rv < 0 && errno == EPERM) { + continue; /* retry with half the size */ + } + if (rv < 0) { + break; + } + VIR_INFO("fd %d pipe size adjusted to %d", fd, sz); + return 0; + } + VIR_WARN("failed to set pipe size to %d (errno=%d)", sz, errno); + return -1; +} + +#else /* !__linux__ */ +static int virFileWrapperSetPipeSize(int fd) +{ + return 0; +} +#endif /* !__linux__ */ + + /** * virFileWrapperFdNew: * @fd: pointer to fd to wrap @@ -282,6 +327,10 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
+ if (virFileWrapperSetPipeSize(pipefd[!output]) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", _("unable to set pipe size, data transfer might be slow"));
Push this into virFileWrapperSetPipeSize instead of the VIR_WARN there, and use virReportSystemError passing in the errno value too.
ok, what about also warning on EPERM? In the normal case we should succeed on the first try I think.
We generally try to avoid any VIR_WARN in cases that we expect to be still functional. Users tend to complain when they get warnings for
INFO? DEBUG? Or nothing at all? Thanks again
I think DEBUG is sufficient for any logging in this code. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (2)
-
Claudio Fontana
-
Daniel P. Berrangé