[libvirt RFC v2] virfile: set pipe size in virFileWrapperFdNew to improve throughput

virsh save is very slow with a default pipe size, so set a larger one. This change improves throughput by ~400% on fast nvme or ramdisk, for the current only user of virFileWrapperFdNew: the qemu driver. Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size. We do not try to use a pipe buffer larger than what the setting of /proc/sys/fs/pipe-max-size currently allows. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/util/virfile.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) see v1 at https://listman.redhat.com/archives/libvir-list/2022-March/229252.html Changes v1 -> v2: * removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal) * moved code to separate functions (Michal) * removed ternary op, disliked in libvirt (Michal) * added #ifdef __linux__ (Ani Sinha) * try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel) diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..13bdd42c68 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,71 @@ struct _virFileWrapperFd { }; #ifndef WIN32 + +#ifdef __linux__ +/** + * virFileWrapperGetBestPipeSize: + * + * get the best pipe size to use with virFileWrapper. + * + * We first check the maximum we are allowed by the system pipe-max-size, + * and then use the minimum between that and our tested best value. + * This is because a request beyond pipe-max-size may fail with EPERM. + * If we are unable to read pipe-max-size, use the kernel default (1MB). + * + * Return value is the pipe size to use. + */ + +static int virFileWrapperGetBestPipeSize(void) +{ + const char path[] = "/proc/sys/fs/pipe-max-size"; + int best_sz = 1024 * 1024; /* good virsh save results with this size */ + int max_sz; + + if (virFileReadValueInt(&max_sz, path) < 0) { + max_sz = 1024 * 1024; /* this is the kernel default pipe-max-size */ + VIR_WARN("failed to read %s, trying default %d", path, max_sz); + } else if (max_sz > best_sz) { + max_sz = best_sz; + } + return max_sz; +} + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. This has been measured to improve virsh save by 400% + * in ideal conditions. + * + * Return value is 0 on success, -1 and errno set on error. + * OS note: only for linux, on other OS this is a no-op. + */ +static int +virFileWrapperSetPipeSize(int fd) +{ + int pipe_sz = virFileWrapperGetBestPipeSize(); + int rv = fcntl(fd, F_SETPIPE_SZ, pipe_sz); + + if (rv < 0) { + VIR_ERROR(_("failed to set pipe size to %d (errno=%d)"), pipe_sz, errno); + return -1; + } + VIR_INFO("fd %d pipe size adjusted to %d", fd, rv); + return 0; +} + +#else /* !__linux__ */ +static int virFileWrapperSetPipeSize(int fd) +{ + return 0; +} +#endif /* !__linux__ */ + + /** * virFileWrapperFdNew: * @fd: pointer to fd to wrap @@ -282,6 +347,10 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags) ret->cmd = virCommandNewArgList(iohelper_path, name, NULL); + if (virFileWrapperSetPipeSize(pipefd[!output]) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", _("unable to set pipe size, data transfer might be slow")); + } + if (output) { virCommandSetInputFD(ret->cmd, pipefd[0]); virCommandSetOutputFD(ret->cmd, fd); -- 2.35.1

Hi, a bit of an early ping on this, just wondering if the general direction of the changes seems ok. Thanks, Claudio On 3/21/22 9:13 AM, Claudio Fontana wrote:
virsh save is very slow with a default pipe size, so set a larger one.
This change improves throughput by ~400% on fast nvme or ramdisk, for the current only user of virFileWrapperFdNew: the qemu driver.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
We do not try to use a pipe buffer larger than what the setting of /proc/sys/fs/pipe-max-size currently allows.
Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/util/virfile.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+)
see v1 at https://listman.redhat.com/archives/libvir-list/2022-March/229252.html
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..13bdd42c68 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,71 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ +/** + * virFileWrapperGetBestPipeSize: + * + * get the best pipe size to use with virFileWrapper. + * + * We first check the maximum we are allowed by the system pipe-max-size, + * and then use the minimum between that and our tested best value. + * This is because a request beyond pipe-max-size may fail with EPERM. + * If we are unable to read pipe-max-size, use the kernel default (1MB). + * + * Return value is the pipe size to use. + */ + +static int virFileWrapperGetBestPipeSize(void) +{ + const char path[] = "/proc/sys/fs/pipe-max-size"; + int best_sz = 1024 * 1024; /* good virsh save results with this size */ + int max_sz; + + if (virFileReadValueInt(&max_sz, path) < 0) { + max_sz = 1024 * 1024; /* this is the kernel default pipe-max-size */ + VIR_WARN("failed to read %s, trying default %d", path, max_sz); + } else if (max_sz > best_sz) { + max_sz = best_sz; + } + return max_sz; +} + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. This has been measured to improve virsh save by 400% + * in ideal conditions. + * + * Return value is 0 on success, -1 and errno set on error. + * OS note: only for linux, on other OS this is a no-op. + */ +static int +virFileWrapperSetPipeSize(int fd) +{ + int pipe_sz = virFileWrapperGetBestPipeSize(); + int rv = fcntl(fd, F_SETPIPE_SZ, pipe_sz); + + if (rv < 0) { + VIR_ERROR(_("failed to set pipe size to %d (errno=%d)"), pipe_sz, errno); + return -1; + } + VIR_INFO("fd %d pipe size adjusted to %d", fd, rv); + return 0; +} + +#else /* !__linux__ */ +static int virFileWrapperSetPipeSize(int fd) +{ + return 0; +} +#endif /* !__linux__ */ + + /** * virFileWrapperFdNew: * @fd: pointer to fd to wrap @@ -282,6 +347,10 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
+ if (virFileWrapperSetPipeSize(pipefd[!output]) < 0) { + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", _("unable to set pipe size, data transfer might be slow")); + } + if (output) { virCommandSetInputFD(ret->cmd, pipefd[0]); virCommandSetOutputFD(ret->cmd, fd);

On Mon, Mar 21, 2022 at 09:13:20AM +0100, Claudio Fontana wrote:
virsh save is very slow with a default pipe size, so set a larger one.
This change improves throughput by ~400% on fast nvme or ramdisk, for the current only user of virFileWrapperFdNew: the qemu driver.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
We do not try to use a pipe buffer larger than what the setting of /proc/sys/fs/pipe-max-size currently allows.
Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/util/virfile.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+)
see v1 at https://listman.redhat.com/archives/libvir-list/2022-March/229252.html
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..13bdd42c68 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,71 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ +/** + * virFileWrapperGetBestPipeSize: + * + * get the best pipe size to use with virFileWrapper. + * + * We first check the maximum we are allowed by the system pipe-max-size, + * and then use the minimum between that and our tested best value. + * This is because a request beyond pipe-max-size may fail with EPERM. + * If we are unable to read pipe-max-size, use the kernel default (1MB). + * + * Return value is the pipe size to use. + */ + +static int virFileWrapperGetBestPipeSize(void) +{ + const char path[] = "/proc/sys/fs/pipe-max-size"; + int best_sz = 1024 * 1024; /* good virsh save results with this size */ + int max_sz; + + if (virFileReadValueInt(&max_sz, path) < 0) { + max_sz = 1024 * 1024; /* this is the kernel default pipe-max-size */ + VIR_WARN("failed to read %s, trying default %d", path, max_sz); + } else if (max_sz > best_sz) { + max_sz = best_sz; + } + return max_sz; +} + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. This has been measured to improve virsh save by 400% + * in ideal conditions. + * + * Return value is 0 on success, -1 and errno set on error. + * OS note: only for linux, on other OS this is a no-op. + */ +static int +virFileWrapperSetPipeSize(int fd) +{ + int pipe_sz = virFileWrapperGetBestPipeSize();
I wonder if we shouldn't just ignore the proc setting and instead for (sz = 1024 * 1024 ; sz >= 64 * 1024; sz /= 2) { int rv = fcntl(fd, F_SETPIPE_SZ, sz); if (rv < 0 && errno == EPERM) { continue; } if (rv < 0) { virReportError(...) return -1; } VIR_INFO("fd %d pipe size adjusted to %d", fd, sz); return 0; } We'll only have 1 loop iteration in the default case, and 4 iterations in the worst case, and gracefully leave it on the default if the last ieratino fails With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 3/25/22 11:41 AM, Daniel P. Berrangé wrote:
On Mon, Mar 21, 2022 at 09:13:20AM +0100, Claudio Fontana wrote:
virsh save is very slow with a default pipe size, so set a larger one.
This change improves throughput by ~400% on fast nvme or ramdisk, for the current only user of virFileWrapperFdNew: the qemu driver.
Best value currently measured is 1MB, which happens to be also the kernel default for the pipe-max-size.
We do not try to use a pipe buffer larger than what the setting of /proc/sys/fs/pipe-max-size currently allows.
Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/util/virfile.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+)
see v1 at https://listman.redhat.com/archives/libvir-list/2022-March/229252.html
Changes v1 -> v2:
* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing unconditional (Michal)
* moved code to separate functions (Michal)
* removed ternary op, disliked in libvirt (Michal)
* added #ifdef __linux__ (Ani Sinha)
* try smallest value between currently best measured value (1MB) and the pipe-max-size setting. If pipe-max-size cannot be read, try kernel default max (1MB). (Daniel)
diff --git a/src/util/virfile.c b/src/util/virfile.c index a04f888e06..13bdd42c68 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -201,6 +201,71 @@ struct _virFileWrapperFd { };
#ifndef WIN32 + +#ifdef __linux__ +/** + * virFileWrapperGetBestPipeSize: + * + * get the best pipe size to use with virFileWrapper. + * + * We first check the maximum we are allowed by the system pipe-max-size, + * and then use the minimum between that and our tested best value. + * This is because a request beyond pipe-max-size may fail with EPERM. + * If we are unable to read pipe-max-size, use the kernel default (1MB). + * + * Return value is the pipe size to use. + */ + +static int virFileWrapperGetBestPipeSize(void) +{ + const char path[] = "/proc/sys/fs/pipe-max-size"; + int best_sz = 1024 * 1024; /* good virsh save results with this size */ + int max_sz; + + if (virFileReadValueInt(&max_sz, path) < 0) { + max_sz = 1024 * 1024; /* this is the kernel default pipe-max-size */ + VIR_WARN("failed to read %s, trying default %d", path, max_sz); + } else if (max_sz > best_sz) { + max_sz = best_sz; + } + return max_sz; +} + +/** + * virFileWrapperSetPipeSize: + * @fd: the fd of the pipe + * + * Set best pipe size on the passed file descriptor for bulk transfers of data. + * + * default pipe size (usually 64K) is generally not suited for large transfers + * to fast devices. This has been measured to improve virsh save by 400% + * in ideal conditions. + * + * Return value is 0 on success, -1 and errno set on error. + * OS note: only for linux, on other OS this is a no-op. + */ +static int +virFileWrapperSetPipeSize(int fd) +{ + int pipe_sz = virFileWrapperGetBestPipeSize();
I wonder if we shouldn't just ignore the proc setting and instead
for (sz = 1024 * 1024 ; sz >= 64 * 1024; sz /= 2) { int rv = fcntl(fd, F_SETPIPE_SZ, sz); if (rv < 0 && errno == EPERM) { continue; } if (rv < 0) { virReportError(...) return -1; }
VIR_INFO("fd %d pipe size adjusted to %d", fd, sz); return 0; }
We'll only have 1 loop iteration in the default case, and 4 iterations in the worst case, and gracefully leave it on the default if the last ieratino fails
With regards, Daniel
Yes, seems better to me, Claudio
participants (2)
-
Claudio Fontana
-
Daniel P. Berrangé