On 3/25/22 2:13 PM, Daniel P. Berrangé wrote:
On Fri, Mar 25, 2022 at 01:54:51PM +0100, Claudio Fontana wrote:
> currently the only user of virFileWrapperFdNew is the qemu driver;
> virsh save is very slow with a default pipe size.
> This change improves throughput by ~400% on fast nvme or ramdisk.
>
> Best value currently measured is 1MB, which happens to be also
> the kernel default for the pipe-max-size.
>
> Signed-off-by: Claudio Fontana <cfontana(a)suse.de>
> ---
>
> see v2 at
>
https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
>
> Changes v2 -> v3:
>
> * removed reading of max-pipe-size from procfs,
> instead make multiple attempts on EPERM with smaller sizes.
> In the regular case, this should succeed on the first try.
> (Daniel)
>
> Changes v1 -> v2:
>
> * removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing
> unconditional (Michal)
>
> * moved code to separate functions (Michal)
>
> * removed ternary op, disliked in libvirt (Michal)
>
> * added #ifdef __linux__ (Ani Sinha)
>
> * try smallest value between currently best measured value (1MB)
> and the pipe-max-size setting. If pipe-max-size cannot be read,
> try kernel default max (1MB). (Daniel)
>
>
> src/util/virfile.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 49 insertions(+)
>
> diff --git a/src/util/virfile.c b/src/util/virfile.c
> index a04f888e06..876b865974 100644
> --- a/src/util/virfile.c
> +++ b/src/util/virfile.c
> @@ -201,6 +201,51 @@ struct _virFileWrapperFd {
> };
>
> #ifndef WIN32
> +
> +#ifdef __linux__
> +
> +/**
> + * virFileWrapperSetPipeSize:
> + * @fd: the fd of the pipe
> + *
> + * Set best pipe size on the passed file descriptor for bulk transfers of data.
> + *
> + * default pipe size (usually 64K) is generally not suited for large transfers
> + * to fast devices. A value of 1MB has been measured to improve virsh save
> + * by 400% in ideal conditions. We retry multiple times with smaller sizes
> + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size.
> + *
> + * Return value is 0 on success, -1 and errno set on error.
> + * OS note: only for linux, on other OS this is a no-op.
> + */
> +static int
> +virFileWrapperSetPipeSize(int fd)
> +{
> + int sz;
> +
> + for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
> + int rv = fcntl(fd, F_SETPIPE_SZ, sz);
> + if (rv < 0 && errno == EPERM) {
> + continue; /* retry with half the size */
> + }
> + if (rv < 0) {
> + break;
> + }
> + VIR_INFO("fd %d pipe size adjusted to %d", fd, sz);
> + return 0;
> + }
> + VIR_WARN("failed to set pipe size to %d (errno=%d)", sz, errno);
> + return -1;
> +}
> +
> +#else /* !__linux__ */
> +static int virFileWrapperSetPipeSize(int fd)
> +{
> + return 0;
> +}
> +#endif /* !__linux__ */
> +
> +
> /**
> * virFileWrapperFdNew:
> * @fd: pointer to fd to wrap
> @@ -282,6 +327,10 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int
flags)
>
> ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
>
> + if (virFileWrapperSetPipeSize(pipefd[!output]) < 0) {
> + virReportError(VIR_ERR_SYSTEM_ERROR, "%s", _("unable to set
pipe size, data transfer might be slow"));
Push this into virFileWrapperSetPipeSize instead of the VIR_WARN
there, and use virReportSystemError passing in the errno value too.
ok, what about also warning on EPERM? In the normal case we should succeed on the first
try I think.
+static void
+virFileWrapperSetPipeSize(int fd)
+{
+ int sz;
+
+ for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
+ int rv = fcntl(fd, F_SETPIPE_SZ, sz);
+ if (rv < 0 && errno == EPERM) {
+ VIR_WARN("EPERM trying to set fd %d pipe size to %d", fd, sz);
+ continue; /* retry with half the size */
+ }
+ if (rv < 0) {
+ break;
+ }
+ VIR_INFO("fd %d pipe size adjusted to %d", fd, sz);
+ return;
+ }
+ virReportSystemError(errno, "%s", _("unable to set pipe size, data
transfer might be slow"));
+}
+
+#else /* !__linux__ */
+static void virFileWrapperSetPipeSize(int fd)
+{
+ return;
+}
+#endif /* !__linux__ */
?
Claudio
> + }
> +
> if (output) {
> virCommandSetInputFD(ret->cmd, pipefd[0]);
> virCommandSetOutputFD(ret->cmd, fd);
With regards,
Daniel