Hello Michal,
thanks for your answer,
On 3/14/22 12:36 PM, Michal Prívozník wrote:
On 3/12/22 17:30, Claudio Fontana wrote:
> the first user is the qemu driver,
>
> virsh save/resume would slow to a crawl with a default pipe size (64k).
>
> This improves the situation by 400%.
>
> Going through io_helper still seems to incur in some penalty (~15%-ish)
> compared with direct qemu migration to a nc socket to a file.
>
> Signed-off-by: Claudio Fontana <cfontana(a)suse.de>
> ---
> src/qemu/qemu_driver.c | 6 +++---
> src/qemu/qemu_saveimage.c | 11 ++++++-----
> src/util/virfile.c | 12 ++++++++++++
> src/util/virfile.h | 1 +
> 4 files changed, 22 insertions(+), 8 deletions(-)
>
> Hello, I initially thought this to be a qemu performance issue,
> so you can find the discussion about this in qemu-devel:
>
> "Re: bad virsh save /dev/null performance (600 MiB/s max)"
>
>
https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg03142.html
>
> RFC since need to validate idea, and it is only lightly tested:
>
> save - about 400% benefit in throughput, getting around 20 Gbps to /dev/null,
> and around 13 Gbps to a ramdisk.
> By comparison, direct qemu migration to a nc socket is around 24Gbps.
>
> restore - not tested, _should_ also benefit in the "bypass_cache" case
> coredump - not tested, _should_ also benefit like for save
>
> Thanks for your comments and review,
>
> Claudio
Hey, I like this idea, but couple of points below.
>
>
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index c1b3bd8536..be248c1e92 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -3044,7 +3044,7 @@ doCoreDump(virQEMUDriver *driver,
> virFileWrapperFd *wrapperFd = NULL;
> int directFlag = 0;
> bool needUnlink = false;
> - unsigned int flags = VIR_FILE_WRAPPER_NON_BLOCKING;
> + unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING |
VIR_FILE_WRAPPER_BIG_PIPE;
> const char *memory_dump_format = NULL;
> g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver);
> g_autoptr(virCommand) compressor = NULL;
> @@ -3059,7 +3059,7 @@ doCoreDump(virQEMUDriver *driver,
>
> /* Create an empty file with appropriate ownership. */
> if (dump_flags & VIR_DUMP_BYPASS_CACHE) {
> - flags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
> + wrapperFlags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
> directFlag = virFileDirectFdFlag();
> if (directFlag < 0) {
> virReportError(VIR_ERR_OPERATION_FAILED, "%s",
> @@ -3072,7 +3072,7 @@ doCoreDump(virQEMUDriver *driver,
> &needUnlink)) < 0)
> goto cleanup;
>
> - if (!(wrapperFd = virFileWrapperFdNew(&fd, path, flags)))
> + if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> goto cleanup;
>
> if (dump_flags & VIR_DUMP_MEMORY_ONLY) {
> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> index c0139041eb..1b522a1542 100644
> --- a/src/qemu/qemu_saveimage.c
> +++ b/src/qemu/qemu_saveimage.c
> @@ -267,7 +267,7 @@ qemuSaveImageCreate(virQEMUDriver *driver,
> int fd = -1;
> int directFlag = 0;
> virFileWrapperFd *wrapperFd = NULL;
> - unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING;
> + unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING |
VIR_FILE_WRAPPER_BIG_PIPE;
>
> /* Obtain the file handle. */
> if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> @@ -463,10 +463,11 @@ qemuSaveImageOpen(virQEMUDriver *driver,
> if ((fd = qemuDomainOpenFile(cfg, NULL, path, oflags, NULL)) < 0)
> return -1;
>
> - if (bypass_cache &&
> - !(*wrapperFd = virFileWrapperFdNew(&fd, path,
> - VIR_FILE_WRAPPER_BYPASS_CACHE)))
> - return -1;
> + if (bypass_cache) {
> + unsigned int wrapperFlags = VIR_FILE_WRAPPER_BYPASS_CACHE |
VIR_FILE_WRAPPER_BIG_PIPE;
> + if (!(*wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> + return -1;
> + }
>
> data = g_new0(virQEMUSaveData, 1);
>
> diff --git a/src/util/virfile.c b/src/util/virfile.c
> index a04f888e06..fdacd17890 100644
> --- a/src/util/virfile.c
> +++ b/src/util/virfile.c
> @@ -282,6 +282,18 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int
flags)
>
> ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
>
> + if (flags & VIR_FILE_WRAPPER_BIG_PIPE) {
I believe we don't need this flag. I mean, the plain fact that
virFileWrapper is used means that caller wants to avoid VFS because it's
interested in speed. Therefore, this code could be done unconditionally.
right, I see now this is called only by the qemu driver for those specific operations.
> + /*
> + * virsh save/resume would slow to a crawl with a default pipe size (usually
64k).
> + * This improves the situation by 400%, although going through io_helper
still incurs
> + * in a performance penalty compared with a direct qemu migration to a
socket.
> + */
This belongs into the commit message. This code has no knowledge about
qemu. What you can mention here is the performance benefit.
Ok. Just FTR, this code seems used by qemu only, but I see how this might change in the
future potentially.
Also, QEMU migrating straight to a socket is going to have
performance
benefit but only in a few cases, because if it's a migration into a file
then VFS (and thus caching) is involved. Thus, if you migrate into a
file and have enough free RAM for caches then yes, it's going to be
faster. But if you don't have free RAM then it's going to be way slower.
Yes, this is from a very specific requirement from the field.
> + int pipe_sz, rv = virFileReadValueInt(&pipe_sz,
"/proc/sys/fs/pipe-max-size");
> + if (rv != 0) {
> + pipe_sz = 1024 * 1024; /* common default for pipe-max-size */
> + }
> + fcntl(pipefd[output ? 0 : 1], F_SETPIPE_SZ, pipe_sz);
Alternative implementation would be to call fcntl() only if we know
we've succeeded in reading /proc/.../pipe-max-size, like this:
My implementation fails gracefully, ie if for any reason we cannot read pipe-max-size,
we could still succeed (maybe with a warning), with a reasonable default.
No need to fail hard on this.
int pipe_sz;
int rv = virFileReadValueInt(&pipe_size, "/proc/sys/fs/pipe-max-size");
if (rv >= 0)
fcntl(pipefd[output ? 0 : 1], F_SETPIPE_SZ, pipe_sz);
(notice I've declared variables on separate lines, we like it that way)
Now, what can we do about that ternary operator? It doesn't look nice.
But I guess we can hardcode just one end of the pipe, because it's the
actual pipe and not FD we are modifying here.
Lastly, let's add some error checking:
int pipe_sz;
int rv = virFileReadValueInt(&pipe_size, "/proc/sys/fs/pipe-max-size");
if (rv == -2) {
/* file doesn't exist */
} else if (rv < 0) {
/* Unable to read/parse file */
goto error;
} else {
if (fcntl(pipefd[0], F_SETPIPE_SZ, pipe_sz) < 0) {
virReportSystemError(errno, _("Unable to set pipe size to %d"),
pipe_sze);
goto error;
}
Even better, move this fcntl() or even the whole code into a separate
function (virPipe() lives under src/util/virutil.c) so that it can be
reused.
Michal
hmm I doubt you want this for _every_ pipe, only for the ones.. you need a wrapper for as
per your initial comment above?
Thanks,
Claudio