[libvirt RFCv8 00/27] multifd save restore prototype

This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly. It is now possible to just say: virsh save mydomain /mnt/saves/mysave --parallel virsh restore /mnt/saves/mysave --parallel and things work with the default of 2 channels, no compression. It is also possible to say of course: virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd virsh restore /mnt/saves/mysave --parallel and things also work fine, due to channels and compression being stored in the main save file. --- changes from v7: * [ base params API and iohelper refactoring upstreamed ] * extended the QEMU save image format more, to record the nr of multifd channels on save. Made the data header struct packed. * removed --parallel-connections from the restore command, as now it is useless due to QEMU save image format extension. * separate out patches to expose migration_params APIs to saveimage, including qemuMigrationParamsSetString, SetCap, SetInt. * fixed bugs in the ImageOpen patch (missing saveFd init), removed some whitespace, and fixed some convoluted code paths for return value -3. Claudio Fontana (27): multifd-helper: new helper for parallel save/restore tools: prepare doSave to use parameters tools: prepare cmdRestore to use parameters libvirt: add new VIR_DOMAIN_SAVE_PARALLEL flag and parameter qemu: add stub support for VIR_DOMAIN_SAVE_PARALLEL in save qemu: add stub support for VIR_DOMAIN_SAVE_PARALLEL in restore qemu: saveimage: introduce virQEMUSaveFd qemu: saveimage: convert qemuSaveImageCreate to use virQEMUSaveFd qemu: saveimage: convert qemuSaveImageOpen to use virQEMUSaveFd qemu: saveimage: add virQEMUSaveFd APIs for multifd qemu: saveimage: wire up saveimage code with the multifd helper qemu: capabilities: add multifd to the probed migration capabilities qemu: saveimage: add multifd related fields to save format qemu: migration_params: add APIs to set Int and Cap qemu: migration: implement qemuMigrationSrcToFilesMultiFd qemu: add parameter to qemuMigrationDstRun to skip waiting qemu: implement qemuSaveImageLoadMultiFd tools: add parallel parameter to virsh save command tools: add parallel parameter to virsh restore command qemu: add migration parameter multifd-compression libvirt: add new VIR_SAVE_PARAM_PARALLEL_COMPRESSION qemu: saveimage: add parallel compression argument to ImageCreate qemu: saveimage: add stub support for multifd compression parameter qemu: migration: expose qemuMigrationParamsSetString qemu: saveimage: implement multifd-compression in parallel save qemu: saveimage: restore compressed parallel images tools: add parallel-compression parameter to virsh save command docs/manpages/virsh.rst | 39 +- include/libvirt/libvirt-domain.h | 29 + po/POTFILES.in | 1 + src/libvirt_private.syms | 1 + src/qemu/qemu_capabilities.c | 6 + src/qemu/qemu_capabilities.h | 4 + src/qemu/qemu_driver.c | 140 +++-- src/qemu/qemu_migration.c | 160 +++-- src/qemu/qemu_migration.h | 16 +- src/qemu/qemu_migration_params.c | 71 ++- src/qemu/qemu_migration_params.h | 15 + src/qemu/qemu_process.c | 3 +- src/qemu/qemu_process.h | 5 +- src/qemu/qemu_saveimage.c | 557 ++++++++++++++---- src/qemu/qemu_saveimage.h | 57 +- src/qemu/qemu_snapshot.c | 6 +- src/util/meson.build | 16 + src/util/multifd-helper.c | 249 ++++++++ src/util/virthread.c | 5 + src/util/virthread.h | 1 + .../caps_4.0.0.aarch64.xml | 1 + .../qemucapabilitiesdata/caps_4.0.0.ppc64.xml | 1 + .../caps_4.0.0.riscv32.xml | 1 + .../caps_4.0.0.riscv64.xml | 1 + .../qemucapabilitiesdata/caps_4.0.0.s390x.xml | 1 + .../caps_4.0.0.x86_64.xml | 1 + .../caps_4.1.0.x86_64.xml | 1 + .../caps_4.2.0.aarch64.xml | 1 + .../qemucapabilitiesdata/caps_4.2.0.ppc64.xml | 1 + .../qemucapabilitiesdata/caps_4.2.0.s390x.xml | 1 + .../caps_4.2.0.x86_64.xml | 1 + .../caps_5.0.0.aarch64.xml | 2 + .../qemucapabilitiesdata/caps_5.0.0.ppc64.xml | 2 + .../caps_5.0.0.riscv64.xml | 2 + .../caps_5.0.0.x86_64.xml | 2 + .../qemucapabilitiesdata/caps_5.1.0.sparc.xml | 2 + .../caps_5.1.0.x86_64.xml | 2 + .../caps_5.2.0.aarch64.xml | 2 + .../qemucapabilitiesdata/caps_5.2.0.ppc64.xml | 2 + .../caps_5.2.0.riscv64.xml | 2 + .../qemucapabilitiesdata/caps_5.2.0.s390x.xml | 2 + .../caps_5.2.0.x86_64.xml | 2 + .../caps_6.0.0.aarch64.xml | 2 + .../qemucapabilitiesdata/caps_6.0.0.s390x.xml | 2 + .../caps_6.0.0.x86_64.xml | 2 + .../caps_6.1.0.x86_64.xml | 2 + .../caps_6.2.0.aarch64.xml | 2 + .../qemucapabilitiesdata/caps_6.2.0.ppc64.xml | 2 + .../caps_6.2.0.x86_64.xml | 2 + .../caps_7.0.0.aarch64.xml | 2 + .../qemucapabilitiesdata/caps_7.0.0.ppc64.xml | 2 + .../caps_7.0.0.x86_64.xml | 2 + tools/virsh-domain.c | 101 +++- 53 files changed, 1254 insertions(+), 281 deletions(-) create mode 100644 src/util/multifd-helper.c -- 2.35.3

For the save direction, this helper listens on a unix socket which QEMU connects to for multifd migration to files. For the restore direction, this helper connects to a unix socket QEMU listens at for multifd migration from files. The file descriptors are passed as command line parameters. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- po/POTFILES.in | 1 + src/libvirt_private.syms | 1 + src/util/meson.build | 16 +++ src/util/multifd-helper.c | 249 ++++++++++++++++++++++++++++++++++++++ src/util/virthread.c | 5 + src/util/virthread.h | 1 + 6 files changed, 273 insertions(+) create mode 100644 src/util/multifd-helper.c diff --git a/po/POTFILES.in b/po/POTFILES.in index 0d9adb0758..4efb330262 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -241,6 +241,7 @@ @SRCDIR@src/storage_file/storage_source_backingstore.c @SRCDIR@src/test/test_driver.c @SRCDIR@src/util/iohelper.c +@SRCDIR@src/util/multifd-helper.c @SRCDIR@src/util/viralloc.c @SRCDIR@src/util/virarptable.c @SRCDIR@src/util/viraudit.c diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 97bfca906b..5f2bee985e 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -3427,6 +3427,7 @@ virThreadCreateFull; virThreadID; virThreadIsSelf; virThreadJoin; +virThreadJoinRet; virThreadMaxName; virThreadSelf; virThreadSelfID; diff --git a/src/util/meson.build b/src/util/meson.build index 17755373c8..337e454137 100644 --- a/src/util/meson.build +++ b/src/util/meson.build @@ -178,6 +178,11 @@ io_helper_sources = [ 'virfile.c', ] +multifd_helper_sources = [ + 'multifd-helper.c', + 'virfile.c', +] + virt_util_lib = static_library( 'virt_util', [ @@ -219,6 +224,17 @@ if conf.has('WITH_LIBVIRTD') libutil_dep, ], } + virt_helpers += { + 'name': 'libvirt_multifd_helper', + 'sources': [ + files(multifd_helper_sources), + dtrace_gen_headers, + ], + 'deps': [ + acl_dep, + libutil_dep, + ], + } endif util_inc_dir = include_directories('.') diff --git a/src/util/multifd-helper.c b/src/util/multifd-helper.c new file mode 100644 index 0000000000..59657578e2 --- /dev/null +++ b/src/util/multifd-helper.c @@ -0,0 +1,249 @@ +/* + * multifd-helper.c: listens on Unix socket to perform I/O to multiple files + * + * Copyright (C) 2022 SUSE LLC + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + * + * This has been written to support QEMU multifd migration to file, + * allowing better use of cpu resources to speed up the save/restore. + */ + +#include <config.h> + +#include <unistd.h> +#include <fcntl.h> +#include <stdlib.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <sys/socket.h> +#include <sys/un.h> + +#include "virthread.h" +#include "virfile.h" +#include "virerror.h" +#include "virstring.h" +#include "virgettext.h" + +#define VIR_FROM_THIS VIR_FROM_STORAGE + +typedef struct _multiFdConnData multiFdConnData; +struct _multiFdConnData { + int clientfd; + int filefd; + int oflags; + const char *path; + virThread tid; + + off_t total; +}; + +typedef struct _multiFdThreadArgs multiFdThreadArgs; +struct _multiFdThreadArgs { + int nchannels; + multiFdConnData *conn; /* contains main fd + nchannels */ + const char *sun_path; /* unix socket name to use for the server */ + struct sockaddr_un serv_addr; + + off_t total; +}; + +static void clientThreadFunc(void *a) +{ + multiFdConnData *c = a; + c->total = virFileDiskCopy(c->filefd, c->path, c->clientfd, "socket"); +} + +static off_t waitClientThreads(multiFdConnData *conn, int n) +{ + int idx; + off_t total = 0; + for (idx = 0; idx < n; idx++) { + multiFdConnData *c = &conn[idx]; + if (virThreadJoinRet(&c->tid) < 0) { + total = -1; + } else if (total >= 0) { + total += c->total; + } + if (VIR_CLOSE(c->clientfd) < 0) { + total = -1; + } + } + return total; +} + +static void loadThreadFunc(void *a) +{ + multiFdThreadArgs *args = a; + int idx; + args->total = -1; + + for (idx = 0; idx < args->nchannels + 1; idx++) { + /* Perform outgoing connections */ + multiFdConnData *c = &args->conn[idx]; + c->clientfd = socket(AF_UNIX, SOCK_STREAM, 0); + if (c->clientfd < 0) { + virReportSystemError(errno, "%s", _("loadThread: socket() failed")); + goto cleanup; + } + if (connect(c->clientfd, (const struct sockaddr *)&args->serv_addr, + sizeof(struct sockaddr_un)) < 0) { + virReportSystemError(errno, "%s", _("loadThread: connect() failed")); + goto cleanup; + } + if (virThreadCreate(&c->tid, true, &clientThreadFunc, c) < 0) { + virReportSystemError(errno, "%s", _("loadThread: client thread creation failed")); + goto cleanup; + } + } + args->total = waitClientThreads(args->conn, args->nchannels + 1); + + cleanup: + for (idx = 0; idx < args->nchannels + 1; idx++) { + multiFdConnData *c = &args->conn[idx]; + VIR_FORCE_CLOSE(c->clientfd); + } +} + +static void saveThreadFunc(void *a) +{ + multiFdThreadArgs *args = a; + int idx; + const char buf[1] = {'R'}; + int sockfd; + + if ((sockfd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { + virReportSystemError(errno, "%s", _("saveThread: socket() failed")); + return; + } + unlink(args->sun_path); + if (bind(sockfd, (struct sockaddr *)&args->serv_addr, sizeof(args->serv_addr)) < 0) { + virReportSystemError(errno, "%s", _("saveThread: bind() failed")); + goto cleanup; + } + if (listen(sockfd, args->nchannels + 1) < 0) { + virReportSystemError(errno, "%s", _("saveThread: listen() failed")); + goto cleanup; + } + + /* signal that the server is ready */ + if (safewrite(STDOUT_FILENO, &buf, 1) != 1) { + virReportSystemError(errno, "%s", _("saveThread: safewrite failed")); + goto cleanup; + } + + for (idx = 0; idx < args->nchannels + 1; idx++) { + /* Wait for incoming connection. */ + multiFdConnData *c = &args->conn[idx]; + if ((c->clientfd = accept(sockfd, NULL, NULL)) < 0) { + virReportSystemError(errno, "%s", _("saveThread: accept() failed")); + goto cleanup; + } + if (virThreadCreate(&c->tid, true, &clientThreadFunc, c) < 0) { + virReportSystemError(errno, "%s", _("saveThread: client thread creation failed")); + goto cleanup; + } + } + + args->total = waitClientThreads(args->conn, args->nchannels + 1); + + cleanup: + for (idx = 0; idx < args->nchannels + 1; idx++) { + multiFdConnData *c = &args->conn[idx]; + VIR_FORCE_CLOSE(c->clientfd); + } + if (VIR_CLOSE(sockfd) < 0) + args->total = -1; +} + +static const char *program_name; + +G_GNUC_NORETURN static void +usage(int status) +{ + if (status) { + fprintf(stderr, _("%s: try --help for more details"), program_name); + } else { + fprintf(stderr, _("Usage: %s UNIX_SOCNAME N MAINFD FD0 FD1 ... FDn"), program_name); + } + exit(status); +} + +int +main(int argc, char **argv) +{ + virThread tid; + virThreadFunc func = saveThreadFunc; + multiFdThreadArgs args = { 0 }; + int idx; + + /* sleep(10); */ + + program_name = argv[0]; + + if (virGettextInitialize() < 0 || + virErrorInitialize() < 0) { + fprintf(stderr, _("%s: initialization failed"), program_name); + exit(EXIT_FAILURE); + } + + if (argc > 1 && STREQ(argv[1], "--help")) + usage(EXIT_SUCCESS); + if (argc < 4) + usage(EXIT_FAILURE); + + args.sun_path = argv[1]; + if (virStrToLong_i(argv[2], NULL, 10, &args.nchannels) < 0) + fprintf(stderr, _("%s: malformed number of channels N %s"), program_name, argv[2]); + + if (argc < 4 + args.nchannels) + usage(EXIT_FAILURE); + + args.conn = g_new0(multiFdConnData, args.nchannels + 1); + + for (idx = 3; idx < 3 + args.nchannels + 1; idx++) { + multiFdConnData *c = &args.conn[idx - 3]; + + if (virStrToLong_i(argv[idx], NULL, 10, &c->filefd) < 0) { + fprintf(stderr, _("%s: malformed FD %s"), program_name, argv[idx]); + usage(EXIT_FAILURE); + } +#ifndef F_GETFL +#error "multifd-helper requires F_GETFL parameter of fcntl" +#endif + c->oflags = fcntl(c->filefd, F_GETFL); + if ((c->oflags & O_ACCMODE) == O_RDONLY) { + func = loadThreadFunc; + } + } + + /* initialize server address structure */ + memset(&args.serv_addr, 0, sizeof(args.serv_addr)); + args.serv_addr.sun_family = AF_UNIX; + virStrcpyStatic(args.serv_addr.sun_path, args.sun_path); + + if (virThreadCreate(&tid, true, func, &args) < 0) { + virReportSystemError(errno, _("%s: failed to create server thread"), program_name); + exit(EXIT_FAILURE); + } + + if (virThreadJoinRet(&tid) < 0) + exit(EXIT_FAILURE); + + if (args.total < 0) + exit(EXIT_FAILURE); + + exit(EXIT_SUCCESS); +} diff --git a/src/util/virthread.c b/src/util/virthread.c index 5422bb74fd..0f6c6a68fa 100644 --- a/src/util/virthread.c +++ b/src/util/virthread.c @@ -348,6 +348,11 @@ void virThreadJoin(virThread *thread) pthread_join(thread->thread, NULL); } +int virThreadJoinRet(virThread *thread) +{ + return pthread_join(thread->thread, NULL); +} + void virThreadCancel(virThread *thread) { pthread_cancel(thread->thread); diff --git a/src/util/virthread.h b/src/util/virthread.h index 23abe0b6c9..5cecb9bd8a 100644 --- a/src/util/virthread.h +++ b/src/util/virthread.h @@ -89,6 +89,7 @@ int virThreadCreateFull(virThread *thread, void virThreadSelf(virThread *thread); bool virThreadIsSelf(virThread *thread); void virThreadJoin(virThread *thread); +int virThreadJoinRet(virThread *thread); size_t virThreadMaxName(void); -- 2.35.3

Signed-off-by: Claudio Fontana <cfontana@suse.de> --- tools/virsh-domain.c | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index ba492e807e..a3b0749fa9 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -4203,6 +4203,9 @@ doSave(void *opaque) g_autoptr(virshDomain) dom = NULL; const char *name = NULL; const char *to = NULL; + virTypedParameterPtr params = NULL; + int nparams = 0; + int maxparams = 0; unsigned int flags = 0; const char *xmlfile = NULL; g_autofree char *xml = NULL; @@ -4216,9 +4219,12 @@ doSave(void *opaque) goto out_sig; #endif /* !WIN32 */ - if (vshCommandOptStringReq(ctl, cmd, "file", &to) < 0) + if (vshCommandOptStringReq(ctl, cmd, "file", &to) < 0) { goto out; - + } else if (virTypedParamsAddString(¶ms, &nparams, &maxparams, + VIR_SAVE_PARAM_FILE, to) < 0) { + goto out; + } if (vshCommandOptBool(cmd, "bypass-cache")) flags |= VIR_DOMAIN_SAVE_BYPASS_CACHE; if (vshCommandOptBool(cmd, "running")) @@ -4232,10 +4238,14 @@ doSave(void *opaque) if (!(dom = virshCommandOptDomain(ctl, cmd, &name))) goto out; - if (xmlfile && - virFileReadAll(xmlfile, VSH_MAX_XML_FILE, &xml) < 0) { - vshReportError(ctl); - goto out; + if (xmlfile) { + if (virFileReadAll(xmlfile, VSH_MAX_XML_FILE, &xml) < 0) { + vshReportError(ctl); + goto out; + } else if (virTypedParamsAddString(¶ms, &nparams, &maxparams, + VIR_SAVE_PARAM_DXML, xml) < 0) { + goto out; + } } if (flags || xml) { @@ -4252,6 +4262,7 @@ doSave(void *opaque) data->ret = 0; out: + virTypedParamsFree(params, nparams); #ifndef WIN32 pthread_sigmask(SIG_SETMASK, &oldsigmask, NULL); out_sig: -- 2.35.3

Signed-off-by: Claudio Fontana <cfontana@suse.de> --- tools/virsh-domain.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index a3b0749fa9..2d90fba9b7 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -5316,15 +5316,21 @@ static bool cmdRestore(vshControl *ctl, const vshCmd *cmd) { const char *from = NULL; + virTypedParameterPtr params = NULL; + int nparams = 0; + int maxparams = 0; unsigned int flags = 0; const char *xmlfile = NULL; g_autofree char *xml = NULL; virshControl *priv = ctl->privData; - int rc; - - if (vshCommandOptStringReq(ctl, cmd, "file", &from) < 0) - return false; + int rc = -1; + if (vshCommandOptStringReq(ctl, cmd, "file", &from) < 0) { + goto out; + } else if (virTypedParamsAddString(¶ms, &nparams, &maxparams, + VIR_SAVE_PARAM_FILE, from) < 0) { + goto out; + } if (vshCommandOptBool(cmd, "bypass-cache")) flags |= VIR_DOMAIN_SAVE_BYPASS_CACHE; if (vshCommandOptBool(cmd, "running")) @@ -5335,11 +5341,15 @@ cmdRestore(vshControl *ctl, const vshCmd *cmd) flags |= VIR_DOMAIN_SAVE_RESET_NVRAM; if (vshCommandOptStringReq(ctl, cmd, "xml", &xmlfile) < 0) - return false; + goto out; - if (xmlfile && - virFileReadAll(xmlfile, VSH_MAX_XML_FILE, &xml) < 0) - return false; + if (xmlfile) { + if (virFileReadAll(xmlfile, VSH_MAX_XML_FILE, &xml) < 0) + goto out; + else if (virTypedParamsAddString(¶ms, &nparams, &maxparams, + VIR_SAVE_PARAM_DXML, xml) < 0) + goto out; + } if (flags || xml) { rc = virDomainRestoreFlags(priv->conn, from, xml, flags); @@ -5349,11 +5359,13 @@ cmdRestore(vshControl *ctl, const vshCmd *cmd) if (rc < 0) { vshError(ctl, _("Failed to restore domain from %s"), from); - return false; + goto out; } vshPrintExtra(ctl, _("Domain restored from %s\n"), from); - return true; + out: + virTypedParamsFree(params, nparams); + return rc >= 0; } /* -- 2.35.3

in order to enable parallel save functionality, we will need an opportune new flag and a parameter to specify the number of extra connections to use. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- include/libvirt/libvirt-domain.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index cf9d9efd51..8bedeaff30 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1555,6 +1555,7 @@ typedef enum { VIR_DOMAIN_SAVE_RUNNING = 1 << 1, /* Favor running over paused (Since: 0.9.5) */ VIR_DOMAIN_SAVE_PAUSED = 1 << 2, /* Favor paused over running (Since: 0.9.5) */ VIR_DOMAIN_SAVE_RESET_NVRAM = 1 << 3, /* Re-initialize NVRAM from template (Since: 8.1.0) */ + VIR_DOMAIN_SAVE_PARALLEL = 1 << 4, /* Parallel Save/Restore to multiple files (Since: 8.4.0) */ } virDomainSaveRestoreFlags; int virDomainSave (virDomainPtr domain, @@ -1582,6 +1583,8 @@ int virDomainRestoreParams (virConnectPtr conn, * VIR_SAVE_PARAM_FILE: * * the parameter used to specify the savestate file to save to or restore from. + * For parallel saves, this is the main file, with the extra connections adding suffix + * .1 .2 .3 ... up to VIR_SAVE_PARAM_PARALLEL_CONNECTIONS. * * Since: 8.4.0 */ @@ -1600,6 +1603,21 @@ int virDomainRestoreParams (virConnectPtr conn, */ # define VIR_SAVE_PARAM_DXML "dxml" +/** + * VIR_SAVE_PARAM_PARALLEL_CONNECTIONS: + * + * this optional parameter mirrors the migration parameter + * VIR_MIGRATE_PARAM_PARALLEL_CONNECTIONS. + * + * This parameter is used when saving state files in parallel + * using the flag VIR_DOMAIN_SAVE_PARALLEL. + * It specifies the number of extra files to save to using parallel + * connections. + * + * Since: 8.4.0 + */ +# define VIR_SAVE_PARAM_PARALLEL_CONNECTIONS "parallel.connections" + /* See below for virDomainSaveImageXMLFlags */ char * virDomainSaveImageGetXMLDesc (virConnectPtr conn, const char *file, -- 2.35.3

and its companion param VIR_SAVE_PARAM_PARALLEL_CONNECTIONS Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_driver.c | 17 +++++++++++------ src/qemu/qemu_saveimage.c | 1 + src/qemu/qemu_saveimage.h | 1 + src/qemu/qemu_snapshot.c | 2 +- 4 files changed, 14 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e3582f62a7..78cc0cef4f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2641,7 +2641,7 @@ static int qemuDomainSaveInternal(virQEMUDriver *driver, virDomainObj *vm, const char *path, int compressed, virCommand *compressor, - const char *xmlin, unsigned int flags) + const char *xmlin, int nconn, unsigned int flags) { g_autofree char *xml = NULL; bool was_running = false; @@ -2722,7 +2722,7 @@ qemuDomainSaveInternal(virQEMUDriver *driver, xml = NULL; ret = qemuSaveImageCreate(driver, vm, path, data, compressor, - flags, VIR_ASYNC_JOB_SAVE); + nconn, flags, VIR_ASYNC_JOB_SAVE); if (ret < 0) goto endjob; @@ -2791,7 +2791,7 @@ qemuDomainSaveFlags(virDomainPtr dom, const char *path, const char *dxml, goto cleanup; ret = qemuDomainSaveInternal(driver, vm, path, compressed, - compressor, dxml, flags); + compressor, dxml, -1, flags); cleanup: virDomainObjEndAPI(&vm); @@ -2815,16 +2815,19 @@ qemuDomainSaveParams(virDomainPtr dom, int compressed; g_autoptr(virCommand) compressor = NULL; int ret = -1; + int nconn = 2; virDomainObj *vm = NULL; g_autoptr(virQEMUDriverConfig) cfg = NULL; virCheckFlags(VIR_DOMAIN_SAVE_BYPASS_CACHE | VIR_DOMAIN_SAVE_RUNNING | - VIR_DOMAIN_SAVE_PAUSED, -1); + VIR_DOMAIN_SAVE_PAUSED | + VIR_DOMAIN_SAVE_PARALLEL, -1); if (virTypedParamsValidate(params, nparams, VIR_SAVE_PARAM_FILE, VIR_TYPED_PARAM_STRING, VIR_SAVE_PARAM_DXML, VIR_TYPED_PARAM_STRING, + VIR_SAVE_PARAM_PARALLEL_CONNECTIONS, VIR_TYPED_PARAM_INT, NULL) < 0) return -1; @@ -2832,6 +2835,8 @@ qemuDomainSaveParams(virDomainPtr dom, return -1; if (virTypedParamsGetString(params, nparams, VIR_SAVE_PARAM_DXML, &dxml) < 0) return -1; + if (virTypedParamsGetInt(params, nparams, VIR_SAVE_PARAM_PARALLEL_CONNECTIONS, &nconn) < 0) + return -1; cfg = virQEMUDriverGetConfig(driver); if ((compressed = qemuSaveImageGetCompressionProgram(cfg->saveImageFormat, @@ -2849,7 +2854,7 @@ qemuDomainSaveParams(virDomainPtr dom, goto cleanup; ret = qemuDomainSaveInternal(driver, vm, to, compressed, - compressor, dxml, flags); + compressor, dxml, nconn, flags); cleanup: virDomainObjEndAPI(&vm); @@ -2906,7 +2911,7 @@ qemuDomainManagedSave(virDomainPtr dom, unsigned int flags) VIR_INFO("Saving state of domain '%s' to '%s'", vm->def->name, name); ret = qemuDomainSaveInternal(driver, vm, name, compressed, - compressor, NULL, flags); + compressor, NULL, -1, flags); if (ret == 0) vm->hasManagedSave = true; diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 4fd4c5cfcd..7c76db359e 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -258,6 +258,7 @@ qemuSaveImageCreate(virQEMUDriver *driver, const char *path, virQEMUSaveData *data, virCommand *compressor, + int nconn G_GNUC_UNUSED, unsigned int flags, virDomainAsyncJob asyncJob) { diff --git a/src/qemu/qemu_saveimage.h b/src/qemu/qemu_saveimage.h index 391cd55ed0..b3d5c02fd6 100644 --- a/src/qemu/qemu_saveimage.h +++ b/src/qemu/qemu_saveimage.h @@ -96,6 +96,7 @@ qemuSaveImageCreate(virQEMUDriver *driver, const char *path, virQEMUSaveData *data, virCommand *compressor, + int nconn, unsigned int flags, virDomainAsyncJob asyncJob); diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c index b62fab7bb3..2e445e8296 100644 --- a/src/qemu/qemu_snapshot.c +++ b/src/qemu/qemu_snapshot.c @@ -1457,7 +1457,7 @@ qemuSnapshotCreateActiveExternal(virQEMUDriver *driver, memory_existing = virFileExists(snapdef->memorysnapshotfile); if ((ret = qemuSaveImageCreate(driver, vm, snapdef->memorysnapshotfile, - data, compressor, 0, + data, compressor, -1, 0, VIR_ASYNC_JOB_SNAPSHOT)) < 0) goto cleanup; -- 2.35.3

Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_driver.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 78cc0cef4f..ce399cd197 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -5834,7 +5834,8 @@ qemuDomainRestoreInternal(virConnectPtr conn, virCheckFlags(VIR_DOMAIN_SAVE_BYPASS_CACHE | VIR_DOMAIN_SAVE_RUNNING | VIR_DOMAIN_SAVE_PAUSED | - VIR_DOMAIN_SAVE_RESET_NVRAM, -1); + VIR_DOMAIN_SAVE_RESET_NVRAM | + VIR_DOMAIN_SAVE_PARALLEL, -1); if (flags & VIR_DOMAIN_SAVE_RESET_NVRAM) reset_nvram = true; -- 2.35.3

use this data type to encapsulate the pathname, file descriptor, wrapper, and need to unlink. This will make management of the resources associated with an FD used for QEMU save/restore much easier, reducing the amount of explicit cleanup required. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_saveimage.c | 117 ++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_saveimage.h | 18 ++++++ 2 files changed, 135 insertions(+) diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 7c76db359e..67c93e3865 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -248,6 +248,123 @@ qemuSaveImageGetCompressionCommand(virQEMUSaveFormat compression) return ret; } +/* + * virQEMUSaveFdInit: initialize a virQEMUSaveFd + * + * @saveFd: the structure to initialize + * @base: the main file name + * @idx: 0 for the main file, >0 for multifd channels. + * @oflags the file descriptor open flags + * @cfg: the driver config + * + * Returns -1 on error, 0 on success, + * and in both cases virQEMUSaveFdFini must be called to free resources. + */ +int virQEMUSaveFdInit(virQEMUSaveFd *saveFd, const char *base, int idx, + int oflags, virQEMUDriverConfig *cfg) +{ + unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING; + bool isCreat = oflags & O_CREAT; + bool isDirect = O_DIRECT && (oflags & O_DIRECT); + + if (isDirect) + wrapperFlags |= VIR_FILE_WRAPPER_BYPASS_CACHE; + if (idx > 0) { + saveFd->path = g_strdup_printf("%s.%d", base, idx); + } else { + saveFd->path = g_strdup(base); + } + saveFd->wrapper = NULL; + if (isCreat) { + saveFd->fd = virQEMUFileOpenAs(cfg->user, cfg->group, false, saveFd->path, + oflags, &saveFd->need_unlink); + } else { + saveFd->fd = qemuDomainOpenFile(cfg, NULL, saveFd->path, oflags, NULL); + } + if (saveFd->fd < 0) + return -1; + /* + * no wrapper required for the multifd channels. + * For O_CREAT, we always add the wrapper for the main file. + * For !O_CREAT, we only add the wrapper if using O_DIRECT. + */ + if (idx == 0 && (isDirect || isCreat)) { + saveFd->wrapper = virFileWrapperFdNew(&saveFd->fd, saveFd->path, wrapperFlags); + if (!saveFd->wrapper) + return -1; + } + return 0; +} + +/* + * virQEMUSaveFdClose: close a virQEMUSaveFd descriptor with normal close. + * + * @saveFd: the saveFd structure with the file descriptors to close. + * @vm: the virDomainObj (necessary to release lock), or NULL. + * + * If saveFd is NULL, the function will return success. + * + * Returns -1 on error, 0 on success. + */ +int virQEMUSaveFdClose(virQEMUSaveFd *saveFd, virDomainObj *vm) +{ + if (!saveFd) + return 0; + + if (VIR_CLOSE(saveFd->fd) < 0) { + virReportSystemError(errno, _("unable to close %s"), saveFd->path); + return -1; + } + if (vm) { + if (qemuDomainFileWrapperFDClose(vm, saveFd->wrapper) < 0) + return -1; + } else { + if (virFileWrapperFdClose(saveFd->wrapper) < 0) + return -1; + } + return 0; +} + +/* + * virQEMUSaveFdFini: finalize a virQEMUSaveFd + * + * @saveFd: the saveFd structure containing the resources to free. + * @vm: the virDomainObj (necessary to release lock for long close ops), or NULL. + * @ret: the current operation result (< 0 is failure) + * + * If saveFd is NULL, the return value will be unchanged. + * + * Returns ret, or -1 if an error is detected. + */ +int virQEMUSaveFdFini(virQEMUSaveFd *saveFd, virDomainObj *vm, int ret) +{ + if (!saveFd) + return ret; + VIR_FORCE_CLOSE(saveFd->fd); + if (vm) { + if (qemuDomainFileWrapperFDClose(vm, saveFd->wrapper) < 0) + ret = -1; + } else { + if (virFileWrapperFdClose(saveFd->wrapper) < 0) + ret = -1; + } + + if (ret < 0 && saveFd->need_unlink && saveFd->path) { + if (unlink(saveFd->path) < 0) { + virReportSystemError(errno, _("cannot remove file: %s"), + saveFd->path); + } + } + if (saveFd->wrapper) { + virFileWrapperFdFree(saveFd->wrapper); + saveFd->wrapper = NULL; + } + + g_free(saveFd->path); + saveFd->path = NULL; + return ret; +} + /* Helper function to execute a migration to file with a correct save header * the caller needs to make sure that the processors are stopped and do all other diff --git a/src/qemu/qemu_saveimage.h b/src/qemu/qemu_saveimage.h index b3d5c02fd6..41937e5eb5 100644 --- a/src/qemu/qemu_saveimage.h +++ b/src/qemu/qemu_saveimage.h @@ -54,6 +54,24 @@ struct _virQEMUSaveData { }; +typedef struct _virQEMUSaveFd virQEMUSaveFd; +struct _virQEMUSaveFd { + char *path; + int fd; + bool need_unlink; + virFileWrapperFd *wrapper; +}; + +#define QEMU_SAVEFD_INVALID (virQEMUSaveFd) { .path = NULL, .fd = -1, .need_unlink = false, .wrapper = NULL } + +int virQEMUSaveFdInit(virQEMUSaveFd *saveFd, const char *base, int idx, + int oflags, virQEMUDriverConfig *cfg) + ATTRIBUTE_NONNULL(5); + +int virQEMUSaveFdClose(virQEMUSaveFd *saveFd, virDomainObj *vm); + +int virQEMUSaveFdFini(virQEMUSaveFd *saveFd, virDomainObj *vm, int ret); + virDomainDef * qemuSaveImageUpdateDef(virQEMUDriver *driver, virDomainDef *def, -- 2.35.3

now that we introduced virQEMUSaveFd, use it in the creation of a new save image. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_saveimage.c | 54 +++++++++++---------------------------- 1 file changed, 15 insertions(+), 39 deletions(-) diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 67c93e3865..30e9a16307 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -380,41 +380,31 @@ qemuSaveImageCreate(virQEMUDriver *driver, virDomainAsyncJob asyncJob) { g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); - bool needUnlink = false; + virQEMUSaveFd saveFd = QEMU_SAVEFD_INVALID; + unsigned int oflags = O_WRONLY | O_TRUNC | O_CREAT; int ret = -1; - int fd = -1; - int directFlag = 0; - virFileWrapperFd *wrapperFd = NULL; - unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING; - /* Obtain the file handle. */ - if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) { - wrapperFlags |= VIR_FILE_WRAPPER_BYPASS_CACHE; - directFlag = virFileDirectFdFlag(); - if (directFlag < 0) { + if (flags & VIR_DOMAIN_SAVE_BYPASS_CACHE) { + if (virFileDirectFdFlag() < 0) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("bypass cache unsupported by this system")); - goto cleanup; + return -1; } + oflags |= O_DIRECT; } - fd = virQEMUFileOpenAs(cfg->user, cfg->group, false, path, - O_WRONLY | O_TRUNC | O_CREAT | directFlag, - &needUnlink); - if (fd < 0) + if (virQEMUSaveFdInit(&saveFd, path, 0, oflags, cfg) < 0) goto cleanup; - - if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, fd) < 0) + if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, saveFd.fd) < 0) goto cleanup; - - if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags))) + if (virQEMUSaveDataWrite(data, saveFd.fd, saveFd.path) < 0) goto cleanup; - if (virQEMUSaveDataWrite(data, fd, path) < 0) + /* Perform the migration */ + if (qemuMigrationSrcToFile(driver, vm, saveFd.fd, compressor, asyncJob) < 0) goto cleanup; - /* Perform the migration */ - if (qemuMigrationSrcToFile(driver, vm, fd, compressor, asyncJob) < 0) + if (virQEMUSaveFdClose(&saveFd, vm) < 0) goto cleanup; /* Touch up file header to mark image complete. */ @@ -423,29 +413,15 @@ qemuSaveImageCreate(virQEMUDriver *driver, * up to seek backwards on wrapperFd. The reopened fd will * trigger a single page of file system cache pollution, but * that's acceptable. */ - if (VIR_CLOSE(fd) < 0) { - virReportSystemError(errno, _("unable to close %s"), path); - goto cleanup; - } - if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0) - goto cleanup; - - if ((fd = qemuDomainOpenFile(cfg, vm->def, path, O_WRONLY, NULL)) < 0 || - virQEMUSaveDataFinish(data, &fd, path) < 0) + if ((saveFd.fd = qemuDomainOpenFile(cfg, vm->def, saveFd.path, O_WRONLY, NULL)) < 0 || + virQEMUSaveDataFinish(data, &saveFd.fd, saveFd.path) < 0) goto cleanup; ret = 0; cleanup: - VIR_FORCE_CLOSE(fd); - if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0) - ret = -1; - virFileWrapperFdFree(wrapperFd); - - if (ret < 0 && needUnlink) - unlink(path); - + ret = virQEMUSaveFdFini(&saveFd, vm, ret); return ret; } -- 2.35.3

all the logic to open a fd, create a wrapper etc, is boilerplate code that is best reused, so change the Open function to take an existing already initialized virQEMUSaveFd. Adapt all callers of qemuSaveImageOpen. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_driver.c | 101 ++++++++++++++++++++++---------------- src/qemu/qemu_saveimage.c | 86 ++++++++------------------------ src/qemu/qemu_saveimage.h | 9 ++-- 3 files changed, 83 insertions(+), 113 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index ce399cd197..f3d5f3937d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -5824,12 +5824,13 @@ qemuDomainRestoreInternal(virConnectPtr conn, virDomainObj *vm = NULL; g_autofree char *xmlout = NULL; const char *newxml = dxml; - int fd = -1; int ret = -1; virQEMUSaveData *data = NULL; - virFileWrapperFd *wrapperFd = NULL; + virQEMUSaveFd saveFd = QEMU_SAVEFD_INVALID; bool hook_taint = false; bool reset_nvram = false; + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + int oflags = O_RDONLY; virCheckFlags(VIR_DOMAIN_SAVE_BYPASS_CACHE | VIR_DOMAIN_SAVE_RUNNING | @@ -5840,10 +5841,17 @@ qemuDomainRestoreInternal(virConnectPtr conn, if (flags & VIR_DOMAIN_SAVE_RESET_NVRAM) reset_nvram = true; - fd = qemuSaveImageOpen(driver, NULL, path, &def, &data, - (flags & VIR_DOMAIN_SAVE_BYPASS_CACHE) != 0, - &wrapperFd, false, false); - if (fd < 0) + if (flags & VIR_DOMAIN_SAVE_BYPASS_CACHE) { + if (virFileDirectFdFlag() < 0) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("bypass cache unsupported by this system")); + return -1; + } + oflags |= O_DIRECT; + } + if (virQEMUSaveFdInit(&saveFd, path, 0, oflags, cfg) < 0) + return -1; + if (qemuSaveImageOpen(driver, NULL, &def, &data, false, &saveFd) < 0) goto cleanup; if (ensureACL(conn, def) < 0) @@ -5897,16 +5905,13 @@ qemuDomainRestoreInternal(virConnectPtr conn, flags) < 0) goto cleanup; - ret = qemuSaveImageStartVM(conn, driver, vm, &fd, data, path, + ret = qemuSaveImageStartVM(conn, driver, vm, &saveFd.fd, data, path, false, reset_nvram, VIR_ASYNC_JOB_START); qemuProcessEndJob(vm); cleanup: - VIR_FORCE_CLOSE(fd); - if (virFileWrapperFdClose(wrapperFd) < 0) - ret = -1; - virFileWrapperFdFree(wrapperFd); + ret = virQEMUSaveFdFini(&saveFd, vm, ret); virQEMUSaveDataFree(data); if (vm && ret < 0) qemuDomainRemoveInactive(driver, vm); @@ -5964,15 +5969,15 @@ qemuDomainSaveImageGetXMLDesc(virConnectPtr conn, const char *path, virQEMUDriver *driver = conn->privateData; char *ret = NULL; g_autoptr(virDomainDef) def = NULL; - int fd = -1; virQEMUSaveData *data = NULL; + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + virQEMUSaveFd saveFd = QEMU_SAVEFD_INVALID; virCheckFlags(VIR_DOMAIN_SAVE_IMAGE_XML_SECURE, NULL); - fd = qemuSaveImageOpen(driver, NULL, path, &def, &data, - false, NULL, false, false); - - if (fd < 0) + if (virQEMUSaveFdInit(&saveFd, path, 0, O_RDONLY, cfg) < 0) + return NULL; + if (qemuSaveImageOpen(driver, NULL, &def, &data, false, &saveFd) < 0) goto cleanup; if (virDomainSaveImageGetXMLDescEnsureACL(conn, def) < 0) @@ -5982,7 +5987,8 @@ qemuDomainSaveImageGetXMLDesc(virConnectPtr conn, const char *path, cleanup: virQEMUSaveDataFree(data); - VIR_FORCE_CLOSE(fd); + if (virQEMUSaveFdFini(&saveFd, NULL, ret ? 0 : -1) < 0) + ret = NULL; return ret; } @@ -5994,8 +6000,9 @@ qemuDomainSaveImageDefineXML(virConnectPtr conn, const char *path, int ret = -1; g_autoptr(virDomainDef) def = NULL; g_autoptr(virDomainDef) newdef = NULL; - int fd = -1; virQEMUSaveData *data = NULL; + virQEMUSaveFd saveFd = QEMU_SAVEFD_INVALID; + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); int state = -1; virCheckFlags(VIR_DOMAIN_SAVE_RUNNING | @@ -6006,10 +6013,9 @@ qemuDomainSaveImageDefineXML(virConnectPtr conn, const char *path, else if (flags & VIR_DOMAIN_SAVE_PAUSED) state = 0; - fd = qemuSaveImageOpen(driver, NULL, path, &def, &data, - false, NULL, true, false); - - if (fd < 0) + if (virQEMUSaveFdInit(&saveFd, path, 0, O_RDWR, cfg) < 0) + return -1; + if (qemuSaveImageOpen(driver, NULL, &def, &data, false, &saveFd) < 0) goto cleanup; if (virDomainSaveImageDefineXMLEnsureACL(conn, def) < 0) @@ -6036,15 +6042,15 @@ qemuDomainSaveImageDefineXML(virConnectPtr conn, const char *path, VIR_DOMAIN_XML_MIGRATABLE))) goto cleanup; - if (lseek(fd, 0, SEEK_SET) != 0) { + if (lseek(saveFd.fd, 0, SEEK_SET) != 0) { virReportSystemError(errno, _("cannot seek in '%s'"), path); goto cleanup; } - if (virQEMUSaveDataWrite(data, fd, path) < 0) + if (virQEMUSaveDataWrite(data, saveFd.fd, path) < 0) goto cleanup; - if (VIR_CLOSE(fd) < 0) { + if (virQEMUSaveFdClose(&saveFd, NULL) < 0) { virReportSystemError(errno, _("failed to write header data to '%s'"), path); goto cleanup; } @@ -6052,8 +6058,8 @@ qemuDomainSaveImageDefineXML(virConnectPtr conn, const char *path, ret = 0; cleanup: - VIR_FORCE_CLOSE(fd); virQEMUSaveDataFree(data); + ret = virQEMUSaveFdFini(&saveFd, NULL, ret); return ret; } @@ -6065,8 +6071,9 @@ qemuDomainManagedSaveGetXMLDesc(virDomainPtr dom, unsigned int flags) g_autofree char *path = NULL; char *ret = NULL; g_autoptr(virDomainDef) def = NULL; - int fd = -1; virQEMUSaveData *data = NULL; + virQEMUSaveFd saveFd = QEMU_SAVEFD_INVALID; + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); qemuDomainObjPrivate *priv; virCheckFlags(VIR_DOMAIN_SAVE_IMAGE_XML_SECURE, NULL); @@ -6088,15 +6095,18 @@ qemuDomainManagedSaveGetXMLDesc(virDomainPtr dom, unsigned int flags) goto cleanup; } - if ((fd = qemuSaveImageOpen(driver, priv->qemuCaps, path, &def, &data, - false, NULL, false, false)) < 0) + if (virQEMUSaveFdInit(&saveFd, path, 0, O_RDONLY, cfg) < 0) + goto cleanup; + if (qemuSaveImageOpen(driver, priv->qemuCaps, &def, &data, false, + &saveFd) < 0) goto cleanup; ret = qemuDomainDefFormatXML(driver, priv->qemuCaps, def, flags); cleanup: virQEMUSaveDataFree(data); - VIR_FORCE_CLOSE(fd); + if (virQEMUSaveFdFini(&saveFd, vm, ret ? 0 : -1) < 0) + ret = NULL; virDomainObjEndAPI(&vm); return ret; } @@ -6147,20 +6157,30 @@ qemuDomainObjRestore(virConnectPtr conn, { g_autoptr(virDomainDef) def = NULL; qemuDomainObjPrivate *priv = vm->privateData; - int fd = -1; int ret = -1; g_autofree char *xmlout = NULL; virQEMUSaveData *data = NULL; - virFileWrapperFd *wrapperFd = NULL; + virQEMUSaveFd saveFd = QEMU_SAVEFD_INVALID; + int oflags = O_RDONLY; + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); - fd = qemuSaveImageOpen(driver, NULL, path, &def, &data, - bypass_cache, &wrapperFd, false, true); - if (fd < 0) { - if (fd == -3) + if (bypass_cache) { + if (virFileDirectFdFlag() < 0) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("bypass cache unsupported by this system")); + return -1; + } + oflags |= O_DIRECT; + } + if (virQEMUSaveFdInit(&saveFd, path, 0, oflags, cfg) < 0) + goto cleanup; + ret = qemuSaveImageOpen(driver, NULL, &def, &data, true, &saveFd); + if (ret < 0) { + if (ret == -3) ret = 1; goto cleanup; } - + ret = -1; if (virHookPresent(VIR_HOOK_DRIVER_QEMU)) { int hookret; @@ -6200,15 +6220,12 @@ qemuDomainObjRestore(virConnectPtr conn, virDomainObjAssignDef(vm, &def, true, NULL); - ret = qemuSaveImageStartVM(conn, driver, vm, &fd, data, path, + ret = qemuSaveImageStartVM(conn, driver, vm, &saveFd.fd, data, path, start_paused, reset_nvram, asyncJob); cleanup: virQEMUSaveDataFree(data); - VIR_FORCE_CLOSE(fd); - if (virFileWrapperFdClose(wrapperFd) < 0) - ret = -1; - virFileWrapperFdFree(wrapperFd); + ret = virQEMUSaveFdFini(&saveFd, vm, ret); return ret; } diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 30e9a16307..33807fe373 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -514,94 +514,53 @@ qemuSaveImageGetCompressionProgram(const char *imageFormat, * @path: path of the save image * @ret_def: returns domain definition created from the XML stored in the image * @ret_data: returns structure filled with data from the image header - * @bypass_cache: bypass cache when opening the file - * @wrapperFd: returns the file wrapper structure - * @open_write: open the file for writing (for updates) - * @unlink_corrupt: remove the image file if it is corrupted + * @unlink_corrupt: mark the image file for removal if it is corrupted + * @saveFd: the save file * - * Returns the opened fd of the save image file and fills the appropriate fields - * on success. On error returns -1 on most failures, -3 if corrupt image was - * unlinked (no error raised). + * Returns 0 on success or -1 on failure. + * -3 is a special failure in which the saveFd has been marked for unlinking. + * On success, the appropriate fields are filled. */ int qemuSaveImageOpen(virQEMUDriver *driver, virQEMUCaps *qemuCaps, - const char *path, virDomainDef **ret_def, virQEMUSaveData **ret_data, - bool bypass_cache, - virFileWrapperFd **wrapperFd, - bool open_write, - bool unlink_corrupt) + bool unlink_corrupt, + virQEMUSaveFd *saveFd) { - g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); - VIR_AUTOCLOSE fd = -1; - int ret = -1; g_autoptr(virQEMUSaveData) data = NULL; virQEMUSaveHeader *header; g_autoptr(virDomainDef) def = NULL; - int oflags = open_write ? O_RDWR : O_RDONLY; size_t xml_len; size_t cookie_len; - if (bypass_cache) { - int directFlag = virFileDirectFdFlag(); - if (directFlag < 0) { - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("bypass cache unsupported by this system")); - return -1; - } - oflags |= directFlag; - } - - if ((fd = qemuDomainOpenFile(cfg, NULL, path, oflags, NULL)) < 0) - return -1; - - if (bypass_cache && - !(*wrapperFd = virFileWrapperFdNew(&fd, path, - VIR_FILE_WRAPPER_BYPASS_CACHE))) - return -1; data = g_new0(virQEMUSaveData, 1); header = &data->header; - if (saferead(fd, header, sizeof(*header)) != sizeof(*header)) { - if (unlink_corrupt) { - if (unlink(path) < 0) { - virReportSystemError(errno, - _("cannot remove corrupt file: %s"), - path); - return -1; - } else { - return -3; - } - } - + if (saferead(saveFd->fd, header, sizeof(*header)) != sizeof(*header)) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("failed to read qemu header")); + if (unlink_corrupt) { + saveFd->need_unlink = true; + return -3; + } return -1; } if (memcmp(header->magic, QEMU_SAVE_MAGIC, sizeof(header->magic)) != 0) { if (memcmp(header->magic, QEMU_SAVE_PARTIAL, sizeof(header->magic)) == 0) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("save image is incomplete")); if (unlink_corrupt) { - if (unlink(path) < 0) { - virReportSystemError(errno, - _("cannot remove corrupt file: %s"), - path); - return -1; - } else { - return -3; - } + saveFd->need_unlink = true; + return -3; } - + } else { virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("save image is incomplete")); - return -1; + _("image magic is incorrect")); } - - virReportError(VIR_ERR_OPERATION_FAILED, "%s", - _("image magic is incorrect")); return -1; } @@ -632,7 +591,7 @@ qemuSaveImageOpen(virQEMUDriver *driver, data->xml = g_new0(char, xml_len); - if (saferead(fd, data->xml, xml_len) != xml_len) { + if (saferead(saveFd->fd, data->xml, xml_len) != xml_len) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("failed to read domain XML")); return -1; @@ -641,7 +600,7 @@ qemuSaveImageOpen(virQEMUDriver *driver, if (cookie_len > 0) { data->cookie = g_new0(char, cookie_len); - if (saferead(fd, data->cookie, cookie_len) != cookie_len) { + if (saferead(saveFd->fd, data->cookie, cookie_len) != cookie_len) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", _("failed to read cookie")); return -1; @@ -657,10 +616,7 @@ qemuSaveImageOpen(virQEMUDriver *driver, *ret_def = g_steal_pointer(&def); *ret_data = g_steal_pointer(&data); - ret = fd; - fd = -1; - - return ret; + return 0; } int diff --git a/src/qemu/qemu_saveimage.h b/src/qemu/qemu_saveimage.h index 41937e5eb5..5dc63f3661 100644 --- a/src/qemu/qemu_saveimage.h +++ b/src/qemu/qemu_saveimage.h @@ -92,14 +92,11 @@ qemuSaveImageStartVM(virConnectPtr conn, int qemuSaveImageOpen(virQEMUDriver *driver, virQEMUCaps *qemuCaps, - const char *path, virDomainDef **ret_def, virQEMUSaveData **ret_data, - bool bypass_cache, - virFileWrapperFd **wrapperFd, - bool open_write, - bool unlink_corrupt) - ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4); + bool unlink_corrupt, + virQEMUSaveFd *saveFd) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(6); int qemuSaveImageGetCompressionProgram(const char *imageFormat, -- 2.35.3

add APIs to Create, Close and Free MultiFD files to associate with multifd channels. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_saveimage.c | 97 +++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_saveimage.h | 11 +++++ 2 files changed, 108 insertions(+) diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 33807fe373..264868b5a4 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -365,6 +365,103 @@ int virQEMUSaveFdFini(virQEMUSaveFd *saveFd, virDomainObj *vm, int ret) return ret; } +/* + * qemuSaveImageFreeMultiFd: free all multifd virQEMUSaveFds. + * @multiFd: the array of saveFds + * @vm: the virDomainObj, to release lock + * @nconn: number of multifd channels + * @ret: the current operation result (< 0 is failure) + * + * If multiFd is NULL, the return value will be unchanged. + * + * Returns ret, or -1 if an error is detected. + */ +int qemuSaveImageFreeMultiFd(virQEMUSaveFd *multiFd, virDomainObj *vm, int nconn, int ret) +{ + int idx; + + if (!multiFd) + return ret; + + for (idx = 0; idx < nconn; idx++) { + ret = virQEMUSaveFdFini(&multiFd[idx], vm, ret); + } + /* + * do it again to unlink all in the error case, + * if error happened in the middle of previous loop. + */ + for (idx = 0; idx < nconn; idx++) { + ret = virQEMUSaveFdFini(&multiFd[idx], vm, ret); + } + g_free(multiFd); + return ret; +} + +/* + * qemuSaveImageCloseMultiFd: perform normal close on all multifd virQEMUSaveFds. + * + * @multiFd: the array of saveFds + * @nconn: number of multifd channels + * @vm: the virDomainObj, to release lock + * + * If multiFd is NULL, the function will return success. + * Returns -1 on error, 0 on success. + */ +int qemuSaveImageCloseMultiFd(virQEMUSaveFd *multiFd, int nconn, virDomainObj *vm) +{ + int idx; + + if (!multiFd) + return 0; + + for (idx = 0; idx < nconn; idx++) { + if (virQEMUSaveFdClose(&multiFd[idx], vm) < 0) { + return -1; + } + } + return 0; +} + +/* + * qemuSaveImageCreateMultiFd: allocate and initialize all multifd virQEMUSaveFds. + * + * @driver: qemu driver data + * @vm: the virDomainObj + * @cmd: the existing multifd helper command, to pass each fd as argument. + * @path: pathname of the main file. + * @oflags: the open flags desired, to be passed to virQEMUSaveFdInit. + * @cfg: the driver config + * @nconn: number of channel files to create or open, depending on oflags. + * + * Returns the new array of virQEMUSaveFds, or NULL on error. + */ +virQEMUSaveFd * +qemuSaveImageCreateMultiFd(virQEMUDriver *driver, virDomainObj *vm, + virCommand *cmd, const char *path, + int oflags, virQEMUDriverConfig *cfg, + int nconn) +{ + virQEMUSaveFd *multiFd = g_new0(virQEMUSaveFd, nconn); + int idx; + + for (idx = 0; idx < nconn; idx++) { + virQEMUSaveFd *m = &multiFd[idx]; + if (virQEMUSaveFdInit(m, path, idx + 1, oflags, cfg) < 0 || + qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, m->fd) < 0) { + + virQEMUSaveFdFini(m, vm, -1); + goto error; + } + virCommandAddArgFormat(cmd, "%d", m->fd); + virCommandPassFD(cmd, m->fd, 0); + } + return multiFd; + + error: + qemuSaveImageFreeMultiFd(multiFd, vm, nconn, -1); + return NULL; +} + /* Helper function to execute a migration to file with a correct save header * the caller needs to make sure that the processors are stopped and do all other diff --git a/src/qemu/qemu_saveimage.h b/src/qemu/qemu_saveimage.h index 5dc63f3661..b775c5eb08 100644 --- a/src/qemu/qemu_saveimage.h +++ b/src/qemu/qemu_saveimage.h @@ -72,6 +72,17 @@ int virQEMUSaveFdClose(virQEMUSaveFd *saveFd, virDomainObj *vm); int virQEMUSaveFdFini(virQEMUSaveFd *saveFd, virDomainObj *vm, int ret); +virQEMUSaveFd * +qemuSaveImageCreateMultiFd(virQEMUDriver *driver, virDomainObj *vm, + virCommand *cmd, const char *path, + int oflags, virQEMUDriverConfig *cfg, + int nconn) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(6); + +int qemuSaveImageCloseMultiFd(virQEMUSaveFd *multiFd, int nconn, virDomainObj *vm); + +int qemuSaveImageFreeMultiFd(virQEMUSaveFd *multiFd, virDomainObj *vm, int nconn, int ret); + virDomainDef * qemuSaveImageUpdateDef(virQEMUDriver *driver, virDomainDef *def, -- 2.35.3

use the multifd helper and the new virQEMUSaveFd APIs for multifd. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_saveimage.c | 42 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 264868b5a4..2a9ef622be 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -17,6 +17,7 @@ */ #include <config.h> +#include <configmake.h> #include "qemu_saveimage.h" #include "qemu_domain.h" @@ -478,6 +479,7 @@ qemuSaveImageCreate(virQEMUDriver *driver, { g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); virQEMUSaveFd saveFd = QEMU_SAVEFD_INVALID; + virQEMUSaveFd *multiFd = NULL; unsigned int oflags = O_WRONLY | O_TRUNC | O_CREAT; int ret = -1; @@ -497,9 +499,42 @@ qemuSaveImageCreate(virQEMUDriver *driver, if (virQEMUSaveDataWrite(data, saveFd.fd, saveFd.path) < 0) goto cleanup; - /* Perform the migration */ - if (qemuMigrationSrcToFile(driver, vm, saveFd.fd, compressor, asyncJob) < 0) - goto cleanup; + if (flags & VIR_DOMAIN_SAVE_PARALLEL) { + g_autoptr(virCommand) cmd = NULL; + g_autofree char *helper_path = NULL; + qemuDomainObjPrivate *priv = vm->privateData; + g_autofree char *sun_path = g_strdup_printf("%s/save-multifd.sock", priv->libDir); + char buf[1]; + int helper_out = -1; + if (!(helper_path = virFileFindResource("libvirt_multifd_helper", + abs_top_builddir "/src", + LIBEXECDIR))) + goto cleanup; + cmd = virCommandNewArgList(helper_path, sun_path, NULL); + virCommandAddArgFormat(cmd, "%d", nconn); + virCommandAddArgFormat(cmd, "%d", saveFd.fd); + virCommandPassFD(cmd, saveFd.fd, 0); + virCommandSetOutputFD(cmd, &helper_out); /* should create pipe automagically */ + + /* Perform parallel multifd migration to files (main fd + channels) */ + if (!(multiFd = qemuSaveImageCreateMultiFd(driver, vm, cmd, saveFd.path, oflags, cfg, nconn))) + goto cleanup; + if (virCommandRunAsync(cmd, NULL) < 0) + goto cleanup; + if (saferead(helper_out, &buf, 1) != 1 || buf[0] != 'R') + goto cleanup; + if (chown(sun_path, cfg->user, cfg->group) < 0) + goto cleanup; + /* still using single fd migration for now */ + if (qemuMigrationSrcToFile(driver, vm, saveFd.fd, compressor, asyncJob) < 0) + goto cleanup; + if (qemuSaveImageCloseMultiFd(multiFd, nconn, vm) < 0) + goto cleanup; + } else { + /* Perform non-parallel migration to file */ + if (qemuMigrationSrcToFile(driver, vm, saveFd.fd, compressor, asyncJob) < 0) + goto cleanup; + } if (virQEMUSaveFdClose(&saveFd, vm) < 0) goto cleanup; @@ -518,6 +553,7 @@ qemuSaveImageCreate(virQEMUDriver *driver, ret = 0; cleanup: + ret = qemuSaveImageFreeMultiFd(multiFd, vm, nconn, ret); ret = virQEMUSaveFdFini(&saveFd, vm, ret); return ret; } -- 2.35.3

Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_capabilities.c | 4 ++++ src/qemu/qemu_capabilities.h | 3 +++ tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml | 1 + tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml | 1 + 34 files changed, 39 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 1ed4cda7f0..581b6a40df 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -672,6 +672,9 @@ VIR_ENUM_IMPL(virQEMUCaps, "virtio-iommu-pci", /* QEMU_CAPS_DEVICE_VIRTIO_IOMMU_PCI */ "virtio-iommu.boot-bypass", /* QEMU_CAPS_VIRTIO_IOMMU_BOOT_BYPASS */ "virtio-net.rss", /* QEMU_CAPS_VIRTIO_NET_RSS */ + + /* 430 */ + "migrate-multifd", /* QEMU_CAPS_MIGRATE_MULTIFD */ ); @@ -1230,6 +1233,7 @@ struct virQEMUCapsStringFlags virQEMUCapsCommands[] = { struct virQEMUCapsStringFlags virQEMUCapsMigration[] = { { "rdma-pin-all", QEMU_CAPS_MIGRATE_RDMA }, + { "multifd", QEMU_CAPS_MIGRATE_MULTIFD }, }; /* Use virQEMUCapsQMPSchemaQueries for querying parameters of events */ diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 9b240e47fb..b089f83da1 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -648,6 +648,9 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_VIRTIO_IOMMU_BOOT_BYPASS, /* virtio-iommu.boot-bypass */ QEMU_CAPS_VIRTIO_NET_RSS, /* virtio-net rss feature */ + /* 430 */ + QEMU_CAPS_MIGRATE_MULTIFD, /* migrate can set multifd parameter */ + QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml index 5adf904fc4..4ca2cfa81c 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml @@ -148,6 +148,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml index a84adc2610..1db978eb4c 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml @@ -153,6 +153,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml b/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml index c494254c4d..251d4dfd29 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml @@ -145,6 +145,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml index d2582fa297..a4af47c6a4 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml @@ -145,6 +145,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml b/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml index 4f36186044..2bab764867 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml @@ -115,6 +115,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml index 18e5ebd4f4..aa8a9812e5 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml @@ -188,6 +188,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml index 12c5ebe6f3..bd89f0c6b2 100644 --- a/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml @@ -195,6 +195,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml index ee536b7b63..369ef707b9 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml @@ -163,6 +163,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml index 10f5a9e2c5..16c867a46b 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml @@ -160,6 +160,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml b/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml index 069777a49b..b584ba7352 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml @@ -128,6 +128,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml index 6b61214a0b..5023028678 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml @@ -206,6 +206,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml index 4fd02e786d..c45b2e6cf6 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml @@ -175,6 +175,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml index f2f3558fdc..a3ad743d70 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml @@ -181,6 +181,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml index 557949d6d6..e1b5cac26b 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml @@ -167,6 +167,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml index f301d8a926..796adb9066 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml @@ -215,6 +215,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml b/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml index 3a330ebdc0..cb203df125 100644 --- a/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml +++ b/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml @@ -87,6 +87,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='memory-backend-file.prealloc-threads'/> + <flag name='migrate-multifd'/> <version>5001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml index 53fcbf3417..7479d942a2 100644 --- a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml @@ -219,6 +219,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml index 824224302c..268d1444ad 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml @@ -182,6 +182,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml index b949f88b5a..eabf4b600c 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml @@ -186,6 +186,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml index 873923992d..0dbaf5a5ec 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml @@ -172,6 +172,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml b/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml index 5e9560d7b7..b0fbab9cb5 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml @@ -140,6 +140,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml index 3998da9253..1a1717bf2a 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml @@ -223,6 +223,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml index 51d3628eeb..1c18d122e2 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml @@ -190,6 +190,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml b/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml index 2e5d0f197a..8fa4cb2307 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml @@ -148,6 +148,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml index 3498d6255b..70c67202b1 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml @@ -232,6 +232,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml index ddeca62290..a5ec77878f 100644 --- a/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml @@ -236,6 +236,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml index 5538940372..92d8ceff7e 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml @@ -201,6 +201,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml index 9c9d9aa08e..f219912927 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml @@ -197,6 +197,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml index dba5ecaf87..38fd3878ea 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml @@ -238,6 +238,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml index 257b0f625d..522e225c8f 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml @@ -209,6 +209,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002092</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml index 1ddca7d767..1eb43799c0 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml @@ -210,6 +210,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002092</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml index 8074c97ecd..e5023c4219 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml @@ -242,6 +242,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>7000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> -- 2.35.3

Qemu folks, It seems we do officially support multifd from version 4.0 : commit cbfd6c957a4437d4759ca660e621daa381bf2898 Author: Juan Quintela <quintela@redhat.com> Date: Wed Feb 6 13:54:06 2019 +0100 multifd: Drop x- We make it supported from now on. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> $ git tag --contains cbfd6c957a4437d4759ca660e621daa381bf2898 | sort -V | grep -v list | head -1 v4.0.0 Yet it seems we continue to prefix the migration property with "x-" (x-multifd). This prop was added here and we have continued to use it as is: commit 30126bbf1f7fcad0bf4c65b01a21ff22a36a9759 Author: Juan Quintela <quintela@redhat.com> Date: Thu Jan 14 12:23:00 2016 +0100 migration: Add multifd capability Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Can anyone explain why? On Sat, May 7, 2022 at 7:13 PM Claudio Fontana <cfontana@suse.de> wrote:
Signed-off-by: Claudio Fontana <cfontana@suse.de>
other than the question above, Reviewed-by: Ani Sinha <ani@anisinha.ca>
--- src/qemu/qemu_capabilities.c | 4 ++++ src/qemu/qemu_capabilities.h | 3 +++ tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml | 1 + tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml | 1 + 34 files changed, 39 insertions(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 1ed4cda7f0..581b6a40df 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -672,6 +672,9 @@ VIR_ENUM_IMPL(virQEMUCaps, "virtio-iommu-pci", /* QEMU_CAPS_DEVICE_VIRTIO_IOMMU_PCI */ "virtio-iommu.boot-bypass", /* QEMU_CAPS_VIRTIO_IOMMU_BOOT_BYPASS */ "virtio-net.rss", /* QEMU_CAPS_VIRTIO_NET_RSS */ + + /* 430 */ + "migrate-multifd", /* QEMU_CAPS_MIGRATE_MULTIFD */ );
@@ -1230,6 +1233,7 @@ struct virQEMUCapsStringFlags virQEMUCapsCommands[] = {
struct virQEMUCapsStringFlags virQEMUCapsMigration[] = { { "rdma-pin-all", QEMU_CAPS_MIGRATE_RDMA }, + { "multifd", QEMU_CAPS_MIGRATE_MULTIFD }, };
/* Use virQEMUCapsQMPSchemaQueries for querying parameters of events */ diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 9b240e47fb..b089f83da1 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -648,6 +648,9 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_VIRTIO_IOMMU_BOOT_BYPASS, /* virtio-iommu.boot-bypass */ QEMU_CAPS_VIRTIO_NET_RSS, /* virtio-net rss feature */
+ /* 430 */ + QEMU_CAPS_MIGRATE_MULTIFD, /* migrate can set multifd parameter */ + QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags;
diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml index 5adf904fc4..4ca2cfa81c 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml @@ -148,6 +148,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml index a84adc2610..1db978eb4c 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml @@ -153,6 +153,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml b/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml index c494254c4d..251d4dfd29 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml @@ -145,6 +145,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml index d2582fa297..a4af47c6a4 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml @@ -145,6 +145,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml b/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml index 4f36186044..2bab764867 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml @@ -115,6 +115,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml index 18e5ebd4f4..aa8a9812e5 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml @@ -188,6 +188,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml index 12c5ebe6f3..bd89f0c6b2 100644 --- a/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml @@ -195,6 +195,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml index ee536b7b63..369ef707b9 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml @@ -163,6 +163,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml index 10f5a9e2c5..16c867a46b 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml @@ -160,6 +160,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml b/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml index 069777a49b..b584ba7352 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml @@ -128,6 +128,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml index 6b61214a0b..5023028678 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml @@ -206,6 +206,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml index 4fd02e786d..c45b2e6cf6 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml @@ -175,6 +175,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml index f2f3558fdc..a3ad743d70 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml @@ -181,6 +181,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml index 557949d6d6..e1b5cac26b 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml @@ -167,6 +167,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml index f301d8a926..796adb9066 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml @@ -215,6 +215,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml b/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml index 3a330ebdc0..cb203df125 100644 --- a/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml +++ b/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml @@ -87,6 +87,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='memory-backend-file.prealloc-threads'/> + <flag name='migrate-multifd'/> <version>5001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml index 53fcbf3417..7479d942a2 100644 --- a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml @@ -219,6 +219,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml index 824224302c..268d1444ad 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml @@ -182,6 +182,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml index b949f88b5a..eabf4b600c 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml @@ -186,6 +186,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml index 873923992d..0dbaf5a5ec 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml @@ -172,6 +172,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml b/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml index 5e9560d7b7..b0fbab9cb5 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml @@ -140,6 +140,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml index 3998da9253..1a1717bf2a 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml @@ -223,6 +223,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml index 51d3628eeb..1c18d122e2 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml @@ -190,6 +190,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml b/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml index 2e5d0f197a..8fa4cb2307 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml @@ -148,6 +148,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml index 3498d6255b..70c67202b1 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml @@ -232,6 +232,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml index ddeca62290..a5ec77878f 100644 --- a/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml @@ -236,6 +236,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml index 5538940372..92d8ceff7e 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml @@ -201,6 +201,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml index 9c9d9aa08e..f219912927 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml @@ -197,6 +197,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml index dba5ecaf87..38fd3878ea 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml @@ -238,6 +238,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml index 257b0f625d..522e225c8f 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml @@ -209,6 +209,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002092</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml index 1ddca7d767..1eb43799c0 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml @@ -210,6 +210,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002092</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml index 8074c97ecd..e5023c4219 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml @@ -242,6 +242,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>7000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> -- 2.35.3

Hi Daniel, is this patch specifically controversial? I ask because this is likely to be painful to maintain. Thanks, Claudio On 5/7/22 3:43 PM, Claudio Fontana wrote:
Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_capabilities.c | 4 ++++ src/qemu/qemu_capabilities.h | 3 +++ tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml | 1 + tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml | 1 + 34 files changed, 39 insertions(+)
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 1ed4cda7f0..581b6a40df 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -672,6 +672,9 @@ VIR_ENUM_IMPL(virQEMUCaps, "virtio-iommu-pci", /* QEMU_CAPS_DEVICE_VIRTIO_IOMMU_PCI */ "virtio-iommu.boot-bypass", /* QEMU_CAPS_VIRTIO_IOMMU_BOOT_BYPASS */ "virtio-net.rss", /* QEMU_CAPS_VIRTIO_NET_RSS */ + + /* 430 */ + "migrate-multifd", /* QEMU_CAPS_MIGRATE_MULTIFD */ );
@@ -1230,6 +1233,7 @@ struct virQEMUCapsStringFlags virQEMUCapsCommands[] = {
struct virQEMUCapsStringFlags virQEMUCapsMigration[] = { { "rdma-pin-all", QEMU_CAPS_MIGRATE_RDMA }, + { "multifd", QEMU_CAPS_MIGRATE_MULTIFD }, };
/* Use virQEMUCapsQMPSchemaQueries for querying parameters of events */ diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 9b240e47fb..b089f83da1 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -648,6 +648,9 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_VIRTIO_IOMMU_BOOT_BYPASS, /* virtio-iommu.boot-bypass */ QEMU_CAPS_VIRTIO_NET_RSS, /* virtio-net rss feature */
+ /* 430 */ + QEMU_CAPS_MIGRATE_MULTIFD, /* migrate can set multifd parameter */ + QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags;
diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml index 5adf904fc4..4ca2cfa81c 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.aarch64.xml @@ -148,6 +148,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml index a84adc2610..1db978eb4c 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.ppc64.xml @@ -153,6 +153,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml b/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml index c494254c4d..251d4dfd29 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.riscv32.xml @@ -145,6 +145,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml index d2582fa297..a4af47c6a4 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.riscv64.xml @@ -145,6 +145,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml b/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml index 4f36186044..2bab764867 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.s390x.xml @@ -115,6 +115,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml index 18e5ebd4f4..aa8a9812e5 100644 --- a/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_4.0.0.x86_64.xml @@ -188,6 +188,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100240</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml index 12c5ebe6f3..bd89f0c6b2 100644 --- a/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_4.1.0.x86_64.xml @@ -195,6 +195,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml index ee536b7b63..369ef707b9 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml @@ -163,6 +163,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml index 10f5a9e2c5..16c867a46b 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.ppc64.xml @@ -160,6 +160,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml b/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml index 069777a49b..b584ba7352 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml @@ -128,6 +128,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml index 6b61214a0b..5023028678 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml @@ -206,6 +206,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='virtio-blk.queue-size'/> + <flag name='migrate-multifd'/> <version>4002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml index 4fd02e786d..c45b2e6cf6 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml @@ -175,6 +175,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml index f2f3558fdc..a3ad743d70 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml @@ -181,6 +181,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml index 557949d6d6..e1b5cac26b 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml @@ -167,6 +167,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml index f301d8a926..796adb9066 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml @@ -215,6 +215,7 @@ <flag name='virtio-blk.queue-size'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> + <flag name='migrate-multifd'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml b/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml index 3a330ebdc0..cb203df125 100644 --- a/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml +++ b/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml @@ -87,6 +87,7 @@ <flag name='input-linux'/> <flag name='query-display-options'/> <flag name='memory-backend-file.prealloc-threads'/> + <flag name='migrate-multifd'/> <version>5001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml index 53fcbf3417..7479d942a2 100644 --- a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml @@ -219,6 +219,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml index 824224302c..268d1444ad 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml @@ -182,6 +182,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml index b949f88b5a..eabf4b600c 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml @@ -186,6 +186,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml index 873923992d..0dbaf5a5ec 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml @@ -172,6 +172,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml b/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml index 5e9560d7b7..b0fbab9cb5 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml @@ -140,6 +140,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml index 3998da9253..1a1717bf2a 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml @@ -223,6 +223,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml index 51d3628eeb..1c18d122e2 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml @@ -190,6 +190,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml b/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml index 2e5d0f197a..8fa4cb2307 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml @@ -148,6 +148,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml index 3498d6255b..70c67202b1 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml @@ -232,6 +232,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml index ddeca62290..a5ec77878f 100644 --- a/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml @@ -236,6 +236,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml index 5538940372..92d8ceff7e 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml @@ -201,6 +201,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml index 9c9d9aa08e..f219912927 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml @@ -197,6 +197,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml index dba5ecaf87..38fd3878ea 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml @@ -238,6 +238,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml index 257b0f625d..522e225c8f 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml @@ -209,6 +209,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002092</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml index 1ddca7d767..1eb43799c0 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml @@ -210,6 +210,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>6002092</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml index 8074c97ecd..e5023c4219 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml @@ -242,6 +242,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> + <flag name='migrate-multifd'/> <version>7000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion>

On Wed, May 11, 2022 at 14:56:35 +0200, Claudio Fontana wrote:
Hi Daniel,
is this patch specifically controversial?
I ask because this is likely to be painful to maintain.
Note that in August of this year it will become obsolete. In august we'll be dropping support for debian-10 old qemu and thus bumping the minimum supported qemu to at least qemu-4.2. Additionally we have a whole separate machinery for probing the migration capabilities via the migration parameters, so you can theoreticaly replace it by that code, especially since you'll need to set the multifd capability when attempting the migration anyways.

On 5/11/22 3:17 PM, Peter Krempa wrote:
On Wed, May 11, 2022 at 14:56:35 +0200, Claudio Fontana wrote:
Hi Daniel,
is this patch specifically controversial?
I ask because this is likely to be painful to maintain.
Note that in August of this year it will become obsolete. In august we'll be dropping support for debian-10 old qemu and thus bumping the minimum supported qemu to at least qemu-4.2.
Additionally we have a whole separate machinery for probing the migration capabilities via the migration parameters, so you can theoreticaly replace it by that code, especially since you'll need to set the multifd capability when attempting the migration anyways.
Thanks for the information, is the whole separate machinery the part I am doing in this series in patch 20/27 with the multifd-compression parameter? Or something else entirely? Could you point me at the ropes so to speak? Thanks, C

On Wed, May 11, 2022 at 15:25:49 +0200, Claudio Fontana wrote:
On 5/11/22 3:17 PM, Peter Krempa wrote:
On Wed, May 11, 2022 at 14:56:35 +0200, Claudio Fontana wrote:
Hi Daniel,
is this patch specifically controversial?
I ask because this is likely to be painful to maintain.
Note that in August of this year it will become obsolete. In august we'll be dropping support for debian-10 old qemu and thus bumping the minimum supported qemu to at least qemu-4.2.
Additionally we have a whole separate machinery for probing the migration capabilities via the migration parameters, so you can theoreticaly replace it by that code, especially since you'll need to set the multifd capability when attempting the migration anyways.
Thanks for the information, is the whole separate machinery the part I am doing in this series in patch 20/27 with the multifd-compression parameter?
Or something else entirely? Could you point me at the ropes so to speak?
Basically yes. You should be able to use 'qemuMigrationCapsGet' with 'QEMU_MIGRATION_CAP_MULTIFD' to probe whether qemu supports it instead of having to add the detection yourself via qemuCaps.

add both multifd compression and number of multifd channels fields in the same commit, in order to change the format to version 3 in one go. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_saveimage.c | 19 ++++++++++++++++++- src/qemu/qemu_saveimage.h | 8 +++++--- 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 2a9ef622be..cb6efe2338 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -67,6 +67,23 @@ VIR_ENUM_IMPL(qemuSaveCompression, "lzop", ); +typedef enum { + QEMU_SAVE_MULTIFD_COMP_NONE = 0, + QEMU_SAVE_MULTIFD_COMP_ZLIB = 1, + QEMU_SAVE_MULTIFD_COMP_ZSTD = 2, + + /* used for the on-disk format, do not change/re-use numbers */ + QEMU_SAVE_MULTIFD_COMP_LAST +} virQEMUSaveMultiFdComp; + +VIR_ENUM_DECL(qemuSaveMultiFdComp); +VIR_ENUM_IMPL(qemuSaveMultiFdComp, + QEMU_SAVE_MULTIFD_COMP_LAST, + "none", + "zlib", + "zstd", +); + static inline void qemuSaveImageBswapHeader(virQEMUSaveHeader *hdr) { @@ -784,7 +801,7 @@ qemuSaveImageStartVM(virConnectPtr conn, virDomainXMLOptionGetSaveCookie(driver->xmlopt)) < 0) goto cleanup; - if ((header->version == 2) && + if ((header->version >= 2) && (header->compressed != QEMU_SAVE_FORMAT_RAW)) { if (!(cmd = qemuSaveImageGetCompressionCommand(header->compressed))) goto cleanup; diff --git a/src/qemu/qemu_saveimage.h b/src/qemu/qemu_saveimage.h index b775c5eb08..356fc7561e 100644 --- a/src/qemu/qemu_saveimage.h +++ b/src/qemu/qemu_saveimage.h @@ -30,7 +30,7 @@ */ #define QEMU_SAVE_MAGIC "LibvirtQemudSave" #define QEMU_SAVE_PARTIAL "LibvirtQemudPart" -#define QEMU_SAVE_VERSION 2 +#define QEMU_SAVE_VERSION 3 G_STATIC_ASSERT(sizeof(QEMU_SAVE_MAGIC) == sizeof(QEMU_SAVE_PARTIAL)); @@ -42,8 +42,10 @@ struct _virQEMUSaveHeader { uint32_t was_running; uint32_t compressed; uint32_t cookieOffset; - uint32_t unused[14]; -}; + uint16_t multifd_channels; + uint16_t multifd_comp; + uint32_t unused[13]; +} ATTRIBUTE_PACKED; typedef struct _virQEMUSaveData virQEMUSaveData; -- 2.35.3

similarly to qemuMigrationParamsSetULL, we need to be able to set fields from qemu_saveimage. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_migration_params.c | 22 ++++++++++++++++++++++ src/qemu/qemu_migration_params.h | 9 +++++++++ 2 files changed, 31 insertions(+) diff --git a/src/qemu/qemu_migration_params.c b/src/qemu/qemu_migration_params.c index df2384b213..36174a66d8 100644 --- a/src/qemu/qemu_migration_params.c +++ b/src/qemu/qemu_migration_params.c @@ -1109,6 +1109,28 @@ qemuMigrationParamsFetch(virQEMUDriver *driver, } +void +qemuMigrationParamsSetCap(qemuMigrationParams *migParams, + virQEMUCapsFlags flag) +{ + ignore_value(virBitmapSetBit(migParams->caps, flag)); +} + + +int +qemuMigrationParamsSetInt(qemuMigrationParams *migParams, + qemuMigrationParam param, + int value) +{ + if (qemuMigrationParamsCheckType(param, QEMU_MIGRATION_PARAM_TYPE_INT) < 0) + return -1; + + migParams->params[param].value.i = value; + migParams->params[param].set = true; + return 0; +} + + int qemuMigrationParamsSetULL(qemuMigrationParams *migParams, qemuMigrationParam param, diff --git a/src/qemu/qemu_migration_params.h b/src/qemu/qemu_migration_params.h index 4a8815e776..99af73b4a4 100644 --- a/src/qemu/qemu_migration_params.h +++ b/src/qemu/qemu_migration_params.h @@ -123,6 +123,15 @@ qemuMigrationParamsFetch(virQEMUDriver *driver, int asyncJob, qemuMigrationParams **migParams); +void +qemuMigrationParamsSetCap(qemuMigrationParams *migParams, + virQEMUCapsFlags flag); + +int +qemuMigrationParamsSetInt(qemuMigrationParams *migParams, + qemuMigrationParam param, + int value); + int qemuMigrationParamsSetULL(qemuMigrationParams *migParams, qemuMigrationParam param, -- 2.35.3

implement a function similar to qemuMigrationSrcToFile that migrates to multiple files using QEMU multifd. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_migration.c | 131 +++++++++++++++++++++++++------------- src/qemu/qemu_migration.h | 7 ++ src/qemu/qemu_saveimage.c | 9 ++- 3 files changed, 99 insertions(+), 48 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index b735bdb391..004e84556c 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -5896,13 +5896,14 @@ qemuMigrationDstFinish(virQEMUDriver *driver, return dom; } - /* Helper function called while vm is active. */ -int -qemuMigrationSrcToFile(virQEMUDriver *driver, virDomainObj *vm, - int fd, - virCommand *compressor, - virDomainAsyncJob asyncJob) +static int +qemuMigrationSrcToFileAux(virQEMUDriver *driver, virDomainObj *vm, + int fd, + virCommand *compressor, + virDomainAsyncJob asyncJob, + const char *sun_path, + int nchannels) { qemuDomainObjPrivate *priv = vm->privateData; bool bwParam = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_PARAM_BANDWIDTH); @@ -5913,24 +5914,26 @@ qemuMigrationSrcToFile(virQEMUDriver *driver, virDomainObj *vm, char *errbuf = NULL; virErrorPtr orig_err = NULL; g_autoptr(qemuMigrationParams) migParams = NULL; + bool needParams = (bwParam || sun_path); + if (sun_path && !virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATE_MULTIFD)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("QEMU does not seem to support multifd migration, required for parallel migration to files")); + return -1; + } if (qemuMigrationSetDBusVMState(driver, vm) < 0) return -1; /* Increase migration bandwidth to unlimited since target is a file. * Failure to change migration speed is not fatal. */ - if (bwParam) { - if (!(migParams = qemuMigrationParamsNew())) - return -1; + if (needParams && !((migParams = qemuMigrationParamsNew()))) + return -1; + if (bwParam) { if (qemuMigrationParamsSetULL(migParams, QEMU_MIGRATION_PARAM_MAX_BANDWIDTH, QEMU_DOMAIN_MIG_BANDWIDTH_MAX * 1024 * 1024) < 0) return -1; - - if (qemuMigrationParamsApply(driver, vm, asyncJob, migParams) < 0) - return -1; - priv->migMaxBandwidth = QEMU_DOMAIN_MIG_BANDWIDTH_MAX; } else { if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) == 0) { @@ -5941,6 +5944,17 @@ qemuMigrationSrcToFile(virQEMUDriver *driver, virDomainObj *vm, } } + if (sun_path) { + qemuMigrationParamsSetCap(migParams, QEMU_MIGRATION_CAP_MULTIFD); + if (qemuMigrationParamsSetInt(migParams, + QEMU_MIGRATION_PARAM_MULTIFD_CHANNELS, + nchannels) < 0) + return -1; + } + + if (needParams && qemuMigrationParamsApply(driver, vm, asyncJob, migParams) < 0) + return -1; + if (!virDomainObjIsActive(vm)) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("guest unexpectedly quit")); @@ -5948,45 +5962,53 @@ qemuMigrationSrcToFile(virQEMUDriver *driver, virDomainObj *vm, return -1; } - if (compressor && virPipe(pipeFD) < 0) + if (!sun_path && compressor && virPipe(pipeFD) < 0) return -1; - /* All right! We can use fd migration, which means that qemu - * doesn't have to open() the file, so while we still have to - * grant SELinux access, we can do it on fd and avoid cleanup - * later, as well as skip futzing with cgroup. */ - if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, - compressor ? pipeFD[1] : fd) < 0) - goto cleanup; - if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) goto cleanup; - if (!compressor) { - rc = qemuMonitorMigrateToFd(priv->mon, - QEMU_MONITOR_MIGRATE_BACKGROUND, - fd); + if (sun_path) { + rc = qemuMonitorMigrateToSocket(priv->mon, + QEMU_MONITOR_MIGRATE_BACKGROUND, + sun_path); } else { - virCommandSetInputFD(compressor, pipeFD[0]); - virCommandSetOutputFD(compressor, &fd); - virCommandSetErrorBuffer(compressor, &errbuf); - virCommandDoAsyncIO(compressor); - if (virSetCloseExec(pipeFD[1]) < 0) { - virReportSystemError(errno, "%s", - _("Unable to set cloexec flag")); - qemuDomainObjExitMonitor(vm); - goto cleanup; - } - if (virCommandRunAsync(compressor, NULL) < 0) { - qemuDomainObjExitMonitor(vm); + /* + * All right! We can use fd migration, which means that qemu + * doesn't have to open() the file, so while we still have to + * grant SELinux access, we can do it on fd and avoid cleanup + * later, as well as skip futzing with cgroup. + */ + if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, + compressor ? pipeFD[1] : fd) < 0) goto cleanup; + + if (!compressor) { + rc = qemuMonitorMigrateToFd(priv->mon, + QEMU_MONITOR_MIGRATE_BACKGROUND, + fd); + } else { + virCommandSetInputFD(compressor, pipeFD[0]); + virCommandSetOutputFD(compressor, &fd); + virCommandSetErrorBuffer(compressor, &errbuf); + virCommandDoAsyncIO(compressor); + if (virSetCloseExec(pipeFD[1]) < 0) { + virReportSystemError(errno, "%s", + _("Unable to set cloexec flag")); + qemuDomainObjExitMonitor(vm); + goto cleanup; + } + if (virCommandRunAsync(compressor, NULL) < 0) { + qemuDomainObjExitMonitor(vm); + goto cleanup; + } + rc = qemuMonitorMigrateToFd(priv->mon, + QEMU_MONITOR_MIGRATE_BACKGROUND, + pipeFD[1]); + if (VIR_CLOSE(pipeFD[0]) < 0 || + VIR_CLOSE(pipeFD[1]) < 0) + VIR_WARN("failed to close intermediate pipe"); } - rc = qemuMonitorMigrateToFd(priv->mon, - QEMU_MONITOR_MIGRATE_BACKGROUND, - pipeFD[1]); - if (VIR_CLOSE(pipeFD[0]) < 0 || - VIR_CLOSE(pipeFD[1]) < 0) - VIR_WARN("failed to close intermediate pipe"); } qemuDomainObjExitMonitor(vm); if (rc < 0) @@ -6007,7 +6029,7 @@ qemuMigrationSrcToFile(virQEMUDriver *driver, virDomainObj *vm, goto cleanup; } - if (compressor && virCommandWait(compressor, NULL) < 0) + if (!sun_path && compressor && virCommandWait(compressor, NULL) < 0) goto cleanup; qemuDomainEventEmitJobCompleted(driver, vm); @@ -6046,6 +6068,25 @@ qemuMigrationSrcToFile(virQEMUDriver *driver, virDomainObj *vm, return ret; } +int +qemuMigrationSrcToFile(virQEMUDriver *driver, virDomainObj *vm, + int fd, + virCommand *compressor, + virDomainAsyncJob asyncJob) +{ + return qemuMigrationSrcToFileAux(driver, vm, fd, compressor, + asyncJob, NULL, -1); +} + +int +qemuMigrationSrcToFilesMultiFd(virQEMUDriver *driver, virDomainObj *vm, + virDomainAsyncJob asyncJob, + const char *sun_path, + int nchannels) +{ + return qemuMigrationSrcToFileAux(driver, vm, -1, NULL, + asyncJob, sun_path, nchannels); +} int qemuMigrationSrcCancel(virQEMUDriver *driver, diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index a8afa66119..ddc8e65489 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -213,6 +213,13 @@ qemuMigrationSrcToFile(virQEMUDriver *driver, virDomainAsyncJob asyncJob) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; +int +qemuMigrationSrcToFilesMultiFd(virQEMUDriver *driver, virDomainObj *vm, + virDomainAsyncJob asyncJob, + const char *sun_path, + int nchannels) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; + int qemuMigrationSrcCancel(virQEMUDriver *driver, virDomainObj *vm); diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index cb6efe2338..485e3163b7 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -490,7 +490,7 @@ qemuSaveImageCreate(virQEMUDriver *driver, const char *path, virQEMUSaveData *data, virCommand *compressor, - int nconn G_GNUC_UNUSED, + int nconn, unsigned int flags, virDomainAsyncJob asyncJob) { @@ -513,6 +513,10 @@ qemuSaveImageCreate(virQEMUDriver *driver, goto cleanup; if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, saveFd.fd) < 0) goto cleanup; + + if (nconn > 0) + data->header.multifd_channels = nconn; + if (virQEMUSaveDataWrite(data, saveFd.fd, saveFd.path) < 0) goto cleanup; @@ -542,8 +546,7 @@ qemuSaveImageCreate(virQEMUDriver *driver, goto cleanup; if (chown(sun_path, cfg->user, cfg->group) < 0) goto cleanup; - /* still using single fd migration for now */ - if (qemuMigrationSrcToFile(driver, vm, saveFd.fd, compressor, asyncJob) < 0) + if (qemuMigrationSrcToFilesMultiFd(driver, vm, asyncJob, sun_path, nconn) < 0) goto cleanup; if (qemuSaveImageCloseMultiFd(multiFd, nconn, vm) < 0) goto cleanup; -- 2.35.3

The distinction on whether to wait for the migration completion or not was made on the async job type, but with the future addition of multifd migration from files, we need a way to avoid waiting, so we can prepare multifd migration parameters before starting the transfers. Adapt all callers. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_driver.c | 8 ++++---- src/qemu/qemu_migration.c | 18 ++++++++++-------- src/qemu/qemu_migration.h | 3 ++- src/qemu/qemu_process.c | 3 ++- src/qemu/qemu_process.h | 5 +++-- src/qemu/qemu_saveimage.c | 4 +++- src/qemu/qemu_saveimage.h | 1 + src/qemu/qemu_snapshot.c | 4 ++-- 8 files changed, 27 insertions(+), 19 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index f3d5f3937d..0e8dd7748c 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -1630,7 +1630,7 @@ static virDomainPtr qemuDomainCreateXML(virConnectPtr conn, } if (qemuProcessStart(conn, driver, vm, NULL, VIR_ASYNC_JOB_START, - NULL, -1, NULL, NULL, + NULL, -1, NULL, false, NULL, VIR_NETDEV_VPORT_PROFILE_OP_CREATE, start_flags) < 0) { virDomainAuditStart(vm, "booted", false); @@ -5906,7 +5906,7 @@ qemuDomainRestoreInternal(virConnectPtr conn, goto cleanup; ret = qemuSaveImageStartVM(conn, driver, vm, &saveFd.fd, data, path, - false, reset_nvram, VIR_ASYNC_JOB_START); + false, reset_nvram, true, VIR_ASYNC_JOB_START); qemuProcessEndJob(vm); @@ -6221,7 +6221,7 @@ qemuDomainObjRestore(virConnectPtr conn, virDomainObjAssignDef(vm, &def, true, NULL); ret = qemuSaveImageStartVM(conn, driver, vm, &saveFd.fd, data, path, - start_paused, reset_nvram, asyncJob); + start_paused, reset_nvram, true, asyncJob); cleanup: virQEMUSaveDataFree(data); @@ -6484,7 +6484,7 @@ qemuDomainObjStart(virConnectPtr conn, } ret = qemuProcessStart(conn, driver, vm, NULL, asyncJob, - NULL, -1, NULL, NULL, + NULL, -1, NULL, false, NULL, VIR_NETDEV_VPORT_PROFILE_OP_CREATE, start_flags); virDomainAuditStart(vm, "booted", ret >= 0); if (ret >= 0) { diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 004e84556c..93cd446b23 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2139,7 +2139,8 @@ int qemuMigrationDstRun(virQEMUDriver *driver, virDomainObj *vm, const char *uri, - virDomainAsyncJob asyncJob) + virDomainAsyncJob asyncJob, + bool wait) { qemuDomainObjPrivate *priv = vm->privateData; int rv; @@ -2160,14 +2161,15 @@ qemuMigrationDstRun(virQEMUDriver *driver, if (rv < 0) return -1; - if (asyncJob == VIR_ASYNC_JOB_MIGRATION_IN) { - /* qemuMigrationDstWaitForCompletion is called from the Finish phase */ - return 0; + if (wait) { + /* + * the Migration Finish phase, as well as the multifd load from files, + * need to call qemuMigrationDstWaitForCompletion separately, not here. + */ + if (qemuMigrationDstWaitForCompletion(driver, vm, asyncJob, false) < 0) + return -1; } - if (qemuMigrationDstWaitForCompletion(driver, vm, asyncJob, false) < 0) - return -1; - return 0; } @@ -3041,7 +3043,7 @@ qemuMigrationDstPrepareAny(virQEMUDriver *driver, } if (qemuMigrationDstRun(driver, vm, incoming->uri, - VIR_ASYNC_JOB_MIGRATION_IN) < 0) + VIR_ASYNC_JOB_MIGRATION_IN, false) < 0) goto stopjob; if (qemuProcessFinishStartup(driver, vm, VIR_ASYNC_JOB_MIGRATION_IN, diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index ddc8e65489..c3c48c19c0 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -255,7 +255,8 @@ int qemuMigrationDstRun(virQEMUDriver *driver, virDomainObj *vm, const char *uri, - virDomainAsyncJob asyncJob); + virDomainAsyncJob asyncJob, + bool wait); void qemuMigrationAnyPostcopyFailed(virQEMUDriver *driver, diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index b0b00eb0a2..0a1e7985fb 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -7788,6 +7788,7 @@ qemuProcessStart(virConnectPtr conn, const char *migrateFrom, int migrateFd, const char *migratePath, + bool wait_incoming, virDomainMomentObj *snapshot, virNetDevVPortProfileOp vmop, unsigned int flags) @@ -7850,7 +7851,7 @@ qemuProcessStart(virConnectPtr conn, relabel = true; if (incoming) { - if (qemuMigrationDstRun(driver, vm, incoming->uri, asyncJob) < 0) + if (qemuMigrationDstRun(driver, vm, incoming->uri, asyncJob, wait_incoming) < 0) goto stop; } else { /* Refresh state of devices from QEMU. During migration this happens diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h index f81bfd930a..5a1d005cb0 100644 --- a/src/qemu/qemu_process.h +++ b/src/qemu/qemu_process.h @@ -86,8 +86,9 @@ int qemuProcessStart(virConnectPtr conn, virCPUDef *updatedCPU, virDomainAsyncJob asyncJob, const char *migrateFrom, - int stdin_fd, - const char *stdin_path, + int fd, + const char *migratePath, + bool wait_incoming, virDomainMomentObj *snapshot, virNetDevVPortProfileOp vmop, unsigned int flags); diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 485e3163b7..83dea78718 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -781,6 +781,7 @@ qemuSaveImageStartVM(virConnectPtr conn, const char *path, bool start_paused, bool reset_nvram, + bool wait_incoming, virDomainAsyncJob asyncJob) { qemuDomainObjPrivate *priv = vm->privateData; @@ -835,7 +836,8 @@ qemuSaveImageStartVM(virConnectPtr conn, priv->disableSlirp = true; if (qemuProcessStart(conn, driver, vm, cookie ? cookie->cpu : NULL, - asyncJob, "stdio", *fd, path, NULL, + asyncJob, "stdio", *fd, path, wait_incoming, + NULL, VIR_NETDEV_VPORT_PROFILE_OP_RESTORE, start_flags) == 0) started = true; diff --git a/src/qemu/qemu_saveimage.h b/src/qemu/qemu_saveimage.h index 356fc7561e..412624b968 100644 --- a/src/qemu/qemu_saveimage.h +++ b/src/qemu/qemu_saveimage.h @@ -99,6 +99,7 @@ qemuSaveImageStartVM(virConnectPtr conn, const char *path, bool start_paused, bool reset_nvram, + bool wait_incoming, virDomainAsyncJob asyncJob) ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5) ATTRIBUTE_NONNULL(6); diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c index 2e445e8296..626a5a14b9 100644 --- a/src/qemu/qemu_snapshot.c +++ b/src/qemu/qemu_snapshot.c @@ -2092,7 +2092,7 @@ qemuSnapshotRevertActive(virDomainObj *vm, rc = qemuProcessStart(snapshot->domain->conn, driver, vm, cookie ? cookie->cpu : NULL, - VIR_ASYNC_JOB_START, NULL, -1, NULL, snap, + VIR_ASYNC_JOB_START, NULL, -1, NULL, false, snap, VIR_NETDEV_VPORT_PROFILE_OP_CREATE, start_flags); virDomainAuditStart(vm, "from-snapshot", rc >= 0); @@ -2215,7 +2215,7 @@ qemuSnapshotRevertInactive(virDomainObj *vm, start_flags |= paused ? VIR_QEMU_PROCESS_START_PAUSED : 0; rc = qemuProcessStart(snapshot->domain->conn, driver, vm, NULL, - VIR_ASYNC_JOB_START, NULL, -1, NULL, NULL, + VIR_ASYNC_JOB_START, NULL, -1, NULL, false, NULL, VIR_NETDEV_VPORT_PROFILE_OP_CREATE, start_flags); virDomainAuditStart(vm, "from-snapshot", rc >= 0); -- 2.35.3

use multifd to restore parallel saves. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_driver.c | 10 +++- src/qemu/qemu_migration.c | 2 +- src/qemu/qemu_migration.h | 6 ++ src/qemu/qemu_saveimage.c | 119 +++++++++++++++++++++++++++++++++++++- src/qemu/qemu_saveimage.h | 8 ++- 5 files changed, 139 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0e8dd7748c..72ab679336 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -5905,8 +5905,14 @@ qemuDomainRestoreInternal(virConnectPtr conn, flags) < 0) goto cleanup; - ret = qemuSaveImageStartVM(conn, driver, vm, &saveFd.fd, data, path, - false, reset_nvram, true, VIR_ASYNC_JOB_START); + if (flags & VIR_DOMAIN_SAVE_PARALLEL) { + ret = qemuSaveImageLoadMultiFd(conn, vm, oflags, data, reset_nvram, + &saveFd, VIR_ASYNC_JOB_START); + + } else { + ret = qemuSaveImageStartVM(conn, driver, vm, &saveFd.fd, data, path, + false, reset_nvram, true, VIR_ASYNC_JOB_START); + } qemuProcessEndJob(vm); diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 93cd446b23..12b7e84f25 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1933,7 +1933,7 @@ qemuMigrationSrcWaitForCompletion(virQEMUDriver *driver, } -static int +int qemuMigrationDstWaitForCompletion(virQEMUDriver *driver, virDomainObj *vm, virDomainAsyncJob asyncJob, diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index c3c48c19c0..38f4877cf0 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -191,6 +191,12 @@ qemuMigrationDstFinish(virQEMUDriver *driver, int retcode, bool v3proto); +int +qemuMigrationDstWaitForCompletion(virQEMUDriver *driver, + virDomainObj *vm, + virDomainAsyncJob asyncJob, + bool postcopy); + int qemuMigrationSrcConfirm(virQEMUDriver *driver, virDomainObj *vm, diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 83dea78718..753e297226 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -579,6 +579,114 @@ qemuSaveImageCreate(virQEMUDriver *driver, } +int qemuSaveImageLoadMultiFd(virConnectPtr conn, virDomainObj *vm, int oflags, + virQEMUSaveData *data, bool reset_nvram, + virQEMUSaveFd *saveFd, virDomainAsyncJob asyncJob) +{ + virQEMUDriver *driver = conn->privateData; + qemuDomainObjPrivate *priv = vm->privateData; + virQEMUSaveFd *multiFd = NULL; + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + g_autoptr(virCommand) cmd = NULL; + g_autofree char *helper_path = NULL; + g_autofree char *sun_path = g_strdup_printf("%s/restore-multifd.sock", cfg->saveDir); + bool qemu_started = false; + int ret = -1; + int nchannels = data->header.multifd_channels; + + if (!(helper_path = virFileFindResource("libvirt_multifd_helper", + abs_top_builddir "/src", + LIBEXECDIR))) + goto cleanup; + cmd = virCommandNewArgList(helper_path, sun_path, NULL); + virCommandAddArgFormat(cmd, "%d", nchannels); + virCommandAddArgFormat(cmd, "%d", saveFd->fd); + virCommandPassFD(cmd, saveFd->fd, 0); + + /* Perform parallel multifd migration from files (main fd + channels) */ + if (!(multiFd = qemuSaveImageCreateMultiFd(driver, vm, cmd, saveFd->path, oflags, cfg, nchannels))) + goto cleanup; + if (qemuSaveImageStartVM(conn, driver, vm, NULL, data, sun_path, + false, reset_nvram, false, asyncJob) < 0) + goto cleanup; + + qemu_started = true; + + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATE_MULTIFD)) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("QEMU does not seem to support multifd migration, required for parallel migration from files")); + goto cleanup; + } else { + g_autoptr(qemuMigrationParams) migParams = qemuMigrationParamsNew(); + bool bwParam = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_PARAM_BANDWIDTH); + + if (bwParam) { + if (qemuMigrationParamsSetULL(migParams, + QEMU_MIGRATION_PARAM_MAX_BANDWIDTH, + QEMU_DOMAIN_MIG_BANDWIDTH_MAX * 1024 * 1024) < 0) + goto cleanup; + priv->migMaxBandwidth = QEMU_DOMAIN_MIG_BANDWIDTH_MAX; + } else { + if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) == 0) { + qemuMonitorSetMigrationSpeed(priv->mon, + QEMU_DOMAIN_MIG_BANDWIDTH_MAX); + priv->migMaxBandwidth = QEMU_DOMAIN_MIG_BANDWIDTH_MAX; + qemuDomainObjExitMonitor(vm); + } + } + qemuMigrationParamsSetCap(migParams, QEMU_MIGRATION_CAP_MULTIFD); + if (qemuMigrationParamsSetInt(migParams, + QEMU_MIGRATION_PARAM_MULTIFD_CHANNELS, + nchannels) < 0) + goto cleanup; + if (qemuMigrationParamsApply(driver, vm, asyncJob, migParams) < 0) + goto cleanup; + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("guest unexpectedly quit")); + goto cleanup; + } + /* multifd helper can now connect, then wait for migration to complete */ + if (virCommandRunAsync(cmd, NULL) < 0) + goto cleanup; + + if (qemuMigrationDstWaitForCompletion(driver, vm, asyncJob, false) < 0) + goto cleanup; + + if (qemuSaveImageCloseMultiFd(multiFd, nchannels, vm) < 0) + goto cleanup; + + if (qemuProcessRefreshState(driver, vm, asyncJob) < 0) + goto cleanup; + + /* run 'cont' on the destination */ + if (qemuProcessStartCPUs(driver, vm, + VIR_DOMAIN_RUNNING_RESTORED, + asyncJob) < 0) { + if (virGetLastErrorCode() == VIR_ERR_OK) + virReportError(VIR_ERR_OPERATION_FAILED, + "%s", _("failed to resume domain")); + goto cleanup; + } + if (virDomainObjSave(vm, driver->xmlopt, cfg->stateDir) < 0) { + VIR_WARN("Failed to save status on vm %s", vm->def->name); + goto cleanup; + } + } + qemuDomainEventEmitJobCompleted(driver, vm); + ret = 0; + + cleanup: + if (ret < 0 && qemu_started) { + qemuProcessStop(driver, vm, VIR_DOMAIN_SHUTOFF_FAILED, + asyncJob, VIR_QEMU_PROCESS_STOP_MIGRATED); + } + ret = qemuSaveImageFreeMultiFd(multiFd, vm, nchannels, ret); + return ret; +} + + /* qemuSaveImageGetCompressionProgram: * @imageFormat: String representation from qemu.conf for the compression * image format being used (dump, save, or snapshot). @@ -789,6 +897,7 @@ qemuSaveImageStartVM(virConnectPtr conn, bool started = false; virObjectEvent *event; VIR_AUTOCLOSE intermediatefd = -1; + g_autofree char *migrate_from = NULL; g_autoptr(virCommand) cmd = NULL; g_autofree char *errbuf = NULL; g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); @@ -835,8 +944,14 @@ qemuSaveImageStartVM(virConnectPtr conn, if (cookie && !cookie->slirpHelper) priv->disableSlirp = true; + if (fd) { + migrate_from = g_strdup("stdio"); + } else { + migrate_from = g_strdup_printf("unix://%s", path); + } + if (qemuProcessStart(conn, driver, vm, cookie ? cookie->cpu : NULL, - asyncJob, "stdio", *fd, path, wait_incoming, + asyncJob, migrate_from, fd ? *fd : -1, path, wait_incoming, NULL, VIR_NETDEV_VPORT_PROFILE_OP_RESTORE, start_flags) == 0) @@ -860,7 +975,7 @@ qemuSaveImageStartVM(virConnectPtr conn, VIR_DEBUG("Decompression binary stderr: %s", NULLSTR(errbuf)); virErrorRestore(&orig_err); } - if (VIR_CLOSE(*fd) < 0) { + if (fd && VIR_CLOSE(*fd) < 0) { virReportSystemError(errno, _("cannot close file: %s"), path); rc = -1; } diff --git a/src/qemu/qemu_saveimage.h b/src/qemu/qemu_saveimage.h index 412624b968..719e6506a5 100644 --- a/src/qemu/qemu_saveimage.h +++ b/src/qemu/qemu_saveimage.h @@ -101,7 +101,7 @@ qemuSaveImageStartVM(virConnectPtr conn, bool reset_nvram, bool wait_incoming, virDomainAsyncJob asyncJob) - ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5) ATTRIBUTE_NONNULL(6); + ATTRIBUTE_NONNULL(5) ATTRIBUTE_NONNULL(6); int qemuSaveImageOpen(virQEMUDriver *driver, @@ -119,6 +119,12 @@ qemuSaveImageGetCompressionProgram(const char *imageFormat, bool use_raw_on_fail) ATTRIBUTE_NONNULL(2); +int qemuSaveImageLoadMultiFd(virConnectPtr conn, virDomainObj *vm, int oflags, + virQEMUSaveData *data, bool reset_nvram, + virQEMUSaveFd *saveFd, virDomainAsyncJob asyncJob) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(4) + ATTRIBUTE_NONNULL(6) G_GNUC_WARN_UNUSED_RESULT; + int qemuSaveImageCreate(virQEMUDriver *driver, virDomainObj *vm, -- 2.35.3

Signed-off-by: Claudio Fontana <cfontana@suse.de> --- docs/manpages/virsh.rst | 23 +++++++++++++++++------ tools/virsh-domain.c | 24 ++++++++++++++++++++++-- 2 files changed, 39 insertions(+), 8 deletions(-) diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst index e73e590754..e9012b85d1 100644 --- a/docs/manpages/virsh.rst +++ b/docs/manpages/virsh.rst @@ -3803,15 +3803,18 @@ save :: save domain state-file [--bypass-cache] [--xml file] + [--parallel] [--parallel-connections connections] [{--running | --paused}] [--verbose] -Saves a running domain (RAM, but not disk state) to a state file so that -it can be restored -later. Once saved, the domain will no longer be running on the -system, thus the memory allocated for the domain will be free for -other domains to use. ``virsh restore`` restores from this state file. +Saves a paused or running domain (RAM, but not disk state) to one or more +state files, so that it can be restored later. +Once saved, the domain will no longer be running on the system, +thus the memory allocated for the domain will be free for +other domains to use. ``virsh restore`` restores from state file/s. + If *--bypass-cache* is specified, the save will avoid the file system -cache, although this may slow down the operation. +cache; depending on the specific scenario this may slow down or speed up +the operation. The progress may be monitored using ``domjobinfo`` virsh command and canceled with ``domjobabort`` command (sent by another virsh instance). Another option @@ -3833,6 +3836,14 @@ based on the state the domain was in when the save was done; passing either the *--running* or *--paused* flag will allow overriding which state the ``restore`` should use. +*--parallel* option will cause the save data to be sent over multiple +parallel connections to multiple files. The main save file is specified +with ``state-file``, and a number of additional connections can be +set using *--parallel-connections*, which will save to files named +``state-file``.1 , ``state-file``.2 ... up to ``connections``. + +Parallel connections may help in speeding up the save operation. + Domain saved state files assume that disk images will be unchanged between the creation and restore point. For a more complete system restore point, where the disk state is saved alongside the memory diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 2d90fba9b7..40791f2135 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -4174,6 +4174,14 @@ static const vshCmdOptDef opts_save[] = { .type = VSH_OT_BOOL, .help = N_("avoid file system cache when saving") }, + {.name = "parallel", + .type = VSH_OT_BOOL, + .help = N_("enable parallel save to files") + }, + {.name = "parallel-connections", + .type = VSH_OT_INT, + .help = N_("number of connections/files for parallel save") + }, {.name = "xml", .type = VSH_OT_STRING, .completer = virshCompletePathLocalExisting, @@ -4206,6 +4214,8 @@ doSave(void *opaque) virTypedParameterPtr params = NULL; int nparams = 0; int maxparams = 0; + int intOpt = 0; + int rv = -1; unsigned int flags = 0; const char *xmlfile = NULL; g_autofree char *xml = NULL; @@ -4227,6 +4237,15 @@ doSave(void *opaque) } if (vshCommandOptBool(cmd, "bypass-cache")) flags |= VIR_DOMAIN_SAVE_BYPASS_CACHE; + if (vshCommandOptBool(cmd, "parallel")) + flags |= VIR_DOMAIN_SAVE_PARALLEL; + if ((rv = vshCommandOptInt(ctl, cmd, "parallel-connections", &intOpt)) < 0) { + goto out; + } else if (rv > 0) { + if (virTypedParamsAddInt(¶ms, &nparams, &maxparams, + VIR_SAVE_PARAM_PARALLEL_CONNECTIONS, intOpt) < 0) + goto out; + } if (vshCommandOptBool(cmd, "running")) flags |= VIR_DOMAIN_SAVE_RUNNING; if (vshCommandOptBool(cmd, "paused")) @@ -4247,8 +4266,9 @@ doSave(void *opaque) goto out; } } - - if (flags || xml) { + if (flags & VIR_DOMAIN_SAVE_PARALLEL) { + rc = virDomainSaveParams(dom, params, nparams, flags); + } else if (flags || xml) { rc = virDomainSaveFlags(dom, to, xml, flags); } else { rc = virDomainSave(dom, to); -- 2.35.3

Signed-off-by: Claudio Fontana <cfontana@suse.de> --- docs/manpages/virsh.rst | 12 ++++++++++-- tools/virsh-domain.c | 10 +++++++++- 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst index e9012b85d1..dee748d870 100644 --- a/docs/manpages/virsh.rst +++ b/docs/manpages/virsh.rst @@ -3754,12 +3754,13 @@ restore :: restore state-file [--bypass-cache] [--xml file] - [{--running | --paused}] [--reset-nvram] + [{--running | --paused}] [--reset-nvram] [--parallel] Restores a domain from a ``virsh save`` state file. See *save* for more info. If *--bypass-cache* is specified, the restore will avoid the file system -cache, although this may slow down the operation. +cache; depending on the specific scenario this may slow down or speed up +the operation. *--xml* ``file`` is usually omitted, but can be used to supply an alternative XML file for use on the restored guest with changes only @@ -3775,6 +3776,13 @@ domain should be started in. If *--reset-nvram* is specified, any existing NVRAM file will be deleted and re-initialized from its pristine template. +*--parallel* option will cause the save data to be loaded from multiple +state files over parallel connections. The main save file is specified +with ``state-file``, and the state file itself contains the number of +additional channels (files) to load. + +Parallel connections may help in speeding up the restore operation. + ``Note``: To avoid corrupting file system contents within the domain, you should not reuse the saved state file for a second ``restore`` unless you have also reverted all storage volumes back to the same contents as when diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 40791f2135..c14f8e74ce 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -5312,6 +5312,10 @@ static const vshCmdOptDef opts_restore[] = { .type = VSH_OT_BOOL, .help = N_("avoid file system cache when restoring") }, + {.name = "parallel", + .type = VSH_OT_BOOL, + .help = N_("enable parallel restore") + }, {.name = "xml", .type = VSH_OT_STRING, .completer = virshCompletePathLocalExisting, @@ -5353,6 +5357,8 @@ cmdRestore(vshControl *ctl, const vshCmd *cmd) } if (vshCommandOptBool(cmd, "bypass-cache")) flags |= VIR_DOMAIN_SAVE_BYPASS_CACHE; + if (vshCommandOptBool(cmd, "parallel")) + flags |= VIR_DOMAIN_SAVE_PARALLEL; if (vshCommandOptBool(cmd, "running")) flags |= VIR_DOMAIN_SAVE_RUNNING; if (vshCommandOptBool(cmd, "paused")) @@ -5371,7 +5377,9 @@ cmdRestore(vshControl *ctl, const vshCmd *cmd) goto out; } - if (flags || xml) { + if (flags & VIR_DOMAIN_SAVE_PARALLEL) { + rc = virDomainRestoreParams(priv->conn, params, nparams, flags); + } else if (flags || xml) { rc = virDomainRestoreFlags(priv->conn, from, xml, flags); } else { rc = virDomainRestore(priv->conn, from); -- 2.35.3

add it to both capabilities and migration parameters Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_migration_params.c | 2 ++ src/qemu/qemu_migration_params.h | 1 + tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml | 1 + tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml | 1 + 25 files changed, 27 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 581b6a40df..9c2d5643eb 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -675,6 +675,7 @@ VIR_ENUM_IMPL(virQEMUCaps, /* 430 */ "migrate-multifd", /* QEMU_CAPS_MIGRATE_MULTIFD */ + "migration-param.multifd-compression", /* QEMU_CAPS_MIGRATION_PARAM_MULTIFD_COMPRESSION */ ); @@ -1612,6 +1613,7 @@ static struct virQEMUCapsStringFlags virQEMUCapsQMPSchemaQueries[] = { { "migrate-set-parameters/arg-type/downtime-limit", QEMU_CAPS_MIGRATION_PARAM_DOWNTIME }, { "migrate-set-parameters/arg-type/xbzrle-cache-size", QEMU_CAPS_MIGRATION_PARAM_XBZRLE_CACHE_SIZE }, { "migrate-set-parameters/arg-type/block-bitmap-mapping/bitmaps/transform", QEMU_CAPS_MIGRATION_PARAM_BLOCK_BITMAP_MAPPING }, + { "migrate-set-parameters/arg-type/multifd-compression", QEMU_CAPS_MIGRATION_PARAM_MULTIFD_COMPRESSION }, { "nbd-server-start/arg-type/tls-creds", QEMU_CAPS_NBD_TLS }, { "nbd-server-add/arg-type/bitmap", QEMU_CAPS_NBD_BITMAP }, { "netdev_add/arg-type/+vhost-vdpa", QEMU_CAPS_NETDEV_VHOST_VDPA }, diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index b089f83da1..e226b1a51a 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -650,6 +650,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ /* 430 */ QEMU_CAPS_MIGRATE_MULTIFD, /* migrate can set multifd parameter */ + QEMU_CAPS_MIGRATION_PARAM_MULTIFD_COMPRESSION, /* multifd-compression in migrate-set-parameters */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/src/qemu/qemu_migration_params.c b/src/qemu/qemu_migration_params.c index 36174a66d8..75b29e93a1 100644 --- a/src/qemu/qemu_migration_params.c +++ b/src/qemu/qemu_migration_params.c @@ -115,6 +115,7 @@ VIR_ENUM_IMPL(qemuMigrationParam, "xbzrle-cache-size", "max-postcopy-bandwidth", "multifd-channels", + "multifd-compression", ); typedef struct _qemuMigrationParamsAlwaysOnItem qemuMigrationParamsAlwaysOnItem; @@ -234,6 +235,7 @@ static const qemuMigrationParamType qemuMigrationParamTypes[] = { [QEMU_MIGRATION_PARAM_XBZRLE_CACHE_SIZE] = QEMU_MIGRATION_PARAM_TYPE_ULL, [QEMU_MIGRATION_PARAM_MAX_POSTCOPY_BANDWIDTH] = QEMU_MIGRATION_PARAM_TYPE_ULL, [QEMU_MIGRATION_PARAM_MULTIFD_CHANNELS] = QEMU_MIGRATION_PARAM_TYPE_INT, + [QEMU_MIGRATION_PARAM_MULTIFD_COMPRESSION] = QEMU_MIGRATION_PARAM_TYPE_STRING, }; G_STATIC_ASSERT(G_N_ELEMENTS(qemuMigrationParamTypes) == QEMU_MIGRATION_PARAM_LAST); diff --git a/src/qemu/qemu_migration_params.h b/src/qemu/qemu_migration_params.h index 99af73b4a4..8b2d6ab210 100644 --- a/src/qemu/qemu_migration_params.h +++ b/src/qemu/qemu_migration_params.h @@ -60,6 +60,7 @@ typedef enum { QEMU_MIGRATION_PARAM_XBZRLE_CACHE_SIZE, QEMU_MIGRATION_PARAM_MAX_POSTCOPY_BANDWIDTH, QEMU_MIGRATION_PARAM_MULTIFD_CHANNELS, + QEMU_MIGRATION_PARAM_MULTIFD_COMPRESSION, QEMU_MIGRATION_PARAM_LAST } qemuMigrationParam; diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml index c45b2e6cf6..1d032789b7 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.aarch64.xml @@ -176,6 +176,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml index a3ad743d70..4748bf791e 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.ppc64.xml @@ -182,6 +182,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml index e1b5cac26b..859638fb1c 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.riscv64.xml @@ -168,6 +168,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml index 796adb9066..305ae0bf26 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml @@ -216,6 +216,7 @@ <flag name='memory-backend-file.prealloc-threads'/> <flag name='virtio-iommu-pci'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>5000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100241</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml b/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml index cb203df125..e0ff08a445 100644 --- a/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml +++ b/tests/qemucapabilitiesdata/caps_5.1.0.sparc.xml @@ -88,6 +88,7 @@ <flag name='query-display-options'/> <flag name='memory-backend-file.prealloc-threads'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>5001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml index 7479d942a2..9f089f551e 100644 --- a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml @@ -220,6 +220,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>5001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml index 268d1444ad..3de49adf5b 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.aarch64.xml @@ -183,6 +183,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml index eabf4b600c..ec2ee445a1 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.ppc64.xml @@ -187,6 +187,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml index 0dbaf5a5ec..cf250f2e42 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.riscv64.xml @@ -173,6 +173,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>0</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml b/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml index b0fbab9cb5..6dc5f25f36 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.s390x.xml @@ -141,6 +141,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml index 1a1717bf2a..69bfdcea01 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml @@ -224,6 +224,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>5002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml index 1c18d122e2..cd25ee8ccb 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.aarch64.xml @@ -191,6 +191,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml b/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml index 8fa4cb2307..659d952b08 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.s390x.xml @@ -149,6 +149,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml index 70c67202b1..7a2fdde901 100644 --- a/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.0.0.x86_64.xml @@ -233,6 +233,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>6000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml index a5ec77878f..f8c8266153 100644 --- a/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.1.0.x86_64.xml @@ -237,6 +237,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>6001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml index 92d8ceff7e..09f7787797 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.aarch64.xml @@ -202,6 +202,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>6001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml index f219912927..5c3342d67a 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.ppc64.xml @@ -198,6 +198,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>6002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml index 38fd3878ea..2ce9f7bb6c 100644 --- a/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_6.2.0.x86_64.xml @@ -239,6 +239,7 @@ <flag name='virtio-iommu-pci'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>6002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100244</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml index 522e225c8f..9de4d8bc51 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.aarch64.xml @@ -210,6 +210,7 @@ <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>6002092</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml index 1eb43799c0..9a0e30b25a 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.ppc64.xml @@ -211,6 +211,7 @@ <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>6002092</version> <kvmVersion>0</kvmVersion> <microcodeVersion>42900243</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml index e5023c4219..250993cd3f 100644 --- a/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_7.0.0.x86_64.xml @@ -243,6 +243,7 @@ <flag name='virtio-iommu.boot-bypass'/> <flag name='virtio-net.rss'/> <flag name='migrate-multifd'/> + <flag name='migration-param.multifd-compression'/> <version>7000000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> -- 2.35.3

Signed-off-by: Claudio Fontana <cfontana@suse.de> --- include/libvirt/libvirt-domain.h | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 8bedeaff30..7628968e46 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1618,6 +1618,17 @@ int virDomainRestoreParams (virConnectPtr conn, */ # define VIR_SAVE_PARAM_PARALLEL_CONNECTIONS "parallel.connections" +/** + * VIR_SAVE_PARAM_PARALLEL_COMPRESSION: + * + * this optional parameter is used in conjunction with the flag + * VIR_DOMAIN_SAVE_PARALLEL during save to ask the hypervisor for + * compressed channels to be used using this algorithm. + * + * Since: 8.4.0 + */ +# define VIR_SAVE_PARAM_PARALLEL_COMPRESSION "parallel.compression" + /* See below for virDomainSaveImageXMLFlags */ char * virDomainSaveImageGetXMLDesc (virConnectPtr conn, const char *file, -- 2.35.3

Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_driver.c | 11 ++++++----- src/qemu/qemu_saveimage.c | 1 + src/qemu/qemu_saveimage.h | 1 + src/qemu/qemu_snapshot.c | 2 +- 4 files changed, 9 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 72ab679336..864825960d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2641,7 +2641,8 @@ static int qemuDomainSaveInternal(virQEMUDriver *driver, virDomainObj *vm, const char *path, int compressed, virCommand *compressor, - const char *xmlin, int nconn, unsigned int flags) + const char *xmlin, int nconn, const char *pcomp, + unsigned int flags) { g_autofree char *xml = NULL; bool was_running = false; @@ -2722,7 +2723,7 @@ qemuDomainSaveInternal(virQEMUDriver *driver, xml = NULL; ret = qemuSaveImageCreate(driver, vm, path, data, compressor, - nconn, flags, VIR_ASYNC_JOB_SAVE); + nconn, pcomp, flags, VIR_ASYNC_JOB_SAVE); if (ret < 0) goto endjob; @@ -2791,7 +2792,7 @@ qemuDomainSaveFlags(virDomainPtr dom, const char *path, const char *dxml, goto cleanup; ret = qemuDomainSaveInternal(driver, vm, path, compressed, - compressor, dxml, -1, flags); + compressor, dxml, -1, NULL, flags); cleanup: virDomainObjEndAPI(&vm); @@ -2854,7 +2855,7 @@ qemuDomainSaveParams(virDomainPtr dom, goto cleanup; ret = qemuDomainSaveInternal(driver, vm, to, compressed, - compressor, dxml, nconn, flags); + compressor, dxml, nconn, NULL, flags); cleanup: virDomainObjEndAPI(&vm); @@ -2911,7 +2912,7 @@ qemuDomainManagedSave(virDomainPtr dom, unsigned int flags) VIR_INFO("Saving state of domain '%s' to '%s'", vm->def->name, name); ret = qemuDomainSaveInternal(driver, vm, name, compressed, - compressor, NULL, -1, flags); + compressor, NULL, -1, NULL, flags); if (ret == 0) vm->hasManagedSave = true; diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 753e297226..0162cb242d 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -491,6 +491,7 @@ qemuSaveImageCreate(virQEMUDriver *driver, virQEMUSaveData *data, virCommand *compressor, int nconn, + const char *pcomp G_GNUC_UNUSED, unsigned int flags, virDomainAsyncJob asyncJob) { diff --git a/src/qemu/qemu_saveimage.h b/src/qemu/qemu_saveimage.h index 719e6506a5..184cc17a68 100644 --- a/src/qemu/qemu_saveimage.h +++ b/src/qemu/qemu_saveimage.h @@ -132,6 +132,7 @@ qemuSaveImageCreate(virQEMUDriver *driver, virQEMUSaveData *data, virCommand *compressor, int nconn, + const char *pcomp, unsigned int flags, virDomainAsyncJob asyncJob); diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c index 626a5a14b9..daa72983b3 100644 --- a/src/qemu/qemu_snapshot.c +++ b/src/qemu/qemu_snapshot.c @@ -1457,7 +1457,7 @@ qemuSnapshotCreateActiveExternal(virQEMUDriver *driver, memory_existing = virFileExists(snapdef->memorysnapshotfile); if ((ret = qemuSaveImageCreate(driver, vm, snapdef->memorysnapshotfile, - data, compressor, -1, 0, + data, compressor, -1, NULL, 0, VIR_ASYNC_JOB_SNAPSHOT)) < 0) goto cleanup; -- 2.35.3

Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_driver.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 864825960d..4374728112 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2812,6 +2812,7 @@ qemuDomainSaveParams(virDomainPtr dom, { const char *to = NULL; const char *dxml = NULL; + const char *pcomp = NULL; virQEMUDriver *driver = dom->conn->privateData; int compressed; g_autoptr(virCommand) compressor = NULL; @@ -2829,6 +2830,7 @@ qemuDomainSaveParams(virDomainPtr dom, VIR_SAVE_PARAM_FILE, VIR_TYPED_PARAM_STRING, VIR_SAVE_PARAM_DXML, VIR_TYPED_PARAM_STRING, VIR_SAVE_PARAM_PARALLEL_CONNECTIONS, VIR_TYPED_PARAM_INT, + VIR_SAVE_PARAM_PARALLEL_COMPRESSION, VIR_TYPED_PARAM_STRING, NULL) < 0) return -1; @@ -2838,6 +2840,8 @@ qemuDomainSaveParams(virDomainPtr dom, return -1; if (virTypedParamsGetInt(params, nparams, VIR_SAVE_PARAM_PARALLEL_CONNECTIONS, &nconn) < 0) return -1; + if (virTypedParamsGetString(params, nparams, VIR_SAVE_PARAM_PARALLEL_COMPRESSION, &pcomp) < 0) + return -1; cfg = virQEMUDriverGetConfig(driver); if ((compressed = qemuSaveImageGetCompressionProgram(cfg->saveImageFormat, @@ -2855,7 +2859,7 @@ qemuDomainSaveParams(virDomainPtr dom, goto cleanup; ret = qemuDomainSaveInternal(driver, vm, to, compressed, - compressor, dxml, nconn, NULL, flags); + compressor, dxml, nconn, pcomp, flags); cleanup: virDomainObjEndAPI(&vm); -- 2.35.3

change from static to external linkage, and move the function close to the other similar ones, near qemuMigrationParamsSetULL. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_migration_params.c | 47 +++++++++++++++----------------- src/qemu/qemu_migration_params.h | 5 ++++ 2 files changed, 27 insertions(+), 25 deletions(-) diff --git a/src/qemu/qemu_migration_params.c b/src/qemu/qemu_migration_params.c index 75b29e93a1..f6b9dc337d 100644 --- a/src/qemu/qemu_migration_params.c +++ b/src/qemu/qemu_migration_params.c @@ -900,31 +900,6 @@ qemuMigrationParamsApply(virQEMUDriver *driver, } -/** - * qemuMigrationParamsSetString: - * @migrParams: migration parameter object - * @param: parameter to set - * @value: new value - * - * Enables and sets the migration parameter @param in @migrParams. Returns 0 on - * success and -1 on error. Libvirt error is reported. - */ -static int -qemuMigrationParamsSetString(qemuMigrationParams *migParams, - qemuMigrationParam param, - const char *value) -{ - if (qemuMigrationParamsCheckType(param, QEMU_MIGRATION_PARAM_TYPE_STRING) < 0) - return -1; - - migParams->params[param].value.s = g_strdup(value); - - migParams->params[param].set = true; - - return 0; -} - - /* qemuMigrationParamsEnableTLS * @driver: pointer to qemu driver * @vm: domain object @@ -1146,6 +1121,28 @@ qemuMigrationParamsSetULL(qemuMigrationParams *migParams, return 0; } +/** + * qemuMigrationParamsSetString: + * @migrParams: migration parameter object + * @param: parameter to set + * @value: new value + * + * Enables and sets the migration parameter @param in @migrParams. Returns 0 on + * success and -1 on error. Libvirt error is reported. + */ +int +qemuMigrationParamsSetString(qemuMigrationParams *migParams, + qemuMigrationParam param, + const char *value) +{ + if (qemuMigrationParamsCheckType(param, QEMU_MIGRATION_PARAM_TYPE_STRING) < 0) + return -1; + + migParams->params[param].value.s = g_strdup(value); + migParams->params[param].set = true; + return 0; +} + /** * Returns -1 on error, diff --git a/src/qemu/qemu_migration_params.h b/src/qemu/qemu_migration_params.h index 8b2d6ab210..23a4e0c8a2 100644 --- a/src/qemu/qemu_migration_params.h +++ b/src/qemu/qemu_migration_params.h @@ -138,6 +138,11 @@ qemuMigrationParamsSetULL(qemuMigrationParams *migParams, qemuMigrationParam param, unsigned long long value); +int +qemuMigrationParamsSetString(qemuMigrationParams *migParams, + qemuMigrationParam param, + const char *value); + int qemuMigrationParamsGetULL(qemuMigrationParams *migParams, qemuMigrationParam param, -- 2.35.3

Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_migration.c | 17 +++++++++++++---- src/qemu/qemu_migration.h | 2 +- src/qemu/qemu_saveimage.c | 19 ++++++++++++++----- 3 files changed, 28 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 12b7e84f25..57bf1947f2 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -5905,7 +5905,7 @@ qemuMigrationSrcToFileAux(virQEMUDriver *driver, virDomainObj *vm, virCommand *compressor, virDomainAsyncJob asyncJob, const char *sun_path, - int nchannels) + int nchannels, const char *pcomp) { qemuDomainObjPrivate *priv = vm->privateData; bool bwParam = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_PARAM_BANDWIDTH); @@ -5952,6 +5952,15 @@ qemuMigrationSrcToFileAux(virQEMUDriver *driver, virDomainObj *vm, QEMU_MIGRATION_PARAM_MULTIFD_CHANNELS, nchannels) < 0) return -1; + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_PARAM_MULTIFD_COMPRESSION)) { + if (qemuMigrationParamsSetString(migParams, + QEMU_MIGRATION_PARAM_MULTIFD_COMPRESSION, pcomp) < 0) + return -1; + } else { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("QEMU does not seem to support multifd compression")); + return -1; + } } if (needParams && qemuMigrationParamsApply(driver, vm, asyncJob, migParams) < 0) @@ -6077,17 +6086,17 @@ qemuMigrationSrcToFile(virQEMUDriver *driver, virDomainObj *vm, virDomainAsyncJob asyncJob) { return qemuMigrationSrcToFileAux(driver, vm, fd, compressor, - asyncJob, NULL, -1); + asyncJob, NULL, -1, NULL); } int qemuMigrationSrcToFilesMultiFd(virQEMUDriver *driver, virDomainObj *vm, virDomainAsyncJob asyncJob, const char *sun_path, - int nchannels) + int nchannels, const char *pcomp) { return qemuMigrationSrcToFileAux(driver, vm, -1, NULL, - asyncJob, sun_path, nchannels); + asyncJob, sun_path, nchannels, pcomp); } int diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index 38f4877cf0..d6185770b2 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -223,7 +223,7 @@ int qemuMigrationSrcToFilesMultiFd(virQEMUDriver *driver, virDomainObj *vm, virDomainAsyncJob asyncJob, const char *sun_path, - int nchannels) + int nchannels, const char *pcomp) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; int diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index 0162cb242d..df273eba66 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -491,13 +491,14 @@ qemuSaveImageCreate(virQEMUDriver *driver, virQEMUSaveData *data, virCommand *compressor, int nconn, - const char *pcomp G_GNUC_UNUSED, + const char *pcomp, unsigned int flags, virDomainAsyncJob asyncJob) { g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); virQEMUSaveFd saveFd = QEMU_SAVEFD_INVALID; virQEMUSaveFd *multiFd = NULL; + virQEMUSaveMultiFdComp multiComp = QEMU_SAVE_MULTIFD_COMP_NONE; unsigned int oflags = O_WRONLY | O_TRUNC | O_CREAT; int ret = -1; @@ -509,15 +510,23 @@ qemuSaveImageCreate(virQEMUDriver *driver, } oflags |= O_DIRECT; } - + if (!pcomp || !pcomp[0]) { + pcomp = qemuSaveMultiFdCompTypeToString(QEMU_SAVE_MULTIFD_COMP_NONE); + } if (virQEMUSaveFdInit(&saveFd, path, 0, oflags, cfg) < 0) goto cleanup; if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, saveFd.fd) < 0) goto cleanup; - if (nconn > 0) + if (nconn > 0) { data->header.multifd_channels = nconn; - + if ((multiComp = qemuSaveMultiFdCompTypeFromString(pcomp)) < 0) { + virReportError(VIR_ERR_OPERATION_FAILED, + _("Invalid %s multifd compression format specified"), pcomp); + goto cleanup; + } + data->header.multifd_comp = multiComp; + } if (virQEMUSaveDataWrite(data, saveFd.fd, saveFd.path) < 0) goto cleanup; @@ -547,7 +556,7 @@ qemuSaveImageCreate(virQEMUDriver *driver, goto cleanup; if (chown(sun_path, cfg->user, cfg->group) < 0) goto cleanup; - if (qemuMigrationSrcToFilesMultiFd(driver, vm, asyncJob, sun_path, nconn) < 0) + if (qemuMigrationSrcToFilesMultiFd(driver, vm, asyncJob, sun_path, nconn, pcomp) < 0) goto cleanup; if (qemuSaveImageCloseMultiFd(multiFd, nconn, vm) < 0) goto cleanup; -- 2.35.3

Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/qemu/qemu_saveimage.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c index df273eba66..eed52140ed 100644 --- a/src/qemu/qemu_saveimage.c +++ b/src/qemu/qemu_saveimage.c @@ -649,6 +649,23 @@ int qemuSaveImageLoadMultiFd(virConnectPtr conn, virDomainObj *vm, int oflags, QEMU_MIGRATION_PARAM_MULTIFD_CHANNELS, nchannels) < 0) goto cleanup; + if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_PARAM_MULTIFD_COMPRESSION)) { + const char *pcomp = qemuSaveMultiFdCompTypeToString(data->header.multifd_comp); + if (!pcomp) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, + _("libvirt does not support parallel compression type %u"), + data->header.multifd_comp); + goto cleanup; + } + if (qemuMigrationParamsSetString(migParams, + QEMU_MIGRATION_PARAM_MULTIFD_COMPRESSION, + pcomp) < 0) + goto cleanup; + } else if (data->header.multifd_comp != QEMU_SAVE_MULTIFD_COMP_NONE) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("QEMU does not seem to support multifd compression")); + goto cleanup; + } if (qemuMigrationParamsApply(driver, vm, asyncJob, migParams) < 0) goto cleanup; -- 2.35.3

this completes the save side of the parallel compression support. Signed-off-by: Claudio Fontana <cfontana@suse.de> --- docs/manpages/virsh.rst | 4 ++++ tools/virsh-domain.c | 12 ++++++++++++ 2 files changed, 16 insertions(+) diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst index dee748d870..5518e78160 100644 --- a/docs/manpages/virsh.rst +++ b/docs/manpages/virsh.rst @@ -3812,6 +3812,7 @@ save save domain state-file [--bypass-cache] [--xml file] [--parallel] [--parallel-connections connections] + [--parallel-compression algo] [{--running | --paused}] [--verbose] Saves a paused or running domain (RAM, but not disk state) to one or more @@ -3852,6 +3853,9 @@ set using *--parallel-connections*, which will save to files named Parallel connections may help in speeding up the save operation. +*--parallel-compression* can be used to ask the hypervisor to provide +compressed channels in the save stream using algorithm ``algo``. + Domain saved state files assume that disk images will be unchanged between the creation and restore point. For a more complete system restore point, where the disk state is saved alongside the memory diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index c14f8e74ce..814dfa93d4 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -4182,6 +4182,10 @@ static const vshCmdOptDef opts_save[] = { .type = VSH_OT_INT, .help = N_("number of connections/files for parallel save") }, + {.name = "parallel-compression", + .type = VSH_OT_STRING, + .help = N_("compression algorithm and format for parallel save") + }, {.name = "xml", .type = VSH_OT_STRING, .completer = virshCompletePathLocalExisting, @@ -4211,6 +4215,7 @@ doSave(void *opaque) g_autoptr(virshDomain) dom = NULL; const char *name = NULL; const char *to = NULL; + const char *pcomp = NULL; virTypedParameterPtr params = NULL; int nparams = 0; int maxparams = 0; @@ -4246,6 +4251,13 @@ doSave(void *opaque) VIR_SAVE_PARAM_PARALLEL_CONNECTIONS, intOpt) < 0) goto out; } + if ((rv = vshCommandOptStringReq(ctl, cmd, "parallel-compression", &pcomp)) < 0) { + goto out; + } else { + if (virTypedParamsAddString(¶ms, &nparams, &maxparams, + VIR_SAVE_PARAM_PARALLEL_COMPRESSION, pcomp) < 0) + goto out; + } if (vshCommandOptBool(cmd, "running")) flags |= VIR_DOMAIN_SAVE_RUNNING; if (vshCommandOptBool(cmd, "paused")) -- 2.35.3

On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files /mnt/saves/mysave /mnt/saves/mysave.0 /mnt/saves/mysave.1 .... /mnt/saves/mysave.n Where 'n' is the number of threads used. Overall I'm not very happy with the approach of doing any of this on the libvirt side. Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt. Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again. As a result of how we integrate with QEMU multifd, we're taking the approach of saving the state across multiple files, because it is easier than trying to get multiple threads writing to the same file. It could be solved by using file range locking on the save file. eg a thread can reserve say 500 MB of space, fill it up, and then reserve another 500 MB, etc, etc. It is a bit tedious though and won't align nicely. eg a 1 GB huge page, would be 1 GB + a few bytes of QEMU RAM ave state header. The other downside of multiple files is that it complicates life for both libvirt and apps using libvirt. They need to be aware of multiple files and move them around together. This is not a simple as it might sound. For example, IIRC OpenStack would upload a save image state into a glance bucket for later use. Well, now it needs multiple distinct buckets and keep track of them all. It also means we're forced to use the same concurrency level when restoring, which is not neccessarily desirable if the host environment is different when restoring. ie The original host might have had 8 CPUs, but the new host might only have 4 available, or vica-verca. I know it is appealing to do something on the libvirt side, because it is quicker than getting an enhancement into new QEMU release. We have been down this route before with the migration support in libvirt in the past though, when we introduced the tunnelled live migration in order to workaround QEMU's inability to do TLS encryption. I very much regret that we ever did this, because tunnelled migration was inherantly limited, so for example failed to work with multifd, and failed to work with NBD based disk migration. In the end I did what I should have done at the beginning and just added TLS support to QEMU, making tunnelled migration obsolete, except we still have to carry the code around in libvirt indefinitely due to apps using it. So I'm very concerned about not having history repeat itself and give us a long term burden for a solution that turns out to be a evolutionary dead end. I like the idea of parallel saving, but I really think we need to implement this directly in QEMU, not libvirt. As previously mentioned I think QEMU needs to get a 'file' migration protocol, along with ability to directly map RAM segments into fixed positions in the file. The benefits are many - It will save & restore faster because we're eliminating data copies that libvirt imposes via the iohelper - It is simple for libvirt & mgmt apps as we still only have one file to manage - It is space efficient because if a guest dirties a memory page, we just overwrite the existing contents at the fixed location in the file, instead of appending new contents to the file - It will restore faster too because we only restore each memory page once, due to always overwriting the file in-place when the guest dirtied a page during save - It can save and restore with differing number of threads, and can even dynamically change the number of threads in the middle of the save/restore operation As David G has pointed out the impl is not trivial on the QEMU side, but from what I understand of the migration code, it is certainly viable. Most importantly I think it puts us in a better position for long term feature enhancements later by taking the middle man (libvirt) out of the equation, letting QEMU directly know what medium it is saving/restoring to/from. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Hi Daniel, thanks for looking at this, On 5/10/22 8:38 PM, Daniel P. Berrangé wrote:
On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files
/mnt/saves/mysave /mnt/saves/mysave.0
just minor correction, there is no .0
/mnt/saves/mysave.1 .... /mnt/saves/mysave.n
Where 'n' is the number of threads used.
Overall I'm not very happy with the approach of doing any of this on the libvirt side.
Ok I understand your concern.
Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt.
Right; still the performance we get is insufficient for the use case we are trying to address, even without libvirt in the picture. Instead, with parallel save + compression we can make the numbers add up. For parallel save using multifd, the overhead of libvirt is negligible.
Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again.
Hmm I am thinking about this point, and at first glance I don't think this is 100% accurate; if we do parallel save like in this series with multifd, the overhead of libvirt is almost non-existent in my view compared with doing it with qemu only, skipping libvirt, it is limited to the one iohelper for the main channel (which is the smallest of the transfers), and maybe this could be removed as well. This is because even without libvirt in the picture, we are still migrating to a socket, and something needs to transfer data from that socket to a file. At that point I think both libvirt and a custom made script are in the same position.
As a result of how we integrate with QEMU multifd, we're taking the approach of saving the state across multiple files, because it is easier than trying to get multiple threads writing to the same file. It could be solved by using file range locking on the save file. eg a thread can reserve say 500 MB of space, fill it up, and then reserve another 500 MB, etc, etc. It is a bit tedious though and won't align nicely. eg a 1 GB huge page, would be 1 GB + a few bytes of QEMU RAM ave state header.
The other downside of multiple files is that it complicates life for both libvirt and apps using libvirt. They need to be aware of multiple files and move them around together. This is not a simple as it might sound. For example, IIRC OpenStack would upload a save image state into a glance bucket for later use. Well, now it needs multiple distinct buckets and keep track of them all. It also means we're forced to use the same concurrency level when restoring, which is not neccessarily desirable if the host environment is different when restoring. ie The original host might have had 8 CPUs, but the new host might only have 4 available, or vica-verca.
I know it is appealing to do something on the libvirt side, because it is quicker than getting an enhancement into new QEMU release. We have been down this route before with the migration support in libvirt in the past though, when we introduced the tunnelled live migration in order to workaround QEMU's inability to do TLS encryption. I very much regret that we ever did this, because tunnelled migration was inherantly limited, so for example failed to work with multifd, and failed to work with NBD based disk migration. In the end I did what I should have done at the beginning and just added TLS support to QEMU, making tunnelled migration obsolete, except we still have to carry the code around in libvirt indefinitely due to apps using it.
So I'm very concerned about not having history repeat itself and give us a long term burden for a solution that turns out to be a evolutionary dead end.
I like the idea of parallel saving, but I really think we need to implement this directly in QEMU, not libvirt. As previously mentioned I think QEMU needs to get a 'file' migration protocol, along with ability to directly map RAM segments into fixed positions in the file. The benefits are many
- It will save & restore faster because we're eliminating data copies that libvirt imposes via the iohelper
- It is simple for libvirt & mgmt apps as we still only have one file to manage
- It is space efficient because if a guest dirties a memory page, we just overwrite the existing contents at the fixed location in the file, instead of appending new contents to the file
- It will restore faster too because we only restore each memory page once, due to always overwriting the file in-place when the guest dirtied a page during save
- It can save and restore with differing number of threads, and can even dynamically change the number of threads in the middle of the save/restore operation
As David G has pointed out the impl is not trivial on the QEMU side, but from what I understand of the migration code, it is certainly viable. Most importantly I think it puts us in a better position for long term feature enhancements later by taking the middle man (libvirt) out of the equation, letting QEMU directly know what medium it is saving/restoring to/from.
With regards, Daniel
It's probably possible to do this in QEMU, with extensive changes, in my view possibly to the migration stream itself, to have a more block-friendly, parallel migration stream to a file. Thanks, Claudio

On Wed, May 11, 2022 at 09:26:10AM +0200, Claudio Fontana wrote:
Hi Daniel,
thanks for looking at this,
On 5/10/22 8:38 PM, Daniel P. Berrangé wrote:
On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files
/mnt/saves/mysave /mnt/saves/mysave.0
just minor correction, there is no .0
Heh, off-by-1
/mnt/saves/mysave.1 .... /mnt/saves/mysave.n
Where 'n' is the number of threads used.
Overall I'm not very happy with the approach of doing any of this on the libvirt side.
Ok I understand your concern.
Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt.
Right; still the performance we get is insufficient for the use case we are trying to address, even without libvirt in the picture.
Instead, with parallel save + compression we can make the numbers add up. For parallel save using multifd, the overhead of libvirt is negligible.
Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again.
Hmm I am thinking about this point, and at first glance I don't think this is 100% accurate;
if we do parallel save like in this series with multifd, the overhead of libvirt is almost non-existent in my view compared with doing it with qemu only, skipping libvirt, it is limited to the one iohelper for the main channel (which is the smallest of the transfers), and maybe this could be removed as well.
Libvirt adds overhead due to the multiple data copies in the save process. Using multifd doesn't get rid of this overhead, it merely distributes the overhead across many CPUs. The overall wallclock time is reduced but in aggregate the CPUs still have the same amount of total work todo copying data around. I don't recall the scale of the libvirt overhead that remains after the pipe buffer optimizations, but whatever is less is still taking up host CPU time that can be used for other guests. It also just ocurred to me that currently our save/restore approach is bypassing all resource limits applied to the guest. eg block I/O rate limits, CPU affinity controls, etc, because most of the work is done in the iohelper. If we had this done in QEMU, then the save/restore process is confined by the existing CPU affinity / I/o limits applied to the guest. This mean we would not negatively impact other co-hosted guests to the same extent.
This is because even without libvirt in the picture, we are still migrating to a socket, and something needs to transfer data from that socket to a file. At that point I think both libvirt and a custom made script are in the same position.
If QEMU had explicit support for a "file" backend, there would be no socket involved at all. QEMU would be copying guest RAM directly to a file with no intermediate steps. If QEMU mmap'd the save state file, then saving of the guest RAM could even possibly reduce to a mere 'memcpy()' With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 5/11/22 11:51 AM, Daniel P. Berrangé wrote:
On Wed, May 11, 2022 at 09:26:10AM +0200, Claudio Fontana wrote:
Hi Daniel,
thanks for looking at this,
On 5/10/22 8:38 PM, Daniel P. Berrangé wrote:
On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files
/mnt/saves/mysave /mnt/saves/mysave.0
just minor correction, there is no .0
Heh, off-by-1
/mnt/saves/mysave.1 .... /mnt/saves/mysave.n
Where 'n' is the number of threads used.
Overall I'm not very happy with the approach of doing any of this on the libvirt side.
Ok I understand your concern.
Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt.
Right; still the performance we get is insufficient for the use case we are trying to address, even without libvirt in the picture.
Instead, with parallel save + compression we can make the numbers add up. For parallel save using multifd, the overhead of libvirt is negligible.
Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again.
Hmm I am thinking about this point, and at first glance I don't think this is 100% accurate;
if we do parallel save like in this series with multifd, the overhead of libvirt is almost non-existent in my view compared with doing it with qemu only, skipping libvirt, it is limited to the one iohelper for the main channel (which is the smallest of the transfers), and maybe this could be removed as well.
Libvirt adds overhead due to the multiple data copies in the save process. Using multifd doesn't get rid of this overhead, it merely distributes the overhead across many CPUs. The overall wallclock time is reduced but in aggregate the CPUs still have the same amount of total work todo copying data around.
I don't recall the scale of the libvirt overhead that remains after the pipe buffer optimizations, but whatever is less is still taking up host CPU time that can be used for other guests.
It also just ocurred to me that currently our save/restore approach is bypassing all resource limits applied to the guest. eg block I/O rate limits, CPU affinity controls, etc, because most of the work is done in the iohelper. If we had this done in QEMU, then the save/restore process is confined by the existing CPU affinity / I/o limits applied to the guest. This mean we would not negatively impact other co-hosted guests to the same extent.
This is because even without libvirt in the picture, we are still migrating to a socket, and something needs to transfer data from that socket to a file. At that point I think both libvirt and a custom made script are in the same position.
If QEMU had explicit support for a "file" backend, there would be no socket involved at all. QEMU would be copying guest RAM directly to a file with no intermediate steps. If QEMU mmap'd the save state file, then saving of the guest RAM could even possibly reduce to a mere 'memcpy()'
Agree, but still, to align with your requirement to have only one file, libvirt would need to add some padding after the libvirt header and before the QEMU VM starts in the file, so that the QEMU VM starts at a block-friendly address. Thanks, Claudio

On Wed, May 11, 2022 at 01:52:05PM +0200, Claudio Fontana wrote:
On 5/11/22 11:51 AM, Daniel P. Berrangé wrote:
On Wed, May 11, 2022 at 09:26:10AM +0200, Claudio Fontana wrote:
Hi Daniel,
thanks for looking at this,
On 5/10/22 8:38 PM, Daniel P. Berrangé wrote:
On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files
/mnt/saves/mysave /mnt/saves/mysave.0
just minor correction, there is no .0
Heh, off-by-1
/mnt/saves/mysave.1 .... /mnt/saves/mysave.n
Where 'n' is the number of threads used.
Overall I'm not very happy with the approach of doing any of this on the libvirt side.
Ok I understand your concern.
Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt.
Right; still the performance we get is insufficient for the use case we are trying to address, even without libvirt in the picture.
Instead, with parallel save + compression we can make the numbers add up. For parallel save using multifd, the overhead of libvirt is negligible.
Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again.
Hmm I am thinking about this point, and at first glance I don't think this is 100% accurate;
if we do parallel save like in this series with multifd, the overhead of libvirt is almost non-existent in my view compared with doing it with qemu only, skipping libvirt, it is limited to the one iohelper for the main channel (which is the smallest of the transfers), and maybe this could be removed as well.
Libvirt adds overhead due to the multiple data copies in the save process. Using multifd doesn't get rid of this overhead, it merely distributes the overhead across many CPUs. The overall wallclock time is reduced but in aggregate the CPUs still have the same amount of total work todo copying data around.
I don't recall the scale of the libvirt overhead that remains after the pipe buffer optimizations, but whatever is less is still taking up host CPU time that can be used for other guests.
It also just ocurred to me that currently our save/restore approach is bypassing all resource limits applied to the guest. eg block I/O rate limits, CPU affinity controls, etc, because most of the work is done in the iohelper. If we had this done in QEMU, then the save/restore process is confined by the existing CPU affinity / I/o limits applied to the guest. This mean we would not negatively impact other co-hosted guests to the same extent.
This is because even without libvirt in the picture, we are still migrating to a socket, and something needs to transfer data from that socket to a file. At that point I think both libvirt and a custom made script are in the same position.
If QEMU had explicit support for a "file" backend, there would be no socket involved at all. QEMU would be copying guest RAM directly to a file with no intermediate steps. If QEMU mmap'd the save state file, then saving of the guest RAM could even possibly reduce to a mere 'memcpy()'
Agree, but still, to align with your requirement to have only one file, libvirt would need to add some padding after the libvirt header and before the QEMU VM starts in the file, so that the QEMU VM starts at a block-friendly address.
That's trivial, as we already add padding in this place. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 5/11/22 2:02 PM, Daniel P. Berrangé wrote:
On Wed, May 11, 2022 at 01:52:05PM +0200, Claudio Fontana wrote:
On 5/11/22 11:51 AM, Daniel P. Berrangé wrote:
On Wed, May 11, 2022 at 09:26:10AM +0200, Claudio Fontana wrote:
Hi Daniel,
thanks for looking at this,
On 5/10/22 8:38 PM, Daniel P. Berrangé wrote:
On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files
/mnt/saves/mysave /mnt/saves/mysave.0
just minor correction, there is no .0
Heh, off-by-1
/mnt/saves/mysave.1 .... /mnt/saves/mysave.n
Where 'n' is the number of threads used.
Overall I'm not very happy with the approach of doing any of this on the libvirt side.
Ok I understand your concern.
Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt.
Right; still the performance we get is insufficient for the use case we are trying to address, even without libvirt in the picture.
Instead, with parallel save + compression we can make the numbers add up. For parallel save using multifd, the overhead of libvirt is negligible.
Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again.
Hmm I am thinking about this point, and at first glance I don't think this is 100% accurate;
if we do parallel save like in this series with multifd, the overhead of libvirt is almost non-existent in my view compared with doing it with qemu only, skipping libvirt, it is limited to the one iohelper for the main channel (which is the smallest of the transfers), and maybe this could be removed as well.
Libvirt adds overhead due to the multiple data copies in the save process. Using multifd doesn't get rid of this overhead, it merely distributes the overhead across many CPUs. The overall wallclock time is reduced but in aggregate the CPUs still have the same amount of total work todo copying data around.
I don't recall the scale of the libvirt overhead that remains after the pipe buffer optimizations, but whatever is less is still taking up host CPU time that can be used for other guests.
It also just ocurred to me that currently our save/restore approach is bypassing all resource limits applied to the guest. eg block I/O rate limits, CPU affinity controls, etc, because most of the work is done in the iohelper. If we had this done in QEMU, then the save/restore process is confined by the existing CPU affinity / I/o limits applied to the guest. This mean we would not negatively impact other co-hosted guests to the same extent.
This is because even without libvirt in the picture, we are still migrating to a socket, and something needs to transfer data from that socket to a file. At that point I think both libvirt and a custom made script are in the same position.
If QEMU had explicit support for a "file" backend, there would be no socket involved at all. QEMU would be copying guest RAM directly to a file with no intermediate steps. If QEMU mmap'd the save state file, then saving of the guest RAM could even possibly reduce to a mere 'memcpy()'
Agree, but still, to align with your requirement to have only one file, libvirt would need to add some padding after the libvirt header and before the QEMU VM starts in the file, so that the QEMU VM starts at a block-friendly address.
That's trivial, as we already add padding in this place.
That's great, I love when things are simple. If indeed we want to remove the copy in libvirt (which will also mean explicitly fsyncing elsewhere, as the iohelper would not be there anymore to do that for us on image creation), with QEMU having a "file" protocol support for migration, do we plan to have libvirt and QEMU both open the file for writing concurrently, with QEMU opening O_DIRECT? The alternative being having libvirt open the file with O_DIRECT, write some libvirt stuff in a new, O_DIRECT-friendly format, and then pass the fd to qemu to migrate to, and QEMU sending its new O_DIRECT friendly stream there. In any case, the expectation here is to have a new "file://pathname" or "file:://fdname" as an added feature in QEMU, where QEMU would write a new O_DIRECT friendly stream directly into the file, taking care of both optional parallelization and compression. Is that the gist of it? Seems a lot of work, just trying to roughly figure out the boundaries of this. Thanks, Claudio

On Wed, May 11, 2022 at 07:31:45PM +0200, Claudio Fontana wrote:
That's great, I love when things are simple.
If indeed we want to remove the copy in libvirt (which will also mean explicitly fsyncing elsewhere, as the iohelper would not be there anymore to do that for us on image creation), with QEMU having a "file" protocol support for migration,
do we plan to have libvirt and QEMU both open the file for writing concurrently, with QEMU opening O_DIRECT?
For non-libvirt users, I expect QEMU would open the file directly . For libvirt usage, it is likely preferrable to pass the pre-opened FD, because that simplifies file permission handling.
The alternative being having libvirt open the file with O_DIRECT, write some libvirt stuff in a new, O_DIRECT- friendly format, and then pass the fd to qemu to migrate to, and QEMU sending its new O_DIRECT friendly stream there.
Yep.
In any case, the expectation here is to have a new "file://pathname" or "file:://fdname" as an added feature in QEMU, where QEMU would write a new O_DIRECT friendly stream directly into the file, taking care of both optional parallelization and compression.
I could see several distinct building blocks * First a "file:/some/path" migration protocol that can just do "normal" I/O, but still writing in the traditional migration data stream * Modify existing 'fd:' protocol so that it fstat()s and passes over to the 'file' protocol handler if it sees the FD is not a socket/pipe * Add a migration capability "direct-mapped" to indicate we want the RAM data written/read directly to/from fixed positions in the file, as opposed to a stream. Obviously only valid with a sub-set of migration protocols (file, and fd: if a seekable FD). * Add a migration capability "bypass-cache" to indicate we want O_DIRECT to bypass host I/O cache. Again limited to some migration protocols With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 5/11/22 7:46 PM, Daniel P. Berrangé wrote:
On Wed, May 11, 2022 at 07:31:45PM +0200, Claudio Fontana wrote:
That's great, I love when things are simple.
If indeed we want to remove the copy in libvirt (which will also mean explicitly fsyncing elsewhere, as the iohelper would not be there anymore to do that for us on image creation), with QEMU having a "file" protocol support for migration,
do we plan to have libvirt and QEMU both open the file for writing concurrently, with QEMU opening O_DIRECT?
For non-libvirt users, I expect QEMU would open the file directly . For libvirt usage, it is likely preferrable to pass the pre-opened FD, because that simplifies file permission handling.
The alternative being having libvirt open the file with O_DIRECT, write some libvirt stuff in a new, O_DIRECT- friendly format, and then pass the fd to qemu to migrate to, and QEMU sending its new O_DIRECT friendly stream there.
Yep.
Currently I am working on this part, and I can use my prototype to test that the Direct part of the I/O works. I can split things more to be able to provide more generally useful parts to the series, that we can use to test out things while the qemu stuff is hopefully also tackled. Thanks, Claudio
In any case, the expectation here is to have a new "file://pathname" or "file:://fdname" as an added feature in QEMU, where QEMU would write a new O_DIRECT friendly stream directly into the file, taking care of both optional parallelization and compression.
I could see several distinct building blocks
* First a "file:/some/path" migration protocol that can just do "normal" I/O, but still writing in the traditional migration data stream
* Modify existing 'fd:' protocol so that it fstat()s and passes over to the 'file' protocol handler if it sees the FD is not a socket/pipe
* Add a migration capability "direct-mapped" to indicate we want the RAM data written/read directly to/from fixed positions in the file, as opposed to a stream. Obviously only valid with a sub-set of migration protocols (file, and fd: if a seekable FD).
* Add a migration capability "bypass-cache" to indicate we want O_DIRECT to bypass host I/O cache. Again limited to some migration protocols
With regards, Daniel

On 5/12/22 3:38 PM, Claudio Fontana wrote:
On 5/11/22 7:46 PM, Daniel P. Berrangé wrote:
On Wed, May 11, 2022 at 07:31:45PM +0200, Claudio Fontana wrote:
That's great, I love when things are simple.
If indeed we want to remove the copy in libvirt (which will also mean explicitly fsyncing elsewhere, as the iohelper would not be there anymore to do that for us on image creation), with QEMU having a "file" protocol support for migration,
do we plan to have libvirt and QEMU both open the file for writing concurrently, with QEMU opening O_DIRECT?
For non-libvirt users, I expect QEMU would open the file directly . For libvirt usage, it is likely preferrable to pass the pre-opened FD, because that simplifies file permission handling.
The alternative being having libvirt open the file with O_DIRECT, write some libvirt stuff in a new, O_DIRECT- friendly format, and then pass the fd to qemu to migrate to, and QEMU sending its new O_DIRECT friendly stream there.
Yep.
Currently I am working on this part, and I can use my prototype to test that the Direct part of the I/O works.
Just as a follow up and heads up that I am reworking the Image Write and Read code in qemu_saveimage.c and while doing that I see quite a few things that need improvement, especially missing validations on lengths etc. Ciao, Claudio
I can split things more to be able to provide more generally useful parts to the series, that we can use to test out things while the qemu stuff is hopefully also tackled.
Thanks,
Claudio
In any case, the expectation here is to have a new "file://pathname" or "file:://fdname" as an added feature in QEMU, where QEMU would write a new O_DIRECT friendly stream directly into the file, taking care of both optional parallelization and compression.
I could see several distinct building blocks
* First a "file:/some/path" migration protocol that can just do "normal" I/O, but still writing in the traditional migration data stream
* Modify existing 'fd:' protocol so that it fstat()s and passes over to the 'file' protocol handler if it sees the FD is not a socket/pipe
* Add a migration capability "direct-mapped" to indicate we want the RAM data written/read directly to/from fixed positions in the file, as opposed to a stream. Obviously only valid with a sub-set of migration protocols (file, and fd: if a seekable FD).
* Add a migration capability "bypass-cache" to indicate we want O_DIRECT to bypass host I/O cache. Again limited to some migration protocols
With regards, Daniel

* Daniel P. Berrangé (berrange@redhat.com) wrote:
On Wed, May 11, 2022 at 07:31:45PM +0200, Claudio Fontana wrote:
That's great, I love when things are simple.
If indeed we want to remove the copy in libvirt (which will also mean explicitly fsyncing elsewhere, as the iohelper would not be there anymore to do that for us on image creation), with QEMU having a "file" protocol support for migration,
do we plan to have libvirt and QEMU both open the file for writing concurrently, with QEMU opening O_DIRECT?
For non-libvirt users, I expect QEMU would open the file directly . For libvirt usage, it is likely preferrable to pass the pre-opened FD, because that simplifies file permission handling.
The alternative being having libvirt open the file with O_DIRECT, write some libvirt stuff in a new, O_DIRECT- friendly format, and then pass the fd to qemu to migrate to, and QEMU sending its new O_DIRECT friendly stream there.
Yep.
In any case, the expectation here is to have a new "file://pathname" or "file:://fdname" as an added feature in QEMU, where QEMU would write a new O_DIRECT friendly stream directly into the file, taking care of both optional parallelization and compression.
I could see several distinct building blocks
* First a "file:/some/path" migration protocol that can just do "normal" I/O, but still writing in the traditional migration data stream
* Modify existing 'fd:' protocol so that it fstat()s and passes over to the 'file' protocol handler if it sees the FD is not a socket/pipe
We used to have that at one point.
* Add a migration capability "direct-mapped" to indicate we want the RAM data written/read directly to/from fixed positions in the file, as opposed to a stream. Obviously only valid with a sub-set of migration protocols (file, and fd: if a seekable FD).
This worries me about how you're going to cleanly glue this into the migration code; it sounds like what you want it to do is very different to what it currently does. Dave
* Add a migration capability "bypass-cache" to indicate we want O_DIRECT to bypass host I/O cache. Again limited to some migration protocols
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
-- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

On Thu, May 12, 2022 at 05:58:46PM +0100, Dr. David Alan Gilbert wrote:
* Daniel P. Berrangé (berrange@redhat.com) wrote:
On Wed, May 11, 2022 at 07:31:45PM +0200, Claudio Fontana wrote:
That's great, I love when things are simple.
If indeed we want to remove the copy in libvirt (which will also mean explicitly fsyncing elsewhere, as the iohelper would not be there anymore to do that for us on image creation), with QEMU having a "file" protocol support for migration,
do we plan to have libvirt and QEMU both open the file for writing concurrently, with QEMU opening O_DIRECT?
For non-libvirt users, I expect QEMU would open the file directly . For libvirt usage, it is likely preferrable to pass the pre-opened FD, because that simplifies file permission handling.
The alternative being having libvirt open the file with O_DIRECT, write some libvirt stuff in a new, O_DIRECT- friendly format, and then pass the fd to qemu to migrate to, and QEMU sending its new O_DIRECT friendly stream there.
Yep.
In any case, the expectation here is to have a new "file://pathname" or "file:://fdname" as an added feature in QEMU, where QEMU would write a new O_DIRECT friendly stream directly into the file, taking care of both optional parallelization and compression.
I could see several distinct building blocks
* First a "file:/some/path" migration protocol that can just do "normal" I/O, but still writing in the traditional migration data stream
* Modify existing 'fd:' protocol so that it fstat()s and passes over to the 'file' protocol handler if it sees the FD is not a socket/pipe
We used to have that at one point.
* Add a migration capability "direct-mapped" to indicate we want the RAM data written/read directly to/from fixed positions in the file, as opposed to a stream. Obviously only valid with a sub-set of migration protocols (file, and fd: if a seekable FD).
This worries me about how you're going to cleanly glue this into the migration code; it sounds like what you want it to do is very different to what it currently does.
I've only investigated it lightly, but I see the key bit of code is this method which emits the header + ram page content: static int save_normal_page(RAMState *rs, RAMBlock *block, ram_addr_t offset, uint8_t *buf, bool async) { ram_transferred_add(save_page_header(rs, rs->f, block, offset | RAM_SAVE_FLAG_PAGE)); if (async) { qemu_put_buffer_async(rs->f, buf, TARGET_PAGE_SIZE, migrate_release_ram() && migration_in_postcopy()); } else { qemu_put_buffer(rs->f, buf, TARGET_PAGE_SIZE); } ram_transferred_add(TARGET_PAGE_SIZE); ram_counters.normal++; return 1; } my (perhaps wishful) thinking was that we just have an alternative impl of this which doesn't save the page header, and puts the page content at a fixed offset. I'm fuzzy on how we figure out the right offset - I was hoping that "RAMState" or "RAMBlock" somehow gives us enough info to figure out a deterministic mapping to a file location. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

* Daniel P. Berrangé (berrange@redhat.com) wrote:
On Thu, May 12, 2022 at 05:58:46PM +0100, Dr. David Alan Gilbert wrote:
* Daniel P. Berrangé (berrange@redhat.com) wrote:
On Wed, May 11, 2022 at 07:31:45PM +0200, Claudio Fontana wrote:
That's great, I love when things are simple.
If indeed we want to remove the copy in libvirt (which will also mean explicitly fsyncing elsewhere, as the iohelper would not be there anymore to do that for us on image creation), with QEMU having a "file" protocol support for migration,
do we plan to have libvirt and QEMU both open the file for writing concurrently, with QEMU opening O_DIRECT?
For non-libvirt users, I expect QEMU would open the file directly . For libvirt usage, it is likely preferrable to pass the pre-opened FD, because that simplifies file permission handling.
The alternative being having libvirt open the file with O_DIRECT, write some libvirt stuff in a new, O_DIRECT- friendly format, and then pass the fd to qemu to migrate to, and QEMU sending its new O_DIRECT friendly stream there.
Yep.
In any case, the expectation here is to have a new "file://pathname" or "file:://fdname" as an added feature in QEMU, where QEMU would write a new O_DIRECT friendly stream directly into the file, taking care of both optional parallelization and compression.
I could see several distinct building blocks
* First a "file:/some/path" migration protocol that can just do "normal" I/O, but still writing in the traditional migration data stream
* Modify existing 'fd:' protocol so that it fstat()s and passes over to the 'file' protocol handler if it sees the FD is not a socket/pipe
We used to have that at one point.
* Add a migration capability "direct-mapped" to indicate we want the RAM data written/read directly to/from fixed positions in the file, as opposed to a stream. Obviously only valid with a sub-set of migration protocols (file, and fd: if a seekable FD).
This worries me about how you're going to cleanly glue this into the migration code; it sounds like what you want it to do is very different to what it currently does.
I've only investigated it lightly, but I see the key bit of code is this method which emits the header + ram page content:
static int save_normal_page(RAMState *rs, RAMBlock *block, ram_addr_t offset, uint8_t *buf, bool async) { ram_transferred_add(save_page_header(rs, rs->f, block, offset | RAM_SAVE_FLAG_PAGE)); if (async) { qemu_put_buffer_async(rs->f, buf, TARGET_PAGE_SIZE, migrate_release_ram() && migration_in_postcopy()); } else { qemu_put_buffer(rs->f, buf, TARGET_PAGE_SIZE); } ram_transferred_add(TARGET_PAGE_SIZE); ram_counters.normal++; return 1; }
my (perhaps wishful) thinking was that we just have an alternative impl of this which doesn't save the page header, and puts the page content at a fixed offset.
Hmm OK, probably can; note I think the multifd is separate code (and currently much cleaner - which you'd make more complex again).
I'm fuzzy on how we figure out the right offset - I was hoping that "RAMState" or "RAMBlock" somehow gives us enough info to figure out a deterministic mapping to a file location.
I think that's probably the ram_addr_t type, RAMBlock->offset + the index intot he ramblock; that gets you the same thing as the dirty bitmap (hmm although we don't have a single one of those any more). Dave
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
-- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

On 5/11/22 2:02 PM, Daniel P. Berrangé wrote:
On Wed, May 11, 2022 at 01:52:05PM +0200, Claudio Fontana wrote:
On 5/11/22 11:51 AM, Daniel P. Berrangé wrote:
On Wed, May 11, 2022 at 09:26:10AM +0200, Claudio Fontana wrote:
Hi Daniel,
thanks for looking at this,
On 5/10/22 8:38 PM, Daniel P. Berrangé wrote:
On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files
/mnt/saves/mysave /mnt/saves/mysave.0
just minor correction, there is no .0
Heh, off-by-1
/mnt/saves/mysave.1 .... /mnt/saves/mysave.n
Where 'n' is the number of threads used.
Overall I'm not very happy with the approach of doing any of this on the libvirt side.
Ok I understand your concern.
Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt.
Right; still the performance we get is insufficient for the use case we are trying to address, even without libvirt in the picture.
Instead, with parallel save + compression we can make the numbers add up. For parallel save using multifd, the overhead of libvirt is negligible.
Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again.
Hmm I am thinking about this point, and at first glance I don't think this is 100% accurate;
if we do parallel save like in this series with multifd, the overhead of libvirt is almost non-existent in my view compared with doing it with qemu only, skipping libvirt, it is limited to the one iohelper for the main channel (which is the smallest of the transfers), and maybe this could be removed as well.
Libvirt adds overhead due to the multiple data copies in the save process. Using multifd doesn't get rid of this overhead, it merely distributes the overhead across many CPUs. The overall wallclock time is reduced but in aggregate the CPUs still have the same amount of total work todo copying data around.
I don't recall the scale of the libvirt overhead that remains after the pipe buffer optimizations, but whatever is less is still taking up host CPU time that can be used for other guests.
It also just ocurred to me that currently our save/restore approach is bypassing all resource limits applied to the guest. eg block I/O rate limits, CPU affinity controls, etc, because most of the work is done in the iohelper. If we had this done in QEMU, then the save/restore process is confined by the existing CPU affinity / I/o limits applied to the guest. This mean we would not negatively impact other co-hosted guests to the same extent.
This is because even without libvirt in the picture, we are still migrating to a socket, and something needs to transfer data from that socket to a file. At that point I think both libvirt and a custom made script are in the same position.
If QEMU had explicit support for a "file" backend, there would be no socket involved at all. QEMU would be copying guest RAM directly to a file with no intermediate steps. If QEMU mmap'd the save state file, then saving of the guest RAM could even possibly reduce to a mere 'memcpy()'
Agree, but still, to align with your requirement to have only one file, libvirt would need to add some padding after the libvirt header and before the QEMU VM starts in the file, so that the QEMU VM starts at a block-friendly address.
That's trivial, as we already add padding in this place.
I posted a new series that now tries this in the initial commits, by aligning things in a more block-friendly and compatible way. Then validates the code with DIRECT/IO multifd transfers without any use of the wrapper. On the "single file" issue, what about the trivial solution of using separate channels in the file, of length provided in the header, and then eventually using UNIX holes to work around the problem of channels of different size? Thanks, CLaudio

On 10 May 2022, at 20:38, Daniel P. Berrangé <berrange@redhat.com> wrote:
On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files
/mnt/saves/mysave /mnt/saves/mysave.0 /mnt/saves/mysave.1 .... /mnt/saves/mysave.n
Where 'n' is the number of threads used.
Overall I'm not very happy with the approach of doing any of this on the libvirt side.
Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt.
Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again.
As a result of how we integrate with QEMU multifd, we're taking the approach of saving the state across multiple files, because it is easier than trying to get multiple threads writing to the same file. It could be solved by using file range locking on the save file. eg a thread can reserve say 500 MB of space, fill it up, and then reserve another 500 MB, etc, etc. It is a bit tedious though and won't align nicely. eg a 1 GB huge page, would be 1 GB + a few bytes of QEMU RAM ave state header.
First, I do not understand why you would write things that are not page-aligned to start with? (As an aside, I don’t know how any dirty tracking would work if you do not keep things page-aligned). Could uffd_register_memory accept a memory range that is not aligned? If so, when? Should that be specified in the interface? Second, instead of creating multiple files, why not write blocks at a location determined by an variable that you increment using atomic operations each time you need a new block? If you want to keep the blocks page-aligned in the file as well (which might help if you want to mmap the file at some point), then you need to build a map of the blocks that you tack at the end of the file. There may be good reasons not to do it that way, of course, but I am not familiar enough with the problem to know them.
The other downside of multiple files is that it complicates life for both libvirt and apps using libvirt. They need to be aware of multiple files and move them around together. This is not a simple as it might sound. For example, IIRC OpenStack would upload a save image state into a glance bucket for later use. Well, now it needs multiple distinct buckets and keep track of them all. It also means we're forced to use the same concurrency level when restoring, which is not neccessarily desirable if the host environment is different when restoring. ie The original host might have had 8 CPUs, but the new host might only have 4 available, or vica-verca.
I know it is appealing to do something on the libvirt side, because it is quicker than getting an enhancement into new QEMU release. We have been down this route before with the migration support in libvirt in the past though, when we introduced the tunnelled live migration in order to workaround QEMU's inability to do TLS encryption. I very much regret that we ever did this, because tunnelled migration was inherantly limited, so for example failed to work with multifd, and failed to work with NBD based disk migration. In the end I did what I should have done at the beginning and just added TLS support to QEMU, making tunnelled migration obsolete, except we still have to carry the code around in libvirt indefinitely due to apps using it.
So I'm very concerned about not having history repeat itself and give us a long term burden for a solution that turns out to be a evolutionary dead end.
I like the idea of parallel saving, but I really think we need to implement this directly in QEMU, not libvirt. As previously mentioned I think QEMU needs to get a 'file' migration protocol, along with ability to directly map RAM segments into fixed positions in the file. The benefits are many
- It will save & restore faster because we're eliminating data copies that libvirt imposes via the iohelper
- It is simple for libvirt & mgmt apps as we still only have one file to manage
- It is space efficient because if a guest dirties a memory page, we just overwrite the existing contents at the fixed location in the file, instead of appending new contents to the file
- It will restore faster too because we only restore each memory page once, due to always overwriting the file in-place when the guest dirtied a page during save
- It can save and restore with differing number of threads, and can even dynamically change the number of threads in the middle of the save/restore operation
As David G has pointed out the impl is not trivial on the QEMU side, but from what I understand of the migration code, it is certainly viable. Most importantly I think it puts us in a better position for long term feature enhancements later by taking the middle man (libvirt) out of the equation, letting QEMU directly know what medium it is saving/restoring to/from.
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, May 11, 2022 at 10:27:30AM +0200, Christophe Marie Francois Dupont de Dinechin wrote:
On 10 May 2022, at 20:38, Daniel P. Berrangé <berrange@redhat.com> wrote:
On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files
/mnt/saves/mysave /mnt/saves/mysave.0 /mnt/saves/mysave.1 .... /mnt/saves/mysave.n
Where 'n' is the number of threads used.
Overall I'm not very happy with the approach of doing any of this on the libvirt side.
Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt.
Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again.
As a result of how we integrate with QEMU multifd, we're taking the approach of saving the state across multiple files, because it is easier than trying to get multiple threads writing to the same file. It could be solved by using file range locking on the save file. eg a thread can reserve say 500 MB of space, fill it up, and then reserve another 500 MB, etc, etc. It is a bit tedious though and won't align nicely. eg a 1 GB huge page, would be 1 GB + a few bytes of QEMU RAM ave state header.
First, I do not understand why you would write things that are not page-aligned to start with? (As an aside, I don’t know how any dirty tracking would work if you do not keep things page-aligned).
Could uffd_register_memory accept a memory range that is not aligned? If so, when? Should that be specified in the interface?
Second, instead of creating multiple files, why not write blocks at a location determined by an variable that you increment using atomic operations each time you need a new block? If you want to keep the blocks page-aligned in the file as well (which might help if you want to mmap the file at some point), then you need to build a map of the blocks that you tack at the end of the file.
There may be good reasons not to do it that way, of course, but I am not familiar enough with the problem to know them.
This is all because QEMU is not actually writing to a file. From QEMU's POV it just thinks it is migrating to another QEMU via a pipe. So the questions of page alignment, position are irrelevant to QEMU's needs - it just has a stream. Libvirt is just capturing this raw migration stream and writing its contents out to a file. The contents of the stream are completely opaque to libvirt, and we don't want be be unpacking this stream to do anything more clever. It is better to invest it making QEMU know that it is writing to a file directly. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 5/11/22 10:27 AM, Christophe Marie Francois Dupont de Dinechin wrote:
On 10 May 2022, at 20:38, Daniel P. Berrangé <berrange@redhat.com> wrote:
On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files
/mnt/saves/mysave /mnt/saves/mysave.0 /mnt/saves/mysave.1 .... /mnt/saves/mysave.n
Where 'n' is the number of threads used.
Overall I'm not very happy with the approach of doing any of this on the libvirt side.
Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt.
Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again.
As a result of how we integrate with QEMU multifd, we're taking the approach of saving the state across multiple files, because it is easier than trying to get multiple threads writing to the same file. It could be solved by using file range locking on the save file. eg a thread can reserve say 500 MB of space, fill it up, and then reserve another 500 MB, etc, etc. It is a bit tedious though and won't align nicely. eg a 1 GB huge page, would be 1 GB + a few bytes of QEMU RAM ave state header.
I am not familiar enough to know if this approach would work with multifd without breaking the existing format, maybe David could answer this.
First, I do not understand why you would write things that are not page-aligned to start with? (As an aside, I don’t know how any dirty tracking would work if you do not keep things page-aligned).
Yes, alignment is one issue I encountered, and that in my view would _still_ need to be solved, and that is _whatever_ we put inside QEMU in the future, as it breaks also any attempt to be more efficient (using alternative APIs to read/write etc), and is the reason why iohelper is still needed in my patchset at all for the main file, causing one extra copy for the main channel. The libvirt header, including metadata, domain xml etc, that wraps the QEMU VM ends at an arbitrary address, f.e: 00000000: 4c69 6276 6972 7451 656d 7564 5361 7665 LibvirtQemudSave 00000010: 0300 0000 5b13 0100 0100 0000 0000 0000 ....[........... 00000020: 3613 0000 0200 0000 0000 0000 0000 0000 6............... 00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000050: 0000 0000 0000 0000 0000 0000 3c64 6f6d ............<dom 00000060: 6169 6e20 7479 7065 3d27 6b76 6d27 3e0a ain type='kvm'>. 000113a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000113b0: 0000 0000 0000 0051 4556 4d00 0000 0307 .......QEVM..... 000113c0: 0000 000d 7063 2d69 3434 3066 782d 362e ....pc-i440fx-6. 000113d0: 3201 0000 0003 0372 616d 0000 0000 0000 2......ram...... 000113e0: 0004 0000 0008 c00c 2004 0670 632e 7261 ........ ..pc.ra 000113f0: 6d00 0000 08c0 0000 0014 2f72 6f6d 4065 m........./rom@e 00011400: 7463 2f61 6370 692f 7461 626c 6573 0000 tc/acpi/tables.. 00011410: 0000 0002 0000 0770 632e 6269 6f73 0000 .......pc.bios.. 00011420: 0000 0004 0000 1f30 3030 303a 3030 3a30 .......0000:00:0 00011430: 322e 302f 7669 7274 696f 2d6e 6574 2d70 2.0/virtio-net-p 00011440: 6369 2e72 6f6d 0000 0000 0004 0000 0670 ci.rom.........p 00011450: 632e 726f 6d00 0000 0000 0200 0015 2f72 c.rom........./r 00011460: 6f6d 4065 7463 2f74 6162 6c65 2d6c 6f61 om@etc/table-loa 00011470: 6465 7200 0000 0000 0010 0012 2f72 6f6d der........./rom 00011480: 4065 7463 2f61 6370 692f 7273 6470 0000 @etc/acpi/rsdp.. 00011490: 0000 0000 1000 0000 0000 0000 0010 7e00 ..............~. 000114a0: 0000 0302 0000 0003 0000 0000 0000 2002 .............. . 000114b0: 0670 632e 7261 6d00 0000 0000 0000 3022 .pc.ram.......0" in my view at the minimum we have to start by adding enough padding before starting the QEMU VM (QEVM magic) to be at a page-aligned address. I would add one patch to this effect to my prototype, as this should not be very controversial I think. Regarding migrating the channels to a single file, with the suggestion of Daniel or some other method, the obvious comment from me is if we have some way to know in advance the size of each channel that would be feasible, but especially considering compression that seems pretty hard to know beforehand, so some trick is needed.
Could uffd_register_memory accept a memory range that is not aligned? If so, when? Should that be specified in the interface?
Second, instead of creating multiple files, why not write blocks at a location determined by an variable that you increment using atomic operations each time you need a new block? If you want to keep the blocks page-aligned in the file as well (which might help if you want to mmap the file at some point), then you need to build a map of the blocks that you tack at the end of the file.
Just wanted to throw the simplest idea in the basket, where we could interleave the file with each channel writing, for example, 4 MB at a time at a channel-specific offset, and again, starting at a nicely aligned address. 4MB being just an example here, it would need to be determined by the best balance considering nvme architecture and performance, and could even be a parameter. Storage gurus could advise on this part. The biggest issue there would be that the main channel does not have the same requirements like the other channels, so likely we would waste space reserving for the main channel, as the "main" channel is much, much smaller than the others. (Dave, could the size of the main channel be determined before the transfer roughly? Based on guest ram size?) So the layout for a --parallel --parallel-connections 2 --parallel-interleave 4MB save _could_ be something like: libvirt header 0 to 4MB-aligned address (most likely 0 to 4MB) main channel ( 4MB to 20MB) channel 0 (20MB to 24MB) channel 1 (24MB to 28MB) channel 0 (28MB to 32MB) channel 1 (32MB to 36MB) ... for example. The multifd helper could do this and feed the channels properly during save and restore... Thanks, CLaudio
There may be good reasons not to do it that way, of course, but I am not familiar enough with the problem to know them.
The other downside of multiple files is that it complicates life for both libvirt and apps using libvirt. They need to be aware of multiple files and move them around together. This is not a simple as it might sound. For example, IIRC OpenStack would upload a save image state into a glance bucket for later use. Well, now it needs multiple distinct buckets and keep track of them all. It also means we're forced to use the same concurrency level when restoring, which is not neccessarily desirable if the host environment is different when restoring. ie The original host might have had 8 CPUs, but the new host might only have 4 available, or vica-verca.
I know it is appealing to do something on the libvirt side, because it is quicker than getting an enhancement into new QEMU release. We have been down this route before with the migration support in libvirt in the past though, when we introduced the tunnelled live migration in order to workaround QEMU's inability to do TLS encryption. I very much regret that we ever did this, because tunnelled migration was inherantly limited, so for example failed to work with multifd, and failed to work with NBD based disk migration. In the end I did what I should have done at the beginning and just added TLS support to QEMU, making tunnelled migration obsolete, except we still have to carry the code around in libvirt indefinitely due to apps using it.
So I'm very concerned about not having history repeat itself and give us a long term burden for a solution that turns out to be a evolutionary dead end.
I like the idea of parallel saving, but I really think we need to implement this directly in QEMU, not libvirt. As previously mentioned I think QEMU needs to get a 'file' migration protocol, along with ability to directly map RAM segments into fixed positions in the file. The benefits are many
- It will save & restore faster because we're eliminating data copies that libvirt imposes via the iohelper
- It is simple for libvirt & mgmt apps as we still only have one file to manage
- It is space efficient because if a guest dirties a memory page, we just overwrite the existing contents at the fixed location in the file, instead of appending new contents to the file
- It will restore faster too because we only restore each memory page once, due to always overwriting the file in-place when the guest dirtied a page during save
- It can save and restore with differing number of threads, and can even dynamically change the number of threads in the middle of the save/restore operation
As David G has pointed out the impl is not trivial on the QEMU side, but from what I understand of the migration code, it is certainly viable. Most importantly I think it puts us in a better position for long term feature enhancements later by taking the middle man (libvirt) out of the equation, letting QEMU directly know what medium it is saving/restoring to/from.
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, May 11, 2022 at 01:47:13PM +0200, Claudio Fontana wrote:
On 5/11/22 10:27 AM, Christophe Marie Francois Dupont de Dinechin wrote:
On 10 May 2022, at 20:38, Daniel P. Berrangé <berrange@redhat.com> wrote:
On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files
/mnt/saves/mysave /mnt/saves/mysave.0 /mnt/saves/mysave.1 .... /mnt/saves/mysave.n
Where 'n' is the number of threads used.
Overall I'm not very happy with the approach of doing any of this on the libvirt side.
Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt.
Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again.
As a result of how we integrate with QEMU multifd, we're taking the approach of saving the state across multiple files, because it is easier than trying to get multiple threads writing to the same file. It could be solved by using file range locking on the save file. eg a thread can reserve say 500 MB of space, fill it up, and then reserve another 500 MB, etc, etc. It is a bit tedious though and won't align nicely. eg a 1 GB huge page, would be 1 GB + a few bytes of QEMU RAM ave state header.
I am not familiar enough to know if this approach would work with multifd without breaking the existing format, maybe David could answer this.
First, I do not understand why you would write things that are not page-aligned to start with? (As an aside, I don’t know how any dirty tracking would work if you do not keep things page-aligned).
Yes, alignment is one issue I encountered, and that in my view would _still_ need to be solved, and that is _whatever_ we put inside QEMU in the future, as it breaks also any attempt to be more efficient (using alternative APIs to read/write etc),
and is the reason why iohelper is still needed in my patchset at all for the main file, causing one extra copy for the main channel.
The libvirt header, including metadata, domain xml etc, that wraps the QEMU VM ends at an arbitrary address, f.e:
00000000: 4c69 6276 6972 7451 656d 7564 5361 7665 LibvirtQemudSave 00000010: 0300 0000 5b13 0100 0100 0000 0000 0000 ....[........... 00000020: 3613 0000 0200 0000 0000 0000 0000 0000 6............... 00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000050: 0000 0000 0000 0000 0000 0000 3c64 6f6d ............<dom 00000060: 6169 6e20 7479 7065 3d27 6b76 6d27 3e0a ain type='kvm'>.
000113a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000113b0: 0000 0000 0000 0051 4556 4d00 0000 0307 .......QEVM..... 000113c0: 0000 000d 7063 2d69 3434 3066 782d 362e ....pc-i440fx-6. 000113d0: 3201 0000 0003 0372 616d 0000 0000 0000 2......ram...... 000113e0: 0004 0000 0008 c00c 2004 0670 632e 7261 ........ ..pc.ra 000113f0: 6d00 0000 08c0 0000 0014 2f72 6f6d 4065 m........./rom@e 00011400: 7463 2f61 6370 692f 7461 626c 6573 0000 tc/acpi/tables.. 00011410: 0000 0002 0000 0770 632e 6269 6f73 0000 .......pc.bios.. 00011420: 0000 0004 0000 1f30 3030 303a 3030 3a30 .......0000:00:0 00011430: 322e 302f 7669 7274 696f 2d6e 6574 2d70 2.0/virtio-net-p 00011440: 6369 2e72 6f6d 0000 0000 0004 0000 0670 ci.rom.........p 00011450: 632e 726f 6d00 0000 0000 0200 0015 2f72 c.rom........./r 00011460: 6f6d 4065 7463 2f74 6162 6c65 2d6c 6f61 om@etc/table-loa 00011470: 6465 7200 0000 0000 0010 0012 2f72 6f6d der........./rom 00011480: 4065 7463 2f61 6370 692f 7273 6470 0000 @etc/acpi/rsdp.. 00011490: 0000 0000 1000 0000 0000 0000 0010 7e00 ..............~. 000114a0: 0000 0302 0000 0003 0000 0000 0000 2002 .............. . 000114b0: 0670 632e 7261 6d00 0000 0000 0000 3022 .pc.ram.......0"
in my view at the minimum we have to start by adding enough padding before starting the QEMU VM (QEVM magic) to be at a page-aligned address.
I would add one patch to this effect to my prototype, as this should not be very controversial I think.
We already add padding before the QEMU migration stream begins, but we're just doing a fixed 64kb. The intent was to allow us to edit the embedded XML. It could easily round this upto to a sensible boundary if needed. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

* Christophe Marie Francois Dupont de Dinechin (cdupontd@redhat.com) wrote:
On 10 May 2022, at 20:38, Daniel P. Berrangé <berrange@redhat.com> wrote:
On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
This is v8 of the multifd save prototype, which fixes a few bugs, adds a few more code splits, and records the number of channels as well as the compression algorithm, so the restore command is more user-friendly.
It is now possible to just say:
virsh save mydomain /mnt/saves/mysave --parallel
virsh restore /mnt/saves/mysave --parallel
and things work with the default of 2 channels, no compression.
It is also possible to say of course:
virsh save mydomain /mnt/saves/mysave --parallel --parallel-connections 16 --parallel-compression zstd
virsh restore /mnt/saves/mysave --parallel
and things also work fine, due to channels and compression being stored in the main save file.
For the sake of people following along, the above commands will result in creation of multiple files
/mnt/saves/mysave /mnt/saves/mysave.0 /mnt/saves/mysave.1 .... /mnt/saves/mysave.n
Where 'n' is the number of threads used.
Overall I'm not very happy with the approach of doing any of this on the libvirt side.
Backing up, we know that QEMU can directly save to disk faster than libvirt can. We mitigated alot of that overhead with previous patches to increase the pipe buffer size, but some still remains due to the extra copies inherant in handing this off to libvirt.
Using multifd on the libvirt side, IIUC, gets us better performance than QEMU can manage if doing non-multifd write to file directly, but we still have the extra copies in there due to the hand off to libvirt. If QEMU were to be directly capable to writing to disk with multifd, it should beat us again.
As a result of how we integrate with QEMU multifd, we're taking the approach of saving the state across multiple files, because it is easier than trying to get multiple threads writing to the same file. It could be solved by using file range locking on the save file. eg a thread can reserve say 500 MB of space, fill it up, and then reserve another 500 MB, etc, etc. It is a bit tedious though and won't align nicely. eg a 1 GB huge page, would be 1 GB + a few bytes of QEMU RAM ave state header.
First, I do not understand why you would write things that are not page-aligned to start with? (As an aside, I don’t know how any dirty tracking would work if you do not keep things page-aligned).
Could uffd_register_memory accept a memory range that is not aligned? If so, when? Should that be specified in the interface?
This isn't about alignment in RAM, this is about alignment in the file. We always write chunks of RAM that are aligned but where they end up in the file is typically not aligned because the file is: <header> <pages> <header> <pages> especially in the non-multifd world.
Second, instead of creating multiple files, why not write blocks at a location determined by an variable that you increment using atomic operations each time you need a new block? If you want to keep the blocks page-aligned in the file as well (which might help if you want to mmap the file at some point), then you need to build a map of the blocks that you tack at the end of the file.
If you're going to make it that complicated you may as well go back to the old thing of writing into a qcow2. Dave
There may be good reasons not to do it that way, of course, but I am not familiar enough with the problem to know them.
The other downside of multiple files is that it complicates life for both libvirt and apps using libvirt. They need to be aware of multiple files and move them around together. This is not a simple as it might sound. For example, IIRC OpenStack would upload a save image state into a glance bucket for later use. Well, now it needs multiple distinct buckets and keep track of them all. It also means we're forced to use the same concurrency level when restoring, which is not neccessarily desirable if the host environment is different when restoring. ie The original host might have had 8 CPUs, but the new host might only have 4 available, or vica-verca.
I know it is appealing to do something on the libvirt side, because it is quicker than getting an enhancement into new QEMU release. We have been down this route before with the migration support in libvirt in the past though, when we introduced the tunnelled live migration in order to workaround QEMU's inability to do TLS encryption. I very much regret that we ever did this, because tunnelled migration was inherantly limited, so for example failed to work with multifd, and failed to work with NBD based disk migration. In the end I did what I should have done at the beginning and just added TLS support to QEMU, making tunnelled migration obsolete, except we still have to carry the code around in libvirt indefinitely due to apps using it.
So I'm very concerned about not having history repeat itself and give us a long term burden for a solution that turns out to be a evolutionary dead end.
I like the idea of parallel saving, but I really think we need to implement this directly in QEMU, not libvirt. As previously mentioned I think QEMU needs to get a 'file' migration protocol, along with ability to directly map RAM segments into fixed positions in the file. The benefits are many
- It will save & restore faster because we're eliminating data copies that libvirt imposes via the iohelper
- It is simple for libvirt & mgmt apps as we still only have one file to manage
- It is space efficient because if a guest dirties a memory page, we just overwrite the existing contents at the fixed location in the file, instead of appending new contents to the file
- It will restore faster too because we only restore each memory page once, due to always overwriting the file in-place when the guest dirtied a page during save
- It can save and restore with differing number of threads, and can even dynamically change the number of threads in the middle of the save/restore operation
As David G has pointed out the impl is not trivial on the QEMU side, but from what I understand of the migration code, it is certainly viable. Most importantly I think it puts us in a better position for long term feature enhancements later by taking the middle man (libvirt) out of the equation, letting QEMU directly know what medium it is saving/restoring to/from.
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
-- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
participants (6)
-
Ani Sinha
-
Christophe Marie Francois Dupont de Dinechin
-
Claudio Fontana
-
Daniel P. Berrangé
-
Dr. David Alan Gilbert
-
Peter Krempa