On 4/21/22 7:08 PM, Daniel P. Berrangé wrote:
On Thu, Apr 14, 2022 at 09:54:16AM +0200, Claudio Fontana wrote:
> RFC, starting point for discussion.
>
> Sketch API changes to allow parallel Saves, and open up
> and implementation for QEMU to leverage multifd migration to files,
> with optional multifd compression.
>
> This allows to improve save times for huge VMs.
>
> The idea is to issue commands like:
>
> virsh save domain /path/savevm --parallel --parallel-connections 2
>
> and have libvirt start a multifd migration to:
>
> /path/savevm : main migration connection
> /path/savevm.1 : multifd channel 1
> /path/savevm.2 : multifd channel 2
At a conceptual level the idea would to still have a single file,
but have threads writing to different regions of it. I don't think
that's possible with multifd though, as it doesn't partition RAM
up between threads, its just hands out pages on demand. So if one
thread happens to be quicker it'll send more RAM than another
thread. Also we're basically capturing the migration RAM, and the
multifd channels have control info, in addition to the RAM pages.
That makes me wonder actually, are the multifd streams unidirectional
or bidirectional ? Our saving to a file logic, relies on the streams
being unidirectional.
Unidirectional. In the meantime I completed an actual libvirt prototype that works (only
did the save part, not the restore yet).
You've got me thinking, however, whether we can take QEMU out of
the loop entirely for saving RAM.
IIUC with 'x-ignore-shared' migration capability QEMU will skip
saving of RAM region entirely (well technically any region marked
as 'shared', which I guess can cover more things).
Heh I have no idea about this.
If the QEMU process is configured with a file backed shared
memory, or memfd, I wonder if we can take advantage of this.
eg
1. pause the VM
1. write the libvirt header to save.img
2. sendfile(qemus-memfd, save.img-fd) to copy the entire
RAM after header
I don't understand this point very much... if the ram is already backed by file why
are we sending this again..?
3. QMP migrate with x-ignore-shared to copy device
state after RAM
Probably can do the same on restore too.
Do I understand correctly that you suggest to constantly update the RAM to file at
runtime?
Given the compute nature of the workload, I'd think this would slow things down.
We need to evict the memory to disk rarely, but when that happens it should be as fast as
possible.
The advantage of the multifd idea was, we have cpus sitting there doing nothing that were
reserved for running the VM,
we may as well use them to reduce the size of the problem substantially by compressing
each stream separately.
Now, this would only work for a 'save' and 'restore', not
for snapshots, as it would rely on the VCPUs being paused
to stop RAM being modified.
>
> Signed-off-by: Claudio Fontana <cfontana(a)suse.de>
> ---
> include/libvirt/libvirt-domain.h | 5 +++++
> src/driver-hypervisor.h | 7 +++++++
> src/libvirt_public.syms | 5 +++++
> src/qemu/qemu_driver.c | 1 +
> tools/virsh-domain.c | 8 ++++++++
> 5 files changed, 26 insertions(+)
>
> diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
> index 2d5718301e..a7b9c4132d 100644
> --- a/include/libvirt/libvirt-domain.h
> +++ b/include/libvirt/libvirt-domain.h
> @@ -1270,6 +1270,7 @@ typedef enum {
> VIR_DOMAIN_SAVE_RUNNING = 1 << 1, /* Favor running over paused */
> VIR_DOMAIN_SAVE_PAUSED = 1 << 2, /* Favor paused over running */
> VIR_DOMAIN_SAVE_RESET_NVRAM = 1 << 3, /* Re-initialize NVRAM from
template */
> + VIR_DOMAIN_SAVE_PARALLEL = 1 << 4, /* Parallel Save/Restore to
multiple files */
> } virDomainSaveRestoreFlags;
>
> int virDomainSave (virDomainPtr domain,
> @@ -1278,6 +1279,10 @@ int virDomainSaveFlags (virDomainPtr
domain,
> const char *to,
> const char *dxml,
> unsigned int flags);
> +int virDomainSaveParametersFlags (virDomainPtr domain,
> + virTypedParameterPtr params,
> + int nparams,
> + unsigned int flags);
> int virDomainRestore (virConnectPtr conn,
> const char *from);
> int virDomainRestoreFlags (virConnectPtr conn,
> diff --git a/src/driver-hypervisor.h b/src/driver-hypervisor.h
> index 4423eb0885..a4e1d21e76 100644
> --- a/src/driver-hypervisor.h
> +++ b/src/driver-hypervisor.h
> @@ -240,6 +240,12 @@ typedef int
> const char *dxml,
> unsigned int flags);
>
> +typedef int
> +(*virDrvDomainSaveParametersFlags)(virDomainPtr domain,
> + virTypedParameterPtr params,
> + int nparams,
> + unsigned int flags);
> +
> typedef int
> (*virDrvDomainRestore)(virConnectPtr conn,
> const char *from);
> @@ -1489,6 +1495,7 @@ struct _virHypervisorDriver {
> virDrvDomainGetControlInfo domainGetControlInfo;
> virDrvDomainSave domainSave;
> virDrvDomainSaveFlags domainSaveFlags;
> + virDrvDomainSaveParametersFlags domainSaveParametersFlags;
> virDrvDomainRestore domainRestore;
> virDrvDomainRestoreFlags domainRestoreFlags;
> virDrvDomainSaveImageGetXMLDesc domainSaveImageGetXMLDesc;
> diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms
> index f93692c427..eb3a7afb75 100644
> --- a/src/libvirt_public.syms
> +++ b/src/libvirt_public.syms
> @@ -916,4 +916,9 @@ LIBVIRT_8.0.0 {
> virDomainSetLaunchSecurityState;
> } LIBVIRT_7.8.0;
>
> +LIBVIRT_8.3.0 {
> + global:
> + virDomainSaveParametersFlags;
> +} LIBVIRT_8.0.0;
> +
> # .... define new API here using predicted next version number ....
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 77012eb527..249105356c 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -20826,6 +20826,7 @@ static virHypervisorDriver qemuHypervisorDriver = {
> .domainGetControlInfo = qemuDomainGetControlInfo, /* 0.9.3 */
> .domainSave = qemuDomainSave, /* 0.2.0 */
> .domainSaveFlags = qemuDomainSaveFlags, /* 0.9.4 */
> + .domainSaveParametersFlags = qemuDomainSaveParametersFlags, /* 8.3.0 */
> .domainRestore = qemuDomainRestore, /* 0.2.0 */
> .domainRestoreFlags = qemuDomainRestoreFlags, /* 0.9.4 */
> .domainSaveImageGetXMLDesc = qemuDomainSaveImageGetXMLDesc, /* 0.9.4 */
> diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
> index d5fd8be7c3..ccded6d265 100644
> --- a/tools/virsh-domain.c
> +++ b/tools/virsh-domain.c
> @@ -4164,6 +4164,14 @@ static const vshCmdOptDef opts_save[] = {
> .type = VSH_OT_BOOL,
> .help = N_("avoid file system cache when saving")
> },
> + {.name = "parallel",
> + .type = VSH_OT_BOOL,
> + .help = N_("enable parallel save to files")
> + },
> + {.name = "parallel-connections",
> + .type = VSH_OT_INT,
> + .help = N_("number of connections/files for parallel save")
> + },
> {.name = "xml",
> .type = VSH_OT_STRING,
> .completer = virshCompletePathLocalExisting,
> --
> 2.34.1
>
With regards,
Daniel