Hi Martin,
On 8/7/24 06:32, Martin Kletzander wrote:
On Thu, Jun 13, 2024 at 04:43:14PM -0600, Jim Fehlig via Devel
wrote:
> This series is a RFC for support of QEMU's mapped-ram migration
> capability [1] for saving and restoring VMs. It implements the first
> part of the design approach we discussed for supporting parallel
> save/restore [2]. In summary, the approach is
>
> 1. Add mapped-ram migration capability
> 2. Steal an element from save header 'unused' for a 'features'
variable
> and bump save version to 3.
> 3. Add /etc/libvirt/qemu.conf knob for the save format version,
> defaulting to latest v3
> 4. Use v3 (aka mapped-ram) by default
> 5. Use mapped-ram with BYPASS_CACHE for v3, old approach for v2
> 6. include: Define constants for parallel save/restore
> 7. qemu: Add support for parallel save. Implies mapped-ram, reject if v2
> 8. qemu: Add support for parallel restore. Implies mapped-ram.
> Reject if v2
> 9. tools: add parallel parameter to virsh save command
> 10. tools: add parallel parameter to virsh restore command
>
> This series implements 1-5, with the BYPASS_CACHE support in patches 8
> and 9 being quite hacky. They are included to discuss approaches to make
> them less hacky. See the patches for details.
>
They might seem tiny bit hacky, but it's not that big of a deal I think.
In the meantime I've been working on a V1 that's a bit less hacky :-).
You could eliminate two conditions by making the first FD always
non-direct (as in either there is no BYPASS_CACHE or it's already
wrapped by the I/O helper), but it would complicate other things in the
code and would get even more hairy IMHO.
In V1, the first FD is always buffered. The directio one, if needed, is opened a
bit later in the save process.
> The QEMU mapped-ram capability currently does not support directio.
> Fabino is working on that now [3]. This complicates merging support
> in libvirt. I don't think it's reasonable to enable mapped-ram by
> default when BYPASS_CACHE cannot be supported. Should we wait until
> the mapped-ram directio support is merged in QEMU before supporting
> mapped-ram in libvirt?
>
By the time I looked at this series the direct-io work has already went
in, but there is still the need for the second descriptor to do some
unaligned I/O.
Correct. However, ATM I don't know how to detect if qemu supports the direct-io
migration parameter.
From the QEMU patches I'm not sure whether you also need to set the
direct-io migration capability/flag when migrating to an fdset. Maybe
that's needed for migration into a file directly.
The direct-io migration parameter must be set to 'true' and QEMU provided a
second FD with O_DIRECT set.
> For the moment, compression is ignored in the new save version.
> Currently, libvirt connects the output of QEMU's save stream to the
> specified compression program via a pipe. This approach is incompatible
> with mapped-ram since the fd provided to QEMU must be seekable. One
> option is to reopen and compress the saved image after the actual save
> operation has completed. This has the downside of requiring the iohelper
> to handle BYPASS_CACHE, which would preclude us from removing it
> sometime in the future. Other suggestions much welcomed.
>
I was wondering whether it would make sense to use user-space block I/O,
but we'd have to use some compression on a block-by-block basis and
since you need to be able to compress each write separately, that means
you might just save few bytes here and there. And on top of that you'd
have to compress each individual block and that block needs to be
allocated as a whole, so no space would be saved at all. So that does
not make sense unless there is some new format.
And compression after the save is finished is in my opinion kind of
pointless. You don't save time and you only save disk space _after_ the
compression step is done. Not to mention you'd have to uncompress it
again before starting QEMU from it. I'd be fine with making users
choose between compression and mapped-ram, at least for now. They can
compress the resulting file on their own.
I'd prefer to make compression incompatible with mapped-ram too. Sounds like
Daniel is agreeable to that as well.
Regards,
Jim
> Note the logical file size of mapped-ram saved images is slightly
> larger than guest RAM size, so the files are often much larger than the
> files produced by the existing, sequential format. However, actual blocks
> written to disk is often lower with mapped-ram saved images. E.g. a saved
> image from a 30G, freshly booted, idle guest results in the following
> 'Size' and 'Blocks' values reported by stat(1)
>
> Size Blocks
> sequential 998595770 1950392
> mapped-ram 34368584225 1800456
>
> With the same guest running a workload that dirties memory
>
> Size Blocks
> sequential 33173330615 64791672
> mapped-ram 34368578210 64706944
>
That's fine and even better. It saves space, the only thing that
everyone needs to keep in mind is to treat it as a sparse file.
The other option for the space saving would be to consolidate the
streamed changes in the resulting file, but for little to no gain.
The mapped-ram is better.
> Thanks for any comments on this RFC!
>
> [1]
>
https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/migration/m...
> [2]
>
https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/K...
> [3]
https://mail.gnu.org/archive/html/qemu-devel/2024-05/msg04432.html
Some more teeny tiny comments in two patches will follow.
Martin
>
> Jim Fehlig (9):
> qemu: Enable mapped-ram migration capability
> qemu_fd: Add function to retrieve fdset ID
> qemu: Add function to get migration params for save
> qemu: Add a 'features' element to save image header and bump version
> qemu: conf: Add setting for save image version
> qemu: Add support for mapped-ram on save
> qemu: Enable mapped-ram on restore
> qemu: Support O_DIRECT with mapped-ram on save
> qemu: Support O_DIRECT with mapped-ram on restore
>
> src/qemu/libvirtd_qemu.aug | 1 +
> src/qemu/qemu.conf.in | 6 +
> src/qemu/qemu_conf.c | 8 ++
> src/qemu/qemu_conf.h | 1 +
> src/qemu/qemu_driver.c | 25 ++--
> src/qemu/qemu_fd.c | 18 +++
> src/qemu/qemu_fd.h | 3 +
> src/qemu/qemu_migration.c | 99 ++++++++++++++-
> src/qemu/qemu_migration.h | 11 +-
> src/qemu/qemu_migration_params.c | 20 +++
> src/qemu/qemu_migration_params.h | 4 +
> src/qemu/qemu_monitor.c | 40 ++++++
> src/qemu/qemu_monitor.h | 5 +
> src/qemu/qemu_process.c | 63 +++++++---
> src/qemu/qemu_process.h | 16 ++-
> src/qemu/qemu_saveimage.c | 187 +++++++++++++++++++++++------
> src/qemu/qemu_saveimage.h | 20 ++-
> src/qemu/qemu_snapshot.c | 12 +-
> src/qemu/test_libvirtd_qemu.aug.in | 1 +
> 19 files changed, 455 insertions(+), 85 deletions(-)
>
> --
> 2.44.0
>