Re: Revisiting parallel save/restore

26 Apr 2024

      Daniel P. Berrangé <berrange@redhat.com> writes:
...
On Fri, Apr 26, 2024 at 10:03:29AM -0300, Fabiano Rosas wrote:
...
Daniel P. Berrangé <berrange@redhat.com> writes:
...
On Wed, Apr 17, 2024 at 05:12:27PM -0600, Jim Fehlig via Devel wrote:
...
A good starting point on this journey is supporting the new mapped-ram
capability in qemu 9.0 [2]. Since mapped-ram is a new on-disk format, I
assume we'll need a new QEMU_SAVE_VERSION 3 when using it? Otherwise I'm not
sure how to detect if a saved image is in mapped-ram format vs the existing,
sequential stream format.
Yes, we'll need to be supporting 'mapped-ram', so a good first step.
A question is whether we make that feature mandatory for all save images,
or implied by another feature (parallel save), or an directly controllable
feature with opt-in.
The former breaks back compat with existnig libvirt, while the latter 2
options are net new so don't have compat implications.
In terms of actual data blocks written on disk mapped-ram should be be the
same size, or smaller, than the existing format.
In terms of logical file size, however, mapped-ram will almost always be
larger.
This is because mapped-ram will result in a file whose logical size matches
the guest RAM size, plus some header overhead, while being sparse so not
all blocks are written.
If tools handling save images aren't sparse-aware this could come across
as a surprise and even be considered a regression.
Mapped ram is needed for parallel saves since it lets each thread write
to a specific region of the file.
Mapped ram is good for non-parallel saves too though, because the mapping
of RAM into the file is aligned suitably to allow for O_DIRECT to be used.
Currently libvirt has to tunnnel over its iohelper to futz alignment
needed for O_DIRECT. This makes it desirable to use in general, but back
compat hurts...
Note that QEMU doesn't support O_DIRECT without multifd.
From mapped-ram patch series v4:
- Dropped support for direct-io with fixed-ram _without_ multifd. This
  is something I said I would do for this version, but I had to drop
  it because performance is really bad. I think the single-threaded
  precopy code cannot cope with the extra latency/synchronicity of
  O_DIRECT.
Note the reason for using O_DIRECT is *not* to make saving / restoring
the guest VM faster. Rather it is to ensure that saving/restoring a VM
does not trash the host I/O / buffer cache, which will negatively impact
performance of all the *other* concurrently running VMs.
Well, there's surely a performance degradation threshold that negates
the benefits of perserving the caches. But maybe it's not as low as I
initially thought then. 

The direct-io enablement is now posted to the qemu mailing list, please
take a look when you get the chance. I'll revisit the direct-io
no-parallel approach in the meantime, let's keep that option open for
now.
...
With regards,
Daniel