On Thu, Oct 10, 2024 at 12:06:51PM -0300, Fabiano Rosas wrote:
Daniel P. Berrangé <berrange(a)redhat.com> writes:
> On Thu, Aug 08, 2024 at 05:38:03PM -0600, Jim Fehlig via Devel wrote:
>> Introduce support for QEMU's new mapped-ram stream format [1].
>> mapped-ram is enabled by default if the underlying QEMU advertises
>> the mapped-ram migration capability. It can be disabled by changing
>> the 'save_image_version' setting in qemu.conf to version '2'.
>>
>> To use mapped-ram with QEMU:
>> - The 'mapped-ram' migration capability must be set to true
>> - The 'multifd' migration capability must be set to true and
>> the 'multifd-channels' migration parameter must set to 1
>> - QEMU must be provided an fdset containing the migration fd
>> - The 'migrate' qmp command is invoked with a URI referencing the
>> fdset and an offset where to start writing the data stream, e.g.
>>
>> {"execute":"migrate",
>> "arguments":{"detach":true,"resume":false,
>> "uri":"file:/dev/fdset/0,offset=0x11921"}}
>>
>> The mapped-ram stream, in conjunction with direct IO and multifd
>> support provided by subsequent patches, can significantly improve
>> the time required to save VM memory state. The following tables
>> compare mapped-ram with the existing, sequential save stream. In
>> all cases, the save and restore operations are to/from a block
>> device comprised of two NVMe disks in RAID0 configuration with
>> xfs (~8600MiB/s). The values in the 'save time' and 'restore
time'
>> columns were scraped from the 'real' time reported by time(1). The
>> 'Size' and 'Blocks' columns were provided by the corresponding
>> outputs of stat(1).
>>
>> VM: 32G RAM, 1 vcpu, idle (shortly after boot)
>>
>> | save | restore |
>> | time | time | Size | Blocks
>> -----------------------+---------+---------+--------------+--------
>> legacy | 6.193s | 4.399s | 985744812 | 1925288
>> -----------------------+---------+---------+--------------+--------
>> mapped-ram | 5.109s | 1.176s | 34368554354 | 1774472
>
> I'm surprised by the restore time speed up, as I didn't think
> mapped-ram should make any perf difference without direct IO
> and multifd.
>
>> -----------------------+---------+---------+--------------+--------
>> legacy + direct IO | 5.725s | 4.512s | 985765251 | 1925328
>> -----------------------+---------+---------+--------------+--------
>> mapped-ram + direct IO | 4.627s | 1.490s | 34368554354 | 1774304
>
> Still somewhat surprised by the speed up on restore here too
Hmm, I'm thinking this might be caused by zero page handling. The non
mapped-ram path has an extra buffer_is_zero() and memset() of the hva
page.
Now, is it an issue that mapped-ram skips that memset? I assume guest
memory will always be clear at the start of migration. There won't be a
situation where the destination VM starts with memory already
dirty... *and* the save file is also different, otherwise it wouldn't
make any difference.
Consider the snapshot use case. You're running the VM, so memory
has arbitrary contents, now you restore to a saved snapshot. QEMU
remains running this whole time and you can't assume initial
memory is zeroed. Surely we need the memset ?
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|