* Daniel P. Berrangé (berrange(a)redhat.com) wrote:
On Thu, May 12, 2022 at 05:58:46PM +0100, Dr. David Alan Gilbert
wrote:
> * Daniel P. Berrangé (berrange(a)redhat.com) wrote:
> > On Wed, May 11, 2022 at 07:31:45PM +0200, Claudio Fontana wrote:
> > > That's great, I love when things are simple.
> > >
> > > If indeed we want to remove the copy in libvirt (which will also mean
explicitly fsyncing elsewhere, as the iohelper would not be there anymore to do that for
us on image creation),
> > > with QEMU having a "file" protocol support for migration,
> > >
> > > do we plan to have libvirt and QEMU both open the file for writing
concurrently, with QEMU opening O_DIRECT?
> >
> > For non-libvirt users, I expect QEMU would open the
> > file directly . For libvirt usage, it is likely
> > preferrable to pass the pre-opened FD, because that
> > simplifies file permission handling.
> >
> > > The alternative being having libvirt open the file with
> > > O_DIRECT, write some libvirt stuff in a new, O_DIRECT-
> > > friendly format, and then pass the fd to qemu to migrate to,
> > > and QEMU sending its new O_DIRECT friendly stream there.
> >
> > Yep.
> >
> > > In any case, the expectation here is to have a new
> > > "file://pathname" or "file:://fdname" as an added
feature in QEMU,
> > > where QEMU would write a new O_DIRECT friendly stream
> > > directly into the file, taking care of both optional
> > > parallelization and compression.
> >
> > I could see several distinct building blocks
> >
> > * First a "file:/some/path" migration protocol
> > that can just do "normal" I/O, but still writing
> > in the traditional migration data stream
> >
> > * Modify existing 'fd:' protocol so that it fstat()s
> > and passes over to the 'file' protocol handler if
> > it sees the FD is not a socket/pipe
>
> We used to have that at one point.
>
> > * Add a migration capability "direct-mapped" to
> > indicate we want the RAM data written/read directly
> > to/from fixed positions in the file, as opposed to
> > a stream. Obviously only valid with a sub-set
> > of migration protocols (file, and fd: if a seekable
> > FD).
>
> This worries me about how you're going to cleanly glue this into the
> migration code; it sounds like what you want it to do is very different
> to what it currently does.
I've only investigated it lightly, but I see the key bit of code
is this method which emits the header + ram page content:
static int save_normal_page(RAMState *rs, RAMBlock *block, ram_addr_t offset,
uint8_t *buf, bool async)
{
ram_transferred_add(save_page_header(rs, rs->f, block,
offset | RAM_SAVE_FLAG_PAGE));
if (async) {
qemu_put_buffer_async(rs->f, buf, TARGET_PAGE_SIZE,
migrate_release_ram() &&
migration_in_postcopy());
} else {
qemu_put_buffer(rs->f, buf, TARGET_PAGE_SIZE);
}
ram_transferred_add(TARGET_PAGE_SIZE);
ram_counters.normal++;
return 1;
}
my (perhaps wishful) thinking was that we just have an alternative
impl of this which doesn't save the page header, and puts the
page content at a fixed offset.
Hmm OK, probably can; note I think the multifd is separate code
(and currently much cleaner - which you'd make more complex again).
I'm fuzzy on how we figure out the right offset - I was hoping
that "RAMState" or "RAMBlock" somehow gives us enough info to figure
out a deterministic mapping to a file location.
I think that's probably the ram_addr_t type, RAMBlock->offset + the
index intot he ramblock; that gets you the same thing as the dirty
bitmap (hmm although we don't have a single one of those any more).
Dave
--
Dr. David Alan Gilbert / dgilbert(a)redhat.com / Manchester, UK