Re: Qemu pause-before-switchover and libvirt live migration

21 Oct 2025

      On Mon, Oct 20, 2025 at 08:52:18 +0000, Rogério Vinhal Nunes wrote:
...
...
On 20 Oct 2025, at 09:36, Peter Krempa <pkrempa@redhat.com> wrote:
On Fri, Oct 17, 2025 at 13:47:54 +0000, Rogério Vinhal Nunes wrote:
...
...
On 17 Oct 2025, at 12:59, Peter Krempa <pkrempa@redhat.com> wrote:
...
...
On Thu, Oct 16, 2025 at 18:47:36 +0000, Rogério Vinhal Nunes wrote:
[...]
...
...
So with NBD/NFS and others it works in a way where both the source and
destination open the storage itself. QEMU internally ensures that it
hands over the state cleanly and doesn't write from the source after
handover.
Can't your storage do that? That way you could do the setup before
migration on the destination and tear-down after migration on source,
thus eliminating the extra unbounded latency at switchover?
The problem is that the way it's currently designed it relies on
cached writes that can be propagated after the domain starts on the
destination, so we need the hook to, at least, flush the source before
the destination becomes rw.
So qemu internally uses posix_fadvise(POSIX_FADV_DONTNEED) (see qemu
commit dd577a26ff03b6829721b1ffbbf9e7c411b72378) to drop caches before
migration to specifically support migration with caching enabled. Won't
that work for your storage?

Alternatively the usual approach before that and in many cases the still
suggested option is to bypass caching on the host side e.g. by using
cache='none' for the disks.

Would either of the above work? Or is the cache inherent to the storage?
...
...
...
...
...
An alternative could be to have an option to wait for a resume
operation to progress as a client-defined migration flag exposing
the pre-switchover state. This way maybe we could work it as a
client feature rather than a hook?
Once again specifying what you actually want to do would be helpful.
E.g. I can suggest that you can migrate the VM as paused, which
ensures that once the migration completes it will not continue
execution on the destination, which could give you the chance for
additional synchronisation.
For us it's important to have the least amount of interruption as
possible, so we're very keen on a live migration here.
That's the reason I think a synchronous hook, which will block the
migration from switching over while the hook is executing, is not a
great idea.
The hook is supposed to take order of ms whilst the migration of the
memory is supposed to take many seconds. I believe that pausing the
domain will be worse in manners of interruption. WRT to migrations
that don't rely on it, we could add a migration flag that enables
this.
The hook, as it happens on a critical path would need some form of
positive action from the user/management app when it is about to be used
so that we can avoid calling it for VMs which don't need it.

A flag would likely work, but I was thinking of actually doing two
hooks.

The first one would be called in one of the preparation steps. The
return value for that hook would be used as an indication that the
switchover hook should be invoked. This way the hook can exist purely in
hook form, and thus not require changes to the applications using said
deployment.

If nothing of what I've suggested aleviates your need for the hook, feel
free to contriubute it.

In the code, hooks are invoked via virHookCall(). You'll have to find
the appropriate places where to place them. 'docs/hooks.rst' documents
the user-facing side of the hook, so your new actions will need to be
added.

Feel free to post another thread if you have any further questions about
the design.

Our contributor guidelines are at https://www.libvirt.org/hacking.html
including AI policy (emphasized due to your email address).