Re: [libvirt] [PATCH v2 1/8] Added public API to enable post-copy migration

Thursday, 6 November 2014

On 01 Oct 2014, at 12:07 , Jiri Denemark <jdenemar(a)redhat.com&gt; wrote:

...
 On Wed, Oct 01, 2014 at 10:45:33 +0200, Cristian KLEIN wrote:
> On 2014-09-30 17:16, Daniel P. Berrange wrote:
>> On Tue, Sep 30, 2014 at 05:11:03PM +0200, Jiri Denemark wrote:
>>> On Tue, Sep 30, 2014 at 16:39:22 +0200, Cristian Klein wrote:
>>>> Signed-off-by: Cristian Klein <cristian.klein(a)cs.umu.se&gt;
>>>> ---
>>>>  include/libvirt/libvirt.h.in | 1 +
>>>>  src/libvirt.c                | 7 +++++++
>>>>  2 files changed, 8 insertions(+)
>>>> 
>>>> diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in
>>>> index 5217ab3..82f3aeb 100644
>>>> --- a/include/libvirt/libvirt.h.in
>>>> +++ b/include/libvirt/libvirt.h.in
>>>> @@ -1225,6 +1225,7 @@ typedef enum {
>>>>      VIR_MIGRATE_ABORT_ON_ERROR    = (1 << 12), /* abort migration
on I/O errors happened during migration */
>>>>      VIR_MIGRATE_AUTO_CONVERGE     = (1 << 13), /* force
convergence */
>>>>      VIR_MIGRATE_RDMA_PIN_ALL      = (1 << 14), /* RDMA memory
pinning */
>>>> +    VIR_MIGRATE_POSTCOPY          = (1 << 15), /* enable (but
don't start) post-copy */
>>>>  } virDomainMigrateFlags;
>>> 
>>> I still think we should add an extra flag to start post copy
>>> immediately. To address your concerns about it, I don't think it's
>>> implementing a policy in libvirt. It's for apps that want to make sure
>>> migration converges without having to spawn another thread and monitor
>>> the progress or wait for a timeout. It's a bit similar to migrating a
>>> paused domain vs. migrating a running domain and pausing it when it
>>> doesn't seem to converge.
>> 
>> Your point about spawning another thread makes me wonder if we should
>> actually look at adding a 'VIR_MIGRATE_ASYNC' method (that would require
>> P2P migration of course). If this flag were set, virDomainMigrateXXX would
>> only block for long enough to start the migration and then return.
>> 
>> Callers can use the job info API to monitor progress & success/failure.
>> 
>> Then we wouldn't have to keep adding flags like you suggest - apps can
>> just easily call the appropriate API right away with no threads needed
> 
> This would make a lot of sense. The user would call:
> 
> """
> virDomainMigrateXXX(..., VIR_MIGRATE_POSTCOPY | VIR_MIGRATE_ASYNC)
> virDomainMigrateStartPostCopy(...)
> """
> 
> Would this be seen as more cumbersome than having a dedicated 
> VIR_MIGRATE_POSTCOPY_AUTOSTART?

 The ASYNC flag Daniel suggested makes sense, so I guess you can just
 ignore my request for a special flag. Although, I don't think the ASYNC
 stuff needs to be done within this series, let's just focus on the
 post-copy stuff. 
Hi Jirka,

I talked to the qemu post-copy guys (Andrea and Dave in CC). Starting post-copy
immediately is a bad performance choice: The VM will start on the destination hypervisor
before the read-only or kernel memory is there. This means that those pages need to be
pulled on-demand, hence a lot of overhead and interruptions in the VM’s execution.

Instead, it is better to first do one pass of pre-copy and only then trigger post-copy. In
fact, I did an experiment with a video streaming VM and starting post-copy after the first
pass of pre-copy (instead of starting post-copy immediately) reduces downtime from 3.5
seconds to under 1 second.

Given all above, I propose the following post-copy API in libvirt:

virDomainMigrateXXX(..., VIR_MIGRATE_ENABLE_POSTCOPY)
virDomainMigrateStartPostCopy(...) // from a different thread

This is for those who just need the post-copy mechanism and want to implement a policy
themselves.

virDomainMigrateXXX(..., VIR_MIGRATE_POSTCOPY_AFTER_PRECOPY)

This is for those who want to use post-copy without caring about any low-level details,
offering a good enough policy for most cases.

What do you think? Would you accept patches that implement this API?

Cristian

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [PATCH v2 1/8] Added public API to enable post-copy migration