Re: [libvirt] [PATCH] qemu: Fix domain resume after failed migration

Tuesday, 19 June 2018

* Peter Krempa (pkrempa(a)redhat.com) wrote:
...
 On Mon, Jun 04, 2018 at 16:51:18 +0200, Jiri Denemark wrote:
 > Libvirt relies on being able to kill the destination domain and resume
 > the source one during migration until we called "cont" on the
 > destination. Unfortunately, QEMU automatically activates block devices
 > at the end of migration even when it's called with -S. This wasn't a big
 > issue in the past since the guest is not running and thus no data are
 > written to the block devices. However, when QEMU introduced its internal
 > block device locks, we can no longer resume the source domain once the
 > destination domain already activated the block devices (and thus
 > acquired all locks) unless the destination domain is killed first.
 > 
 > Since it's impossible to synchronize the destination and the source
 > libvirt daemons after a failed migration, QEMU introduced a new
 > migration capability called "late-block-activat" which ensures QEMU
 > won't activate block devices until it gets "cont". The only thing we
 > need to do is to enable this capability whenever QEMU supports it.

 I'm wondering when this new feature should _not_ be used. I did not get
 the information from the qemu commit message so I've cc'd David to shed
 some light.

 If it's desired to always pass it then I'm failing to see why they've
 added it in the first place. 

There was some worry that doing it by default would be a subtle API
change; personally I didn't really see it as a problem, but since people
were worried I made it switchable.

See:
https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg01300.html

Dave
--
Dr. David Alan Gilbert / dgilbert(a)redhat.com / Manchester, UK

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [PATCH] qemu: Fix domain resume after failed migration