Re: [libvirt] [PATCH] RFC: Support QEMU live uprgade

Wednesday, 13 November 2013

On Wed, Nov 13, 2013 at 12:15:30PM +0800, Zheng Sheng ZS Zhou wrote:
...
 Hi Daniel,

 on 2013/11/12/ 20:23, Daniel P. Berrange wrote:> On Tue, Nov 12, 2013 at 08:14:11PM
+0800, Zheng Sheng ZS Zhou wrote:
 >> Hi all,
 >>
 >> Recently QEMU developers are working on a feature to allow upgrading
 >> a live QEMU instance to a new version without restarting the VM. This
 >> is implemented as live migration between the old and new QEMU process
 >> on the same host [1]. Here is the the use case:
 >>
 >> 1) Guests are running QEMU release 1.6.1.
 >> 2) Admin installs QEMU release 1.6.2 via RPM or deb.
 >> 3) Admin starts a new VM using the updated QEMU binary, and asks the old
 >> QEMU process to migrate the VM to the newly started VM.
 >>
 >> I think it will be very useful to support QEMU live upgrade in libvirt.
 >> After some investigations, I found migrating to the same host breaks
 >> the current migration code. I'd like to propose a new work flow for
 >> QEMU live migration. It is to implement the above step 3).
 > 
 > How does it break migration code ? Your patch below is effectively
 > re-implementing the multistep migration workflow, leaving out many
 > important features (seemless reconnect to SPICE clients for example)
 > which is really bad for our ongoing code support burden, so not
 > something I want to see.
 > 
 > Daniel
 > 

 Actually I wrote another hacking patch to investigate how we
 can re-use existing framework to do local migration. I found
 the following problems.

 (1) When migrate to different host, the destination domain uses
 the same UUID and name as the source, and this is OK. When migrate
 to localhost, destination domain UUID and name causes conflict
 with the source. In QEMU driver, it maintains a hash table of
 domain objects, the reference key is the UUID of the virtual
 machine. The closeCallbacks is also a hash table with domain
 UUID as key, and maybe there are other data structures using
 UUID as key. This implies we use a different name and UUID
 for the destination domain. In the migration framework, during
 the Begin and Prepare stage, it calls virDomainDefCheckABIStability
 to prevent us using a different UUID, and it also checks the
 hostname and host UUID to be different. If we want to enable
 local migration, we have to skip these check and generate new
 UUID and name for destination domain. Of course we restore the
 original UUID after migration. UUID is used in higher level
 management software to identify virtual machines. It should
 stay the same after QEMU live upgrade. 
This point is something that needs to be solved regardless of
whether using migration framework, or re-inventing the migration
framework. The QEMU driver fundamentally assumes that there is
only ever one single VM with a given UUID, and a VM has only
1 process. IMHO name + uuid must be preserved during any live
upgrade process, otherwise mgmt will get confused. This has
more problems becasue 'name' is used for various resources
created by QEMU on disk - eg the monitor command path. We can't
have 2 QEMUs using the same name, but at the same time that's
exactly what we'd need here.

...
 (2) If I understand the code correctly, libvirt uses thread
 pool to handle RPC requests. This means local migration may
 cause deadlock in P2P migration mode. Suppose there are some
 concurrent local migration requests and all the worker threads
 are occupied by these requests. When source libvirtd connects
 destination libvirtd on the same host to negotiate the migration,
 the negotiation request is queued, but the negotiation request
 will never be handled, because the original migration request
 from client is waiting for the negotiation request to finish
 to progress, while the negotiation request is queued waiting
 for the original request to end. This is one of the dealock
 risk I can think of.
 I guess in traditional migration mode, in which the client
 opens two connections to source and destination libvirtd,
 there is also risk to cause deadlock. 
Yes, it sounds like you could get deadlock even with 2 separate
libvirtds, if both them were migrating to the other concurrently.

...
 (3) Libvirt supports Unix domain socket transport, but
 this is only used in a tunnelled migration. For native
 migration, it only supports TCP. We need to enable Unix
 domain socket transport in native migration. Now we already
 have a hypervisor migration URI argument in the migration
 API, but there is no support for parsing and verifying a
 "unix:/full/path" URI and passing that URI transparently
 to QEMU. We can add this to current migration framework
 but direct Unix socket transport looks meaningless for
 normal migration. 
Actually as far as QEMU is concerned libvirt uses fd: migration
only. Again though this points seems pretty much unrelated to
the question of how we design the APIs & structure the code.

...
 (4) When migration fails, the source domain is resumed, and
 this may not work if we enable page-flipping in QEMU. With
 page-flipping enabled, QEMU transfers memory page ownership
 to the destination QEMU, so the source virtual machine
 should be restarted but not resumed when the migration fails. 
IMHO that is not an acceptable approach. The whole point of doing
live upgrades in place, is that you consider the VMs to be
"precious". If you were OK with VMs being killed & restarted then
we'd not bother doing any of this live upgrade pain at all.

So if we're going to support live upgrades, we *must* be able to
guarantee that they will either succeed, or the existing QEMU is
left intact.  Killing the VM and restarting is not an option on
failure.

...
 So I propose a new and compact work flow dedicated for QEMU
 live upgrade. After all, it's an upgrade operation based on
 tricky migration. When developing the previous RFC patch for
 the new API, I focused on the correctness of the work flow,
 so many other things are missing. I think I can add things
 like Spice seamless migration when I submitting new versions. 
This way lies madness. We do not want 2 impls of the internal
migration framework.

...
 I am also really happy if you could give me some advice to
 re-use the migration framework. Re-using the current framework
 can saves a lot of effort. 
I consider using the internal migration framework a mandatory
requirement here, even if the public API is different.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [PATCH] RFC: Support QEMU live uprgade