
On Tue, Nov 12, 2013 at 09:54:44PM -0700, Eric Blake wrote:
On 11/12/2013 05:14 AM, Zheng Sheng ZS Zhou wrote:
From 2b659584f2cbe676c843ddeaf198c9a8368ff0ff Mon Sep 17 00:00:00 2001 From: Zhou Zheng Sheng <zhshzhou@linux.vnet.ibm.com> Date: Wed, 30 Oct 2013 15:36:49 +0800 Subject: [PATCH] RFC: Support QEMU live uprgade
This patch is to support upgrading QEMU version without restarting the virtual machine.
Add new API virDomainQemuLiveUpgrade(), and a new virsh command qemu-live-upgrade. virDomainQemuLiveUpgrade() migrates a running VM to the same host as a new VM with new name and new UUID. Then it shutdown the original VM and drop the new VM definition without shutdown the QEMU process of the new VM. At last it attaches original VM to the new QEMU process.
Firstly the admin installs new QEMU package, then he runs virsh qemu-live-upgrade domain_name to trigger our virDomainQemuLiveUpgrade() upgrading flow.
In general, I agree that we need a new API (in fact, I think I helped suggest why we need it as opposed to reusing existing migration API, precisely for some of the deadlock reasons you called out in your reply to Dan). But the new API should still reuse as much of the existing migration code as possible (refactor that to be reusable, rather than bulk copying into completely new code). High-level review below (I didn't test whether things work or look for details like memory leaks, so much as a first impression of style problems and even some major design problems).
I really don't like the idea of adding a new API for this - IMHO we need to address the deadlock scenario and fit this into our existing migration APIs. In particular calling this "live upgrades" is wrong, as that is just a specific use case. Functionally this is "localhost migration" and so belongs in the migration APIs. As mentioned in my other message, I believe the deadlock scenario mentioned could even occurr in non-localhost migration, if two libvirtds were doing migrating concurrent migrations in opposite directions. So this seems like something we need to look at fixing somehow. Perhaps it needs a dedicated thread pool, or spawn on demand thread, just for doing the specific migration RPC call that could deadlock, so we can guarantee we can always succeed in it ? Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|