Hi Daniel,
on 2013/11/12/ 20:23, Daniel P. Berrange wrote:> On Tue, Nov 12, 2013 at 08:14:11PM
+0800, Zheng Sheng ZS Zhou wrote:
> Hi all,
>
> Recently QEMU developers are working on a feature to allow upgrading
> a live QEMU instance to a new version without restarting the VM. This
> is implemented as live migration between the old and new QEMU process
> on the same host [1]. Here is the the use case:
>
> 1) Guests are running QEMU release 1.6.1.
> 2) Admin installs QEMU release 1.6.2 via RPM or deb.
> 3) Admin starts a new VM using the updated QEMU binary, and asks the old
> QEMU process to migrate the VM to the newly started VM.
>
> I think it will be very useful to support QEMU live upgrade in libvirt.
> After some investigations, I found migrating to the same host breaks
> the current migration code. I'd like to propose a new work flow for
> QEMU live migration. It is to implement the above step 3).
How does it break migration code ? Your patch below is effectively
re-implementing the multistep migration workflow, leaving out many
important features (seemless reconnect to SPICE clients for example)
which is really bad for our ongoing code support burden, so not
something I want to see.
Daniel
Actually I wrote another hacking patch to investigate how we can re-use existing framework
to do local migration. I found the following problems.
(1) When migrate to different host, the destination domain uses the same UUID and name as
the source, and this is OK. When migrate to localhost, destination domain UUID and name
causes conflict with the source. In QEMU driver, it maintains a hash table of domain
objects, the reference key is the UUID of the virtual machine. The closeCallbacks is also
a hash table with domain UUID as key, and maybe there are other data structures using UUID
as key. This implies we use a different name and UUID for the destination domain. In the
migration framework, during the Begin and Prepare stage, it calls
virDomainDefCheckABIStability to prevent us using a different UUID, and it also checks the
hostname and host UUID to be different. If we want to enable local migration, we have to
skip these check and generate new UUID and name for destination domain. Of course we
restore the original UUID after migration. UUID is used in higher level management
software to identify virtual machines. It should stay the same after QEMU live upgrade.
(2) If I understand the code correctly, libvirt uses thread pool to handle RPC requests.
This means local migration may cause deadlock in P2P migration mode. Suppose there are
some concurrent local migration requests and all the worker threads are occupied by these
requests. When source libvirtd connects destination libvirtd on the same host to negotiate
the migration, the negotiation request is queued, but the negotiation request will never
be handled, because the original migration request from client is waiting for the
negotiation request to finish to progress, while the negotiation request is queued waiting
for the original request to end. This is one of the dealock risk I can think of.
I guess in traditional migration mode, in which the client opens two connections to source
and destination libvirtd, there is also risk to cause deadlock.
(3) Libvirt supports Unix domain socket transport, but this is only used in a tunnelled
migration. For native migration, it only supports TCP. We need to enable Unix domain
socket transport in native migration. Now we already have a hypervisor migration URI
argument in the migration API, but there is no support for parsing and verifying a
"unix:/full/path" URI and passing that URI transparently to QEMU. We can add
this to current migration framework but direct Unix socket transport looks meaningless for
normal migration.
(4) When migration fails, the source domain is resumed, and this may not work if we enable
page-flipping in QEMU. With page-flipping enabled, QEMU transfers memory page ownership to
the destination QEMU, so the source virtual machine should be restarted but not resumed
when the migration fails.
To summarize, I made a call migration flow with things I hacked to enable local migration
in the existing migration framework. It's a bit long, I put it at the end of the mail.
I found if I was to re-use migration framework, I need to change interface of a few
functions, add some flags, pass them deep into the inner functions.
So I propose a new and compact work flow dedicated for QEMU live upgrade. After all,
it's an upgrade operation based on tricky migration. When developing the previous RFC
patch for the new API, I focused on the correctness of the work flow, so many other things
are missing. I think I can add things like Spice seamless migration when I submitting new
versions. I am also really happy if you could give me some advice to re-use the migration
framework. Re-using the current framework can saves a lot of effort.
Appendix
Call flow to enable local migration in current migration framework
All conn->XXX() and dconn->XXX() are remote calls to libvirtd, then libvirtd
dispatches the request to QEMU driver.
"domain", "conn" means source domain and libvirt connection,
"ddomain", "dconn" means destination domain and libvirt connection.
Things I hacked are marked with /* HACKED */.
virDomainMigrate(...)
virDomainMigrateVersion3(...) -> virDomainMigrateVersion3Full(...)
dom_xml = conn->domainMigrateBegin3(&cookieout...) =>
qemuDomainMigrateBegin3
vm = qemuDomObjFromDomain(domain)
qemuMigrationBegin(vm, ...)
qemuMigrationBeginPhase(vm, ...)
Generate migration cookie: qemuMigrationEatCookie(NULL, ...);
qemuMigrationBakeCookie(...)
if (xmlin) def = virDomainDefParseString(xmlin, ...)
virDomainDefCheckABIStability(def, vm->def) /* HACKED to skip domain uuid
check */
qemuDomainDefFormatLive(def / vm->def, ...)
cookiein = cookieout
dconn->domainMigratePrepare3(uri, &uri_out, cookiein, &cookieout, dname,
dom_xml, ...) => qemuDomainMigratePrepare3
def = qemuMigrationPrepareDef(dom_xml, ...)
qemuMigrationPrepareDirect(uri, uri_out, &def, ...)
parse uri and build uri_out;
qemuMigrationPrepareAny(def, ...)
new_def = call migration hooks to manipulate def
if (virDomainDefCheckABIStability(def, new_def, ...)) /* HACKED to skip domain
uuid check */
def = new_def;
migrateFrom = "tcp:host:port" /* HACKED to "unix:/full/path"
*/
vm = virDomainObjListAdd(domains, def, ...)
Verify cookie: qemuMigrationEatCookie(cookiein, ...)
qemuMigrationCookieXMLParseStr(...)
qemuMigrationCookieXMLParse(...)
verify our hostname and host UUID are different from source /* HACKED to
skip host uuid & name check*/
qemuProcessStart(vm, migrateFrom, ...)
qemu --incoming "tcp:host:port" --hda ...
qemuMigrationBakeCookie(cookieout, ...)
uri = uri_out
cookiein = cookieout
conn->domainMigratePerform3(uri, coockiein, &cookieout, ...) =>
qemuDomainMigratePerform3
vm = qemuDomObjFromDomain(dom)
qemuMigrationPerform(uri, ...)
qemuMigrationPerformPhase(uri, ...)
doNativeMigrate(vm, uri, ...)
build qemu migration spec /* HACKED to replace tcp transport spec with unix
socket spec */
qemuMigrationRun(vm, spec, ...)
qemuMigrationEatCookie(...) /* HACKED to skip host uuid & name check*/
qemuMonitorMigrateToHost / qemuMonitorMigrateToUnix /
qemuMonitorMigrateToFd
qemuMonitorJSONMigrate(uri, ...)
cmd = qemuMonitorJSONMakeCommand("migrate", "s:uri",
uri, ...)
qemuMonitorJSONCommand(cmd)
start tunneling between src and dst unix socket /* HACKED to skip
tunneling*/
qemuMigrationWaitForCompletion(...)
Poll every 50ms for progress, check error, and allow cancellation
qemuMigrationBakeCookie(cookieout, ...)
cookiein = cookieout
ddomain = dconn->domainMigrateFinish3(dname, uri, cookiein, &cookieout...)
=> qemuDomainMigrateFinish3
vm = virDomainObjListFindByName(driver->domains, dname)
qemuMigrationFinish(vm, ...)
qemuMigrationEatCookie(...) /* HACKED to skip host uuid & name check*/
qemuProcessStartCPUs(...) -> ... Send QMP command "cont" to QEMU
dom = virGetDomain(dconn, vm->def->name, vm->def->uuid)
qemuMigrationBakeCookie(cookieout, ...)
return dom
cookiein = cookieout
conn->domainMigrateConfirm3(cookiein, ...) => qemuDomainMigrateConfirm3
vm = qemuDomObjFromDomain(domain)
qemuMigrationConfirm(vm, cookiein, ...)
qemuMigrationConfirmPhase(vm, cookiein, ...)
qemuMigrationEatCookie(cookiein, ...) /* HACKED to skip host uuid & name
check*/
qemuProcessStop(vm, ..)
return ddomain
Thanks and best regards!
_____________________________
Zhou Zheng Sheng / 周征晟
Software Engineer
E-mail: zhshzhou(a)cn.ibm.com
Telephone: 86-10-82454397