[libvirt] [PATCH 0/6] correctly migrate paused VMs (and a bunch of bugfixes)

Currently, the QEMU driver tries to migrate paused VMs, but the effect is that the migrated virtual machine will always run on the destination. Furthermore, the state is erroneously stored as paused, so that the driver is left confused and it is not possible to re-pause the VM without doing a (useless except to make the libvirtd state consistent) "virsh resume" first. While pausing is not very used, this feature is still nice-to-have in case for example you're migrating all VMs in emergency, or (when QEMU will have asynchronous notifications) if you want to run QEMU with the werror=stop option (*). This set of patches is structured as follows: 1) patches 1 and 2 fix two bugs that were introduced by the recent reorganization. 2) patch 3 fixes a bug that happens to be in the paths touched later on; 3) patch 4 adds the possibility to migrate a VM and leave it suspended on the destination VM. I added the feature for debugging and thought I might as well submit it for inclusion. Especially if the last two patches turn out to be a no-go (at least in their current form), this patch at least provides the infrastructure and a way to test it. 4) patches 5 and 6 actually fix the bug. Unfortunately, this requires a change to the RPC protocol. I'm not sure whether it is required the libvirtd's on both sides of the migration to have the same version, or rather if this is a deal breaker. The change to the protocol is in patch 5; the bug fix is in patch 6. Please review and ack (or, for the last two, nack). Paolo (*) While the state of the migrated VM is currently fetched at the beginning of migration, the communication happens at the end of it (from Perform to Finish), so this is ready for being improved in the future once QEMU provides asynchronous notifications.

Fix migration, broken in two different ways by the QEMU monitor abstraction. Note that the QEMU console emits a "\r\n" as the line-ending. * src/qemu/qemu_monitor_text.c (qemuMonitorGetMigrationStatus): Fix "info migrate" command and its output's parsing. --- src/qemu/qemu_monitor_text.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_monitor_text.c b/src/qemu/qemu_monitor_text.c index 5054adf..66526dc 100644 --- a/src/qemu/qemu_monitor_text.c +++ b/src/qemu/qemu_monitor_text.c @@ -1072,7 +1072,7 @@ int qemuMonitorGetMigrationStatus(const virDomainObjPtr vm, *remaining = 0; *total = 0; - if (qemuMonitorCommand(vm, "info migration", &reply) < 0) { + if (qemuMonitorCommand(vm, "info migrate", &reply) < 0) { qemudReportError(NULL, NULL, NULL, VIR_ERR_OPERATION_FAILED, "%s", _("cannot query migration status")); return -1; @@ -1080,7 +1080,7 @@ int qemuMonitorGetMigrationStatus(const virDomainObjPtr vm, if ((tmp = strstr(reply, MIGRATION_PREFIX)) != NULL) { tmp += strlen(MIGRATION_PREFIX); - end = strchr(tmp, '\n'); + end = strchr(tmp, '\r'); *end = '\0'; if ((*status = qemuMonitorMigrationStatusTypeFromString(tmp)) < 0) { -- 1.6.2.5

Paolo Bonzini wrote:
Fix migration, broken in two different ways by the QEMU monitor abstraction. Note that the QEMU console emits a "\r\n" as the line-ending.
* src/qemu/qemu_monitor_text.c (qemuMonitorGetMigrationStatus): Fix "info migrate" command and its output's parsing.
Pushed, thanks. -- Chris Lalancette

Fix "make rpcgen", broken by the directory reorganization. * src/Makefile.am (rpcgen): Fix path to rpcgen_fix.pl. --- src/Makefile.am | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/Makefile.am b/src/Makefile.am index de7765c..f3d4559 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -332,8 +332,8 @@ rpcgen: $(RPCGEN) -h -o rp.h-t $(srcdir)/remote/remote_protocol.x $(RPCGEN) -c -o rp.c-t $(srcdir)/remote/remote_protocol.x if HAVE_GLIBC_RPCGEN - perl -w $(srcdir)/rpcgen_fix.pl rp.h-t > rp.h-t1 - perl -w $(srcdir)/rpcgen_fix.pl rp.c-t > rp.c-t1 + perl -w $(srcdir)/remote/rpcgen_fix.pl rp.h-t > rp.h-t1 + perl -w $(srcdir)/remote/rpcgen_fix.pl rp.c-t > rp.c-t1 (echo '#include <config.h>'; cat rp.c-t1) > rp.c-t2 chmod 0444 rp.c-t2 rp.h-t1 mv -f rp.h-t1 $(srcdir)/remote/remote_protocol.h -- 1.6.2.5

Paolo Bonzini wrote:
Fix "make rpcgen", broken by the directory reorganization.
* src/Makefile.am (rpcgen): Fix path to rpcgen_fix.pl. --- src/Makefile.am | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/Makefile.am b/src/Makefile.am index de7765c..f3d4559 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -332,8 +332,8 @@ rpcgen: $(RPCGEN) -h -o rp.h-t $(srcdir)/remote/remote_protocol.x $(RPCGEN) -c -o rp.c-t $(srcdir)/remote/remote_protocol.x if HAVE_GLIBC_RPCGEN - perl -w $(srcdir)/rpcgen_fix.pl rp.h-t > rp.h-t1 - perl -w $(srcdir)/rpcgen_fix.pl rp.c-t > rp.c-t1 + perl -w $(srcdir)/remote/rpcgen_fix.pl rp.h-t > rp.h-t1 + perl -w $(srcdir)/remote/rpcgen_fix.pl rp.c-t > rp.c-t1 (echo '#include <config.h>'; cat rp.c-t1) > rp.c-t2 chmod 0444 rp.c-t2 rp.h-t1 mv -f rp.h-t1 $(srcdir)/remote/remote_protocol.h
Pushed, thanks. -- Chris Lalancette

This makes a small change on the failed-migration path. Up to now, all VMs that failed non-live migration after the "stop" command were restarted. This must not be done when the VM was paused in the first place. * src/qemu/qemu_driver.c (qemudDomainMigratePerform): Do not restart a paused VM that fails migration. Set paused state after "stop", reset it after failure. --- src/qemu/qemu_driver.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 70e9c70..3d5ef92 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6008,8 +6008,9 @@ qemudDomainMigratePerform (virDomainPtr dom, /* Pause domain for non-live migration */ if (qemuMonitorStopCPUs(vm) < 0) goto cleanup; - paused = 1; + paused = (vm->state == VIR_DOMAIN_RUNNING); + vm->state = VIR_DOMAIN_PAUSED; event = virDomainEventNewFromObj(vm, VIR_DOMAIN_EVENT_SUSPENDED, VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED); @@ -6085,6 +6086,7 @@ cleanup: vm->def->name); } + vm->state = VIR_DOMAIN_RUNNING; event = virDomainEventNewFromObj(vm, VIR_DOMAIN_EVENT_RESUMED, VIR_DOMAIN_EVENT_RESUMED_MIGRATED); -- 1.6.2.5

Paolo Bonzini wrote:
This makes a small change on the failed-migration path. Up to now, all VMs that failed non-live migration after the "stop" command were restarted. This must not be done when the VM was paused in the first place.
OK, yes, after thinking about this, I agree with this semantic. However, I don't really agree with the implementation below. Here's why: If your domain is already paused, then you'll be issuing a "qemuMonitorStopCPUs" command unnecessarily. That by itself isn't a huge deal; it will essentially be a no-op. But you will also generate an event of "VIR_DOMAIN_EVENT_SUSPENDED", which is wrong; you didn't really suspend the domain, an earlier command did. I think the implementation should be something more along the lines of: paused = 0; if (!(flags & VIR_MIGRATE_LIVE) && vm->state == VIR_DOMAIN_RUNNING) { stop_cpus(); generate_event(); paused = 1; vm->state = PAUSED; } Make sense? One other thing to possibly investigate is whether the "vm->state == VIR_DOMAIN_RUNNING" is sufficient. Can we (legitimately) be in state VIR_DOMAIN_NOSTATE or VIR_DOMAIN_BLOCKED, yet still be "running"? I'm not sure, but I'd want to make sure before using that as a condition. -- Chris Lalancette

If your domain is already paused, then you'll be issuing a "qemuMonitorStopCPUs" command unnecessarily. That by itself isn't a huge deal; it will essentially be a no-op. But you will also generate an event of "VIR_DOMAIN_EVENT_SUSPENDED", which is wrong; you didn't really suspend the domain, an earlier command did.
Yes, I agree it's better. Since the semantics are fine, only the implementation will vary (and will be exactly what you suggest in your email) I'll make the change when I send the rebased patch set.
One other thing to possibly investigate is whether the "vm->state == VIR_DOMAIN_RUNNING" is sufficient. Can we (legitimately) be in state VIR_DOMAIN_NOSTATE or VIR_DOMAIN_BLOCKED, yet still be "running"?
I'm actually not sure (but I think the answer is no from a quick grep) if a qemu guest can be in any state but PAUSED, RUNNING, SHUTOFF. Paolo

This adds a new flag, VIR_MIGRATE_PAUSED, that mandates pausing the migrated VM before starting it. * include/libvirt/libvirt.h.in (virDomainMigrateFlags): Add VIR_MIGRATE_PAUSED. * src/libvirt.c (virDomainMigrateVersion2): "Optimize" combination of VIR_MIGRATE_LIVE and VIR_MIGRATE_PAUSED. * src/qemu/qemu_driver.c (qemudDomainMigrateFinish2): Handle VIR_MIGRATE_PAUSED. * tools/virsh.c (opts_migrate): Add --suspend. (cmdMigrate): Handle it. * tools/virsh.pod (migrate): Document it. * docs/libvirt-api.xml: Regenerate. * docs/libvirt-refs.xml: Regenerate. --- I'm including regenerated files to aid whoever commits the patch. docs/libvirt-api.xml | 9 ++++++--- docs/libvirt-refs.xml | 14 ++++++++++++++ include/libvirt/libvirt.h.in | 1 + src/libvirt.c | 9 ++++++++- src/qemu/qemu_driver.c | 28 ++++++++++++++++++---------- tools/virsh.c | 4 ++++ tools/virsh.pod | 7 ++++--- 7 files changed, 55 insertions(+), 17 deletions(-) diff --git a/docs/libvirt-api.xml b/docs/libvirt-api.xml index 88dc246..f93980b 100644 --- a/docs/libvirt-api.xml +++ b/docs/libvirt-api.xml @@ -47,6 +47,7 @@ <exports symbol='VIR_MIGRATE_LIVE' type='enum'/> <exports symbol='VIR_DOMAIN_EVENT_STOPPED_DESTROYED' type='enum'/> <exports symbol='VIR_DOMAIN_EVENT_DEFINED_ADDED' type='enum'/> + <exports symbol='VIR_STORAGE_POOL_DELETE_NORMAL' type='enum'/> <exports symbol='VIR_SECRET_USAGE_TYPE_NONE' type='enum'/> <exports symbol='VIR_DOMAIN_EVENT_STARTED_MIGRATED' type='enum'/> <exports symbol='VIR_STREAM_EVENT_HANGUP' type='enum'/> @@ -63,7 +64,7 @@ <exports symbol='VIR_EVENT_HANDLE_WRITABLE' type='enum'/> <exports symbol='VIR_STORAGE_POOL_BUILD_NEW' type='enum'/> <exports symbol='VIR_DOMAIN_EVENT_SUSPENDED_PAUSED' type='enum'/> - <exports symbol='VIR_STORAGE_POOL_DELETE_NORMAL' type='enum'/> + <exports symbol='VIR_MIGRATE_PAUSED' type='enum'/> <exports symbol='VIR_MEMORY_PHYSICAL' type='enum'/> <exports symbol='VIR_DOMAIN_SCHED_FIELD_INT' type='enum'/> <exports symbol='VIR_DOMAIN_SCHED_FIELD_ULLONG' type='enum'/> @@ -718,7 +719,8 @@ <enum name='VIR_FROM_XML' file='virterror' value='5' type='virErrorDomain' info='Error in the XML code'/> <enum name='VIR_MEMORY_PHYSICAL' file='libvirt' value='2' type='virDomainMemoryFlags' info=' addresses are physical addresses'/> <enum name='VIR_MEMORY_VIRTUAL' file='libvirt' value='1' type='virDomainMemoryFlags' info='addresses are virtual addresses'/> - <enum name='VIR_MIGRATE_LIVE' file='libvirt' value='1' type='virDomainMigrateFlags' info=' live migration'/> + <enum name='VIR_MIGRATE_LIVE' file='libvirt' value='1' type='virDomainMigrateFlags' info='live migration'/> + <enum name='VIR_MIGRATE_PAUSED' file='libvirt' value='2' type='virDomainMigrateFlags' info=' pause on remote side'/> <enum name='VIR_SECRET_USAGE_TYPE_NONE' file='libvirt' value='0' type='virSecretUsageType'/> <enum name='VIR_SECRET_USAGE_TYPE_VOLUME' file='libvirt' value='1' type='virSecretUsageType' info=' Expect more owner types later...'/> <enum name='VIR_STORAGE_POOL_BUILDING' file='libvirt' value='1' type='virStoragePoolState' info='Initializing pool, not available'/> @@ -1595,7 +1597,8 @@ connection you should split large requests to <= 65536 bytes.]]></info> host given by dconn (a connection to the destination host). Flags may be one of more of the following: - VIR_MIGRATE_LIVE Attempt a live migration. + VIR_MIGRATE_LIVE Attempt a live migration. + VIR_MIGRATE_PAUSED Leave the domain suspended on the remote side. If a hypervisor supports renaming domains during migration, then you may set the dname parameter to the new name (otherwise diff --git a/docs/libvirt-refs.xml b/docs/libvirt-refs.xml index 08034e4..b72ecf2 100644 --- a/docs/libvirt-refs.xml +++ b/docs/libvirt-refs.xml @@ -154,6 +154,7 @@ <reference name='VIR_MEMORY_PHYSICAL' href='html/libvirt-libvirt.html#VIR_MEMORY_PHYSICAL'/> <reference name='VIR_MEMORY_VIRTUAL' href='html/libvirt-libvirt.html#VIR_MEMORY_VIRTUAL'/> <reference name='VIR_MIGRATE_LIVE' href='html/libvirt-libvirt.html#VIR_MIGRATE_LIVE'/> + <reference name='VIR_MIGRATE_PAUSED' href='html/libvirt-libvirt.html#VIR_MIGRATE_PAUSED'/> <reference name='VIR_NODEINFO_MAXCPUS' href='html/libvirt-libvirt.html#VIR_NODEINFO_MAXCPUS'/> <reference name='VIR_SECRET_USAGE_TYPE_NONE' href='html/libvirt-libvirt.html#VIR_SECRET_USAGE_TYPE_NONE'/> <reference name='VIR_SECRET_USAGE_TYPE_VOLUME' href='html/libvirt-libvirt.html#VIR_SECRET_USAGE_TYPE_VOLUME'/> @@ -659,6 +660,7 @@ <ref name='VIR_MEMORY_PHYSICAL'/> <ref name='VIR_MEMORY_VIRTUAL'/> <ref name='VIR_MIGRATE_LIVE'/> + <ref name='VIR_MIGRATE_PAUSED'/> <ref name='VIR_NODEINFO_MAXCPUS'/> <ref name='VIR_SECRET_USAGE_TYPE_NONE'/> <ref name='VIR_SECRET_USAGE_TYPE_VOLUME'/> @@ -1596,6 +1598,7 @@ <ref name='VIR_MEMORY_PHYSICAL'/> <ref name='VIR_MEMORY_VIRTUAL'/> <ref name='VIR_MIGRATE_LIVE'/> + <ref name='VIR_MIGRATE_PAUSED'/> <ref name='VIR_NODEINFO_MAXCPUS'/> <ref name='VIR_SECRET_USAGE_TYPE_NONE'/> <ref name='VIR_SECRET_USAGE_TYPE_VOLUME'/> @@ -2463,6 +2466,9 @@ <word name='Launch'> <ref name='virDomainCreateXML'/> </word> + <word name='Leave'> + <ref name='virDomainMigrate'/> + </word> <word name='Length'> <ref name='_virConnectCredential'/> </word> @@ -2912,6 +2918,9 @@ <word name='VIR_MIGRATE_LIVE'> <ref name='virDomainMigrate'/> </word> + <word name='VIR_MIGRATE_PAUSED'> + <ref name='virDomainMigrate'/> + </word> <word name='VIR_SECRET_USAGE_TYPE_VOLUME'> <ref name='virSecretGetUsageID'/> </word> @@ -6569,6 +6578,7 @@ <ref name='virDomainBlockPeek'/> <ref name='virDomainCoreDump'/> <ref name='virDomainMemoryPeek'/> + <ref name='virDomainMigrate'/> <ref name='virStoragePoolRefresh'/> </word> <word name='remove'> @@ -6954,6 +6964,9 @@ <ref name='virDomainDestroy'/> <ref name='virNetworkDestroy'/> </word> + <word name='side'> + <ref name='virDomainMigrate'/> + </word> <word name='significant'> <ref name='virDomainPinVcpu'/> </word> @@ -7239,6 +7252,7 @@ <ref name='virDomainSave'/> </word> <word name='suspended'> + <ref name='virDomainMigrate'/> <ref name='virDomainResume'/> </word> <word name='synchronization'> diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 4e63e48..0bc0c06 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -337,6 +337,7 @@ typedef virDomainInterfaceStatsStruct *virDomainInterfaceStatsPtr; /* Domain migration flags. */ typedef enum { VIR_MIGRATE_LIVE = 1, /* live migration */ + VIR_MIGRATE_PAUSED = 2, /* pause on remote side */ } virDomainMigrateFlags; /* Domain migration. */ diff --git a/src/libvirt.c b/src/libvirt.c index dacf8c4..8569fcb 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -3057,7 +3057,8 @@ virDomainMigrateVersion2 (virDomainPtr domain, * host given by dconn (a connection to the destination host). * * Flags may be one of more of the following: - * VIR_MIGRATE_LIVE Attempt a live migration. + * VIR_MIGRATE_LIVE Attempt a live migration. + * VIR_MIGRATE_PAUSED Leave the domain suspended on the remote side. * * If a hypervisor supports renaming domains during migration, * then you may set the dname parameter to the new name (otherwise @@ -3106,6 +3107,12 @@ virDomainMigrate (virDomainPtr domain, virResetLastError(); + if ((flags & VIR_MIGRATE_PAUSED)) { + /* Not doing migration live in this case reduces traffic, + and the effect is not visible. */ + flags &= ~VIR_MIGRATE_LIVE; + } + /* First checkout the source */ if (!VIR_IS_CONNECTED_DOMAIN (domain)) { virLibDomainError(NULL, VIR_ERR_INVALID_DOMAIN, __FUNCTION__); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 3d5ef92..d0b2942 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6131,21 +6131,29 @@ qemudDomainMigrateFinish2 (virConnectPtr dconn, if (retcode == 0) { dom = virGetDomain (dconn, vm->def->name, vm->def->uuid); - /* run 'cont' on the destination, which allows migration on qemu - * >= 0.10.6 to work properly. This isn't strictly necessary on - * older qemu's, but it also doesn't hurt anything there - */ - if (qemuMonitorStartCPUs(dconn, vm) < 0) { - if (virGetLastError() == NULL) - qemudReportError(dconn, NULL, NULL, VIR_ERR_INTERNAL_ERROR, - "%s", _("resume operation failed")); - goto cleanup; + if (!(flags & VIR_MIGRATE_PAUSED)) { + /* run 'cont' on the destination, which allows migration on qemu + * >= 0.10.6 to work properly. This isn't strictly necessary on + * older qemu's, but it also doesn't hurt anything there + */ + if (qemuMonitorStartCPUs(dconn, vm) < 0) { + if (virGetLastError() == NULL) + qemudReportError(dconn, NULL, NULL, VIR_ERR_INTERNAL_ERROR, + "%s", _("resume operation failed")); + goto cleanup; + } + vm->state = VIR_DOMAIN_RUNNING; } - vm->state = VIR_DOMAIN_RUNNING; event = virDomainEventNewFromObj(vm, VIR_DOMAIN_EVENT_RESUMED, VIR_DOMAIN_EVENT_RESUMED_MIGRATED); + if (vm->state == VIR_DOMAIN_PAUSED) { + qemuDomainEventQueue(driver, event); + event = virDomainEventNewFromObj(vm, + VIR_DOMAIN_EVENT_SUSPENDED, + VIR_DOMAIN_EVENT_SUSPENDED_PAUSED); + } virDomainSaveStatus(dconn, driver->stateDir, vm); } else { qemudShutdownVMDaemon (dconn, driver, vm); diff --git a/tools/virsh.c b/tools/virsh.c index 3482389..307807e 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -2462,6 +2462,7 @@ static const vshCmdInfo info_migrate[] = { static const vshCmdOptDef opts_migrate[] = { {"live", VSH_OT_BOOL, 0, gettext_noop("live migration")}, + {"suspend", VSH_OT_BOOL, 0, gettext_noop("do not restart the domain on the destination host")}, {"domain", VSH_OT_DATA, VSH_OFLAG_REQ, gettext_noop("domain name, id or uuid")}, {"desturi", VSH_OT_DATA, VSH_OFLAG_REQ, gettext_noop("connection URI of the destination host")}, {"migrateuri", VSH_OT_DATA, 0, gettext_noop("migration URI, usually can be omitted")}, @@ -2499,6 +2500,9 @@ cmdMigrate (vshControl *ctl, const vshCmd *cmd) if (vshCommandOptBool (cmd, "live")) flags |= VIR_MIGRATE_LIVE; + if (vshCommandOptBool (cmd, "suspend")) + flags |= VIR_MIGRATE_PAUSED; + /* Temporarily connect to the destination host. */ dconn = virConnectOpenAuth (desturi, virConnectAuthPtrDefault, 0); if (!dconn) goto done; diff --git a/tools/virsh.pod b/tools/virsh.pod index 55ec64a..34c9b34 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -303,10 +303,11 @@ except that it does some error checking. The editor used can be supplied by the C<$EDITOR> environment variable, or if that is not defined defaults to C<vi>. -=item B<migrate> optional I<--live> I<domain-id> I<desturi> I<migrateuri> +=item B<migrate> optional I<--live> I<--suspend> I<domain-id> I<desturi> I<migrateuri> -Migrate domain to another host. Add --live for live migration. The I<desturi> -is the connection URI of the destination host, and I<migrateuri> is the +Migrate domain to another host. Add --live for live migration; --suspend +leaves the domain paused on the destination host. The I<desturi> is the +connection URI of the destination host, and I<migrateuri> is the migration URI, which usually can be omitted. =item B<reboot> I<domain-id> -- 1.6.2.5

On Thu, Oct 01, 2009 at 08:18:31PM +0200, Paolo Bonzini wrote:
This adds a new flag, VIR_MIGRATE_PAUSED, that mandates pausing the migrated VM before starting it.
* include/libvirt/libvirt.h.in (virDomainMigrateFlags): Add VIR_MIGRATE_PAUSED. * src/libvirt.c (virDomainMigrateVersion2): "Optimize" combination of VIR_MIGRATE_LIVE and VIR_MIGRATE_PAUSED. * src/qemu/qemu_driver.c (qemudDomainMigrateFinish2): Handle VIR_MIGRATE_PAUSED. * tools/virsh.c (opts_migrate): Add --suspend. (cmdMigrate): Handle it. * tools/virsh.pod (migrate): Document it. * docs/libvirt-api.xml: Regenerate. * docs/libvirt-refs.xml: Regenerate. [...] diff --git a/src/libvirt.c b/src/libvirt.c index dacf8c4..8569fcb 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -3057,7 +3057,8 @@ virDomainMigrateVersion2 (virDomainPtr domain, * host given by dconn (a connection to the destination host). * * Flags may be one of more of the following: - * VIR_MIGRATE_LIVE Attempt a live migration. + * VIR_MIGRATE_LIVE Attempt a live migration. + * VIR_MIGRATE_PAUSED Leave the domain suspended on the remote side. * * If a hypervisor supports renaming domains during migration, * then you may set the dname parameter to the new name (otherwise @@ -3106,6 +3107,12 @@ virDomainMigrate (virDomainPtr domain,
virResetLastError();
Hum, what happens if the remote node has an old libvirtd which doesn't know the VIR_MIGRATE_PAUSED flag ? I would expect some kind of flag checking on the receiving end might generate an error, true ?
+ if ((flags & VIR_MIGRATE_PAUSED)) { + /* Not doing migration live in this case reduces traffic, + and the effect is not visible. */ + flags &= ~VIR_MIGRATE_LIVE; + }
Hum ... that's still a change of semantic, LIVE and PAUSED would actually still run the domain until the data are transfered on the other side (mostly). If we keep that overriding semantic then that should be in the description IMHO That sounds reasonnable to me, but I wonder what's the use case :-) Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

A return code of say 1 from Perform is currently considered an error, but it could also be passed simply to Finish. This makes the v2 protocol much more powerful. * src/remote/remote_protocol.x (remote_domain_migrate_perform_ret): New. * src/qemu/qemu_driver.c (qemudDomainMigrateFinish2): Succeed if qemudDomainMigratePerform returned a non-zero but positive value. * daemon/remote.c (remoteDispatchDomainMigratePerform): Adjust for remote_domain_migrate_perform_ret. * daemon/remote_dispatch_prototypes.h: Regenerate. * daemon/remote_dispatch_ret.h: Regenerate. * daemon/remote_dispatch_table.h: Regenerate. * src/remote/remote_driver.c (remoteDomainMigratePerform): Adjust for remote_domain_migrate_perform_ret. * src/remote/remote_protocol.c: Regenerate. * src/remote/remote_protocol.h: Regenerate. --- I'm including regenerated files to aid whoever commits the patch. daemon/remote.c | 3 ++- daemon/remote_dispatch_prototypes.h | 2 +- daemon/remote_dispatch_ret.h | 1 + daemon/remote_dispatch_table.h | 2 +- src/qemu/qemu_driver.c | 2 +- src/remote/remote_driver.c | 6 ++++-- src/remote/remote_protocol.c | 11 ++++++++++- src/remote/remote_protocol.h | 7 +++++++ src/remote/remote_protocol.x | 4 ++++ 9 files changed, 31 insertions(+), 7 deletions(-) diff --git a/daemon/remote.c b/daemon/remote.c index ba97379..67e85a4 100644 --- a/daemon/remote.c +++ b/daemon/remote.c @@ -1364,7 +1364,7 @@ remoteDispatchDomainMigratePerform (struct qemud_server *server ATTRIBUTE_UNUSED remote_message_header *hdr ATTRIBUTE_UNUSED, remote_error *rerr, remote_domain_migrate_perform_args *args, - void *ret ATTRIBUTE_UNUSED) + remote_domain_migrate_perform_ret *ret) { int r; virDomainPtr dom; @@ -1389,6 +1389,7 @@ remoteDispatchDomainMigratePerform (struct qemud_server *server ATTRIBUTE_UNUSED return -1; } + ret->code = r; return 0; } diff --git a/daemon/remote_dispatch_prototypes.h b/daemon/remote_dispatch_prototypes.h index 16e8bb0..ba22df5 100644 --- a/daemon/remote_dispatch_prototypes.h +++ b/daemon/remote_dispatch_prototypes.h @@ -281,7 +281,7 @@ static int remoteDispatchDomainMigratePerform( remote_message_header *hdr, remote_error *err, remote_domain_migrate_perform_args *args, - void *ret); + remote_domain_migrate_perform_ret *ret); static int remoteDispatchDomainMigratePrepare( struct qemud_server *server, struct qemud_client *client, diff --git a/daemon/remote_dispatch_ret.h b/daemon/remote_dispatch_ret.h index 6ced13a..8c37917 100644 --- a/daemon/remote_dispatch_ret.h +++ b/daemon/remote_dispatch_ret.h @@ -39,6 +39,7 @@ remote_get_hostname_ret val_remote_get_hostname_ret; remote_supports_feature_ret val_remote_supports_feature_ret; remote_domain_migrate_prepare_ret val_remote_domain_migrate_prepare_ret; + remote_domain_migrate_perform_ret val_remote_domain_migrate_perform_ret; remote_domain_migrate_finish_ret val_remote_domain_migrate_finish_ret; remote_domain_block_stats_ret val_remote_domain_block_stats_ret; remote_domain_interface_stats_ret val_remote_domain_interface_stats_ret; diff --git a/daemon/remote_dispatch_table.h b/daemon/remote_dispatch_table.h index 6b5df80..d2e66d4 100644 --- a/daemon/remote_dispatch_table.h +++ b/daemon/remote_dispatch_table.h @@ -315,7 +315,7 @@ { /* DomainMigratePerform => 62 */ .fn = (dispatch_fn) remoteDispatchDomainMigratePerform, .args_filter = (xdrproc_t) xdr_remote_domain_migrate_perform_args, - .ret_filter = (xdrproc_t) xdr_void, + .ret_filter = (xdrproc_t) xdr_remote_domain_migrate_perform_ret, }, { /* DomainMigrateFinish => 63 */ .fn = (dispatch_fn) remoteDispatchDomainMigrateFinish, diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d0b2942..d2429de 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6128,7 +6128,7 @@ qemudDomainMigrateFinish2 (virConnectPtr dconn, /* Did the migration go as planned? If yes, return the domain * object, but if no, clean up the empty qemu process. */ - if (retcode == 0) { + if (retcode >= 0) { dom = virGetDomain (dconn, vm->def->name, vm->def->uuid); if (!(flags & VIR_MIGRATE_PAUSED)) { diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index 731b213..5ad833b 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -2577,6 +2577,7 @@ remoteDomainMigratePerform (virDomainPtr domain, { int rv = -1; remote_domain_migrate_perform_args args; + remote_domain_migrate_perform_ret ret; struct private_data *priv = domain->conn->privateData; remoteDriverLock(priv); @@ -2589,12 +2590,13 @@ remoteDomainMigratePerform (virDomainPtr domain, args.dname = dname == NULL ? NULL : (char **) &dname; args.resource = resource; + memset (&ret, 0, sizeof (ret)); if (call (domain->conn, priv, 0, REMOTE_PROC_DOMAIN_MIGRATE_PERFORM, (xdrproc_t) xdr_remote_domain_migrate_perform_args, (char *) &args, - (xdrproc_t) xdr_void, (char *) NULL) == -1) + (xdrproc_t) xdr_remote_domain_migrate_perform_ret, (char *) &ret) == -1) goto done; - rv = 0; + rv = ret.code; done: remoteDriverUnlock(priv); diff --git a/src/remote/remote_protocol.c b/src/remote/remote_protocol.c index 1d2d242..0fab007 100644 --- a/src/remote/remote_protocol.c +++ b/src/remote/remote_protocol.c @@ -4,7 +4,7 @@ * It was generated using rpcgen. */ -#include "remote_protocol.h" +#include "./remote/remote_protocol.h" #include "internal.h" #include <arpa/inet.h> @@ -974,6 +974,15 @@ xdr_remote_domain_migrate_perform_args (XDR *xdrs, remote_domain_migrate_perform } bool_t +xdr_remote_domain_migrate_perform_ret (XDR *xdrs, remote_domain_migrate_perform_ret *objp) +{ + + if (!xdr_int (xdrs, &objp->code)) + return FALSE; + return TRUE; +} + +bool_t xdr_remote_domain_migrate_finish_args (XDR *xdrs, remote_domain_migrate_finish_args *objp) { char **objp_cpp0 = (char **) (void *) &objp->cookie.cookie_val; diff --git a/src/remote/remote_protocol.h b/src/remote/remote_protocol.h index 64da9fa..85fb491 100644 --- a/src/remote/remote_protocol.h +++ b/src/remote/remote_protocol.h @@ -525,6 +525,11 @@ struct remote_domain_migrate_perform_args { }; typedef struct remote_domain_migrate_perform_args remote_domain_migrate_perform_args; +struct remote_domain_migrate_perform_ret { + int code; +}; +typedef struct remote_domain_migrate_perform_ret remote_domain_migrate_perform_ret; + struct remote_domain_migrate_finish_args { remote_nonnull_string dname; struct { @@ -1790,6 +1795,7 @@ extern bool_t xdr_remote_domain_dump_xml_ret (XDR *, remote_domain_dump_xml_ret extern bool_t xdr_remote_domain_migrate_prepare_args (XDR *, remote_domain_migrate_prepare_args*); extern bool_t xdr_remote_domain_migrate_prepare_ret (XDR *, remote_domain_migrate_prepare_ret*); extern bool_t xdr_remote_domain_migrate_perform_args (XDR *, remote_domain_migrate_perform_args*); +extern bool_t xdr_remote_domain_migrate_perform_ret (XDR *, remote_domain_migrate_perform_ret*); extern bool_t xdr_remote_domain_migrate_finish_args (XDR *, remote_domain_migrate_finish_args*); extern bool_t xdr_remote_domain_migrate_finish_ret (XDR *, remote_domain_migrate_finish_ret*); extern bool_t xdr_remote_domain_migrate_prepare2_args (XDR *, remote_domain_migrate_prepare2_args*); @@ -2044,6 +2050,7 @@ extern bool_t xdr_remote_domain_dump_xml_ret (); extern bool_t xdr_remote_domain_migrate_prepare_args (); extern bool_t xdr_remote_domain_migrate_prepare_ret (); extern bool_t xdr_remote_domain_migrate_perform_args (); +extern bool_t xdr_remote_domain_migrate_perform_ret (); extern bool_t xdr_remote_domain_migrate_finish_args (); extern bool_t xdr_remote_domain_migrate_finish_ret (); extern bool_t xdr_remote_domain_migrate_prepare2_args (); diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 6b0a784..3c7873e 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -573,6 +573,10 @@ struct remote_domain_migrate_perform_args { unsigned hyper resource; }; +struct remote_domain_migrate_perform_ret { + int code; +}; + struct remote_domain_migrate_finish_args { remote_nonnull_string dname; opaque cookie<REMOTE_MIGRATE_COOKIE_MAX>; -- 1.6.2.5

On Thu, Oct 01, 2009 at 08:18:32PM +0200, Paolo Bonzini wrote:
A return code of say 1 from Perform is currently considered an error, but it could also be passed simply to Finish. This makes the v2 protocol much more powerful.
* src/remote/remote_protocol.x (remote_domain_migrate_perform_ret): New. * src/qemu/qemu_driver.c (qemudDomainMigrateFinish2): Succeed if qemudDomainMigratePerform returned a non-zero but positive value.
* daemon/remote.c (remoteDispatchDomainMigratePerform): Adjust for remote_domain_migrate_perform_ret. * daemon/remote_dispatch_prototypes.h: Regenerate. * daemon/remote_dispatch_ret.h: Regenerate. * daemon/remote_dispatch_table.h: Regenerate.
* src/remote/remote_driver.c (remoteDomainMigratePerform): Adjust for remote_domain_migrate_perform_ret. * src/remote/remote_protocol.c: Regenerate. * src/remote/remote_protocol.h: Regenerate.
Sounds fine by me, but I wonder about the change of semantic for client server not at the same version level. Could you explain what might happen in those case ? Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

Sounds fine by me, but I wonder about the change of semantic for client server not at the same version level. Could you explain what might happen in those case ?
Here is the result of my testing (source libvirtd is always new): source virsh source method destination libvirtd result ----------------------------------------------------------------------- old qemu:// old runs (bad) old qemu:// new runs (bad) old qemu+tcp:// old runs (bad) old qemu+tcp:// new runs (bad) new qemu:// old runs (bad) new qemu:// new paused (ok) new qemu+tcp:// old runs (bad) new qemu+tcp:// new paused (ok) So you do need not just a new daemon, but also a new virsh to fix the bug. However, the good news is that in no case the migration fails, even when using an old virsh and connecting to the daemon via a remote connection. Paolo

On Wed, Oct 07, 2009 at 03:09:17PM +0200, Paolo Bonzini wrote:
Sounds fine by me, but I wonder about the change of semantic for client server not at the same version level. Could you explain what might happen in those case ?
Here is the result of my testing (source libvirtd is always new):
This is why you're missing the ABI incompatability problems. The perform() RPC call that you changed is invoked against the source libvirtd. So a mis-matched version for client vs source libvirt will get a RPC error. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Wed, Oct 07, 2009 at 03:09:17PM +0200, Paolo Bonzini wrote:
Sounds fine by me, but I wonder about the change of semantic for client server not at the same version level. Could you explain what might happen in those case ?
Here is the result of my testing (source libvirtd is always new):
source virsh source method destination libvirtd result ----------------------------------------------------------------------- old qemu:// old runs (bad) old qemu:// new runs (bad) old qemu+tcp:// old runs (bad) old qemu+tcp:// new runs (bad) new qemu:// old runs (bad) new qemu:// new paused (ok) new qemu+tcp:// old runs (bad) new qemu+tcp:// new paused (ok)
So you do need not just a new daemon, but also a new virsh to fix the bug. However, the good news is that in no case the migration fails, even when using an old virsh and connecting to the daemon via a remote connection.
Excellent, sounds fine then, but we need the rebased version once Dan patches are applied. thanks ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

Excellent, sounds fine then, but we need the rebased version once Dan patches are applied.
Of course. But it sounds like Dan Berrange doesn't fancy 5/6 very much because he wants the "old virsh, new daemon" case to work too. He also didn't like the hack I posted (which I understand even more...). For v2, I could add a driver feature saying whether the driver supports VIR_MIGRATE_PAUSED, and if it does I'll do the resume from virDomainMigrate. This will fix the QEMU driver, and will cost (almost) nothing if the driver can actually transfer the suspended/running state. This will also not require a new ABI for the MigratePerform method. Paolo

On Wed, Oct 07, 2009 at 04:54:27PM +0200, Paolo Bonzini wrote:
Excellent, sounds fine then, but we need the rebased version once Dan patches are applied.
Of course. But it sounds like Dan Berrange doesn't fancy 5/6 very much because he wants the "old virsh, new daemon" case to work too. He also didn't like the hack I posted (which I understand even more...).
For v2, I could add a driver feature saying whether the driver supports VIR_MIGRATE_PAUSED, and if it does I'll do the resume from virDomainMigrate. This will fix the QEMU driver, and will cost (almost) nothing if the driver can actually transfer the suspended/running state.
That sounds like a reasonable approach. If the client is too old, then everything will work as currently. If the server is too old, the client can report a error to caller that this capability isn't available. If the client & server are new,then everything will work Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Wed, Oct 07, 2009 at 03:01:14PM +0200, Daniel Veillard wrote:
On Thu, Oct 01, 2009 at 08:18:32PM +0200, Paolo Bonzini wrote:
A return code of say 1 from Perform is currently considered an error, but it could also be passed simply to Finish. This makes the v2 protocol much more powerful.
* src/remote/remote_protocol.x (remote_domain_migrate_perform_ret): New. * src/qemu/qemu_driver.c (qemudDomainMigrateFinish2): Succeed if qemudDomainMigratePerform returned a non-zero but positive value.
* daemon/remote.c (remoteDispatchDomainMigratePerform): Adjust for remote_domain_migrate_perform_ret. * daemon/remote_dispatch_prototypes.h: Regenerate. * daemon/remote_dispatch_ret.h: Regenerate. * daemon/remote_dispatch_table.h: Regenerate.
* src/remote/remote_driver.c (remoteDomainMigratePerform): Adjust for remote_domain_migrate_perform_ret. * src/remote/remote_protocol.c: Regenerate. * src/remote/remote_protocol.h: Regenerate.
Sounds fine by me, but I wonder about the change of semantic for client server not at the same version level. Could you explain what might happen in those case ?
In the case of old server and new client, the client should get a RPC error when trying to re-serialize the reply, because the data packet it will be getting back from the server will be too small (ie missing the new field this patch adds). The RPC protocol should be considered ABI, and no existing methods ever changed, only new methods can be added so NACK to this patch Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

In the case of old server and new client, the client should get a RPC error when trying to re-serialize the reply, because the data packet it will be getting back from the server will be too small (ie missing the new field this patch adds).
The RPC protocol should be considered ABI, and no existing methods ever changed, only new methods can be added so NACK to this patch
I had actually prepared this patch too, but after my testing I wasn't sure if it was necessary. Would it be fine to add it to the respin? Paolo

On Wed, Oct 07, 2009 at 03:52:10PM +0200, Paolo Bonzini wrote:
In the case of old server and new client, the client should get a RPC error when trying to re-serialize the reply, because the data packet it will be getting back from the server will be too small (ie missing the new field this patch adds).
The RPC protocol should be considered ABI, and no existing methods ever changed, only new methods can be added so NACK to this patch
I had actually prepared this patch too, but after my testing I wasn't sure if it was necessary. Would it be fine to add it to the respin?
This hack means that it would be ignoring *all* deserialization errors, which IMHO is just as bad. This is ABI and we shouldn't try to change it, or workaround it because it is just going to cause pain somewhere down the line Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On 10/07/2009 03:59 PM, Daniel P. Berrange wrote:
This hack means that it would be ignoring*all* deserialization errors, which IMHO is just as bad.
These deserialization errors however do not include networking errors, because libvirt calls xdr routines only after it has extracted the xdr data from the reply packet. And the reply packet is also synthesized via xdr, so basically the only errors that would be let through would be bugs in xdr, or mismatches between daemon/dispatch.c and src/remote_driver/remote_driver.c. Both cases would have likely showed up long before the invocation of MigratePerform. (I know it is a hack though). Would a patch introducing MigratePerform2 (or 3, or 4??? not joking) be appropriate? I hoped I wouldn't have to go down that way, but given I have Chris's tunneling patch to follow the model, it shouldn't be hard. Paolo

In order to correctly pass the paused/unpaused state on the remote side, we use the newly introduced return code of qemudDomainMigratePerform. The return code does not need to be public (it is internal to the QEMU driver). A return code of 0 specifies the old behavior. * src/qemu/qemu_driver.c (qemudDomainMigratePerform): Check if the machine will have to be resumed on the destination side, pass a return value to indicate this. (qemudDomainMigrateFinish2): Conditionalize resumption on the return code from qemudDomainMigratePerform. * src/qemu/qemu_driver.h (qemuDomainMigratePerformResult): New enum. --- src/qemu/qemu_driver.c | 14 ++++++++++---- src/qemu/qemu_driver.h | 6 ++++++ 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d2429de..5811ba2 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -37,6 +37,7 @@ #include <errno.h> #include <sys/utsname.h> #include <sys/stat.h> +#include <assert.h> #include <fcntl.h> #include <signal.h> #include <paths.h> @@ -5969,7 +5970,9 @@ cleanup: return ret; } -/* Perform is the second step, and it runs on the source host. */ +/* Perform is the second step, and it runs on the source host. It + returns a combination of actions that should (or should not) be + done by MigrateFinish2, currently QEMU_MIGRATE_NO_RESUME. */ static int qemudDomainMigratePerform (virDomainPtr dom, const char *cookie ATTRIBUTE_UNUSED, @@ -5983,7 +5986,7 @@ qemudDomainMigratePerform (virDomainPtr dom, virDomainObjPtr vm; virDomainEventPtr event = NULL; int ret = -1; - int paused = 0; + int paused = 0, need_resume = 0; int status; xmlURIPtr uribits = NULL; unsigned long long transferred, remaining, total; @@ -6004,6 +6007,7 @@ qemudDomainMigratePerform (virDomainPtr dom, goto cleanup; } + need_resume = (vm->state == VIR_DOMAIN_RUNNING); if (!(flags & VIR_MIGRATE_LIVE)) { /* Pause domain for non-live migration */ if (qemuMonitorStopCPUs(vm) < 0) @@ -6072,7 +6076,7 @@ qemudDomainMigratePerform (virDomainPtr dom, virDomainRemoveInactive(&driver->domains, vm); vm = NULL; } - ret = 0; + ret = need_resume ? 0 : QEMU_MIGRATE_NO_RESUME; cleanup: if (paused) { @@ -6131,7 +6135,9 @@ qemudDomainMigrateFinish2 (virConnectPtr dconn, if (retcode >= 0) { dom = virGetDomain (dconn, vm->def->name, vm->def->uuid); - if (!(flags & VIR_MIGRATE_PAUSED)) { + assert (vm->state == VIR_DOMAIN_PAUSED); + if (!(retcode & QEMU_MIGRATE_NO_RESUME) + && !(flags & VIR_MIGRATE_PAUSED)) { /* run 'cont' on the destination, which allows migration on qemu * >= 0.10.6 to work properly. This isn't strictly necessary on * older qemu's, but it also doesn't hurt anything there diff --git a/src/qemu/qemu_driver.h b/src/qemu/qemu_driver.h index 17b184f..74ec089 100644 --- a/src/qemu/qemu_driver.h +++ b/src/qemu/qemu_driver.h @@ -47,6 +47,12 @@ # define KVM_CAP_NR_VCPUS 9 /* returns max vcpus per vm */ #endif +/* Flags passed from virDomainMigratePerform to virDomainMigrateFinish. */ + +typedef enum { + QEMU_MIGRATE_NO_RESUME = 1 /* destination domain should stay paused */ +} qemuDomainMigratePerformResult; + int qemuRegister(void); #endif /* QEMUD_DRIVER_H */ -- 1.6.2.5

On Thu, Oct 01, 2009 at 08:18:33PM +0200, Paolo Bonzini wrote:
In order to correctly pass the paused/unpaused state on the remote side, we use the newly introduced return code of qemudDomainMigratePerform.
The return code does not need to be public (it is internal to the QEMU driver). A return code of 0 specifies the old behavior.
* src/qemu/qemu_driver.c (qemudDomainMigratePerform): Check if the machine will have to be resumed on the destination side, pass a return value to indicate this. (qemudDomainMigrateFinish2): Conditionalize resumption on the return code from qemudDomainMigratePerform. * src/qemu/qemu_driver.h (qemuDomainMigratePerformResult): New enum. --- src/qemu/qemu_driver.c | 14 ++++++++++---- src/qemu/qemu_driver.h | 6 ++++++ 2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d2429de..5811ba2 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -37,6 +37,7 @@ #include <errno.h> #include <sys/utsname.h> #include <sys/stat.h> +#include <assert.h> #include <fcntl.h> #include <signal.h> #include <paths.h> @@ -5969,7 +5970,9 @@ cleanup: return ret; }
-/* Perform is the second step, and it runs on the source host. */ +/* Perform is the second step, and it runs on the source host. It + returns a combination of actions that should (or should not) be + done by MigrateFinish2, currently QEMU_MIGRATE_NO_RESUME. */ static int qemudDomainMigratePerform (virDomainPtr dom, const char *cookie ATTRIBUTE_UNUSED, @@ -5983,7 +5986,7 @@ qemudDomainMigratePerform (virDomainPtr dom, virDomainObjPtr vm; virDomainEventPtr event = NULL; int ret = -1; - int paused = 0; + int paused = 0, need_resume = 0; int status; xmlURIPtr uribits = NULL; unsigned long long transferred, remaining, total; @@ -6004,6 +6007,7 @@ qemudDomainMigratePerform (virDomainPtr dom, goto cleanup; }
+ need_resume = (vm->state == VIR_DOMAIN_RUNNING); if (!(flags & VIR_MIGRATE_LIVE)) { /* Pause domain for non-live migration */ if (qemuMonitorStopCPUs(vm) < 0) @@ -6072,7 +6076,7 @@ qemudDomainMigratePerform (virDomainPtr dom, virDomainRemoveInactive(&driver->domains, vm); vm = NULL; } - ret = 0; + ret = need_resume ? 0 : QEMU_MIGRATE_NO_RESUME;
cleanup: if (paused) { @@ -6131,7 +6135,9 @@ qemudDomainMigrateFinish2 (virConnectPtr dconn, if (retcode >= 0) { dom = virGetDomain (dconn, vm->def->name, vm->def->uuid);
- if (!(flags & VIR_MIGRATE_PAUSED)) { + assert (vm->state == VIR_DOMAIN_PAUSED);
NACK ! an error in migration should not result in a dead daemon assert/exit forbidden even in error handling paths.
+ if (!(retcode & QEMU_MIGRATE_NO_RESUME) + && !(flags & VIR_MIGRATE_PAUSED)) { /* run 'cont' on the destination, which allows migration on qemu * >= 0.10.6 to work properly. This isn't strictly necessary on * older qemu's, but it also doesn't hurt anything there diff --git a/src/qemu/qemu_driver.h b/src/qemu/qemu_driver.h index 17b184f..74ec089 100644 --- a/src/qemu/qemu_driver.h +++ b/src/qemu/qemu_driver.h @@ -47,6 +47,12 @@ # define KVM_CAP_NR_VCPUS 9 /* returns max vcpus per vm */ #endif
+/* Flags passed from virDomainMigratePerform to virDomainMigrateFinish. */ + +typedef enum { + QEMU_MIGRATE_NO_RESUME = 1 /* destination domain should stay paused */ +} qemuDomainMigratePerformResult; + int qemuRegister(void);
#endif /* QEMUD_DRIVER_H */ -- 1.6.2.5
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

On 10/07/2009 03:03 PM, Daniel Veillard wrote:
NACK ! an error in migration should not result in a dead daemon assert/exit forbidden even in error handling paths.
I have no problem removing the assertion, however notice that I added it simply to match vm->state = migrateFrom ? VIR_DOMAIN_PAUSED : VIR_DOMAIN_RUNNING; in qemudStartVMDaemon. It is not an error handling path. Paolo

On Wed, Oct 07, 2009 at 03:14:18PM +0200, Paolo Bonzini wrote:
On 10/07/2009 03:03 PM, Daniel Veillard wrote:
NACK ! an error in migration should not result in a dead daemon assert/exit forbidden even in error handling paths.
I have no problem removing the assertion, however notice that I added it simply to match
vm->state = migrateFrom ? VIR_DOMAIN_PAUSED : VIR_DOMAIN_RUNNING;
in qemudStartVMDaemon. It is not an error handling path.
Okay, but still, we should not call exit even if something might be slightly broken :-) Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

Paolo Bonzini wrote:
Currently, the QEMU driver tries to migrate paused VMs, but the effect is that the migrated virtual machine will always run on the destination.
Furthermore, the state is erroneously stored as paused, so that the driver is left confused and it is not possible to re-pause the VM without doing a (useless except to make the libvirtd state consistent) "virsh resume" first.
While pausing is not very used, this feature is still nice-to-have in case for example you're migrating all VMs in emergency, or (when QEMU will have asynchronous notifications) if you want to run QEMU with the werror=stop option (*).
This set of patches is structured as follows:
1) patches 1 and 2 fix two bugs that were introduced by the recent reorganization.
2) patch 3 fixes a bug that happens to be in the paths touched later on;
3) patch 4 adds the possibility to migrate a VM and leave it suspended on the destination VM. I added the feature for debugging and thought I might as well submit it for inclusion. Especially if the last two patches turn out to be a no-go (at least in their current form), this patch at least provides the infrastructure and a way to test it.
4) patches 5 and 6 actually fix the bug. Unfortunately, this requires a change to the RPC protocol. I'm not sure whether it is required the libvirtd's on both sides of the migration to have the same version, or rather if this is a deal breaker.
The change to the protocol is in patch 5; the bug fix is in patch 6.
Please review and ack (or, for the last two, nack).
Paolo
(*) While the state of the migrated VM is currently fetched at the beginning of migration, the communication happens at the end of it (from Perform to Finish), so this is ready for being improved in the future once QEMU provides asynchronous notifications.
I haven't reviewed the actual code yet, but I like the idea. Unfortunately this patch series is going to clash horribly with the tunnelled migration code that I'm going to commit today, so I'd suggest waiting until I commit, rebasing, and then reposting. -- Chris Lalancette

I haven't reviewed the actual code yet, but I like the idea. Unfortunately this patch series is going to clash horribly with the tunnelled migration code that I'm going to commit today, so I'd suggest waiting until I commit, rebasing, and then reposting.
Yes, that's fine. In the meanwhile I'll play with mismatched virsh/daemon versions. Patches 1 and 2 are obvious and unrelated to the paths you touch, so they can be committed now. Paolo
participants (5)
-
Chris Lalancette
-
Daniel P. Berrange
-
Daniel Veillard
-
Paolo Bonzini
-
Paolo Bonzini