[libvirt] [PATCH v2 0/3] Fix migration of paused VMs

This is the final version of QEMU support for migrating paused VMs. The patch series is the same as what I posted on Nov 25, except for the check on the return value of virDomainGetInfo. Patch 2 also had some easily-solved conflicts upon rebasing. I haven't yet implemented an error message when inactive domains are migrated; I'll do so in a follow-up patch. Paolo include/libvirt/libvirt.h.in | 1 + src/libvirt.c | 15 ++++++++++++++- src/qemu/qemu_driver.c | 37 ++++++++++++++++++++++++------------- src/xen/xend_internal.c | 9 +++++++++ tools/virsh.c | 5 ++++- tools/virsh.pod | 7 ++++--- 6 files changed, 56 insertions(+), 18 deletions(-) Paolo Bonzini (3): fix migration of paused vms upon failure add virsh --suspend retrieve paused/running state at the beginning of migration

This makes a small change on the failed-migration path. Up to now, all VMs that failed non-live migration after the "stop" command were restarted. This must not be done when the VM was paused in the first place. * src/qemu/qemu_driver.c (qemudDomainMigratePerform): Do not restart a paused VM that fails migration. Set paused state after "stop", reset it after failure. --- src/qemu/qemu_driver.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 7e60d0e..b7bc677 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7389,7 +7389,7 @@ qemudDomainMigratePerform (virDomainPtr dom, goto endjob; } - if (!(flags & VIR_MIGRATE_LIVE)) { + if (!(flags & VIR_MIGRATE_LIVE) && vm->state == VIR_DOMAIN_RUNNING) { qemuDomainObjPrivatePtr priv = vm->privateData; /* Pause domain for non-live migration */ qemuDomainObjEnterMonitorWithDriver(driver, vm); @@ -7400,6 +7400,7 @@ qemudDomainMigratePerform (virDomainPtr dom, qemuDomainObjExitMonitorWithDriver(driver, vm); paused = 1; + vm->state = VIR_DOMAIN_PAUSED; event = virDomainEventNewFromObj(vm, VIR_DOMAIN_EVENT_SUSPENDED, VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED); @@ -7447,6 +7448,7 @@ endjob: } qemuDomainObjExitMonitorWithDriver(driver, vm); + vm->state = VIR_DOMAIN_RUNNING; event = virDomainEventNewFromObj(vm, VIR_DOMAIN_EVENT_RESUMED, VIR_DOMAIN_EVENT_RESUMED_MIGRATED); -- 1.6.5.2

This adds a new flag, VIR_MIGRATE_PAUSED, that mandates pausing the migrated VM before starting it. * include/libvirt/libvirt.h.in (virDomainMigrateFlags): Add VIR_MIGRATE_PAUSED. * src/qemu/qemu_driver.c (qemudDomainMigrateFinish2): Handle VIR_MIGRATE_PAUSED. * tools/virsh.c (opts_migrate): Add --suspend. (cmdMigrate): Handle it. * tools/virsh.pod (migrate): Document it. --- include/libvirt/libvirt.h.in | 1 + src/libvirt.c | 1 + src/qemu/qemu_driver.c | 33 +++++++++++++++++++++------------ tools/virsh.c | 5 ++++- tools/virsh.pod | 7 ++++--- 5 files changed, 31 insertions(+), 16 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 5bc7694..0488cbf 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -341,6 +341,7 @@ typedef enum { VIR_MIGRATE_TUNNELLED = (1 << 2), /* tunnel migration data over libvirtd connection */ VIR_MIGRATE_PERSIST_DEST = (1 << 3), /* persist the VM on the destination */ VIR_MIGRATE_UNDEFINE_SOURCE = (1 << 4), /* undefine the VM on the source */ + VIR_MIGRATE_PAUSED = (1 << 5), /* pause on remote side */ } virDomainMigrateFlags; /* Domain migration. */ diff --git a/src/libvirt.c b/src/libvirt.c index 05e45f3..2ced604 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -3179,6 +3179,7 @@ virDomainMigrateDirect (virDomainPtr domain, * on the destination host. * VIR_MIGRATE_UNDEFINE_SOURCE If the migration is successful, undefine the * domain on the source host. + * VIR_MIGRATE_PAUSED Leave the domain suspended on the remote side. * * VIR_MIGRATE_TUNNELLED requires that VIR_MIGRATE_PEER2PEER be set. * Applications using the VIR_MIGRATE_PEER2PEER flag will probably diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index b7bc677..4569998 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7528,24 +7528,33 @@ qemudDomainMigrateFinish2 (virConnectPtr dconn, qemuDomainObjPrivatePtr priv = vm->privateData; dom = virGetDomain (dconn, vm->def->name, vm->def->uuid); - /* run 'cont' on the destination, which allows migration on qemu - * >= 0.10.6 to work properly. This isn't strictly necessary on - * older qemu's, but it also doesn't hurt anything there - */ - qemuDomainObjEnterMonitorWithDriver(driver, vm); - if (qemuMonitorStartCPUs(priv->mon, dconn) < 0) { - if (virGetLastError() == NULL) - qemudReportError(dconn, NULL, NULL, VIR_ERR_INTERNAL_ERROR, - "%s", _("resume operation failed")); + if (!(flags & VIR_MIGRATE_PAUSED)) { + /* run 'cont' on the destination, which allows migration on qemu + * >= 0.10.6 to work properly. This isn't strictly necessary on + * older qemu's, but it also doesn't hurt anything there + */ + qemuDomainObjEnterMonitorWithDriver(driver, vm); + if (qemuMonitorStartCPUs(priv->mon, dconn) < 0) { + if (virGetLastError() == NULL) + qemudReportError(dconn, NULL, NULL, VIR_ERR_INTERNAL_ERROR, + "%s", _("resume operation failed")); + qemuDomainObjExitMonitorWithDriver(driver, vm); + goto endjob; + } qemuDomainObjExitMonitorWithDriver(driver, vm); - goto endjob; + + vm->state = VIR_DOMAIN_RUNNING; } - qemuDomainObjExitMonitorWithDriver(driver, vm); - vm->state = VIR_DOMAIN_RUNNING; event = virDomainEventNewFromObj(vm, VIR_DOMAIN_EVENT_RESUMED, VIR_DOMAIN_EVENT_RESUMED_MIGRATED); + if (vm->state == VIR_DOMAIN_PAUSED) { + qemuDomainEventQueue(driver, event); + event = virDomainEventNewFromObj(vm, + VIR_DOMAIN_EVENT_SUSPENDED, + VIR_DOMAIN_EVENT_SUSPENDED_PAUSED); + } virDomainSaveStatus(dconn, driver->caps, driver->stateDir, vm); } else { qemudShutdownVMDaemon (dconn, driver, vm); diff --git a/tools/virsh.c b/tools/virsh.c index 9faac35..9871b4b 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -2478,6 +2478,7 @@ static const vshCmdOptDef opts_migrate[] = { {"tunnelled", VSH_OT_BOOL, 0, gettext_noop("tunnelled migration")}, {"persistent", VSH_OT_BOOL, 0, gettext_noop("persist VM on destination")}, {"undefinesource", VSH_OT_BOOL, 0, gettext_noop("undefine VM on source")}, + {"suspend", VSH_OT_BOOL, 0, gettext_noop("do not restart the domain on the destination host")}, {"domain", VSH_OT_DATA, VSH_OFLAG_REQ, gettext_noop("domain name, id or uuid")}, {"desturi", VSH_OT_DATA, VSH_OFLAG_REQ, gettext_noop("connection URI of the destination host")}, {"migrateuri", VSH_OT_DATA, 0, gettext_noop("migration URI, usually can be omitted")}, @@ -2519,10 +2520,12 @@ cmdMigrate (vshControl *ctl, const vshCmd *cmd) if (vshCommandOptBool (cmd, "persistent")) flags |= VIR_MIGRATE_PERSIST_DEST; - if (vshCommandOptBool (cmd, "undefinesource")) flags |= VIR_MIGRATE_UNDEFINE_SOURCE; + if (vshCommandOptBool (cmd, "suspend")) + flags |= VIR_MIGRATE_PAUSED; + if ((flags & VIR_MIGRATE_PEER2PEER) || vshCommandOptBool (cmd, "direct")) { /* For peer2peer migration or direct migration we only expect one URI diff --git a/tools/virsh.pod b/tools/virsh.pod index 6ff0151..3830464 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -302,10 +302,11 @@ except that it does some error checking. The editor used can be supplied by the C<$EDITOR> environment variable, or if that is not defined defaults to C<vi>. -=item B<migrate> optional I<--live> I<domain-id> I<desturi> I<migrateuri> +=item B<migrate> optional I<--live> I<--suspend> I<domain-id> I<desturi> I<migrateuri> -Migrate domain to another host. Add --live for live migration. The I<desturi> -is the connection URI of the destination host, and I<migrateuri> is the +Migrate domain to another host. Add --live for live migration; --suspend +leaves the domain paused on the destination host. The I<desturi> is the +connection URI of the destination host, and I<migrateuri> is the migration URI, which usually can be omitted. =item B<reboot> I<domain-id> -- 1.6.5.2

This patch fixes the bug where paused/running state is not transmitted during migration. As a result, in the QEMU driver for example the machine was always started on the destination end. In order to do so, I just read the state and if it is appropriate I set the VIR_MIGRATE_PAUSED flag. * src/libvirt.c (virDomainMigrateVersion1, virDomainMigrateVersion2): Automatically add VIR_MIGRATE_PAUSED when appropriate. * src/xen/xend_internal.c (xenDaemonDomainMigratePerform): Give a nicer error message when migration of paused domains is attempted. --- src/libvirt.c | 14 +++++++++++++- src/xen/xend_internal.c | 9 +++++++++ 2 files changed, 22 insertions(+), 1 deletions(-) diff --git a/src/libvirt.c b/src/libvirt.c index 2ced604..008e322 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -2963,7 +2963,13 @@ virDomainMigrateVersion1 (virDomainPtr domain, virDomainPtr ddomain = NULL; char *uri_out = NULL; char *cookie = NULL; - int cookielen = 0; + int cookielen = 0, ret; + virDomainInfo info; + + ret = virDomainGetInfo (domain, &info); + if (ret == 0 && info.state == VIR_DOMAIN_PAUSED) { + flags |= VIR_MIGRATE_PAUSED; + } /* Prepare the migration. * @@ -3028,6 +3034,7 @@ virDomainMigrateVersion2 (virDomainPtr domain, char *cookie = NULL; char *dom_xml = NULL; int cookielen = 0, ret; + virDomainInfo info; /* Prepare the migration. * @@ -3054,6 +3061,11 @@ virDomainMigrateVersion2 (virDomainPtr domain, if (!dom_xml) return NULL; + ret = virDomainGetInfo (domain, &info); + if (ret == 0 && info.state == VIR_DOMAIN_PAUSED) { + flags |= VIR_MIGRATE_PAUSED; + } + ret = dconn->driver->domainMigratePrepare2 (dconn, &cookie, &cookielen, uri, &uri_out, flags, dname, bandwidth, dom_xml); diff --git a/src/xen/xend_internal.c b/src/xen/xend_internal.c index 8822f44..aa1c07d 100644 --- a/src/xen/xend_internal.c +++ b/src/xen/xend_internal.c @@ -4440,6 +4440,15 @@ xenDaemonDomainMigratePerform (virDomainPtr domain, if (flags & VIR_MIGRATE_PERSIST_DEST) flags &= ~VIR_MIGRATE_PERSIST_DEST; + /* This is buggy in Xend, but could be supported in principle. Give + * a nice error message. + */ + if (flags & VIR_MIGRATE_PAUSED) { + virXendError (conn, VIR_ERR_NO_SUPPORT, + "%s", _("xenDaemonDomainMigrate: xend cannot migrate paused domains")); + return -1; + } + /* XXX we could easily do tunnelled & peer2peer migration too if we want to. support these... */ if (flags != 0) { -- 1.6.5.2

On Wed, Dec 09, 2009 at 02:38:25PM +0100, Paolo Bonzini wrote:
This is the final version of QEMU support for migrating paused VMs. The patch series is the same as what I posted on Nov 25, except for the check on the return value of virDomainGetInfo.
Patch 2 also had some easily-solved conflicts upon rebasing.
I haven't yet implemented an error message when inactive domains are migrated; I'll do so in a follow-up patch.
Paolo
include/libvirt/libvirt.h.in | 1 + src/libvirt.c | 15 ++++++++++++++- src/qemu/qemu_driver.c | 37 ++++++++++++++++++++++++------------- src/xen/xend_internal.c | 9 +++++++++ tools/virsh.c | 5 ++++- tools/virsh.pod | 7 ++++--- 6 files changed, 56 insertions(+), 18 deletions(-)
Paolo Bonzini (3): fix migration of paused vms upon failure add virsh --suspend retrieve paused/running state at the beginning of migration
ACK to this series Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Thu, Dec 10, 2009 at 11:38:16AM +0000, Daniel P. Berrange wrote:
On Wed, Dec 09, 2009 at 02:38:25PM +0100, Paolo Bonzini wrote:
This is the final version of QEMU support for migrating paused VMs. The patch series is the same as what I posted on Nov 25, except for the check on the return value of virDomainGetInfo.
Patch 2 also had some easily-solved conflicts upon rebasing.
I haven't yet implemented an error message when inactive domains are migrated; I'll do so in a follow-up patch.
Paolo
include/libvirt/libvirt.h.in | 1 + src/libvirt.c | 15 ++++++++++++++- src/qemu/qemu_driver.c | 37 ++++++++++++++++++++++++------------- src/xen/xend_internal.c | 9 +++++++++ tools/virsh.c | 5 ++++- tools/virsh.pod | 7 ++++--- 6 files changed, 56 insertions(+), 18 deletions(-)
Paolo Bonzini (3): fix migration of paused vms upon failure add virsh --suspend retrieve paused/running state at the beginning of migration
ACK to this series
Yup, pushed ! thanks Paolo ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/
participants (3)
-
Daniel P. Berrange
-
Daniel Veillard
-
Paolo Bonzini