[libvirt] [PATCH] Introduce yet another migration version in API.

Migration just seems togo from bad to worse. We already had to introduce a second migration protocol when adding the QEMU driver, since the one from Xen was insufficiently flexible to cope with passing the data the QEMU driver required. It turns out that this protocol still has some flaws that we need to address. The current sequence is * Src: DumpXML - Generate XML to pass to dst * Dst: Prepare - Get ready to accept incoming VM - Generate optional cookie to pass to src * Src: Perform - Start migration and wait for send completion - Kill off VM if successful, resume if failed * Dst: Finish - Wait for recv completion and check status - Kill off VM if unsuccessful The problems with this are: - Since the first step is a generic 'DumpXML' call, we can't add in other migration specific data. eg, we can't include any VM lease data from lock manager plugins - Since the first step is a generic 'DumpXML' call, we can't emit any 'migration begin' event on the source, or have any hook that runs right at the start of the process - Since there is no final step on the source, if the Finish method fails to receive all migration data & has to kill the VM, then there's no way to resume the original VM on the source This patch attempts to introduce a version 3 that uses the improved 5 step sequence * Src: Begin - Generate XML to pass to dst - Generate optional cookie to pass to dst * Dst: Prepare - Get ready to accept incoming VM - Generate optional cookie to pass to src * Src: Perform - Start migration and wait for send completion - Generate optional cookie to pass to dst * Dst: Finish - Wait for recv completion and check status - Kill off VM if failed, resume if success - Generate optional cookie to pass to src * Src: Confirm - Kill off VM if success, resume if failed The API is designed to allow both input and output cookies in all methods where applicable. This lets us pass around arbitrary extra driver specific data between src & dst during migration. Combined with the extra 'Begin' method this lets us pass lease information from source to dst at the start of migration Moving the killing of the source VM out of Perform and into Confirm, means we can now recover if the dst host can't successfully Finish receiving migration data. --- src/driver.h | 64 +++++++++++++++++++ src/libvirt.c | 164 ++++++++++++++++++++++++++++++++++++++++++++++++ src/libvirt_internal.h | 7 ++ 3 files changed, 235 insertions(+), 0 deletions(-) diff --git a/src/driver.h b/src/driver.h index e3a232d..b2514ed 100644 --- a/src/driver.h +++ b/src/driver.h @@ -486,6 +486,65 @@ typedef int virStreamPtr st, unsigned int flags); +typedef char * + (*virDrvDomainMigrateBegin3) + (virConnectPtr dconn, + char **cookiesrc, + int *cookiesrclen, + const char *uri, + unsigned long flags, + const char *dname, + unsigned long resource); + +typedef int + (*virDrvDomainMigratePrepare3) + (virConnectPtr dconn, + const char *cookiesrc, + int cookiesrclen, + char **cookiedst, + int *cookiedstlen, + const char *uri_in, + char **uri_out, + unsigned long flags, + const char *dname, + unsigned long resource, + const char *dom_xml); + +typedef int + (*virDrvDomainMigratePerform3) + (virDomainPtr domain, + char **cookiesrc, + int *cookiesrclen, + const char *cookiedst, + int cookiedstlen, + const char *uri, + unsigned long flags, + const char *dname, + unsigned long resource); + +typedef virDomainPtr + (*virDrvDomainMigrateFinish3) + (virConnectPtr dconn, + const char *dname, + const char *cookiesrc, + int cookiesrclen, + char **cookiedst, + int *cookiedstlen, + const char *uri, + unsigned long flags, + int retcode); + +typedef int + (*virDrvDomainMigrateConfirm3) + (virConnectPtr dconn, + const char *dname, + const char *cookiesrc, + int cookiesrclen, + const char *cookiedst, + int cookiedstlen, + const char *uri, + unsigned long flags, + int retcode); /** * _virDriver: @@ -604,6 +663,11 @@ struct _virDriver { virDrvDomainSetMemoryParameters domainSetMemoryParameters; virDrvDomainGetMemoryParameters domainGetMemoryParameters; virDrvDomainOpenConsole domainOpenConsole; + virDrvDomainMigrateBegin3 domainMigrateBegin3; + virDrvDomainMigratePrepare3 domainMigratePrepare3; + virDrvDomainMigratePerform3 domainMigratePerform3; + virDrvDomainMigrateFinish3 domainMigrateFinish3; + virDrvDomainMigrateConfirm3 domainMigrateConfirm3; }; typedef int diff --git a/src/libvirt.c b/src/libvirt.c index e24fb9e..04a7cd0 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -3342,6 +3342,22 @@ error: } +/* + * Sequence v1: + * + * Dst: Prepare + * - Get ready to accept incoming VM + * - Generate optional cookie to pass to src + * + * Src: Perform + * - Start migration and wait for send completion + * - Kill off VM if successful, resume if failed + * + * Dst: Finish + * - Wait for recv completion and check status + * - Kill off VM if unsuccessful + * + */ static virDomainPtr virDomainMigrateVersion1 (virDomainPtr domain, virConnectPtr dconn, @@ -3411,6 +3427,25 @@ virDomainMigrateVersion1 (virDomainPtr domain, return ddomain; } +/* + * Sequence v2: + * + * Src: DumpXML + * - Generate XML to pass to dst + * + * Dst: Prepare + * - Get ready to accept incoming VM + * - Generate optional cookie to pass to src + * + * Src: Perform + * - Start migration and wait for send completion + * - Kill off VM if successful, resume if failed + * + * Dst: Finish + * - Wait for recv completion and check status + * - Kill off VM if unsuccessful + * + */ static virDomainPtr virDomainMigrateVersion2 (virDomainPtr domain, virConnectPtr dconn, @@ -3505,6 +3540,130 @@ virDomainMigrateVersion2 (virDomainPtr domain, } +/* + * Sequence v3: + * + * Src: Begin + * - Generate XML to pass to dst + * - Generate optional cookie to pass to dst + * + * Dst: Prepare + * - Get ready to accept incoming VM + * - Generate optional cookie to pass to src + * + * Src: Perform + * - Start migration and wait for send completion + * - Generate optional cookie to pass to dst + * + * Dst: Finish + * - Wait for recv completion and check status + * - Kill off VM if failed, resume if success + * - Generate optional cookie to pass to src + * + * Src: Confirm + * - Kill off VM if success, resume if failed + * + */ +static virDomainPtr +virDomainMigrateVersion3 (virDomainPtr domain, + virConnectPtr dconn, + unsigned long flags, + const char *dname, + const char *uri, + unsigned long bandwidth) +{ + virDomainPtr ddomain = NULL; + char *uri_out = NULL; + char *cookiesrc = NULL; + char *cookiedst = NULL; + char *dom_xml = NULL; + int cookiesrclen = 0; + int cookiedstlen = 0; + int ret; + virDomainInfo info; + virErrorPtr orig_err = NULL; + + if (!domain->conn->driver->domainMigrateBegin3) { + virLibConnError (domain->conn, VIR_ERR_INTERNAL_ERROR, __FUNCTION__); + virDispatchError(domain->conn); + return NULL; + } + dom_xml = domain->conn->driver->domainMigrateBegin3 + (domain->conn, &cookiesrc, &cookiesrclen, uri, flags, dname, + bandwidth); + if (!dom_xml) + goto done; + + ret = virDomainGetInfo (domain, &info); + if (ret == 0 && info.state == VIR_DOMAIN_PAUSED) { + flags |= VIR_MIGRATE_PAUSED; + } + + ret = dconn->driver->domainMigratePrepare3 + (dconn, cookiesrc, cookiesrclen, &cookiedst, &cookiedstlen, + uri, &uri_out, flags, dname, bandwidth, dom_xml); + VIR_FREE (dom_xml); + if (ret == -1) + goto done; + + if (uri == NULL && uri_out == NULL) { + virLibConnError (domain->conn, VIR_ERR_INTERNAL_ERROR, + _("domainMigratePrepare2 did not set uri")); + virDispatchError(domain->conn); + goto done; + } + if (uri_out) + uri = uri_out; /* Did domainMigratePrepare2 change URI? */ + assert (uri != NULL); + + /* Perform the migration. The driver isn't supposed to return + * until the migration is complete. The src VM should remain + * running, but in paused state until the destination can + * confirm migration completion. + */ + ret = domain->conn->driver->domainMigratePerform3 + (domain, &cookiesrc, &cookiesrclen, cookiedst, cookiedstlen, + uri, flags, dname, bandwidth); + + /* Perform failed. Make sure Finish doesn't overwrite the error */ + if (ret < 0) + orig_err = virSaveLastError(); + + /* + * The status code from the source is passed to the destination. + * The dest can cleanup in the source indicated it failed to + * send all migration data. Returns NULL for ddomain if + * the dest was unable to complete migration. + */ + dname = dname ? dname : domain->name; + ddomain = dconn->driver->domainMigrateFinish3 + (dconn, dname, cookiesrc, cookiesrclen, &cookiedst, &cookiedstlen, + uri, flags, ret); + + /* + * If Perform3 indicated an error, or if NULL was returned + * from Finish3, then the status code tells the source + * to resume CPUs on the original VM. + */ + ret = domain->conn->driver->domainMigrateConfirm3 + (domain->conn, dname, cookiesrc, cookiesrclen, cookiedst, cookiedstlen, + uri, flags, ret >= 0 && domain ? 0 : -1); + /* XXX is there anything we can do if Confirm3 returns -1. + * Probably nothing beyond praying + */ + + done: + if (orig_err) { + virSetError(orig_err); + virFreeError(orig_err); + } + VIR_FREE(uri_out); + VIR_FREE(cookiesrc); + VIR_FREE(cookiedst); + return ddomain; +} + + /* * This is sort of a migration v3 * @@ -3716,6 +3875,11 @@ virDomainMigrate (virDomainPtr domain, VIR_DRV_SUPPORTS_FEATURE(dconn->driver, dconn, VIR_DRV_FEATURE_MIGRATION_V2)) ddomain = virDomainMigrateVersion2(domain, dconn, flags, dname, uri, bandwidth); + else if (VIR_DRV_SUPPORTS_FEATURE(domain->conn->driver, domain->conn, + VIR_DRV_FEATURE_MIGRATION_V3) && + VIR_DRV_SUPPORTS_FEATURE(dconn->driver, dconn, + VIR_DRV_FEATURE_MIGRATION_V3)) + ddomain = virDomainMigrateVersion3(domain, dconn, flags, dname, uri, bandwidth); else { /* This driver does not support any migration method */ virLibConnError(domain->conn, VIR_ERR_NO_SUPPORT, __FUNCTION__); diff --git a/src/libvirt_internal.h b/src/libvirt_internal.h index 1c4fa4f..d781d6b 100644 --- a/src/libvirt_internal.h +++ b/src/libvirt_internal.h @@ -66,6 +66,13 @@ enum { * perform step is used. */ VIR_DRV_FEATURE_MIGRATION_DIRECT = 5, + + /* + * Driver supports V3-style virDomainMigrate, ie domainMigrateBegin3/ + * domainMigratePrepare3/domainMigratePerform3/domainMigrateFinish3/ + * domainMigrateConfirm3. + */ + VIR_DRV_FEATURE_MIGRATION_V3 = 6, }; -- 1.7.2.3

On Wed, Nov 03, 2010 at 12:17:22AM +1100, Justin Clift wrote:
On 11/03/2010 12:08 AM, Daniel P. Berrange wrote:
Migration just seems togo from bad to worse.
Ahem. "Evolving" and getting more resilient aren't such a bad thing are they? :)
It has a very big cost in code complexity and the size of the testing matrix to cover all possible migration API scenarios, let alone considering different QEMU/host OS versions Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Tue, Nov 02, 2010 at 01:08:53PM +0000, Daniel P. Berrange wrote:
This patch attempts to introduce a version 3 that uses the improved 5 step sequence
* Src: Begin - Generate XML to pass to dst - Generate optional cookie to pass to dst
* Dst: Prepare - Get ready to accept incoming VM - Generate optional cookie to pass to src
* Src: Perform - Start migration and wait for send completion - Generate optional cookie to pass to dst
* Dst: Finish - Wait for recv completion and check status - Kill off VM if failed, resume if success - Generate optional cookie to pass to src
* Src: Confirm - Kill off VM if success, resume if failed
What happens if no confirmation is recieved (destination silently died)? Will there be a timeout (bad idea), or at least a means by which higher level management can cancel migration? Either way, I did not get the suggestion to contact divinity.
+ /* + * If Perform3 indicated an error, or if NULL was returned + * from Finish3, then the status code tells the source + * to resume CPUs on the original VM. + */ + ret = domain->conn->driver->domainMigrateConfirm3 + (domain->conn, dname, cookiesrc, cookiesrclen, cookiedst, cookiedstlen, + uri, flags, ret >= 0 && domain ? 0 : -1); + /* XXX is there anything we can do if Confirm3 returns -1. + * Probably nothing beyond praying + */

On Wed, Nov 03, 2010 at 04:51:18PM +0200, Dan Kenigsberg wrote:
On Tue, Nov 02, 2010 at 01:08:53PM +0000, Daniel P. Berrange wrote:
This patch attempts to introduce a version 3 that uses the improved 5 step sequence
* Src: Begin - Generate XML to pass to dst - Generate optional cookie to pass to dst
* Dst: Prepare - Get ready to accept incoming VM - Generate optional cookie to pass to src
* Src: Perform - Start migration and wait for send completion - Generate optional cookie to pass to dst
* Dst: Finish - Wait for recv completion and check status - Kill off VM if failed, resume if success - Generate optional cookie to pass to src
* Src: Confirm - Kill off VM if success, resume if failed
What happens if no confirmation is recieved (destination silently died)? Will there be a timeout (bad idea), or at least a means by which higher level management can cancel migration?
Either way, I did not get the suggestion to contact divinity.
In protocol of this kind, you always have the problem of what todo if you get an error in the very last step. You can't keep adding further steps, so at some point you have to decide that you've done enough that the consequences of failure are not critical. If the 'Confirm' step fails this means we've either failed to kill off the paused VM, or failed to unpause the VM CPUs. Neither of those scenarios is a showstopper. It just means a VM is left around. The app can try and do a destroy to take care of it again later. The lock manager plugins should ensure that the VM can't do anything bad. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On 11/02/2010 07:08 AM, Daniel P. Berrange wrote:
+static virDomainPtr +virDomainMigrateVersion3 (virDomainPtr domain, + virConnectPtr dconn, + unsigned long flags, + const char *dname, + const char *uri, + unsigned long bandwidth) +{ + virDomainPtr ddomain = NULL; + char *uri_out = NULL; + char *cookiesrc = NULL; + char *cookiedst = NULL; + char *dom_xml = NULL; + int cookiesrclen = 0; + int cookiedstlen = 0; + int ret; + virDomainInfo info; + virErrorPtr orig_err = NULL; + + if (!domain->conn->driver->domainMigrateBegin3) { + virLibConnError (domain->conn, VIR_ERR_INTERNAL_ERROR, __FUNCTION__); + virDispatchError(domain->conn); + return NULL; + }
Do we ever have a step (either here, or when a driver is registered) that ensures that if domainMigrateBegin3 exists, then all 5 domainMigrate*3 functions exist? Otherwise, a dlopen()'d driver could crash libvirt by maliciously supplying one but not all 5 functions.
+ + if (uri == NULL && uri_out == NULL) { + virLibConnError (domain->conn, VIR_ERR_INTERNAL_ERROR, + _("domainMigratePrepare2 did not set uri")); + virDispatchError(domain->conn); + goto done; + } + if (uri_out) + uri = uri_out; /* Did domainMigratePrepare2 change URI? */ + assert (uri != NULL);
Do we really want assert() here? If someone compiles with NDEBUG, would that let us fall through and dereference NULL?
@@ -3716,6 +3875,11 @@ virDomainMigrate (virDomainPtr domain, VIR_DRV_SUPPORTS_FEATURE(dconn->driver, dconn, VIR_DRV_FEATURE_MIGRATION_V2)) ddomain = virDomainMigrateVersion2(domain, dconn, flags, dname, uri, bandwidth); + else if (VIR_DRV_SUPPORTS_FEATURE(domain->conn->driver, domain->conn, + VIR_DRV_FEATURE_MIGRATION_V3) && + VIR_DRV_SUPPORTS_FEATURE(dconn->driver, dconn, + VIR_DRV_FEATURE_MIGRATION_V3)) + ddomain = virDomainMigrateVersion3(domain, dconn, flags, dname, uri, bandwidth);
Should migrateVersion3 be attempted with a higher priority than version2? As written, this attempts version2 first, so you'll never get the benefits of version3. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

On 11/02/10 - 01:08:53PM, Daniel P. Berrange wrote:
This patch attempts to introduce a version 3 that uses the improved 5 step sequence
* Src: Begin - Generate XML to pass to dst - Generate optional cookie to pass to dst
* Dst: Prepare - Get ready to accept incoming VM - Generate optional cookie to pass to src
* Src: Perform - Start migration and wait for send completion - Generate optional cookie to pass to dst
* Dst: Finish - Wait for recv completion and check status - Kill off VM if failed, resume if success - Generate optional cookie to pass to src
* Src: Confirm - Kill off VM if success, resume if failed
The API is designed to allow both input and output cookies in all methods where applicable. This lets us pass around arbitrary extra driver specific data between src & dst during migration. Combined with the extra 'Begin' method this lets us pass lease information from source to dst at the start of migration
Moving the killing of the source VM out of Perform and into Confirm, means we can now recover if the dst host can't successfully Finish receiving migration data.
Yeah, this seems like a pretty good sequence. Like mentioned below, if the last step fails, there isn't a whole lot we can do, except make sure to report it and let management kill it off. One comment inline... <snip>
diff --git a/src/libvirt.c b/src/libvirt.c index e24fb9e..04a7cd0 100644 --- a/src/libvirt.c +++ b/src/libvirt.c ... +/* + * Sequence v3: + * + * Src: Begin + * - Generate XML to pass to dst + * - Generate optional cookie to pass to dst + * + * Dst: Prepare + * - Get ready to accept incoming VM + * - Generate optional cookie to pass to src + * + * Src: Perform + * - Start migration and wait for send completion + * - Generate optional cookie to pass to dst + * + * Dst: Finish + * - Wait for recv completion and check status + * - Kill off VM if failed, resume if success + * - Generate optional cookie to pass to src + * + * Src: Confirm + * - Kill off VM if success, resume if failed + * + */ +static virDomainPtr +virDomainMigrateVersion3 (virDomainPtr domain, + virConnectPtr dconn, + unsigned long flags, + const char *dname, + const char *uri, + unsigned long bandwidth) +{ + virDomainPtr ddomain = NULL; + char *uri_out = NULL; + char *cookiesrc = NULL; + char *cookiedst = NULL; + char *dom_xml = NULL; + int cookiesrclen = 0; + int cookiedstlen = 0; + int ret; + virDomainInfo info; + virErrorPtr orig_err = NULL; + + if (!domain->conn->driver->domainMigrateBegin3) { + virLibConnError (domain->conn, VIR_ERR_INTERNAL_ERROR, __FUNCTION__); + virDispatchError(domain->conn); + return NULL; + } + dom_xml = domain->conn->driver->domainMigrateBegin3 + (domain->conn, &cookiesrc, &cookiesrclen, uri, flags, dname, + bandwidth); + if (!dom_xml) + goto done; + + ret = virDomainGetInfo (domain, &info); + if (ret == 0 && info.state == VIR_DOMAIN_PAUSED) { + flags |= VIR_MIGRATE_PAUSED; + } + + ret = dconn->driver->domainMigratePrepare3 + (dconn, cookiesrc, cookiesrclen, &cookiedst, &cookiedstlen, + uri, &uri_out, flags, dname, bandwidth, dom_xml); + VIR_FREE (dom_xml); + if (ret == -1) + goto done; + + if (uri == NULL && uri_out == NULL) { + virLibConnError (domain->conn, VIR_ERR_INTERNAL_ERROR, + _("domainMigratePrepare2 did not set uri")); + virDispatchError(domain->conn); + goto done; + } + if (uri_out) + uri = uri_out; /* Did domainMigratePrepare2 change URI? */ + assert (uri != NULL);
I think we should get rid of this whole "uri_out" concept, while we are adding a new protocol. Basically, it allows the destination to "tell" the source where to migrate to, but there is no way to really discover that. What you really need are: 1) The ability to let the user specify which network should be used for migration. 2) The ability to determine the "default" network, in the absence of 1). We already get 1) from the passed in uri, if it is specified. We can easily figure out 2) from the source, since we know we can reach the destination via dconn. So if nothing is specified for 1), we can just default to the same network that the libvirt transport is using. This will eliminate a lot of the headaches we have with migration protocol version 2 with trying to determine the hostname on the destination side. -- Chris Lalancette

On Tue, Nov 09, 2010 at 01:33:50PM -0500, Chris Lalancette wrote:
On 11/02/10 - 01:08:53PM, Daniel P. Berrange wrote:
This patch attempts to introduce a version 3 that uses the improved 5 step sequence
* Src: Begin - Generate XML to pass to dst - Generate optional cookie to pass to dst
* Dst: Prepare - Get ready to accept incoming VM - Generate optional cookie to pass to src
* Src: Perform - Start migration and wait for send completion - Generate optional cookie to pass to dst
* Dst: Finish - Wait for recv completion and check status - Kill off VM if failed, resume if success - Generate optional cookie to pass to src
* Src: Confirm - Kill off VM if success, resume if failed
The API is designed to allow both input and output cookies in all methods where applicable. This lets us pass around arbitrary extra driver specific data between src & dst during migration. Combined with the extra 'Begin' method this lets us pass lease information from source to dst at the start of migration
Moving the killing of the source VM out of Perform and into Confirm, means we can now recover if the dst host can't successfully Finish receiving migration data.
Yeah, this seems like a pretty good sequence. Like mentioned below, if the last step fails, there isn't a whole lot we can do, except make sure to report it and let management kill it off. One comment inline...
<snip>
diff --git a/src/libvirt.c b/src/libvirt.c index e24fb9e..04a7cd0 100644 --- a/src/libvirt.c +++ b/src/libvirt.c ... +/* + * Sequence v3: + * + * Src: Begin + * - Generate XML to pass to dst + * - Generate optional cookie to pass to dst + * + * Dst: Prepare + * - Get ready to accept incoming VM + * - Generate optional cookie to pass to src + * + * Src: Perform + * - Start migration and wait for send completion + * - Generate optional cookie to pass to dst + * + * Dst: Finish + * - Wait for recv completion and check status + * - Kill off VM if failed, resume if success + * - Generate optional cookie to pass to src + * + * Src: Confirm + * - Kill off VM if success, resume if failed + * + */ +static virDomainPtr +virDomainMigrateVersion3 (virDomainPtr domain, + virConnectPtr dconn, + unsigned long flags, + const char *dname, + const char *uri, + unsigned long bandwidth) +{ + virDomainPtr ddomain = NULL; + char *uri_out = NULL; + char *cookiesrc = NULL; + char *cookiedst = NULL; + char *dom_xml = NULL; + int cookiesrclen = 0; + int cookiedstlen = 0; + int ret; + virDomainInfo info; + virErrorPtr orig_err = NULL; + + if (!domain->conn->driver->domainMigrateBegin3) { + virLibConnError (domain->conn, VIR_ERR_INTERNAL_ERROR, __FUNCTION__); + virDispatchError(domain->conn); + return NULL; + } + dom_xml = domain->conn->driver->domainMigrateBegin3 + (domain->conn, &cookiesrc, &cookiesrclen, uri, flags, dname, + bandwidth); + if (!dom_xml) + goto done; + + ret = virDomainGetInfo (domain, &info); + if (ret == 0 && info.state == VIR_DOMAIN_PAUSED) { + flags |= VIR_MIGRATE_PAUSED; + } + + ret = dconn->driver->domainMigratePrepare3 + (dconn, cookiesrc, cookiesrclen, &cookiedst, &cookiedstlen, + uri, &uri_out, flags, dname, bandwidth, dom_xml); + VIR_FREE (dom_xml); + if (ret == -1) + goto done; + + if (uri == NULL && uri_out == NULL) { + virLibConnError (domain->conn, VIR_ERR_INTERNAL_ERROR, + _("domainMigratePrepare2 did not set uri")); + virDispatchError(domain->conn); + goto done; + } + if (uri_out) + uri = uri_out; /* Did domainMigratePrepare2 change URI? */ + assert (uri != NULL);
I think we should get rid of this whole "uri_out" concept, while we are adding a new protocol. Basically, it allows the destination to "tell" the source where to migrate to, but there is no way to really discover that. What you really need are:
1) The ability to let the user specify which network should be used for migration. 2) The ability to determine the "default" network, in the absence of 1).
We already get 1) from the passed in uri, if it is specified. We can easily figure out 2) from the source, since we know we can reach the destination via dconn. So if nothing is specified for 1), we can just default to the same network that the libvirt transport is using.
This will eliminate a lot of the headaches we have with migration protocol version 2 with trying to determine the hostname on the destination side.
While I understand your desire to change this, I think it would be a very bad idea todo so. These semantics aren't expressed in the API at all, and neither the app or the user knows whether libvirt is about to use version 1, 2 or 3 of the migration protocol. Thus having the semantics of the URI change based on version 1, 2 or 3 would be very surprising, and make it impossible to predict the semantics when invoking the API. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On 11/02/2010 07:08 AM, Daniel P. Berrange wrote:
Migration just seems togo from bad to worse. We already had to introduce a second migration protocol when adding the QEMU driver, since the one from Xen was insufficiently flexible to cope with passing the data the QEMU driver required.
This patch attempts to introduce a version 3 that uses the improved 5 step sequence
* Src: Begin - Generate XML to pass to dst - Generate optional cookie to pass to dst
* Dst: Prepare - Get ready to accept incoming VM - Generate optional cookie to pass to src
* Src: Perform - Start migration and wait for send completion - Generate optional cookie to pass to dst
* Dst: Finish - Wait for recv completion and check status - Kill off VM if failed, resume if success - Generate optional cookie to pass to src
* Src: Confirm - Kill off VM if success, resume if failed
Are we any closer to implementing this new migration protocol? Meanwhile, I just came up with a question - how does live migration interact with transient changes? For example, consider Osier's recent patches for things like virDomainIsUpdated which tracks whether a running guest has had transient changes that should be tracked as long as the guest runs, but which should not impact the persistent xml definition of the guest when it stops and is restarted. Does that mean that live migration has to send two XML definitions over the wire, for both the persistent definition (to be used if the guest is stopped and restarted) as well as the transient definition (to accurately reflect the running state of the guest, including currently hot-plugged devices)? Does this protocol accurately allow for the transfer of two differing XML descriptions associated with a single running domain? -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org
participants (5)
-
Chris Lalancette
-
Dan Kenigsberg
-
Daniel P. Berrange
-
Eric Blake
-
Justin Clift