On Wed, Jul 18, 2007 at 08:22:37PM +0100, Richard W.M. Jones wrote:
This is version 4 of the virDomainMigrate patch. It includes remote
support, but no qemu, virsh or python yet. And it needs lots more
testing which I intend to do more of tomorrow.
Interface
---------
The interface is now this:
virDomainPtr
virDomainMigrate (virDomainPtr domain, virConnectPtr dconn,
unsigned long flags, const char *dname,
const char *uri, unsigned long resource);
I would probably rename resource as 'bandwidth' it's more precise.
The caller may set dname, uri and resource to 0/NULL and forget about
them. Or else the caller may set, in particular, uri to allow for more
complicated migration strategies (especially for qemu).
https://www.redhat.com/archives/libvir-list/2007-July/msg00249.html
Sounds fine, I would have loved a simpler API though.
As outlined in the diagram in this email:
https://www.redhat.com/archives/libvir-list/2007-July/msg00264.html
migration happens in two stages.
To some extend that's "implementation details" compared to the API
but I understand why two steps are needed under the hood, fine by me.
Firstly we send a "prepare" message to the destination
host. The
destination host may reply with a cookie. It may also suggest a URI (in
the current Xen implementation it just returns gethostname). Secondly
we send a "perform" message to the source host.
Correspondingly, there are two new driver API functions:
typedef int
(*virDrvDomainMigratePrepare)
(virConnectPtr dconn,
char **cookie,
int *cookielen,
const char *uri_in,
char **uri_out,
unsigned long flags,
const char *dname,
unsigned long resource);
typedef int
(*virDrvDomainMigratePerform)
(virDomainPtr domain,
const char *cookie,
int cookielen,
const char *uri,
unsigned long flags,
const char *dname,
unsigned long resource);
I wonder if 2 steps are really sufficient. I have the feeling that a third
step virDrvDomainMigrateFinish() might be needed, it could for example
resume on the target side and also verify the domain is actually okay.
That could improve error handling and feels a lot more like a transactional
system where you really want an atomic work/fail operation and nothing else.
There are two corresponding wire messages
(REMOTE_PROC_DOMAIN_MIGRATE_PREPARE and
REMOTE_PROC_DOMAIN_MIGRATE_PERFORM) but they just do dumb argument
shuffling, albeit rather complicated because of the number of arguments
passed in and out.
The complete list of messages which go across the wire during a
migration is:
client -- prepare --> destination host
client <-- prepare reply -- destination host
client -- perform --> source host
client <-- perform reply -- source host
client -- lookupbyname --> destination host
client <-- lookupbyname reply -- destination host
Okay, instead of trying to reuse lookupbyname to assert completion,
I would rather make a third special entry point. Sounds more generic
to me, but again it's an implementation point, not a blocker, all this
is hidden behind the API.
Xen URIs
--------
Xen recognises the following forms of URI:
hostname
hostname:port
tcp://hostname/
tcp://hostname:port/
From an URI perspective "tcp" is not proper IMHO, it should be a protocol
name, make that xenmigr or something.
Capabilities
------------
I have extended capabilities with <migration_features>. For Xen this is:
<capabilities>
<host>
<migration_features>
<live/>
<uri_transports>
<uri_transport>tcp</uri_transport>
</uri_transports>
</migration_features>
Nice, but what is the expected set of values for uri_transport ?
Theorically that can be any scheme name from rfc2396 (or later)
scheme = alpha *( alpha | digit | "+" | "-" | "."
)
unless I misunderstood this.
[...]
+/* Domain migration flags. */
+#define VIR_MIGRATE_LIVE 1
+
+/* Domain migration. */
+virDomainPtr virDomainMigrate (virDomainPtr domain, virConnectPtr dconn,
+ unsigned long flags, const char *dname,
+ const char *uri, unsigned long resource);
I would rename resource to bandwidth :-)
[...]
#define REMOTE_CPUMAPS_MAX 16384
+#define REMOTE_MIGRATE_COOKIE_MAX 256
hum, what is that ? Sorry to show up ignorance, feel free to point
me to an xdr FAQ !
@@ -632,6 +663,8 @@ enum remote_procedure {
REMOTE_PROC_DOMAIN_GET_SCHEDULER_PARAMETERS = 57,
REMOTE_PROC_DOMAIN_SET_SCHEDULER_PARAMETERS = 58,
REMOTE_PROC_GET_HOSTNAME = 59,
+ REMOTE_PROC_DOMAIN_MIGRATE_PERFORM = 60,
+ REMOTE_PROC_DOMAIN_MIGRATE_PREPARE = 61,
};
I would like to see a distinct third step FINISH, as I think it would be
more generic.
[...]
+ * virDomainMigrate:
+ * @domain: a domain object
+ * @dconn: destination host (a connection object)
+ * @flags: flags
+ * @dname: (optional) rename domain to this at destination
+ * @uri: (optional) dest hostname/URI as seen from the source host
+ * @resource: (optional) specify resource limit in Mbps
+ *
+ * Migrate the domain object from its current host to the destination
+ * host given by dconn (a connection to the destination host).
+ *
+ * Flags may be one of more of the following:
+ * VIR_MIGRATE_LIVE Attempt a live migration.
+ *
+ * If a hypervisor supports renaming domains during migration,
+ * then you may set the dname parameter to the new name (otherwise
+ * it keeps the same name). If this is not supported by the
+ * hypervisor, dname must be NULL or else you will get an error.
hum, sounds a bit drastic to me, but might be the right thing.
In any case the capability to rename should be added to the
<capabilities><migration_features> probably as a <rename/> optional
element.
+ * Since typically the two hypervisors connect directly to each
+ * other in order to perform the migration, you may need to specify
+ * a path from the source to the destination. This is the purpose
+ * of the uri parameter. If uri is NULL, then libvirt will try to
+ * find the best method. Uri may specify the hostname or IP address
+ * of the destination host as seen from the source. Or uri may be
+ * a URI giving transport, hostname, user, port, etc. in the usual
+ * form. Refer to driver documentation for the particular URIs
+ * supported.
+ *
+ * The maximum bandwidth (in Mbps) that will be used to do migration
+ * can be specified with the resource parameter. If set to 0,
+ * libvirt will choose a suitable default. Some hypervisors do
+ * not support this feature and will return an error if resource
+ * is not 0.
Do you really want to fail there too ? Similary the capability to
limit bandwidth should be added to the <capabilities><migration_features>
possibly as a <bandwidth/> optional element.
+ * To see which features are supported by the current hypervisor,
+ * see virConnectGetCapabilities, /capabilities/host/migration_features.
+ *
+ * There are many limitations on migration imposed by the underlying
+ * technology - for example it may not be possible to migrate between
+ * different processors even with the same architecture, or between
+ * different types of hypervisor.
+ *
+ * Returns the new domain object if the migration was successful,
+ * or NULL in case of error.
+ */
+virDomainPtr
+virDomainMigrate (virDomainPtr domain,
+ virConnectPtr dconn,
+ unsigned long flags,
+ const char *dname,
+ const char *uri,
+ unsigned long resource)
+{
+ virConnectPtr conn;
+ char *uri_out = NULL;
+ char *cookie = NULL;
+ int cookielen = 0, ret;
+ DEBUG("domain=%p, dconn=%p, flags=%lu, dname=%s, uri=%s, resource=%lu",
+ domain, dconn, flags, dname, uri, resource);
+
+ if (!VIR_IS_DOMAIN (domain)) {
+ virLibDomainError(domain, VIR_ERR_INVALID_DOMAIN, __FUNCTION__);
+ return NULL;
+ }
+ conn = domain->conn; /* Source connection. */
+ if (!VIR_IS_CONNECT (dconn)) {
+ virLibConnError (conn, VIR_ERR_INVALID_CONN, __FUNCTION__);
+ return NULL;
+ }
+
+ /* Check that migration is supported.
+ * Note that in the remote case these functions always exist.
+ * But the remote driver will throw NO_SUPPORT errors later.
+ */
+ if (!conn->driver->domainMigratePerform ||
+ !dconn->driver->domainMigratePrepare) {
+ virLibConnError (conn, VIR_ERR_NO_SUPPORT, __FUNCTION__);
+ return NULL;
+ }
+
+ /* Prepare the migration.
+ *
+ * The destination host may return a cookie, or leave cookie as
+ * NULL.
+ *
+ * The destination host MUST set uri_out if uri_in is NULL.
+ *
+ * If uri_in is non-NULL, then the destination host may modify
+ * the URI by setting uri_out. If it does not wish to modify
+ * the URI, it should leave uri_out as NULL.
+ */
+ ret = dconn->driver->domainMigratePrepare
+ (dconn, &cookie, &cookielen, uri, &uri_out, flags, dname,
resource);
+ if (ret == -1) return NULL;
+ if (uri == NULL && uri_out == NULL) {
+ virLibConnError (conn, VIR_ERR_INTERNAL_ERROR,
+ "domainMigratePrepare did not set uri");
+ if (cookie) free (cookie);
+ return NULL;
+ }
+ if (uri_out) uri = uri_out; /* Did domainMigratePrepare change URI? */
+
+ assert (uri != NULL);
+
+ /* Perform the migration. The driver isn't supposed to return
+ * until the migration is complete.
+ */
+ ret = conn->driver->domainMigratePerform
+ (domain, cookie, cookielen, uri, flags, dname, resource);
+
+ if (uri_out) free (uri_out);
+ if (cookie) free (cookie);
+
+ if (ret == -1) return NULL;
+
+ /* Get the destination domain and return it or error. */
+ return virDomainLookupByName (dconn, dname ? dname : domain->name);
+}
Looks fine, but a finish operation returning a virDomainPtr may be
better. It would carry the cookie and names to identify the migration.
Crazy question, what happens if the remote task gets migrated away again
or just killed before the lookup :-) , I'm not sure we can come up with
an atomic operation in the general case, but that would be nice to allow
it in the design if the hypervisor makes it possible.
[...]
--- src/xen_internal.c 12 Jul 2007 08:57:52 -0000 1.85
+++ src/xen_internal.c 18 Jul 2007 19:00:20 -0000
@@ -2173,6 +2173,12 @@ xenHypervisorMakeCapabilitiesXML(virConn
"\
</features>\n\
</cpu>\n\
+ <migration_features>\n\
+ <live/>\n\
plus
<bandwidth/>
<rename/>
+ <uri_transports>\n\
+ <uri_transport>tcp</uri_transport>\n\
+ </uri_transports>\n\
+ </migration_features>\n\
</host>\n", -1);
if (r == -1) goto vir_buffer_failed;
[...]
+ /* Xen (at least up to 3.1.0) takes a resource parameter but
+ * ignores it.
+ */
+ if (resource) {
+ virXendError (conn, VIR_ERR_NO_SUPPORT,
+ "xenDaemonDomainMigrate: Xen does not support resource limits
during migration");
+ return -1;
+ }
Hum, strange, the classical Xen bandwidth during live migration of http
picture uses a 100Mbps capped transfer, so that ought to work at least
at some point in the past, weird.
[...]
+ /* Set hostname and port.
+ *
+ * URI is non-NULL (guaranteed by caller). We expect either
+ * "hostname", "hostname:port" or
"tcp://hostname[:port]/".
+ */
+ if (strstr (uri, "//")) { /* Full URI. */
+ xmlURIPtr uriptr = xmlParseURI (uri);
+ if (!uriptr) {
+ virXendError (conn, VIR_ERR_INVALID_ARG,
+ "xenDaemonDomainMigrate: invalid URI");
+ return -1;
+ }
+ if (uriptr->scheme && STRCASENEQ (uriptr->scheme,
"tcp")) {
+ virXendError (conn, VIR_ERR_INVALID_ARG,
+ "xenDaemonDomainMigrate: only tcp:// migrations are
supported by Xen");
+ xmlFreeURI (uriptr);
+ return -1;
+ }
+ if (!uriptr->server) {
+ virXendError (conn, VIR_ERR_INVALID_ARG,
+ "xenDaemonDomainMigrate: a hostname must be specified in
the URI");
+ xmlFreeURI (uriptr);
+ return -1;
+ }
+ hostname = strdup (uriptr->server);
+ if (!hostname) {
+ virXendError (conn, VIR_ERR_NO_MEMORY, "strdup");
+ xmlFreeURI (uriptr);
+ return -1;
+ }
+ if (uriptr->port)
+ snprintf (port, sizeof port, "%d", uriptr->port);
+ xmlFreeURI (uriptr);
+ }
+ else if ((p = strrchr (uri, ':')) != NULL) { /* "hostname:port"
*/
+ int port_nr, n;
+
+ if (sscanf (p+1, "%d", &port_nr) != 1) {
+ virXendError (conn, VIR_ERR_INVALID_ARG,
+ "xenDaemonDomainMigrate: invalid port number");
+ return -1;
+ }
+ snprintf (port, sizeof port, "%d", port_nr);
+
+ /* Get the hostname. */
+ n = p - uri; /* n = Length of hostname in bytes. */
+ hostname = strdup (uri);
+ if (!hostname) {
+ virXendError (conn, VIR_ERR_NO_MEMORY, "strdup");
+ return -1;
+ }
+ hostname[n] = '\0';
+ }
+ else { /* "hostname" (or IP address) */
+ hostname = strdup (uri);
+ if (!hostname) {
+ virXendError (conn, VIR_ERR_NO_MEMORY, "strdup");
+ return -1;
+ }
+ }
Hum, we say it's an uri, but we interpret is differently if it's not absolute
this could lead to confusion. But I'm not sure being a purist here would help
that much from an user POV.
In general this looks really good to me. I'm just worried about the 3rd
part of the operation, I don't think it would be that hard to add, but
would IMHO make the protocol at the network level more future-proof,
if we could avoid breaking it in the future that woud be nice to put
in now. But that's not a blocker to me.
thanks !
Daniel
--
Red Hat Virtualization group
http://redhat.com/virtualization/
Daniel Veillard | virtualization library
http://libvirt.org/
veillard(a)redhat.com | libxml GNOME XML XSLT toolkit
http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine
http://rpmfind.net/