On Fri, Jul 17, 2009 at 02:59:36PM +0200, Chris Lalancette wrote:
All,
Attached is the current version of the tunnelled migration patch, based
upon danpb's generic datastream work. In order to use this work, you must first
grab danpb's data-streams git branch here:
http://gitorious.org/~berrange/libvirt/staging
and then apply this patch on top.
In some basic testing, this seems to work fine for me, although I have not given
it a difficult scenario nor measured CPU utilization with these patches in place.
DanB, these patches take a slightly different approach than you and I
discussed yesterday on IRC. Just to recap, you suggested a new version of
virMigratePrepare (called virMigratePrepareTunnel) that would take in as one of
the arguments a datastream, and during the prepare step properly setup the
datastream. Unless I'm missing something (which is entirely possible), this
would also require passing that same datastream into the perform and finish
stages, meaning that I'd essentially have an all new migration protocol version 3.
To try to avoid that, during the prepare I store the port that we used to
start the listening qemu in a new field in the virDomainObj structure. Then
during the perform step, I create a datastream on the destination and run a new
RPC function called virDomainMigratePrepareTunnel. This looks that port back
up, associates it with the current stream, and returns back to the caller. Then
the source side just does virStreamSend for all the data, and we have tunnelled
migration.
Ah, now I understand why you were having trouble with this.
With this patch, the flow of control is logically
virDomainMigrate(src, dst, uri)
+- virDomainMigratePrepare(dst)
+- virDomainMigratePerform(src, uri)
| +- dst2 = virConnectOpen(uri)
| +- virDomainMigratePrepareTunnel(dst2)
| +- while (1)
| | +- virStreamSend(dst2, data)
| +- virConnectClose(uri)
+- virDomainMigrateFinish(dst)
If we remember the requirement from the libvirt-qpid guys, which is
to remove the need for an application to pass in the destination
handle, this scheme won't work because 'dst' will be NULL.
To cope with that requirement we'd need the logical flow to be
virDomainMigrate(src, NULL, uri)
+- virDomainMigratePerform(src, uri)
+- dst = virConnectOpen(uri)
+- virDomainMigratePrepare(dst)
+- virDomainMigratePrepareTunnel(dst)
+- while (1)
| +- virStreamSend(dst, data)
+- virDomainMigrateFinish(dst)
+- virConnectClose(uri)
At which point, having separate virDomainMigratePrepare vs
virDomainMigratePrepareTunnel is overkill & might as well
just have virDomainMigratePrepareTunnel, which does all
the virDomainMigratePrepare logic itself, avoiding the need
to pass a TCP port around in virDomainObjPtr.
TODO:
- More testing, especially under worst-case scenarios (VM constantly
changing it's memory during migration)
- CPU utilization testing to make sure that we aren't using a lot of CPU
time doing this
- Wall-clock testing
- Switch over to using Unix Domain Sockets instead of localhost TCP
migration. With a patch I put into upstream qemu (and is now in F-12), we can
totally get rid of the scanning of localhost ports to find a free one, and just
use Unix Domain Sockets. That should make the whole thing more robust.
The downside of using exec+dd to open a UNXI domain socket is that
it would add in yet another data copy. It really is annoying that
QEMU doesn't have more of this stuff built-in.
+ st = virStreamNew(dconn, 0);
+ if (st == NULL)
+ /* FIXME: do we need to set an error here or did virStreamNew do it? */
+ goto close_dconn;
+
+ if (virDomainMigratePrepareTunnel(dconn, st, vm->def->uuid, 0) < 0)
+ /* FIXME: do we need to set an error here or did PrepareTunnel do it? */
+ goto close_stream;
+
+ for (;;) {
+ bytes = saferead(client_sock, buffer, MAX_BUFFER);
+ if (bytes < 0) {
+ qemudReportError (dom->conn, dom, NULL, VIR_ERR_OPERATION_FAILED,
+ _("Failed to read from qemu: %s"),
+ virStrerror(errno, ebuf, sizeof ebuf));
+ goto close_stream;
You should call virStreamAbort() here before virStreamFree to
inform the remote end you're terminating the data channel
abnormally
+ }
+ else if (bytes == 0)
+ /* EOF; get out of here */
+ break;
+
+ if (virStreamSend(st, buffer, bytes) < 0) {
+ qemudReportError (dom->conn, dom, NULL, VIR_ERR_OPERATION_FAILED,
+ _("Failed to write migration data to remote
libvirtd"));
+ goto close_stream;
+ }
+ }
+
+ virStreamFinish(st);
+ /* FIXME: check for errors */
A simple check of virStreamFinish return code as you already
do for virStreamSend would be sufficient here. voirStreamFinish
does a round triip handshake to ensure it sees any of the
async errors from virStreamSend.
+
+ retval = 0;
+
+close_stream:
+ virStreamFree(st);
+
+close_dconn:
+ virConnectClose(dconn);
+
+close_client_sock:
+ close(client_sock);
+
+qemu_cancel_migration:
+ if (retval != 0)
+ qemudMonitorCommand(vm, "migrate_cancel", &info);
+ VIR_FREE(info);
+
+close_qemu_sock:
+ close(qemu_sock);
+
+ return retval;
+}
+
/* Perform is the second step, and it runs on the source host. */
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://ovirt.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|