
On 06/05/2012 02:36 AM, Guannan Ren wrote:
Bugzilla:https://bugzilla.redhat.com/show_bug.cgi?id=822839 add two general virsh options to support keepalive message protocol
-i | --keepalive_interval interval time value (default 5 seconds) -n | --keepalive_count number of heartbeats (default 5 times)
For non-p2p migration, start keepalive for remote driver to determine the status of network connection, aborting migrating job after defined amount of interval time. --- tools/virsh.c | 88 +++++++++++++++++++++++++++++++++++++++++++++----------- 1 files changed, 70 insertions(+), 18 deletions(-)
Missing a counterpart documentation in virsh.pod.
@@ -7213,6 +7219,12 @@ doMigrate (void *opaque) dconn = virConnectOpenAuth (desturi, virConnectAuthPtrDefault, 0); if (!dconn) goto out;
+ data->dconn = dconn; + if (virConnectSetKeepAlive(dconn, + ctl->keepalive_interval, + ctl->keepalive_count) < 0) + vshDebug(ctl, VSH_ERR_WARNING, "migrate: Failed to start keepalive\n");
This makes it impossible to migrate to an older server that lacks virConnectSetKeepAlive() support. You need to avoid doing this if keepalive_interval and/or keepalive_count is set to a value that avoids keepalive. I think that also means you need to distinguish between a normal default of 5 seconds with new servers but unspecified by the user (where we just gracefully continue without keepalive), compared to an explicit user request of 5 seconds (must fail if the keepalive request cannot be honored), and compared to an explicit user request of no keepalive.
@@ -7329,6 +7342,13 @@ repoll: goto cleanup; }
+ if (data->dconn && virConnectIsAlive(data->dconn) <= 0) { + virDomainAbortJob(dom); + vshError(ctl, "%s", + _("Lost connection to destination host")); + goto cleanup; + }
Again, need to be careful of older servers that lack virConnectIsAlive() support.
@@ -18681,6 +18703,18 @@ vshCommandRun(vshControl *ctl, const vshCmd *cmd) if (enable_timing) GETTIMEOFDAY(&before);
+ /* start keepalive for remote driver */ + driver = virConnectGetType(ctl->conn); + if (STREQ_NULLABLE(driver, "QEMU") || + STREQ_NULLABLE(driver, "xenlight") || + STREQ_NULLABLE(driver, "UML") || + STREQ_NULLABLE(driver, "LXC") || + STREQ_NULLABLE(driver, "remote"))
Why are you limiting this to particular hypervisors? That feels wrong. Rather, you should blindly attempt it, and gracefully detect when it is unsupported.
@@ -19959,17 +19993,19 @@ vshUsage(void) fprintf(stdout, _("\n%s [options]... [<command_string>]" "\n%s [options]... <command> [args...]\n\n" " options:\n" - " -c | --connect=URI hypervisor connection URI\n"
+ " -c | --connect=URI hypervisor connection URI\n"
Rather than reindenting everything, I would just alter your new lines to use multiple lines:
+ " -i | --keepalive_interval interval time value (default 5 seconds)\n" + " -n | --keepalive_count number of heartbeats (default 5 times)\n\n"
-c | --connect=URI hypervisor connection URI -i | --keepalive_interval interval time value (default 5 seconds)
/* Standard (non-command) options. The leading + ensures that no * argument reordering takes place, so that command options are * not confused with top-level virsh options. */ - while ((arg = getopt_long(argc, argv, "+d:hqtc:vVrl:e:", opt, NULL)) != -1) { + while ((arg = getopt_long(argc, argv, "+d:hqtc:vVrl:e:i:n:", opt, NULL)) != -1) {
Pre-existing, but I would welcome an independent patch that cleaned this up into alphabetic ordering to make it easier to find code for a given option now that we have more options to deal with. Looking forward to v3. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org