On 06/05/2012 02:36 AM, Guannan Ren wrote:
Bugzilla:https://bugzilla.redhat.com/show_bug.cgi?id=822839
add two general virsh options to support keepalive message protocol
-i | --keepalive_interval interval time value (default 5 seconds)
-n | --keepalive_count number of heartbeats (default 5 times)
For non-p2p migration, start keepalive for remote driver to
determine the status of network connection, aborting migrating job
after defined amount of interval time.
---
tools/virsh.c | 88 +++++++++++++++++++++++++++++++++++++++++++++-----------
1 files changed, 70 insertions(+), 18 deletions(-)
Missing a counterpart documentation in virsh.pod.
@@ -7213,6 +7219,12 @@ doMigrate (void *opaque)
dconn = virConnectOpenAuth (desturi, virConnectAuthPtrDefault, 0);
if (!dconn) goto out;
+ data->dconn = dconn;
+ if (virConnectSetKeepAlive(dconn,
+ ctl->keepalive_interval,
+ ctl->keepalive_count) < 0)
+ vshDebug(ctl, VSH_ERR_WARNING, "migrate: Failed to start
keepalive\n");
This makes it impossible to migrate to an older server that lacks
virConnectSetKeepAlive() support. You need to avoid doing this if
keepalive_interval and/or keepalive_count is set to a value that avoids
keepalive. I think that also means you need to distinguish between a
normal default of 5 seconds with new servers but unspecified by the user
(where we just gracefully continue without keepalive), compared to an
explicit user request of 5 seconds (must fail if the keepalive request
cannot be honored), and compared to an explicit user request of no
keepalive.
@@ -7329,6 +7342,13 @@ repoll:
goto cleanup;
}
+ if (data->dconn && virConnectIsAlive(data->dconn) <= 0) {
+ virDomainAbortJob(dom);
+ vshError(ctl, "%s",
+ _("Lost connection to destination host"));
+ goto cleanup;
+ }
Again, need to be careful of older servers that lack virConnectIsAlive()
support.
@@ -18681,6 +18703,18 @@ vshCommandRun(vshControl *ctl, const vshCmd
*cmd)
if (enable_timing)
GETTIMEOFDAY(&before);
+ /* start keepalive for remote driver */
+ driver = virConnectGetType(ctl->conn);
+ if (STREQ_NULLABLE(driver, "QEMU") ||
+ STREQ_NULLABLE(driver, "xenlight") ||
+ STREQ_NULLABLE(driver, "UML") ||
+ STREQ_NULLABLE(driver, "LXC") ||
+ STREQ_NULLABLE(driver, "remote"))
Why are you limiting this to particular hypervisors? That feels wrong.
Rather, you should blindly attempt it, and gracefully detect when it is
unsupported.
@@ -19959,17 +19993,19 @@ vshUsage(void)
fprintf(stdout, _("\n%s [options]... [<command_string>]"
"\n%s [options]... <command> [args...]\n\n"
" options:\n"
- " -c | --connect=URI hypervisor connection
URI\n"
+ " -c | --connect=URI
hypervisor connection URI\n"
Rather than reindenting everything, I would just alter your new lines to
use multiple lines:
+ " -i | --keepalive_interval interval
time value (default 5 seconds)\n"
+ " -n | --keepalive_count number of heartbeats
(default 5 times)\n\n"
-c | --connect=URI hypervisor connection URI
-i | --keepalive_interval
interval time value (default 5 seconds)
/* Standard (non-command) options. The leading + ensures that no
* argument reordering takes place, so that command options are
* not confused with top-level virsh options. */
- while ((arg = getopt_long(argc, argv, "+d:hqtc:vVrl:e:", opt, NULL)) !=
-1) {
+ while ((arg = getopt_long(argc, argv, "+d:hqtc:vVrl:e:i:n:", opt, NULL))
!= -1) {
Pre-existing, but I would welcome an independent patch that cleaned this
up into alphabetic ordering to make it easier to find code for a given
option now that we have more options to deal with.
Looking forward to v3.
--
Eric Blake eblake(a)redhat.com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org