[libvirt] [PATCH] Better error reporting for failed migration.

The comment in the code says most of it, but when the destination hostname resolution is screwed up, print a proper error instead of the very unhelpful "unknown error". Note that I'm not overly fond of the wording in the error message, so I'm open to suggestions. Signed-off-by: Chris Lalancette <clalance@redhat.com> --- src/qemu/qemu_driver.c | 16 ++++++++++++++++ 1 files changed, 16 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d37b184..02bb5cb 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6288,6 +6288,22 @@ qemudDomainMigratePrepare2 (virConnectPtr dconn, goto cleanup; } + /* Remember that we are running on the destination. The hostname that + * we resolve here will be used on the source machine in the "migrate" + * monitor command. Because of that, localhost is almost always the + * wrong thing. Adding this check explicitly breaks localhost + * migration, but only for those machines that have improperly + * configured hostname resolution. + */ + if (STREQ(hostname, "localhost")) { + VIR_FREE(hostname); + qemudReportError(dconn, NULL, NULL, VIR_ERR_INVALID_ARG, "%s", + _("Could not resolve destination hostname; " + "either fix destination to resolve hostname, " + "or use the optional URI migration parameter")); + goto cleanup; + } + /* XXX this really should have been a properly well-formed * URI, but we can't add in tcp:// now without breaking * compatability with old targets. We at least make the -- 1.6.0.6

On Thu, Oct 15, 2009 at 11:26:38AM +0200, Chris Lalancette wrote:
The comment in the code says most of it, but when the destination hostname resolution is screwed up, print a proper error instead of the very unhelpful "unknown error".
Note that I'm not overly fond of the wording in the error message, so I'm open to suggestions.
Signed-off-by: Chris Lalancette <clalance@redhat.com> --- src/qemu/qemu_driver.c | 16 ++++++++++++++++ 1 files changed, 16 insertions(+), 0 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d37b184..02bb5cb 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6288,6 +6288,22 @@ qemudDomainMigratePrepare2 (virConnectPtr dconn, goto cleanup; }
+ /* Remember that we are running on the destination. The hostname that + * we resolve here will be used on the source machine in the "migrate" + * monitor command. Because of that, localhost is almost always the + * wrong thing. Adding this check explicitly breaks localhost + * migration, but only for those machines that have improperly + * configured hostname resolution. + */
NB, localhost migration has never worked, nor do we intend it to. You will certainly deadlock libvirtd if you tried it with tunnelled migration We should in fact try to protect against this in the 'perform' method. After opening the libvirtd connection to the dest, it shoud call the virGetHotsname(dconn) and compare it to virGetHostname(conn) and reject it if it is the same
+ if (STREQ(hostname, "localhost")) { + VIR_FREE(hostname); + qemudReportError(dconn, NULL, NULL, VIR_ERR_INVALID_ARG, "%s", + _("Could not resolve destination hostname; " + "either fix destination to resolve hostname, " + "or use the optional URI migration parameter")); + goto cleanup; + } +
I think I'd be inclined to actually resolve the hostname, and then check it agaist 127.0.0.1 and ::1. You can get quite a few variations which ultimtely might point to localhost. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Daniel P. Berrange wrote:
+ if (STREQ(hostname, "localhost")) { + VIR_FREE(hostname); + qemudReportError(dconn, NULL, NULL, VIR_ERR_INVALID_ARG, "%s", + _("Could not resolve destination hostname; " + "either fix destination to resolve hostname, " + "or use the optional URI migration parameter")); + goto cleanup; + } +
I think I'd be inclined to actually resolve the hostname, and then check it agaist 127.0.0.1 and ::1. You can get quite a few variations which ultimtely might point to localhost.
I did some testing here. Basically I wrote a small program that just calls getaddrinfo() like virHostname() does, iterates over all of the addresses returned, and prints various information. On my F-11 box, due to a possibly glibc bug, it only returns IPv4 addresses, and the fields look like: flags 0x82, family 2, socktype 1, protocol 6, addrlen 16, addr 0x1521b80, canonname intel2.xmen.org flags 0x82, family 2, socktype 2, protocol 17, addrlen 16, addr 0x1521bd0, canonname (null) flags 0x82, family 2, socktype 3, protocol 0, addrlen 16, addr 0x1521c20, canonname (null) On my F-12 box, which is the one that originally showed the problems, it shows both IPv6 and IPv4 addresses, and the output looks like: flags 0x82, family 10, socktype 1, protocol 6, addrlen 28, addr 0x19ad220, canonname localhost flags 0x82, family 10, socktype 2, protocol 17, addrlen 28, addr 0x19ad280, canonname (null) flags 0x82, family 10, socktype 3, protocol 0, addrlen 28, addr 0x19ad2e0, canonname (null) flags 0x82, family 2, socktype 1, protocol 6, addrlen 16, addr 0x19ad110, canonname (null) flags 0x82, family 2, socktype 2, protocol 17, addrlen 16, addr 0x19ad180, canonname (null) flags 0x82, family 2, socktype 3, protocol 0, addrlen 16, addr 0x19ad1d0, canonname (null) So it seems like the canonname resolves to "localhost" in both the IPv4 (family 2) and IPv6 (family 10) cases. I guess we could pass the result of virGetHostname() back into getaddrinfo() and look at the results, but I don't know that it would be significantly different from what we already get. Do you still think it is worthwhile? -- Chris Lalancette
participants (2)
-
Chris Lalancette
-
Daniel P. Berrange