[libvirt] libvirt: xen: Shutdown detection fixes

The issue is easy to observe when using virt-manager as a frontend. Its main window seems to open connections for the up/down main display once and then only repeats DomainGetInfo on the same connection to check whether the instance is up or shut down. The initial open on the connection sets the domain id to -1 for inactive managed domains. And once running, the hypervisor driver returns the status. But after shutdown, the domain cannot be found by the hypervisor driver and the domain info is obtained by the xend driver. However, that expects the domain id to be -1 for inactive domains. This is not true (anymore). In testing I found that domain/status seems to be always present and better suited to check whether a domain is running or not (0 for shutdown and 2 for running, there are probably other values, too). Beside the change, which does fix this problem, the question is whether there should be a mechanism that resets domain->id to -1 when the instance is found to be shut down. And if that should happen, should it be done by any sub-driver or by the unified driver. The second problem I observed when I shut down an instance from the vnc console provided by virt-manager. As soon as the instance was down, the log produced internal errors about every second. Found this again, to be related to the unified driver approach. The call that causes the many error messages is GetVcpus. While the instance is up, the hypervisor driver returns those. But after shutdown, the domain cannot be found by that driver. The xend driver works, though. So what happens is that the hypervisor driver logs an internal error but the call succeeds because the xend driver returns successful. This generally opens the question whether a sub-driver should log any errors (at least not on error level, maybe debug)? The second patch only quiets down the one message I saw repeated that often. There are some other errors logged without any obvious problems interacting with the instances, but those are by far less often. -Stefan Please cc: me on replies as I am (yeah still) not subscribed to this m-l. Thanks.

xend_internal: Use domain/status for shutdown check On newer xend (v3.x and after) there is no state and domid reported for inactive domains. When initially creating connections this is handled in various places by assigning domain->id = -1. But once an instance has been running, the id is set to the current domain id. And it does not change when the instance is shut down. So when querying the domain info, the hypervisor driver, which gets asked first will indicate it cannot find information, then the xend driver is asked and will set the status to NOSTATE because it checks for the -1 domain id. Checking domain/status for 0 seems to be more reliable for that. One note: I am not sure whether the domain->id also should get set back to -1 whenever any sub-driver thinks the instance is no longer running. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=746007 BugLink: http://bugs.launchpad.net/bugs/929626S Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Index: libvirt-0.9.8/src/xen/xend_internal.c =================================================================== --- libvirt-0.9.8.orig/src/xen/xend_internal.c 2011-12-04 08:15:00.000000000 +0100 +++ libvirt-0.9.8/src/xen/xend_internal.c 2012-03-23 11:07:43.575529377 +0100 @@ -989,9 +989,11 @@ state = VIR_DOMAIN_BLOCKED; else if (strchr(flags, 'r')) state = VIR_DOMAIN_RUNNING; - } else if (domain->id < 0) { - /* Inactive domains don't have a state reported, so - mark them SHUTOFF, rather than NOSTATE */ + } else if (sexpr_int(root, "domain/status") == 0) { + /* As far as I can see the domain->id is a bad sign for checking + * inactive domains as this is inaccurate after the domain has + * been running once. However domain/status from xend seems to + * be always present and 0 for inactive domains. */ state = VIR_DOMAIN_SHUTOFF; }

xen-hypervisor: GetVcpus may not find a certain domain Observed on connections that have been running and then shut down. The hypervisor subdriver fills the log with internal errors while the xend driver actually can handle the query. This only handles the case which I observed after shutting down an instance via virt-manager and leaving its console window open. The same reasoning would probably be true for other internal errors as long as they potentially get recovered by other sub-drivers. BugLink: http://bugs.launchpad.net/bugs/963006 Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Index: libvirt-0.9.8/src/xen/xen_driver.c =================================================================== --- libvirt-0.9.8.orig/src/xen/xen_driver.c 2011-12-02 04:59:50.000000000 +0100 +++ libvirt-0.9.8/src/xen/xen_driver.c 2012-03-23 11:31:53.982620050 +0100 @@ -1190,6 +1190,8 @@ if (ret > 0) return ret; } + + xenUnifiedError(VIR_ERR_NO_SUPPORT, __FUNCTION__); return -1; } Index: libvirt-0.9.8/src/xen/xen_hypervisor.c =================================================================== --- libvirt-0.9.8.orig/src/xen/xen_hypervisor.c 2012-03-23 10:58:29.000000000 +0100 +++ libvirt-0.9.8/src/xen/xen_hypervisor.c 2012-03-23 11:26:50.741137585 +0100 @@ -3612,8 +3612,11 @@ &dominfo); if ((ret < 0) || (XEN_GETDOMAININFO_DOMAIN(dominfo) != domain->id)) { - virXenErrorFunc(VIR_ERR_INTERNAL_ERROR, __FUNCTION__, - _("cannot get domain details"), 0); + /* This can happen if an instance is just shut down. It is probably + * better to leave the shouting to the unified caller. + * virXenErrorFunc(VIR_ERR_INTERNAL_ERROR, __FUNCTION__, + * _("cannot get domain details"), 0); + */ return (-1); } nbinfo = XEN_GETDOMAININFO_CPUCOUNT(dominfo) + 1;
participants (1)
-
Stefan Bader