[libvirt] [PATCH 0/2] Fix xen domain listing

These patches are a re-post of Jonas Eriksson's work from a couple of months ago, that we mistakenly forgot to commit. The first patch is unchanged, apart from being adjusted for new source code layout. The second patch is changed so that it checks against the hypervisor, not Xend, since using Xend has a very serious performance impact (as much as 1 second per domain queried has been seen) Daniel

From: Jonas Eriksson <jonas.j.eriksson@ericsson.com> The xenUnifiedNumOfDomains and xenUnifiedListDomains methods work together as a pair, so it is critical they both apply the same logic. With the current mis-matched logic it is possible to sometimes get into a state when you miss certain active guests. * src/xen/xen_driver.c: Change xenUnifiedNumOfDomains ordering to match xenUnifiedListDomains. --- src/xen/proxy_internal.c | 3 +-- src/xen/proxy_internal.h | 1 + src/xen/xen_driver.c | 30 ++++++++++++++++++++++++------ src/xen/xend_internal.c | 2 +- src/xen/xend_internal.h | 1 + 5 files changed, 28 insertions(+), 9 deletions(-) diff --git a/src/xen/proxy_internal.c b/src/xen/proxy_internal.c index f9d2fad..22ff172 100644 --- a/src/xen/proxy_internal.c +++ b/src/xen/proxy_internal.c @@ -40,7 +40,6 @@ static virDrvOpenStatus xenProxyOpen(virConnectPtr conn, virConnectAuthPtr auth, static int xenProxyGetVersion(virConnectPtr conn, unsigned long *hvVer); static int xenProxyNodeGetInfo(virConnectPtr conn, virNodeInfoPtr info); static char *xenProxyGetCapabilities(virConnectPtr conn); -static int xenProxyNumOfDomains(virConnectPtr conn); static unsigned long xenProxyDomainGetMaxMemory(virDomainPtr domain); static int xenProxyDomainGetInfo(virDomainPtr domain, virDomainInfoPtr info); static char *xenProxyDomainGetOSType(virDomainPtr domain); @@ -607,7 +606,7 @@ xenProxyListDomains(virConnectPtr conn, int *ids, int maxids) * * Returns the number of domain found or -1 in case of error */ -static int +int xenProxyNumOfDomains(virConnectPtr conn) { virProxyPacket req; diff --git a/src/xen/proxy_internal.h b/src/xen/proxy_internal.h index 185fa67..19df751 100644 --- a/src/xen/proxy_internal.h +++ b/src/xen/proxy_internal.h @@ -96,4 +96,5 @@ extern char * xenProxyDomainDumpXML(virDomainPtr domain, int flags); extern int xenProxyListDomains(virConnectPtr conn, int *ids, int maxids); +extern int xenProxyNumOfDomains(virConnectPtr conn); #endif /* __LIBVIR_PROXY_H__ */ diff --git a/src/xen/xen_driver.c b/src/xen/xen_driver.c index 76b896a..fc8bbe6 100644 --- a/src/xen/xen_driver.c +++ b/src/xen/xen_driver.c @@ -576,13 +576,31 @@ static int xenUnifiedNumOfDomains (virConnectPtr conn) { GET_PRIVATE(conn); - int i, ret; + int ret; - for (i = 0; i < XEN_UNIFIED_NR_DRIVERS; ++i) - if (priv->opened[i] && drivers[i]->numOfDomains) { - ret = drivers[i]->numOfDomains (conn); - if (ret >= 0) return ret; - } + /* Try xenstore. */ + if (priv->opened[XEN_UNIFIED_XS_OFFSET]) { + ret = xenStoreNumOfDomains (conn); + if (ret >= 0) return ret; + } + + /* Try HV. */ + if (priv->opened[XEN_UNIFIED_HYPERVISOR_OFFSET]) { + ret = xenHypervisorNumOfDomains (conn); + if (ret >= 0) return ret; + } + + /* Try xend. */ + if (priv->opened[XEN_UNIFIED_XEND_OFFSET]) { + ret = xenDaemonNumOfDomains (conn); + if (ret >= 0) return ret; + } + + /* Try proxy. */ + if (priv->opened[XEN_UNIFIED_PROXY_OFFSET]) { + ret = xenProxyNumOfDomains (conn); + if (ret >= 0) return ret; + } return -1; } diff --git a/src/xen/xend_internal.c b/src/xen/xend_internal.c index 49bdba9..f362ae6 100644 --- a/src/xen/xend_internal.c +++ b/src/xen/xend_internal.c @@ -3685,7 +3685,7 @@ error: * * Returns the number of domain found or -1 in case of error */ -static int +int xenDaemonNumOfDomains(virConnectPtr conn) { struct sexpr *root = NULL; diff --git a/src/xen/xend_internal.h b/src/xen/xend_internal.h index 9d2571b..8b00737 100644 --- a/src/xen/xend_internal.h +++ b/src/xen/xend_internal.h @@ -187,5 +187,6 @@ int xenDaemonDomainMigratePerform (virDomainPtr domain, const char *cookie, int int xenDaemonDomainBlockPeek (virDomainPtr domain, const char *path, unsigned long long offset, size_t size, void *buffer); int xenDaemonListDomains(virConnectPtr conn, int *ids, int maxids); +int xenDaemonNumOfDomains(virConnectPtr conn); #endif /* __XEND_INTERNAL_H_ */ -- 1.6.2.5

On Fri, Oct 09, 2009 at 10:35:48AM +0100, Daniel P. Berrange wrote:
From: Jonas Eriksson <jonas.j.eriksson@ericsson.com>
The xenUnifiedNumOfDomains and xenUnifiedListDomains methods work together as a pair, so it is critical they both apply the same logic. With the current mis-matched logic it is possible to sometimes get into a state when you miss certain active guests.
* src/xen/xen_driver.c: Change xenUnifiedNumOfDomains ordering to match xenUnifiedListDomains. --- src/xen/proxy_internal.c | 3 +-- src/xen/proxy_internal.h | 1 + src/xen/xen_driver.c | 30 ++++++++++++++++++++++++------ src/xen/xend_internal.c | 2 +- src/xen/xend_internal.h | 1 + 5 files changed, 28 insertions(+), 9 deletions(-)
diff --git a/src/xen/proxy_internal.c b/src/xen/proxy_internal.c index f9d2fad..22ff172 100644 --- a/src/xen/proxy_internal.c +++ b/src/xen/proxy_internal.c @@ -40,7 +40,6 @@ static virDrvOpenStatus xenProxyOpen(virConnectPtr conn, virConnectAuthPtr auth, static int xenProxyGetVersion(virConnectPtr conn, unsigned long *hvVer); static int xenProxyNodeGetInfo(virConnectPtr conn, virNodeInfoPtr info); static char *xenProxyGetCapabilities(virConnectPtr conn); -static int xenProxyNumOfDomains(virConnectPtr conn); static unsigned long xenProxyDomainGetMaxMemory(virDomainPtr domain); static int xenProxyDomainGetInfo(virDomainPtr domain, virDomainInfoPtr info); static char *xenProxyDomainGetOSType(virDomainPtr domain); @@ -607,7 +606,7 @@ xenProxyListDomains(virConnectPtr conn, int *ids, int maxids) * * Returns the number of domain found or -1 in case of error */ -static int +int xenProxyNumOfDomains(virConnectPtr conn) { virProxyPacket req; diff --git a/src/xen/proxy_internal.h b/src/xen/proxy_internal.h index 185fa67..19df751 100644 --- a/src/xen/proxy_internal.h +++ b/src/xen/proxy_internal.h @@ -96,4 +96,5 @@ extern char * xenProxyDomainDumpXML(virDomainPtr domain, int flags); extern int xenProxyListDomains(virConnectPtr conn, int *ids, int maxids); +extern int xenProxyNumOfDomains(virConnectPtr conn); #endif /* __LIBVIR_PROXY_H__ */ diff --git a/src/xen/xen_driver.c b/src/xen/xen_driver.c index 76b896a..fc8bbe6 100644 --- a/src/xen/xen_driver.c +++ b/src/xen/xen_driver.c @@ -576,13 +576,31 @@ static int xenUnifiedNumOfDomains (virConnectPtr conn) { GET_PRIVATE(conn); - int i, ret; + int ret;
- for (i = 0; i < XEN_UNIFIED_NR_DRIVERS; ++i) - if (priv->opened[i] && drivers[i]->numOfDomains) { - ret = drivers[i]->numOfDomains (conn); - if (ret >= 0) return ret; - } + /* Try xenstore. */ + if (priv->opened[XEN_UNIFIED_XS_OFFSET]) { + ret = xenStoreNumOfDomains (conn); + if (ret >= 0) return ret; + } + + /* Try HV. */ + if (priv->opened[XEN_UNIFIED_HYPERVISOR_OFFSET]) { + ret = xenHypervisorNumOfDomains (conn); + if (ret >= 0) return ret; + } + + /* Try xend. */ + if (priv->opened[XEN_UNIFIED_XEND_OFFSET]) { + ret = xenDaemonNumOfDomains (conn); + if (ret >= 0) return ret; + } + + /* Try proxy. */ + if (priv->opened[XEN_UNIFIED_PROXY_OFFSET]) { + ret = xenProxyNumOfDomains (conn); + if (ret >= 0) return ret; + }
return -1; } diff --git a/src/xen/xend_internal.c b/src/xen/xend_internal.c index 49bdba9..f362ae6 100644 --- a/src/xen/xend_internal.c +++ b/src/xen/xend_internal.c @@ -3685,7 +3685,7 @@ error: * * Returns the number of domain found or -1 in case of error */ -static int +int xenDaemonNumOfDomains(virConnectPtr conn) { struct sexpr *root = NULL; diff --git a/src/xen/xend_internal.h b/src/xen/xend_internal.h index 9d2571b..8b00737 100644 --- a/src/xen/xend_internal.h +++ b/src/xen/xend_internal.h @@ -187,5 +187,6 @@ int xenDaemonDomainMigratePerform (virDomainPtr domain, const char *cookie, int
int xenDaemonDomainBlockPeek (virDomainPtr domain, const char *path, unsigned long long offset, size_t size, void *buffer); int xenDaemonListDomains(virConnectPtr conn, int *ids, int maxids); +int xenDaemonNumOfDomains(virConnectPtr conn);
#endif /* __XEND_INTERNAL_H_ */
ACK ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

The xenstore database sometimes has stale domain IDs which are not present in the hypervisor anymore. Filter these out to avoid causing confusion * src/xen/xs_internal.c: Filter domain IDs against HV's list * src/xen/xen_hypervisor.h, src/xen/xen_hypervisor.c: Add new xenHypervisorHasDomain() method for checking ID validity --- src/xen/xen_hypervisor.c | 22 ++++++++++++++++++++++ src/xen/xen_hypervisor.h | 4 +++- src/xen/xs_internal.c | 35 ++++++++++++++++++++++++++--------- 3 files changed, 51 insertions(+), 10 deletions(-) diff --git a/src/xen/xen_hypervisor.c b/src/xen/xen_hypervisor.c index 3aa3c30..6ab2431 100644 --- a/src/xen/xen_hypervisor.c +++ b/src/xen/xen_hypervisor.c @@ -2780,6 +2780,28 @@ xenHypervisorDomainGetOSType (virDomainPtr dom) return strdup("linux"); } +int +xenHypervisorHasDomain(virConnectPtr conn, + int id) +{ + xenUnifiedPrivatePtr priv; + xen_getdomaininfo dominfo; + + priv = (xenUnifiedPrivatePtr) conn->privateData; + if (priv->handle < 0) + return 0; + + XEN_GETDOMAININFO_CLEAR(dominfo); + + if (virXen_getdomaininfo(priv->handle, id, &dominfo) < 0) + return 0; + + if (XEN_GETDOMAININFO_DOMAIN(dominfo) != id) + return 0; + + return 1; +} + virDomainPtr xenHypervisorLookupDomainByID(virConnectPtr conn, int id) diff --git a/src/xen/xen_hypervisor.h b/src/xen/xen_hypervisor.h index 766f676..5971a90 100644 --- a/src/xen/xen_hypervisor.h +++ b/src/xen/xen_hypervisor.h @@ -23,7 +23,9 @@ int xenHypervisorInit (void); virCapsPtr xenHypervisorMakeCapabilities (virConnectPtr conn); /* The following calls are made directly by the Xen proxy: */ - +int + xenHypervisorHasDomain(virConnectPtr conn, + int id); virDomainPtr xenHypervisorLookupDomainByID (virConnectPtr conn, int id); diff --git a/src/xen/xs_internal.c b/src/xen/xs_internal.c index 0fabcf8..c83cfda 100644 --- a/src/xen/xs_internal.c +++ b/src/xen/xs_internal.c @@ -545,8 +545,9 @@ int xenStoreNumOfDomains(virConnectPtr conn) { unsigned int num; - char **idlist; - int ret = -1; + char **idlist = NULL, *endptr; + int i, ret = -1, realnum = 0; + long id; xenUnifiedPrivatePtr priv; if (conn == NULL) { @@ -559,10 +560,22 @@ xenStoreNumOfDomains(virConnectPtr conn) virXenStoreError(conn, VIR_ERR_INVALID_ARG, __FUNCTION__); return(-1); } + idlist = xs_directory(priv->xshandle, 0, "/local/domain", &num); if (idlist) { - free(idlist); - ret = num; + for (i = 0; i < num; i++) { + id = strtol(idlist[i], &endptr, 10); + if ((endptr == idlist[i]) || (*endptr != 0)) + goto out; + + /* Sometimes xenstore has stale domain IDs, so filter + against the hypervisor's info */ + if (xenHypervisorHasDomain(conn, (int)id)) + realnum++; + } +out: + VIR_FREE (idlist); + ret = realnum; } return(ret); } @@ -579,7 +592,7 @@ xenStoreNumOfDomains(virConnectPtr conn) * Returns the number of domain found or -1 in case of error */ static int -xenStoreDoListDomains(xenUnifiedPrivatePtr priv, int *ids, int maxids) +xenStoreDoListDomains(virConnectPtr conn, xenUnifiedPrivatePtr priv, int *ids, int maxids) { char **idlist = NULL, *endptr; unsigned int num, i; @@ -597,7 +610,11 @@ xenStoreDoListDomains(xenUnifiedPrivatePtr priv, int *ids, int maxids) id = strtol(idlist[i], &endptr, 10); if ((endptr == idlist[i]) || (*endptr != 0)) goto out; - ids[ret++] = (int) id; + + /* Sometimes xenstore has stale domain IDs, so filter + against the hypervisor's info */ + if (xenHypervisorHasDomain(conn, (int)id)) + ids[ret++] = (int) id; } out: @@ -629,7 +646,7 @@ xenStoreListDomains(virConnectPtr conn, int *ids, int maxids) priv = (xenUnifiedPrivatePtr) conn->privateData; xenUnifiedLock(priv); - ret = xenStoreDoListDomains(priv, ids, maxids); + ret = xenStoreDoListDomains(conn, priv, ids, maxids); xenUnifiedUnlock(priv); return(ret); @@ -1275,7 +1292,7 @@ retry: virReportOOMError(NULL); return -1; } - nread = xenStoreDoListDomains(priv, new_domids, new_domain_cnt); + nread = xenStoreDoListDomains(conn, priv, new_domids, new_domain_cnt); if (nread != new_domain_cnt) { // mismatch. retry this read VIR_FREE(new_domids); @@ -1356,7 +1373,7 @@ retry: virReportOOMError(NULL); return -1; } - nread = xenStoreDoListDomains(priv, new_domids, new_domain_cnt); + nread = xenStoreDoListDomains(conn, priv, new_domids, new_domain_cnt); if (nread != new_domain_cnt) { // mismatch. retry this read VIR_FREE(new_domids); -- 1.6.2.5

On Fri, Oct 09, 2009 at 10:35:49AM +0100, Daniel P. Berrange wrote:
The xenstore database sometimes has stale domain IDs which are not present in the hypervisor anymore. Filter these out to avoid causing confusion
* src/xen/xs_internal.c: Filter domain IDs against HV's list * src/xen/xen_hypervisor.h, src/xen/xen_hypervisor.c: Add new xenHypervisorHasDomain() method for checking ID validity --- src/xen/xen_hypervisor.c | 22 ++++++++++++++++++++++ src/xen/xen_hypervisor.h | 4 +++- src/xen/xs_internal.c | 35 ++++++++++++++++++++++++++--------- 3 files changed, 51 insertions(+), 10 deletions(-)
diff --git a/src/xen/xen_hypervisor.c b/src/xen/xen_hypervisor.c index 3aa3c30..6ab2431 100644 --- a/src/xen/xen_hypervisor.c +++ b/src/xen/xen_hypervisor.c @@ -2780,6 +2780,28 @@ xenHypervisorDomainGetOSType (virDomainPtr dom) return strdup("linux"); }
+int +xenHypervisorHasDomain(virConnectPtr conn, + int id) +{ + xenUnifiedPrivatePtr priv; + xen_getdomaininfo dominfo; + + priv = (xenUnifiedPrivatePtr) conn->privateData; + if (priv->handle < 0) + return 0; + + XEN_GETDOMAININFO_CLEAR(dominfo); + + if (virXen_getdomaininfo(priv->handle, id, &dominfo) < 0) + return 0; + + if (XEN_GETDOMAININFO_DOMAIN(dominfo) != id) + return 0; + + return 1; +} + virDomainPtr xenHypervisorLookupDomainByID(virConnectPtr conn, int id) diff --git a/src/xen/xen_hypervisor.h b/src/xen/xen_hypervisor.h index 766f676..5971a90 100644 --- a/src/xen/xen_hypervisor.h +++ b/src/xen/xen_hypervisor.h @@ -23,7 +23,9 @@ int xenHypervisorInit (void); virCapsPtr xenHypervisorMakeCapabilities (virConnectPtr conn);
/* The following calls are made directly by the Xen proxy: */ - +int + xenHypervisorHasDomain(virConnectPtr conn, + int id); virDomainPtr xenHypervisorLookupDomainByID (virConnectPtr conn, int id); diff --git a/src/xen/xs_internal.c b/src/xen/xs_internal.c index 0fabcf8..c83cfda 100644 --- a/src/xen/xs_internal.c +++ b/src/xen/xs_internal.c @@ -545,8 +545,9 @@ int xenStoreNumOfDomains(virConnectPtr conn) { unsigned int num; - char **idlist; - int ret = -1; + char **idlist = NULL, *endptr; + int i, ret = -1, realnum = 0; + long id; xenUnifiedPrivatePtr priv;
if (conn == NULL) { @@ -559,10 +560,22 @@ xenStoreNumOfDomains(virConnectPtr conn) virXenStoreError(conn, VIR_ERR_INVALID_ARG, __FUNCTION__); return(-1); } + idlist = xs_directory(priv->xshandle, 0, "/local/domain", &num); if (idlist) { - free(idlist); - ret = num; + for (i = 0; i < num; i++) { + id = strtol(idlist[i], &endptr, 10); + if ((endptr == idlist[i]) || (*endptr != 0)) + goto out; + + /* Sometimes xenstore has stale domain IDs, so filter + against the hypervisor's info */ + if (xenHypervisorHasDomain(conn, (int)id)) + realnum++; + } +out: + VIR_FREE (idlist); + ret = realnum; } return(ret); } @@ -579,7 +592,7 @@ xenStoreNumOfDomains(virConnectPtr conn) * Returns the number of domain found or -1 in case of error */ static int -xenStoreDoListDomains(xenUnifiedPrivatePtr priv, int *ids, int maxids) +xenStoreDoListDomains(virConnectPtr conn, xenUnifiedPrivatePtr priv, int *ids, int maxids) { char **idlist = NULL, *endptr; unsigned int num, i; @@ -597,7 +610,11 @@ xenStoreDoListDomains(xenUnifiedPrivatePtr priv, int *ids, int maxids) id = strtol(idlist[i], &endptr, 10); if ((endptr == idlist[i]) || (*endptr != 0)) goto out; - ids[ret++] = (int) id; + + /* Sometimes xenstore has stale domain IDs, so filter + against the hypervisor's info */ + if (xenHypervisorHasDomain(conn, (int)id)) + ids[ret++] = (int) id; }
out: @@ -629,7 +646,7 @@ xenStoreListDomains(virConnectPtr conn, int *ids, int maxids) priv = (xenUnifiedPrivatePtr) conn->privateData;
xenUnifiedLock(priv); - ret = xenStoreDoListDomains(priv, ids, maxids); + ret = xenStoreDoListDomains(conn, priv, ids, maxids); xenUnifiedUnlock(priv);
return(ret); @@ -1275,7 +1292,7 @@ retry: virReportOOMError(NULL); return -1; } - nread = xenStoreDoListDomains(priv, new_domids, new_domain_cnt); + nread = xenStoreDoListDomains(conn, priv, new_domids, new_domain_cnt); if (nread != new_domain_cnt) { // mismatch. retry this read VIR_FREE(new_domids); @@ -1356,7 +1373,7 @@ retry: virReportOOMError(NULL); return -1; } - nread = xenStoreDoListDomains(priv, new_domids, new_domain_cnt); + nread = xenStoreDoListDomains(conn, priv, new_domids, new_domain_cnt); if (nread != new_domain_cnt) { // mismatch. retry this read VIR_FREE(new_domids);
Okay so we filter out though an hypervisor call instead of slow xend lookup, ACK ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

On Friday 09 October 2009 11:35:47 Daniel P. Berrange wrote:
These patches are a re-post of Jonas Eriksson's work from a couple of months ago, that we mistakenly forgot to commit.
The first patch is unchanged, apart from being adjusted for new source code layout. The second patch is changed so that it checks against the hypervisor, not Xend, since using Xend has a very serious performance impact (as much as 1 second per domain queried has been seen)
Looks very good so far, stable and fast! Thanks a lot, -t-

On Fri, Oct 09, 2009 at 10:35:47AM +0100, Daniel P. Berrange wrote:
These patches are a re-post of Jonas Eriksson's work from a couple of months ago, that we mistakenly forgot to commit.
The first patch is unchanged, apart from being adjusted for new source code layout. The second patch is changed so that it checks against the hypervisor, not Xend, since using Xend has a very serious performance impact (as much as 1 second per domain queried has been seen)
For some reason I forgot to commit this, yet again. So it really is pushed now Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (3)
-
Daniel P. Berrange
-
Daniel Veillard
-
Thomas Treutner