[libvirt] XenStore fix

Hi, I have been examining a bug where libvirtd (and virsh) does not show all virtual machines on a xen host. This proved to be because of this program flow: 1. virConnectNumOfDomains -> .. -> xenUnifiedNumOfDomains -> xenHypervisorNumOfDomains => 3 2. virConnectListDomains(max=3) -> .. -> xenUnifiedListDomains(max=3) -> xenStoreNumOfDomains(max=3) => { 0, 2, 7 } The domain with ID 2 is then removed when it is discovered that it is not a running domain, which leads to this: xenhost# xm list Name ID Mem VCPUs State Time(s) Domain-0 0 14970 2 r----- 2544.7 vm1 7 512 1 -b---- 2191.7 vm4 512 1 28.0 vm5 12 512 1 -b---- 467.1 vm6 512 1 0.0 vm7 512 1 482.4 xenhost# virsh list Id Name State ---------------------------------- 0 Domain-0 running 7 vm1 idle xenhost# But where does "2" come from? If we check all "directories" in /local/domain which is queried by the xenstore driver, it is apparent that xenstore is not properly cleaned. We find the sequence {0, 2, 7} as the first entries: xenhost# xenstore ls /local/domain |grep '^[^ ]' 0 = "" 2 = "" 7 = "" 9 = "" 10 = "" 11 = "" 12 = "" xenhost# This patch checks that the path found in /local/domain/<domid>/vm exists in xenstore before adding the domid to the return list. The same thing is done for xenStoreNumOfDomains. I use SLES11 with Xen 3.3.1_18546_12-3.1. /Jonas

Make sure that the domains found in xenstore are active when reporting the number of domains and when reporting which domains that are active. --- src/xs_internal.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++------- 1 files changed, 46 insertions(+), 7 deletions(-) diff --git a/src/xs_internal.c b/src/xs_internal.c index 1f54b1f..5b0ce1e 100644 --- a/src/xs_internal.c +++ b/src/xs_internal.c @@ -542,9 +542,10 @@ xenStoreDomainGetMaxMemory(virDomainPtr domain) int xenStoreNumOfDomains(virConnectPtr conn) { - unsigned int num; + unsigned int num, realnum, len; char **idlist; - int ret = -1; + char *ldpath = NULL, *vmpath = NULL, *vmvalue = NULL; + int i, ret = -1; xenUnifiedPrivatePtr priv; if (conn == NULL) { @@ -558,9 +559,29 @@ xenStoreNumOfDomains(virConnectPtr conn) return(-1); } idlist = xs_directory(priv->xshandle, 0, "/local/domain", &num); + + /* Check which of the found domains that are running */ + realnum = num; + for (i = 0; i < num; i++) { + if (asprintf(&ldpath, "/local/domain/%s/vm", idlist[i]) < 0) { + virReportOOMError(NULL); + return -1; + } + vmpath = xs_read (priv->xshandle, 0, ldpath, &len); + if (len > 0) + vmvalue = xs_read (priv->xshandle, 0, vmpath, &len); + + if (vmvalue == NULL) + realnum--; + + VIR_FREE(ldpath); + VIR_FREE(vmpath); + VIR_FREE(vmvalue); + } + if (idlist) { free(idlist); - ret = num; + ret = realnum; } return(ret); } @@ -580,7 +601,8 @@ static int xenStoreDoListDomains(xenUnifiedPrivatePtr priv, int *ids, int maxids) { char **idlist = NULL, *endptr; - unsigned int num, i; + char *vmpath = NULL, *vmvalue = NULL, *ldpath = NULL; + unsigned int num, i, len; int ret = -1; long id; @@ -592,14 +614,31 @@ xenStoreDoListDomains(xenUnifiedPrivatePtr priv, int *ids, int maxids) goto out; for (ret = 0, i = 0; (i < num) && (ret < maxids); i++) { - id = strtol(idlist[i], &endptr, 10); - if ((endptr == idlist[i]) || (*endptr != 0)) + if (asprintf(&ldpath, "/local/domain/%s/vm", idlist[i]) < 0) { + virReportOOMError(NULL); goto out; - ids[ret++] = (int) id; + } + vmpath = xs_read (priv->xshandle, 0, ldpath, &len); + if (len > 0) + vmvalue = xs_read (priv->xshandle, 0, vmpath, &len); + + if (vmvalue != NULL) { + id = strtol(idlist[i], &endptr, 10); + if ((endptr == idlist[i]) || (*endptr != 0)) + goto out; + ids[ret++] = (int) id; + } + + VIR_FREE (ldpath); + VIR_FREE (vmpath); + VIR_FREE (vmvalue); } out: VIR_FREE (idlist); + VIR_FREE (ldpath); + VIR_FREE (vmpath); + VIR_FREE (vmvalue); return ret; } -- 1.6.2

On Tue, Jul 28, 2009 at 08:30:12AM +0200, Jonas Eriksson wrote:
Hi,
I have been examining a bug where libvirtd (and virsh) does not show all virtual machines on a xen host. This proved to be because of this program flow: 1. virConnectNumOfDomains -> .. -> xenUnifiedNumOfDomains -> xenHypervisorNumOfDomains => 3 2. virConnectListDomains(max=3) -> .. -> xenUnifiedListDomains(max=3) -> xenStoreNumOfDomains(max=3) => { 0, 2, 7 }
Actually the problem I see is that ListDomains should really go through the Hypervisor API i.e. xenHypervisorListDomains(), which is *way* faster and garanteed to be acurate. We should try the hypervisor first, IMHO, the function code was modified end of last year to avoid Xend not properly cleaning up: http://www.mail-archive.com/libvir-list@redhat.com/msg09855.html The problem is that we have put xenstore driver call first, while it's clearly slower and has a higher chance of getting things wrong than the hypervisor itself (if the HV get it wrong I guess there is no cure :-) Could you try changing xenUnifiedListDomains() and make the xenHypervisorListDomains try first, then check if it still works with 3.3, if yes that's even better. Now the patch looks reasonable to me but I think the reorder should be done too if this works, and will lead to really good results ... as long as the hypercall works. thanks ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

On Tue, Jul 28, 2009 at 10:15:10AM +0200 Daniel Veillard wrote:
On Tue, Jul 28, 2009 at 08:30:12AM +0200, Jonas Eriksson wrote:
Hi,
I have been examining a bug where libvirtd (and virsh) does not show all virtual machines on a xen host. This proved to be because of this program flow: 1. virConnectNumOfDomains -> .. -> xenUnifiedNumOfDomains -> xenHypervisorNumOfDomains => 3 2. virConnectListDomains(max=3) -> .. -> xenUnifiedListDomains(max=3) -> xenStoreNumOfDomains(max=3) => { 0, 2, 7 }
Actually the problem I see is that ListDomains should really go through the Hypervisor API i.e. xenHypervisorListDomains(), which is *way* faster and garanteed to be acurate. We should try the hypervisor first, IMHO, the function code was modified end of last year to avoid Xend not properly cleaning up:
http://www.mail-archive.com/libvir-list@redhat.com/msg09855.html
Ah, yes.. This seems very related. I remember seeing this earlier but never had the time to look into it back then.
The problem is that we have put xenstore driver call first, while it's clearly slower and has a higher chance of getting things wrong than the hypervisor itself (if the HV get it wrong I guess there is no cure :-)
Could you try changing xenUnifiedListDomains() and make the xenHypervisorListDomains try first, then check if it still works with 3.3, if yes that's even better. Now the patch looks reasonable to me but I think the reorder should be done too if this works, and will lead to really good results ... as long as the hypercall works.
Yes, this works. I tried with the reordering in xenUnifiedListDomains, both with and without my patch to make sure that libvirt used the Hypervisor interface instead of XenStore. Do you want a revised patch for this? In that case, I think that XenStore should be last in the prio-list due to both the performance issues, not helped by my patch, and because it behaves..differently. /Jonas -- Jonas Eriksson Consultant at AS/EAB/FLJ/IL Phone: +46 8 58086281 Combitech AB Älvsjö, Sweden

On Tue, Jul 28, 2009 at 10:15:10AM +0200, Daniel Veillard wrote:
On Tue, Jul 28, 2009 at 08:30:12AM +0200, Jonas Eriksson wrote:
Hi,
I have been examining a bug where libvirtd (and virsh) does not show all virtual machines on a xen host. This proved to be because of this program flow: 1. virConnectNumOfDomains -> .. -> xenUnifiedNumOfDomains -> xenHypervisorNumOfDomains => 3 2. virConnectListDomains(max=3) -> .. -> xenUnifiedListDomains(max=3) -> xenStoreNumOfDomains(max=3) => { 0, 2, 7 }
Actually the problem I see is that ListDomains should really go through the Hypervisor API i.e. xenHypervisorListDomains(), which is *way* faster and garanteed to be acurate. We should try the hypervisor first, IMHO, the function code was modified end of last year to avoid Xend not properly cleaning up:
Going to the Hypervisor first will just re-introduce the bug shown in this mail you quote:
http://www.mail-archive.com/libvir-list@redhat.com/msg09855.html
ie, the Hypervisor will report a domain exists, while XenD will claim it wouldn't, and thus virsh will print out lots of errors libvir: Xen Daemon error : GET operation failed: xend_get: error from xen
The problem is that we have put xenstore driver call first, while it's clearly slower and has a higher chance of getting things wrong than the hypervisor itself (if the HV get it wrong I guess there is no cure :-)
The problem here is that there are 2 definitions of 'right'. - The hypervisor reports all guest domains - XenD reports all guest domains that it knows about We have to ask XenD for the guest configuration later, so if we get the list of domain IDs from the HV, it is inevitable that we will get errors from XenD for some domains. If we are to avoid errors then we need to get a list of domain IDs that matches XenD's view. We can't ask XenD directly because that is insanely slow, so XenStore is the next best, however, that seems to have some domains that have gone away. I think part of the problem is that this cannot be solved with a simple prioritization of HV, XenStored, XenD. We need to implement something that combines information from xenstore & HV. - Get list of domain IDs from XenStore. - Remove any domain IDs from this list that don't exist in the HyperVisor And crucially, 'numOfDomains' API has to follow same logic as the 'listDomains' API. Currently numOfDomains goes to HV, while listDomains goes to XenStore which is a nasty mist-match This should ensure we don't get any domain IDs from the HV that XenD has stopped reporting, and it should also ensure we don't report stale domain IDs from xenstore. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Tue, Jul 28, 2009 at 09:53:46AM +0100, Daniel P. Berrange wrote:
On Tue, Jul 28, 2009 at 10:15:10AM +0200, Daniel Veillard wrote:
On Tue, Jul 28, 2009 at 08:30:12AM +0200, Jonas Eriksson wrote:
Hi,
I have been examining a bug where libvirtd (and virsh) does not show all virtual machines on a xen host. This proved to be because of this program flow: 1. virConnectNumOfDomains -> .. -> xenUnifiedNumOfDomains -> xenHypervisorNumOfDomains => 3 2. virConnectListDomains(max=3) -> .. -> xenUnifiedListDomains(max=3) -> xenStoreNumOfDomains(max=3) => { 0, 2, 7 }
Actually the problem I see is that ListDomains should really go through the Hypervisor API i.e. xenHypervisorListDomains(), which is *way* faster and garanteed to be acurate. We should try the hypervisor first, IMHO, the function code was modified end of last year to avoid Xend not properly cleaning up:
Going to the Hypervisor first will just re-introduce the bug shown in this mail you quote:
http://www.mail-archive.com/libvir-list@redhat.com/msg09855.html
ie, the Hypervisor will report a domain exists, while XenD will claim it wouldn't, and thus virsh will print out lots of errors
libvir: Xen Daemon error : GET operation failed: xend_get: error from xen
The problem is that we have put xenstore driver call first, while it's clearly slower and has a higher chance of getting things wrong than the hypervisor itself (if the HV get it wrong I guess there is no cure :-)
The problem here is that there are 2 definitions of 'right'.
- The hypervisor reports all guest domains
As far as I know it's the hypervisor which allocates resources
- XenD reports all guest domains that it knows about
and xend/xenstore layers are just there for the management. IMHO if the hypervisor allocates ressources to a domain, that should be reported, even if xend or xenstore is confused.
We have to ask XenD for the guest configuration later, so if we get the list of domain IDs from the HV, it is inevitable that we will get errors from XenD for some domains. If we are to avoid
Well then that's IMHO indicative of Xend problems. Rogue domains hidden from Xend or xenstore for some reason should be reported even if incomplete status is given.
I think part of the problem is that this cannot be solved with a simple prioritization of HV, XenStored, XenD. We need to implement something that combines information from xenstore & HV.
- Get list of domain IDs from XenStore. - Remove any domain IDs from this list that don't exist in the HyperVisor
Considering the various possible problem in the XenStore or XenD userland, if someone finds a way to hide his domain from them it would get 0 accounting as a result, if you consider hosting companies and similar use case, I really would prefer some kind of error popping out when running domains aren't listed in the management layer than silently ignoring them.
And crucially, 'numOfDomains' API has to follow same logic as the 'listDomains' API. Currently numOfDomains goes to HV, while listDomains goes to XenStore which is a nasty mist-match
having the same loging on both side makes sense.
This should ensure we don't get any domain IDs from the HV that XenD has stopped reporting, and it should also ensure we don't report stale domain IDs from xenstore.
The real question is why xend should stop reporting about domains that are still present at the hypervisor level. Except for state transitions taht smells fishy to me ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

On 07/28/09 12:28, Daniel Veillard wrote:
Going to the Hypervisor first will just re-introduce the bug shown in this mail you quote:
http://www.mail-archive.com/libvir-list@redhat.com/msg09855.html ie, the Hypervisor will report a domain exists, while XenD will claim it wouldn't, and thus virsh will print out lots of errors
The real question is why xend should stop reporting about domains that are still present at the hypervisor level. Except for state transitions taht smells fishy to me !
One case where this happens is domain shutdown. xend shuts down the domain, cleans xenstore. xen can't zap all pages owned by the domain because some pages are still referenced. xen will report the domain until all references are gone, although the management knows nothing about it. Usually it takes only a very short time until all references are dropped (basically the backends have to notice the frontend is gone and release all grant mappings). It may take longer or not happen at all though. Which indeed indicates a bug somewhere. Still it may happen. I suspect you can trigger it using "kill -STOP xenconsoled". cheers, Gerd

On Tue, Jul 28, 2009 at 08:30:12AM +0200, Jonas Eriksson wrote:
Hi,
I have been examining a bug where libvirtd (and virsh) does not show all virtual machines on a xen host. This proved to be because of this program flow: 1. virConnectNumOfDomains -> .. -> xenUnifiedNumOfDomains -> xenHypervisorNumOfDomains => 3 2. virConnectListDomains(max=3) -> .. -> xenUnifiedListDomains(max=3) -> xenStoreNumOfDomains(max=3) => { 0, 2, 7 }
That in itself is a nice indication of trouble - xenUnifiedNumOfDomains and xenUnifiedListDomains should both follow the exact same logic. At least part of your problem is due to one using the HV first, while the other uses XenStore - they need to be consistent Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Tue, Jul 28, 2009 at 09:54:53AM +0100 Daniel P. Berrange wrote:
On Tue, Jul 28, 2009 at 08:30:12AM +0200, Jonas Eriksson wrote:
Hi,
I have been examining a bug where libvirtd (and virsh) does not show all virtual machines on a xen host. This proved to be because of this program flow: 1. virConnectNumOfDomains -> .. -> xenUnifiedNumOfDomains -> xenHypervisorNumOfDomains => 3 2. virConnectListDomains(max=3) -> .. -> xenUnifiedListDomains(max=3) -> xenStoreNumOfDomains(max=3) => { 0, 2, 7 }
That in itself is a nice indication of trouble - xenUnifiedNumOfDomains and xenUnifiedListDomains should both follow the exact same logic. At least part of your problem is due to one using the HV first, while the other uses XenStore - they need to be consistent
- Get list of domain IDs from XenStore. - Remove any domain IDs from this list that don't exist in the HyperVisor This would mean that this functionality should be implemented in
I agree that both xenUnifiedNumOfDomains and xenUnifiedListDomains really should use the same logic. We should however not forget that my fix addresses the case when xenStoreNumOfDomains and xenStoreListDomains gets confused due to old data left in XenStore by verifying the data agains other parts of XenStore. In some other thread an implementation like this were suggested: the Xen unified driver, correct? Plus, in order to get the same logic in both numOfDomains as listDomains, we would have to implement the same thing there. As for the discussion about rouge VMs, I must admit that I found it to be a bit strange. Either we can trust XenStore, which will mean that it should not be any problems in verifying that the data is valid through XenStore as well, or am I missing some case here? If we cannot trust XenStore, should it be used at all? Best regards, Jonas -- Jonas Eriksson Consultant at AS/EAB/FLJ/IL Combitech AB Älvsjö, Sweden

On Tue, Jul 28, 2009 at 08:30:12AM +0200 Jonas Eriksson wrote:
Hi,
I have been examining a bug where libvirtd (and virsh) does not show all virtual machines on a xen host. This proved to be because of this program flow: 1. virConnectNumOfDomains -> .. -> xenUnifiedNumOfDomains -> xenHypervisorNumOfDomains => 3 2. virConnectListDomains(max=3) -> .. -> xenUnifiedListDomains(max=3) -> xenStoreNumOfDomains(max=3) => { 0, 2, 7 }
The last call is of corse to xenStoreListDomains, nothing else. /Jonas -- Jonas Eriksson Consultant at AS/EAB/FLJ/IL Combitech AB Älvsjö, Sweden

Hi again, How should we tackle this? The discussions seems to have stopped, but as this is a bug, it would be nice to have the fix in 0.7.0 if possible. Any suggestions of what an eveutual reworked patch should contain, or is this in need of a discussion better saved to a time where a release is not right around the corner? Regards, Jonas On Tue, Jul 28, 2009 at 08:30:12AM +0200 Jonas Eriksson wrote:
Hi,
I have been examining a bug where libvirtd (and virsh) does not show all virtual machines on a xen host. This proved to be because of this program flow: 1. virConnectNumOfDomains -> .. -> xenUnifiedNumOfDomains -> xenHypervisorNumOfDomains => 3 2. virConnectListDomains(max=3) -> .. -> xenUnifiedListDomains(max=3) -> xenStoreNumOfDomains(max=3) => { 0, 2, 7 } The domain with ID 2 is then removed when it is discovered that it is not a running domain, which leads to this:
xenhost# xm list Name ID Mem VCPUs State Time(s) Domain-0 0 14970 2 r----- 2544.7 vm1 7 512 1 -b---- 2191.7 vm4 512 1 28.0 vm5 12 512 1 -b---- 467.1 vm6 512 1 0.0 vm7 512 1 482.4 xenhost# virsh list Id Name State ---------------------------------- 0 Domain-0 running 7 vm1 idle xenhost#
But where does "2" come from? If we check all "directories" in /local/domain which is queried by the xenstore driver, it is apparent that xenstore is not properly cleaned. We find the sequence {0, 2, 7} as the first entries: xenhost# xenstore ls /local/domain |grep '^[^ ]' 0 = "" 2 = "" 7 = "" 9 = "" 10 = "" 11 = "" 12 = "" xenhost#
This patch checks that the path found in /local/domain/<domid>/vm exists in xenstore before adding the domid to the return list. The same thing is done for xenStoreNumOfDomains.
I use SLES11 with Xen 3.3.1_18546_12-3.1.
/Jonas
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Jonas Eriksson Consultant at AS/EAB/FLJ/IL Combitech AB Älvsjö, Sweden

On Thu, Jul 30, 2009 at 01:51:40PM +0200, Jonas Eriksson wrote:
Hi again,
How should we tackle this? The discussions seems to have stopped, but as this is a bug, it would be nice to have the fix in 0.7.0 if possible. Any suggestions of what an eveutual reworked patch should contain, or is this in need of a discussion better saved to a time where a release is not right around the corner?
I think we need todo 2 patches - Make xenUnifiedNumOfDomains use the same ordering as xenUnifiedListDomains - Make xenStoreNumOfDomains/xenStoreListDomains filter out any domain IDs they find that aren't visible in the hypervisor. The latter would entail xs_internal.c, calling into xen_internal.c but I don't consider that a problem really Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Something like this works for me. Did test each patch individualy as well. Had to do a fix to xenDaemonDomainLookupByID as it's behaviour did not reflect the documentation. As this is the only time it is called with uuid=NULL that I can see, I included this in the same patch.

On Thu, Jul 30, 2009 at 05:15:09PM +0200, Jonas Eriksson wrote:
Something like this works for me. Did test each patch individualy as well. Had to do a fix to xenDaemonDomainLookupByID as it's behaviour did not reflect the documentation. As this is the only time it is called with uuid=NULL that I can see, I included this in the same patch.
Okay, I still find a bit annoying that the hypervisor doesn't have the last word on what's running or not, but I'm fine being in the minority. In any case the UUID check should be commited it's a real bug, ACK, thanks ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

Hi, just woundering of the status of this patch in good time before 0.7.1. Regards, Jonas On Thu, Jul 30, 2009 at 05:35:24PM +0200 Daniel Veillard wrote:
On Thu, Jul 30, 2009 at 05:15:09PM +0200, Jonas Eriksson wrote:
Something like this works for me. Did test each patch individualy as well. Had to do a fix to xenDaemonDomainLookupByID as it's behaviour did not reflect the documentation. As this is the only time it is called with uuid=NULL that I can see, I included this in the same patch.
Okay, I still find a bit annoying that the hypervisor doesn't have the last word on what's running or not, but I'm fine being in the minority. In any case the UUID check should be commited it's a real bug,
ACK,
thanks !
Daniel
-- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/
-- Jonas Eriksson Consultant at AS/EAB/FLJ/IL Combitech AB Älvsjö, Sweden

src/xen_unified.c: xenUnifiedNumOfDomains used backends in same order as array priv->opened. Changed this to reflect the priorities used in xenUnifiedListDomains --- src/proxy_internal.c | 3 +-- src/proxy_internal.h | 1 + src/xen_unified.c | 30 ++++++++++++++++++++++++------ src/xend_internal.c | 2 +- src/xend_internal.h | 1 + 5 files changed, 28 insertions(+), 9 deletions(-) diff --git a/src/proxy_internal.c b/src/proxy_internal.c index 5b92ad8..8d8877a 100644 --- a/src/proxy_internal.c +++ b/src/proxy_internal.c @@ -40,7 +40,6 @@ static virDrvOpenStatus xenProxyOpen(virConnectPtr conn, virConnectAuthPtr auth, static int xenProxyGetVersion(virConnectPtr conn, unsigned long *hvVer); static int xenProxyNodeGetInfo(virConnectPtr conn, virNodeInfoPtr info); static char *xenProxyGetCapabilities(virConnectPtr conn); -static int xenProxyNumOfDomains(virConnectPtr conn); static unsigned long xenProxyDomainGetMaxMemory(virDomainPtr domain); static int xenProxyDomainGetInfo(virDomainPtr domain, virDomainInfoPtr info); static char *xenProxyDomainGetOSType(virDomainPtr domain); @@ -604,7 +603,7 @@ xenProxyListDomains(virConnectPtr conn, int *ids, int maxids) * * Returns the number of domain found or -1 in case of error */ -static int +int xenProxyNumOfDomains(virConnectPtr conn) { virProxyPacket req; diff --git a/src/proxy_internal.h b/src/proxy_internal.h index 185fa67..19df751 100644 --- a/src/proxy_internal.h +++ b/src/proxy_internal.h @@ -96,4 +96,5 @@ extern char * xenProxyDomainDumpXML(virDomainPtr domain, int flags); extern int xenProxyListDomains(virConnectPtr conn, int *ids, int maxids); +extern int xenProxyNumOfDomains(virConnectPtr conn); #endif /* __LIBVIR_PROXY_H__ */ diff --git a/src/xen_unified.c b/src/xen_unified.c index f2ffc25..c04ea37 100644 --- a/src/xen_unified.c +++ b/src/xen_unified.c @@ -576,13 +576,31 @@ static int xenUnifiedNumOfDomains (virConnectPtr conn) { GET_PRIVATE(conn); - int i, ret; + int ret; - for (i = 0; i < XEN_UNIFIED_NR_DRIVERS; ++i) - if (priv->opened[i] && drivers[i]->numOfDomains) { - ret = drivers[i]->numOfDomains (conn); - if (ret >= 0) return ret; - } + /* Try xenstore. */ + if (priv->opened[XEN_UNIFIED_XS_OFFSET]) { + ret = xenStoreNumOfDomains (conn); + if (ret >= 0) return ret; + } + + /* Try HV. */ + if (priv->opened[XEN_UNIFIED_HYPERVISOR_OFFSET]) { + ret = xenHypervisorNumOfDomains (conn); + if (ret >= 0) return ret; + } + + /* Try xend. */ + if (priv->opened[XEN_UNIFIED_XEND_OFFSET]) { + ret = xenDaemonNumOfDomains (conn); + if (ret >= 0) return ret; + } + + /* Try proxy. */ + if (priv->opened[XEN_UNIFIED_PROXY_OFFSET]) { + ret = xenProxyNumOfDomains (conn); + if (ret >= 0) return ret; + } return -1; } diff --git a/src/xend_internal.c b/src/xend_internal.c index e23ae2b..867f85f 100644 --- a/src/xend_internal.c +++ b/src/xend_internal.c @@ -3678,7 +3678,7 @@ error: * * Returns the number of domain found or -1 in case of error */ -static int +int xenDaemonNumOfDomains(virConnectPtr conn) { struct sexpr *root = NULL; diff --git a/src/xend_internal.h b/src/xend_internal.h index 9d2571b..8b00737 100644 --- a/src/xend_internal.h +++ b/src/xend_internal.h @@ -187,5 +187,6 @@ int xenDaemonDomainMigratePerform (virDomainPtr domain, const char *cookie, int int xenDaemonDomainBlockPeek (virDomainPtr domain, const char *path, unsigned long long offset, size_t size, void *buffer); int xenDaemonListDomains(virConnectPtr conn, int *ids, int maxids); +int xenDaemonNumOfDomains(virConnectPtr conn); #endif /* __XEND_INTERNAL_H_ */ -- 1.6.2

* src/xend_internal.c: xenDaemonDomainLookupByID said to accept that uuid == NULL, but would gladly dereference it without checking this. * src/xs_internal.c: xenStoreNumOfDomains and xenStoreListDomains now make sure that the domains that are reported as existing also exist in xend. xenStoreDoListDomains needed to be modified to handle the virConnectPtr to be able to call xenDaemonDomainLookupByID. --- src/xend_internal.c | 5 +++-- src/xs_internal.c | 36 ++++++++++++++++++++++++++---------- 2 files changed, 29 insertions(+), 12 deletions(-) diff --git a/src/xend_internal.c b/src/xend_internal.c index 867f85f..fad5e60 100644 --- a/src/xend_internal.c +++ b/src/xend_internal.c @@ -1042,7 +1042,8 @@ xenDaemonDomainLookupByID(virConnectPtr xend, const char *name = NULL; struct sexpr *root; - memset(uuid, 0, VIR_UUID_BUFLEN); + if (uuid != NULL) + memset(uuid, 0, VIR_UUID_BUFLEN); root = sexpr_get(xend, "/xend/domain/%d?detail=1", id); if (root == NULL) @@ -1057,7 +1058,7 @@ xenDaemonDomainLookupByID(virConnectPtr xend, if (domname) *domname = strdup(name); - if (sexpr_uuid(uuid, root, "domain/uuid") < 0) { + if (uuid != NULL && sexpr_uuid(uuid, root, "domain/uuid") < 0) { virXendError(xend, VIR_ERR_INTERNAL_ERROR, "%s", _("domain information incomplete, missing uuid")); goto error; diff --git a/src/xs_internal.c b/src/xs_internal.c index 1f54b1f..8a7516e 100644 --- a/src/xs_internal.c +++ b/src/xs_internal.c @@ -36,6 +36,7 @@ #include "xen_unified.h" #include "xs_internal.h" #include "xen_internal.h" +#include "xend_internal.h" #define VIR_FROM_THIS VIR_FROM_XEN @@ -543,8 +544,9 @@ int xenStoreNumOfDomains(virConnectPtr conn) { unsigned int num; - char **idlist; - int ret = -1; + char **idlist = NULL, *endptr; + int i, r, ret = -1, realnum = 0; + long id; xenUnifiedPrivatePtr priv; if (conn == NULL) { @@ -557,10 +559,21 @@ xenStoreNumOfDomains(virConnectPtr conn) virXenStoreError(conn, VIR_ERR_INVALID_ARG, __FUNCTION__); return(-1); } + idlist = xs_directory(priv->xshandle, 0, "/local/domain", &num); if (idlist) { - free(idlist); - ret = num; + for (i = 0; i < num; i++) { + id = strtol(idlist[i], &endptr, 10); + if ((endptr == idlist[i]) || (*endptr != 0)) + goto out; + + r = xenDaemonDomainLookupByID(conn, (int) id, NULL, NULL); + if (r == 0) + realnum++; + } +out: + VIR_FREE (idlist); + ret = realnum; } return(ret); } @@ -577,11 +590,11 @@ xenStoreNumOfDomains(virConnectPtr conn) * Returns the number of domain found or -1 in case of error */ static int -xenStoreDoListDomains(xenUnifiedPrivatePtr priv, int *ids, int maxids) +xenStoreDoListDomains(virConnectPtr conn, xenUnifiedPrivatePtr priv, int *ids, int maxids) { char **idlist = NULL, *endptr; unsigned int num, i; - int ret = -1; + int r, ret = -1; long id; if (priv->xshandle == NULL) @@ -595,7 +608,10 @@ xenStoreDoListDomains(xenUnifiedPrivatePtr priv, int *ids, int maxids) id = strtol(idlist[i], &endptr, 10); if ((endptr == idlist[i]) || (*endptr != 0)) goto out; - ids[ret++] = (int) id; + + r = xenDaemonDomainLookupByID(conn, (int) id, NULL, NULL); + if (r == 0) + ids[ret++] = (int) id; } out: @@ -627,7 +643,7 @@ xenStoreListDomains(virConnectPtr conn, int *ids, int maxids) priv = (xenUnifiedPrivatePtr) conn->privateData; xenUnifiedLock(priv); - ret = xenStoreDoListDomains(priv, ids, maxids); + ret = xenStoreDoListDomains(conn, priv, ids, maxids); xenUnifiedUnlock(priv); return(ret); @@ -1276,7 +1292,7 @@ retry: virReportOOMError(NULL); return -1; } - nread = xenStoreDoListDomains(priv, new_domids, new_domain_cnt); + nread = xenStoreDoListDomains(conn, priv, new_domids, new_domain_cnt); if (nread != new_domain_cnt) { // mismatch. retry this read VIR_FREE(new_domids); @@ -1357,7 +1373,7 @@ retry: virReportOOMError(NULL); return -1; } - nread = xenStoreDoListDomains(priv, new_domids, new_domain_cnt); + nread = xenStoreDoListDomains(conn, priv, new_domids, new_domain_cnt); if (nread != new_domain_cnt) { // mismatch. retry this read VIR_FREE(new_domids); -- 1.6.2
participants (4)
-
Daniel P. Berrange
-
Daniel Veillard
-
Gerd Hoffmann
-
Jonas Eriksson