On Tue, Jul 28, 2009 at 10:15:10AM +0200 Daniel Veillard wrote:
On Tue, Jul 28, 2009 at 08:30:12AM +0200, Jonas Eriksson wrote:
> Hi,
>
> I have been examining a bug where libvirtd (and virsh) does not show
> all virtual machines on a xen host. This proved to be because of this
> program flow:
> 1. virConnectNumOfDomains -> .. -> xenUnifiedNumOfDomains
> -> xenHypervisorNumOfDomains => 3
> 2. virConnectListDomains(max=3) -> .. -> xenUnifiedListDomains(max=3)
> -> xenStoreNumOfDomains(max=3) => { 0, 2, 7 }
Actually the problem I see is that ListDomains should really go
through the Hypervisor API i.e. xenHypervisorListDomains(), which
is *way* faster and garanteed to be acurate. We should try the
hypervisor first, IMHO, the function code was modified end of last year
to avoid Xend not properly cleaning up:
http://www.mail-archive.com/libvir-list@redhat.com/msg09855.html
Ah, yes.. This seems very related. I remember seeing this earlier
but never had the time to look into it back then.
The problem is that we have put xenstore driver call first, while
it's clearly slower and has a higher chance of getting things wrong than
the hypervisor itself (if the HV get it wrong I guess there is no cure :-)
Could you try changing xenUnifiedListDomains() and make the
xenHypervisorListDomains try first, then check if it still works with
3.3, if yes that's even better. Now the patch looks reasonable to me
but I think the reorder should be done too if this works, and will lead
to really good results ... as long as the hypercall works.
Yes, this works. I tried with the reordering in
xenUnifiedListDomains, both with and without my patch to make
sure that libvirt used the Hypervisor interface instead of
XenStore. Do you want a revised patch for this? In that case, I
think that XenStore should be last in the prio-list due to both
the performance issues, not helped by my patch, and because it
behaves..differently.
/Jonas
--
Jonas Eriksson
Consultant at AS/EAB/FLJ/IL
Phone: +46 8 58086281
Combitech AB
Älvsjö, Sweden