[libvirt] xen:/// vs. xen://FQDN/ vs xen+unix:/// discrepancy

Hello, I regularly observe the problem, that depending on the libvirt-URL I get different information: root@xen4# virsh -c xen://xen4.domain.name/ list Id Name State ---------------------------------- root@xen4# virsh -c xen:/// list Id Name State ---------------------------------- 0 Domain-0 running 1 dos4 idle root@xen4# virsh -c xen+unix:/// list Id Name State ---------------------------------- root@xen4# xm li Name ID Mem VCPUs State Time(s) Domain-0 0 2044 2 r----- 419.7 dos4 1 4 1 -b---- 33.4 Has somebody observed the same problem and/or can give me any advise on where to look for the problem? I'm running a Debian-Lenny based OS with libvirt-0.8.2 on Xen-3.4.3 with a 2.6.32 kernel. Sincerely Philipp Hahn -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH Linux for Your Business fon: +49 421 22 232- 0 Mary-Somerville-Str.1 28359 Bremen fax: +49 421 22 232-99 http://www.univention.de

On Thu, Jul 22, 2010 at 12:44:53PM +0200, Philipp Hahn wrote:
Hello,
I regularly observe the problem, that depending on the libvirt-URL I get different information:
root@xen4# virsh -c xen://xen4.domain.name/ list Id Name State ----------------------------------
This will go to the libvirtd daemon, which will in turn open a 'xen:///' connection
root@xen4# virsh -c xen:/// list Id Name State ---------------------------------- 0 Domain-0 running 1 dos4 idle
This is uses a combination of talking to the hypervisor, xenstore and xend.
root@xen4# virsh -c xen+unix:/// list Id Name State ----------------------------------
This goes to the libvirtd daemon, using UNIX sockets, which will in turn open a 'xen:///' connection. Assuming the libvirtd daemon you've connected to is on the same host as virsh process, then since they all ultimately end up using 'xen:///', the data should be identical in all cases. To verify that libvirtd is actually openening the same drivers as virsh you can edit /etc/libvirt/libvirtd.conf and set log_filters="1:libvirt 1:xen" log_outputs="1:file:/var/log/libvirt/libvirtd.log" and restart the libvirtd daemon. Then try the 'xen+unix:///' URI again After that use export LIBVIRT_LOG_FILTERS="1:libvirt 1:xen" export LIBVIRT_LOG_OUTPUTS="1:file:/var/log/libvirt/libvirtd.log" and run 'virsh -c xen:///'. It should be possible to then compare that both sets of logs show same Xen drivers Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Hello, thank you for your answer. Am Donnerstag 22 Juli 2010 14:51:22 schrieb Daniel P. Berrange:
root@xen4# virsh -c xen://xen4.domain.name/ list root@xen4# virsh -c xen:/// list root@xen4# virsh -c xen+unix:/// list ... Assuming the libvirtd daemon you've connected to is on the same host as virsh process, then since they all ultimately end up using 'xen:///', the data should be identical in all cases.
Yes, I run all those commands locally on the same host (xen4). The problem only occurs after some time of normal operation: just after starting xend, libvirtd, ... everything is fine. Then sometime later after starting and stoping some domains, the systems gets unresponsive and the above described problem occurs: Only than do the different connection types provide different data. To me it looks like some connection between libvirtd / virsh and xen gets broken, so no data on running domains can be retrieved any more, until I restart libvirtd. After that everything is back to normal until the next hickup.
To verify that libvirtd is actually openening the same drivers as virsh you can edit /etc/libvirt/libvirtd.conf and set
log_filters="1:libvirt 1:xen" log_outputs="1:file:/var/log/libvirt/libvirtd.log"
I'v now enabled that and see if I can gather more data. Sincerely Philipp Hahn -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH Linux for Your Business fon: +49 421 22 232- 0 Mary-Somerville-Str.1 28359 Bremen fax: +49 421 22 232-99 http://www.univention.de

Hello, Am Freitag 23 Juli 2010 18:02:27 schrieb Philipp Hahn:
Am Donnerstag 22 Juli 2010 14:51:22 schrieb Daniel P. Berrange:
root@xen4# virsh -c xen://xen4.domain.name/ list root@xen4# virsh -c xen:/// list root@xen4# virsh -c xen+unix:/// list ... To me it looks like some connection between libvirtd / virsh and xen gets broken, so no data on running domains can be retrieved any more, until I restart libvirtd. After that everything is back to normal until the next hickup.
Today I encountered the bug again: "virsh list" was not showing any domain, while "virsh -c xen+unix:///" did show "Domain 0" and my other domain. I attached gdb, "thread apply all bt" showed no thread still executing xenHypervisorInit(), but I found the following inconsistent state:
in_init == 1 hypervisor_version == -1 sys_interface_version == -1 dom_interface_version == -1
Notice in_init=1: Currently there are two error path, where that can happen: 1. ret = open(XEN_HYPERVISOR_SOCKET, O_RDWR); 2. if (VIR_ALLOC(ipt) < 0) { In both cases in_init remains 1, which disables further useful logging:
#define virXenError(code, ...) \ if (in_init == 0) \ virReportErrorHelper(NULL, VIR_FROM_XEN, code, __FILE__, \ __FUNCTION__, __LINE__, __VA_ARGS__)
This leads me to the following question: 1. Why is error reporting deactivated during init? Since in_init==1 is valid only during the call xenHypervisorInit(), explicitly using that macro there for logging seems very wrong. 2. xenHypervisorInit() also allocates three regexp_t structures, which are only freed when one of them can not be compiled, but they remain in memory when determin the Xen hypervisor version fails. On the other hand these 3 structures are used in xenHypervisorMakeCapabilitiesInternal(). This isn't checked directly, but only indirectly, because in the error case no one will ever call it with a valid virConnectPtr. Should the structures be freed in all error cases? The attached patch is a first try at improving that situation.
To verify that libvirtd is actually openening the same drivers as virsh you can edit /etc/libvirt/libvirtd.conf and set
log_filters="1:libvirt 1:xen" log_outputs="1:file:/var/log/libvirt/libvirtd.log"
I've now enabled that and see if I can gather more data.
This didn't work because of the bug described above. Sincerely Philipp Hahn -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH Linux for Your Business fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/

Philipp Hahn wrote:
Hello,
Am Freitag 23 Juli 2010 18:02:27 schrieb Philipp Hahn:
Am Donnerstag 22 Juli 2010 14:51:22 schrieb Daniel P. Berrange:
root@xen4# virsh -c xen://xen4.domain.name/ list root@xen4# virsh -c xen:/// list root@xen4# virsh -c xen+unix:/// list
...
To me it looks like some connection between libvirtd / virsh and xen gets broken, so no data on running domains can be retrieved any more, until I restart libvirtd. After that everything is back to normal until the next hickup.
Today I encountered the bug again: "virsh list" was not showing any domain, while "virsh -c xen+unix:///" did show "Domain 0" and my other domain.
Are you using xen unstable or Xen 4.1 (i.e. something greater than Xen 4.0)? If so, the sysctl and domctl versions have changed, which essentially breaks the xen_hypervisor sub-driver. I ran into this problem while trying to get the libxl and xen-unified drivers to play nicely together - wasted a bunch of my testing time :-(. I'll send a patch to handle the new sysctl and domctl versions. Regards, Jim

Hello Jim, Am Donnerstag 17 März 2011 21:11:34 schrieb Jim Fehlig:
Philipp Hahn wrote:
Today I encountered the bug again: "virsh list" was not showing any domain, while "virsh -c xen+unix:///" did show "Domain 0" and my other domain.
Are you using xen unstable or Xen 4.1 (i.e. something greater than Xen 4.0)?
No, this was with Xen 3.4.3.
If so, the sysctl and domctl versions have changed, which essentially breaks the xen_hypervisor sub-driver. I ran into this problem while trying to get the libxl and xen-unified drivers to play nicely together - wasted a bunch of my testing time :-(. I'll send a patch to handle the new sysctl and domctl versions.
Good to head your XenLight work is making progress. Sincerely Philipp Hahn -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH Linux for Your Business fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/
participants (3)
-
Daniel P. Berrange
-
Jim Fehlig
-
Philipp Hahn