On Thu, Sep 13, 2007 at 07:34:52PM +0100, Richard W.M. Jones wrote:
Shuveb Hussain wrote:
>Hi,
>
>I observed this while using the python bindings and accessing a remote
>host with libvirt:
>
>>>>import libvirt
>>>>c = libvirt.open('xen://veetee/')
>>>>c.getInfo()
>['i686', 2021, 2, 1864, 1, 1, 2, 1]
>>>>c.getInfo()
>['i686', 2021, 2, 1864, 1, 1, 2, 1]
>
># remove network cable from remote machine now
>>>>c.getInfo()
># blocks forever....
>
>What is the problem here and is there a solution to this? I am running
>FC7 and here is the version info from virsh:
>virsh # version
>Compiled against library: libvir 0.3.2
>Using library: libvir 0.3.2
>Using API: Xen 3.0.1
>Running hypervisor: Xen 3.1.0
>
>I observed this for more than 10 mins, it was still hung.
This is simply a TCP issue, and nothing to do with libvirt or the remote
protocol.
I repeated your experiment using a virsh shell and the nodeinfo command,
which essentially does the same thing. After yanking the network cable
I observed that the sendto(2) syscall succeeded and the recvfrom(2)
syscall failed:
sendto(4,
"\27\3\1\1\20\246\325\207<\320\0230E<\352\4x\310E\1O*g\204!\254\n\234O
N\23\310"..., 277, 0, NULL, 0) = 277
recvfrom(4, [... strace hangs here ...]
On the wire I could see using tcpdump that TCP was repeatedly trying to
send the request packet and getting no response:
19:25:17.108067 IP oirase.55065 > amd.16514: P 1474:1623(149) ack 1082
win 107 <nop,nop,timestamp 703462318 117574265>
19:25:17.108360 IP oirase.55065 > amd.16514: P 1623:1900(277) ack 1082
win 107 <nop,nop,timestamp 703462319 117574265>
19:25:17.308306 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082
win 107 <nop,nop,timestamp 703462519 117574265>
19:25:17.710212 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082
win 107 <nop,nop,timestamp 703462921 117574265>
19:25:18.514030 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082
win 107 <nop,nop,timestamp 703463725 117574265>
19:25:20.121667 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082
win 107 <nop,nop,timestamp 703465333 117574265>
19:25:23.336940 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082
win 107 <nop,nop,timestamp 703468549 117574265>
19:25:29.766483 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082
win 107 <nop,nop,timestamp 703474981 117574265>
19:25:42.625568 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082
win 107 <nop,nop,timestamp 703487845 117574265>
19:26:08.344739 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082
win 107 <nop,nop,timestamp 703513573 117574265>
19:32:42.572441 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082
win 107 <nop,nop,timestamp 703907941 117574265>
[etc]
On the broader issue, libvirt calls are synchronous -- this is done to
reduce the complexity of the interface and implementation. If you need
them to be asychronous, use a separate thread (or process) to make the
calls.
Is there a configuration knob in the RPC layer to lower the
timeout delay ? Some calls are slow, but we should not reach a 2mn
timeout, that's very very long I think.
Daniel
--
Red Hat Virtualization group
http://redhat.com/virtualization/
Daniel Veillard | virtualization library
http://libvirt.org/
veillard(a)redhat.com | libxml GNOME XML XSLT toolkit
http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine
http://rpmfind.net/