[Libvir] Network blocking issue

Hi, I observed this while using the python bindings and accessing a remote host with libvirt:
import libvirt c = libvirt.open('xen://veetee/') c.getInfo() ['i686', 2021, 2, 1864, 1, 1, 2, 1] c.getInfo() ['i686', 2021, 2, 1864, 1, 1, 2, 1]
# remove network cable from remote machine now
c.getInfo() # blocks forever....
What is the problem here and is there a solution to this? I am running FC7 and here is the version info from virsh: virsh # version Compiled against library: libvir 0.3.2 Using library: libvir 0.3.2 Using API: Xen 3.0.1 Running hypervisor: Xen 3.1.0 I observed this for more than 10 mins, it was still hung. Thanks in advance! -- Shuveb Hussain. Money has nothing to do with Happiness, But, Poverty has a lot to do with Sorrow. Company: www.binarykarma.com Blog: www.binarykarma.org

Shuveb Hussain wrote:
Hi,
I observed this while using the python bindings and accessing a remote host with libvirt:
import libvirt c = libvirt.open('xen://veetee/') c.getInfo() ['i686', 2021, 2, 1864, 1, 1, 2, 1] c.getInfo() ['i686', 2021, 2, 1864, 1, 1, 2, 1]
# remove network cable from remote machine now
c.getInfo() # blocks forever....
What is the problem here and is there a solution to this? I am running FC7 and here is the version info from virsh: virsh # version Compiled against library: libvir 0.3.2 Using library: libvir 0.3.2 Using API: Xen 3.0.1 Running hypervisor: Xen 3.1.0
I observed this for more than 10 mins, it was still hung.
This is simply a TCP issue, and nothing to do with libvirt or the remote protocol. I repeated your experiment using a virsh shell and the nodeinfo command, which essentially does the same thing. After yanking the network cable I observed that the sendto(2) syscall succeeded and the recvfrom(2) syscall failed: sendto(4, "\27\3\1\1\20\246\325\207<\320\0230E<\352\4x\310E\1O*g\204!\254\n\234O N\23\310"..., 277, 0, NULL, 0) = 277 recvfrom(4, [... strace hangs here ...] On the wire I could see using tcpdump that TCP was repeatedly trying to send the request packet and getting no response: 19:25:17.108067 IP oirase.55065 > amd.16514: P 1474:1623(149) ack 1082 win 107 <nop,nop,timestamp 703462318 117574265> 19:25:17.108360 IP oirase.55065 > amd.16514: P 1623:1900(277) ack 1082 win 107 <nop,nop,timestamp 703462319 117574265> 19:25:17.308306 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703462519 117574265> 19:25:17.710212 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703462921 117574265> 19:25:18.514030 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703463725 117574265> 19:25:20.121667 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703465333 117574265> 19:25:23.336940 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703468549 117574265> 19:25:29.766483 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703474981 117574265> 19:25:42.625568 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703487845 117574265> 19:26:08.344739 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703513573 117574265> 19:32:42.572441 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703907941 117574265> [etc] On the broader issue, libvirt calls are synchronous -- this is done to reduce the complexity of the interface and implementation. If you need them to be asychronous, use a separate thread (or process) to make the calls. Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

On Thu, Sep 13, 2007 at 07:34:52PM +0100, Richard W.M. Jones wrote:
Shuveb Hussain wrote:
Hi,
I observed this while using the python bindings and accessing a remote host with libvirt:
import libvirt c = libvirt.open('xen://veetee/') c.getInfo() ['i686', 2021, 2, 1864, 1, 1, 2, 1] c.getInfo() ['i686', 2021, 2, 1864, 1, 1, 2, 1]
# remove network cable from remote machine now
c.getInfo() # blocks forever....
What is the problem here and is there a solution to this? I am running FC7 and here is the version info from virsh: virsh # version Compiled against library: libvir 0.3.2 Using library: libvir 0.3.2 Using API: Xen 3.0.1 Running hypervisor: Xen 3.1.0
I observed this for more than 10 mins, it was still hung.
This is simply a TCP issue, and nothing to do with libvirt or the remote protocol.
I repeated your experiment using a virsh shell and the nodeinfo command, which essentially does the same thing. After yanking the network cable I observed that the sendto(2) syscall succeeded and the recvfrom(2) syscall failed:
sendto(4, "\27\3\1\1\20\246\325\207<\320\0230E<\352\4x\310E\1O*g\204!\254\n\234O N\23\310"..., 277, 0, NULL, 0) = 277 recvfrom(4, [... strace hangs here ...]
On the wire I could see using tcpdump that TCP was repeatedly trying to send the request packet and getting no response:
19:25:17.108067 IP oirase.55065 > amd.16514: P 1474:1623(149) ack 1082 win 107 <nop,nop,timestamp 703462318 117574265> 19:25:17.108360 IP oirase.55065 > amd.16514: P 1623:1900(277) ack 1082 win 107 <nop,nop,timestamp 703462319 117574265> 19:25:17.308306 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703462519 117574265> 19:25:17.710212 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703462921 117574265> 19:25:18.514030 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703463725 117574265> 19:25:20.121667 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703465333 117574265> 19:25:23.336940 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703468549 117574265> 19:25:29.766483 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703474981 117574265> 19:25:42.625568 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703487845 117574265> 19:26:08.344739 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703513573 117574265> 19:32:42.572441 IP oirase.55065 > amd.16514: P 1474:1900(426) ack 1082 win 107 <nop,nop,timestamp 703907941 117574265> [etc]
On the broader issue, libvirt calls are synchronous -- this is done to reduce the complexity of the interface and implementation. If you need them to be asychronous, use a separate thread (or process) to make the calls.
Is there a configuration knob in the RPC layer to lower the timeout delay ? Some calls are slow, but we should not reach a 2mn timeout, that's very very long I think. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Daniel Veillard wrote:
Is there a configuration knob in the RPC layer to lower the timeout delay ? Some calls are slow, but we should not reach a 2mn timeout, that's very very long I think.
Migrations might take some time. In any case the RPC code just does 'sendto' followed by 'recvfrom'. There is no timeout to adjust on the client side. Shuveb's problem is that TCP doesn't gracefully handle the case where the ethernet cable is pulled out. There may be a socket option which helps for this. Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

On Mon, Sep 17, 2007 at 03:49:37PM +0100, Richard W.M. Jones wrote:
Daniel Veillard wrote:
Is there a configuration knob in the RPC layer to lower the timeout delay ? Some calls are slow, but we should not reach a 2mn timeout, that's very very long I think.
Migrations might take some time.
In any case the RPC code just does 'sendto' followed by 'recvfrom'. There is no timeout to adjust on the client side.
Shuveb's problem is that TCP doesn't gracefully handle the case where the ethernet cable is pulled out. There may be a socket option which helps for this.
That depends on your definition of graceful. Shuveb's definition is that he wants the connection to fail & give an error back to the app. My definition is that TCP should keep retrying until I plug the cable back in, so I don't get unneccessary failures if i'm just switching cables around. Likewise if there's temporary outages anywhere else in the link between the client & server. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel P. Berrange wrote:
On Mon, Sep 17, 2007 at 03:49:37PM +0100, Richard W.M. Jones wrote:
Daniel Veillard wrote:
Is there a configuration knob in the RPC layer to lower the timeout delay ? Some calls are slow, but we should not reach a 2mn timeout, that's very very long I think. Migrations might take some time.
In any case the RPC code just does 'sendto' followed by 'recvfrom'. There is no timeout to adjust on the client side.
Shuveb's problem is that TCP doesn't gracefully handle the case where the ethernet cable is pulled out. There may be a socket option which helps for this.
That depends on your definition of graceful.
Shuveb's definition is that he wants the connection to fail & give an error back to the app.
My definition is that TCP should keep retrying until I plug the cable back in, so I don't get unneccessary failures if i'm just switching cables around. Likewise if there's temporary outages anywhere else in the link between the client & server.
Yes actually I agree with you on that one. On the other hand there is no way for Shuveb to set TCP socket options on the socket other than making a private copy of the libvirt code and hacking it. So a patch to add yet another query string flag to the remote URI or to expose the remote socket somehow might be acceptable. Rich. -- Emerging Technologies, Red Hat - http://et.redhat.com/~rjones/ Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 03798903

Hi, On 9/17/07, Richard W.M. Jones <rjones@redhat.com> wrote:
Yes actually I agree with you on that one. On the other hand there is no way for Shuveb to set TCP socket options on the socket other than making a private copy of the libvirt code and hacking it. So a patch to add yet another query string flag to the remote URI or to expose the remote socket somehow might be acceptable.
One big advantage of SANs is that local storage is avoided and storage management is simplified. Most importantly, nodes on which VMs run become like non-fixed resources. Domains can be chucked out from a node to take it down for maintenance, for example. There would be no issues when nodes are added. And with Avahi, life feels good. :-) But when a node goes away suddenly, the application is left high and dry at a point where it was making a call to the remote node. I understand it is a TCP issue. But there must be a way to reach a middle path. I removed the cable for a good 15 mins and still the call remained blocked. The request never timed out at all. I am just worried about this bit a lot. -- Shuveb Hussain. Money has nothing to do with Happiness, But, Poverty has a lot to do with Sorrow. Company: www.binarykarma.com Blog: www.binarykarma.org

On Mon, Sep 17, 2007 at 03:57:03PM +0100, Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
My definition is that TCP should keep retrying until I plug the cable back in, so I don't get unneccessary failures if i'm just switching cables around. Likewise if there's temporary outages anywhere else in the link between the client & server.
Yes actually I agree with you on that one. On the other hand there is no way for Shuveb to set TCP socket options on the socket other than making a private copy of the libvirt code and hacking it. So a patch to add yet another query string flag to the remote URI or to expose the remote socket somehow might be acceptable.
Expectations will vary according to environments. But 15mn is *way* over what I would accept in a distributed system environment. On the other hand when ssh resumes the connection fine if I suspend my laptop for 30mn is a a feature. That's why I ask for a way to set the timeout, I don't think we will be able to find a value which will please everybody. I'm not too fond of using an extension to the URI, I think this gets over what URI were designed for. Either giving an access to the socket descriptor (beware this may generate a future Windows portability problem), or provide a specific tuning API (not nice but relatively safe). Since calls are synchronous the app may change the Connection timeout before issuing some calls like Create or Migrate. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Mon, Sep 17, 2007 at 03:35:19PM -0400, Daniel Veillard wrote:
On Mon, Sep 17, 2007 at 03:57:03PM +0100, Richard W.M. Jones wrote:
Daniel P. Berrange wrote:
My definition is that TCP should keep retrying until I plug the cable back in, so I don't get unneccessary failures if i'm just switching cables around. Likewise if there's temporary outages anywhere else in the link between the client & server.
Yes actually I agree with you on that one. On the other hand there is no way for Shuveb to set TCP socket options on the socket other than making a private copy of the libvirt code and hacking it. So a patch to add yet another query string flag to the remote URI or to expose the remote socket somehow might be acceptable.
Expectations will vary according to environments. But 15mn is *way* over what I would accept in a distributed system environment. On the other hand when ssh resumes the connection fine if I suspend my laptop for 30mn is a a feature. That's why I ask for a way to set the timeout, I don't think we will be able to find a value which will please everybody.
I'm not in favour of providing a configurable timeout to individual apps using libvirt because it'll lead to inconsistent behaviour between apps/ ie scenarios where one libvirt app will continue to work while another will prematurely fail. It'll also mean that every app using libvirt will have to find a way to expose this setting to the user.
I'm not too fond of using an extension to the URI, I think this gets over what URI were designed for. Either giving an access to the socket descriptor (beware this may generate a future Windows portability problem), or provide a specific tuning API (not nice but relatively safe). Since calls are synchronous the app may change the Connection timeout before issuing some calls like Create or Migrate.
Yep, a timeout isn't really part of an address which is what we're using URIs for. Exposing new APIs for this to the apps isn't nice because of the need for all apps to then expose it to the user as i mention above. Other possibilities are a libvirt client side config file or environment variables but the latter isn't too nice. I'd probably go for config file unless people have other ideas. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Mon, Sep 17, 2007 at 08:51:26PM +0100, Daniel P. Berrange wrote:
On Mon, Sep 17, 2007 at 03:35:19PM -0400, Daniel Veillard wrote:
Expectations will vary according to environments. But 15mn is *way* over what I would accept in a distributed system environment. On the other hand when ssh resumes the connection fine if I suspend my laptop for 30mn is a a feature. That's why I ask for a way to set the timeout, I don't think we will be able to find a value which will please everybody.
I'm not in favour of providing a configurable timeout to individual apps using libvirt because it'll lead to inconsistent behaviour between apps/ ie scenarios where one libvirt app will continue to work while another will prematurely fail. It'll also mean that every app using libvirt will have to find a way to expose this setting to the user.
I'm not too fond of using an extension to the URI, I think this gets over what URI were designed for. Either giving an access to the socket descriptor (beware this may generate a future Windows portability problem), or provide a specific tuning API (not nice but relatively safe). Since calls are synchronous the app may change the Connection timeout before issuing some calls like Create or Migrate.
Yep, a timeout isn't really part of an address which is what we're using URIs for. Exposing new APIs for this to the apps isn't nice because of the need for all apps to then expose it to the user as i mention above.
Other possibilities are a libvirt client side config file or environment variables but the latter isn't too nice.
I'd probably go for config file unless people have other ideas.
Well a config file forces the same timeout for the whole duration of the program and for every connection, I'm not sure it's the right thing to do in this case. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
participants (4)
-
Daniel P. Berrange
-
Daniel Veillard
-
Richard W.M. Jones
-
Shuveb Hussain