[libvirt] [PATCH 4/4] remote: Detect 'nc' version incompatibilities

This ugly thing is a shell script to detect availability of the -q option for 'nc': debian and suse based distros need this flag to ensure the remote nc will exit on EOF, so it will go away when we close the tunnel. If it doesn't go away, a useless 'nc' process is left sitting on the remote host. Fedora's 'nc' doesn't have this option, so we can't blindly pass -q. More info here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=564034 We also detect the -U option, since some nc versions don't support it, which has caused quite some user confusion. Connecting to a box and using the incorrect nc: virsh --connect qemu+ssh://root@debhost/system?netcat=/bin/nc.traditional error: server closed connection: /bin/nc.traditional: invalid option -- U nc -h for help openbsd nc is required error: failed to connect to the hypervisor Test with Fedora 12, RHEL 5.4, and Debian Lenny --- src/remote/remote_driver.c | 52 ++++++++++++++++++++++++++++++++++++++------ 1 files changed, 45 insertions(+), 7 deletions(-) diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index 7f92fd0..c3258d3 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -732,8 +732,49 @@ doRemoteOpen (virConnectPtr conn, } case trans_ssh: { - int j, nr_args = 6; + int j, nr_args = 0; + char *nc_command; + + const char *nc_bin = netcat ? netcat : "nc"; + const char *sock_path = (sockname ? sockname : + (flags & VIR_CONNECT_RO + ? LIBVIRTD_PRIV_UNIX_SOCKET_RO + : LIBVIRTD_PRIV_UNIX_SOCKET)); + + /* + * Build 'nc' command run on the remote host + * + * This ugly thing is a shell script to detect availability of + * the -q option for 'nc': debian and suse based distros need this + * flag to ensure the remote nc will exit on EOF, so it will go away + * when we close the tunnel. If it doesn't go away, a useless 'nc' + * process is left sitting on the remote host. + * + * Fedora's 'nc' doesn't have this option, and apparently defaults + * to the desired behavior. + */ + const char *nc_detect_template = ( + "NCOUT=$(%s -U 2>&1);" + "echo \"$NCOUT\" | grep -q 'invalid option';" + "if [ $? -eq 0 ] ; then" + " echo \"$NCOUT\" >&2;" + " echo openbsd 'nc' is required >&2;" + " exit 1;" + "fi;" + "" + "%s -q 2>&1 | grep -q 'requires an argument';" + "if [ $? -eq 0 ] ; then" + " CMD='-q 0';" + "else" + " CMD='';" + "fi;" + "%s $CMD -U %s;"); + + if (virAsprintf(&nc_command, nc_detect_template, + nc_bin, nc_bin, nc_bin, sock_path) < 0) + goto out_of_memory; + nr_args += 4; /* ssh $hostname $netcat_command NULL*/ if (username) nr_args += 2; /* For -l username */ if (no_tty) nr_args += 5; /* For -T -o BatchMode=yes -e none */ if (port) nr_args += 2; /* For -p port */ @@ -765,12 +806,9 @@ doRemoteOpen (virConnectPtr conn, cmd_argv[j++] = strdup ("none"); } cmd_argv[j++] = strdup (priv->hostname); - cmd_argv[j++] = strdup (netcat ? netcat : "nc"); - cmd_argv[j++] = strdup ("-U"); - cmd_argv[j++] = strdup (sockname ? sockname : - (flags & VIR_CONNECT_RO - ? LIBVIRTD_PRIV_UNIX_SOCKET_RO - : LIBVIRTD_PRIV_UNIX_SOCKET)); + + cmd_argv[j++] = nc_command; + cmd_argv[j++] = 0; assert (j == nr_args); for (j = 0; j < (nr_args-1); j++) -- 1.6.5.2

On Fri, Feb 12, 2010 at 10:32:17AM -0500, Cole Robinson wrote:
This ugly thing is a shell script to detect availability of the -q option for 'nc': debian and suse based distros need this flag to ensure the remote nc will exit on EOF, so it will go away when we close the tunnel. If it doesn't go away, a useless 'nc' process is left sitting on the remote host.
Fedora's 'nc' doesn't have this option, so we can't blindly pass -q. More info here:
I don't really like this approach. Shouldn't it be sufficient to just explicit SIGKILL the ssh client, rather than relying on the exit-on-EOF behaviour of nc. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On 02/15/2010 06:11 AM, Daniel P. Berrange wrote:
On Fri, Feb 12, 2010 at 10:32:17AM -0500, Cole Robinson wrote:
This ugly thing is a shell script to detect availability of the -q option for 'nc': debian and suse based distros need this flag to ensure the remote nc will exit on EOF, so it will go away when we close the tunnel. If it doesn't go away, a useless 'nc' process is left sitting on the remote host.
Fedora's 'nc' doesn't have this option, so we can't blindly pass -q. More info here:
I don't really like this approach. Shouldn't it be sufficient to just explicit SIGKILL the ssh client, rather than relying on the exit-on-EOF behaviour of nc.
kill() helps prevent virt-manager from hanging, but it doesn't address the dangling 'nc' process on the remote host that requires -q. Every connection will leave an 'nc' process hanging. - Cole

On Mon, Feb 15, 2010 at 09:39:31AM -0500, Cole Robinson wrote:
On 02/15/2010 06:11 AM, Daniel P. Berrange wrote:
On Fri, Feb 12, 2010 at 10:32:17AM -0500, Cole Robinson wrote:
This ugly thing is a shell script to detect availability of the -q option for 'nc': debian and suse based distros need this flag to ensure the remote nc will exit on EOF, so it will go away when we close the tunnel. If it doesn't go away, a useless 'nc' process is left sitting on the remote host.
Fedora's 'nc' doesn't have this option, so we can't blindly pass -q. More info here:
I don't really like this approach. Shouldn't it be sufficient to just explicit SIGKILL the ssh client, rather than relying on the exit-on-EOF behaviour of nc.
kill() helps prevent virt-manager from hanging, but it doesn't address the dangling 'nc' process on the remote host that requires -q. Every connection will leave an 'nc' process hanging.
It does when I try it. Killing the SSH client connection results in SIGHUP for the process running on the remote host & nc exits on SIGHUP. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On 02/15/2010 09:47 AM, Daniel P. Berrange wrote:
On Mon, Feb 15, 2010 at 09:39:31AM -0500, Cole Robinson wrote:
On 02/15/2010 06:11 AM, Daniel P. Berrange wrote:
On Fri, Feb 12, 2010 at 10:32:17AM -0500, Cole Robinson wrote:
This ugly thing is a shell script to detect availability of the -q option for 'nc': debian and suse based distros need this flag to ensure the remote nc will exit on EOF, so it will go away when we close the tunnel. If it doesn't go away, a useless 'nc' process is left sitting on the remote host.
Fedora's 'nc' doesn't have this option, so we can't blindly pass -q. More info here:
I don't really like this approach. Shouldn't it be sufficient to just explicit SIGKILL the ssh client, rather than relying on the exit-on-EOF behaviour of nc.
kill() helps prevent virt-manager from hanging, but it doesn't address the dangling 'nc' process on the remote host that requires -q. Every connection will leave an 'nc' process hanging.
It does when I try it. Killing the SSH client connection results in SIGHUP for the process running on the remote host & nc exits on SIGHUP.
Connecting to a debian system? I just tried using the attached patch on top of latest git (not my patches): Running virsh --connect URI to a debian lenny system, then invoking 'quit', leaves 'nc -U' processes running. 'quit' doesn't hang like it does with current git, but the nc process isn't closed on the remote host. Any recommendations? Thanks, Cole

On Mon, Feb 15, 2010 at 12:00:50PM -0500, Cole Robinson wrote:
On 02/15/2010 09:47 AM, Daniel P. Berrange wrote:
On Mon, Feb 15, 2010 at 09:39:31AM -0500, Cole Robinson wrote:
On 02/15/2010 06:11 AM, Daniel P. Berrange wrote:
On Fri, Feb 12, 2010 at 10:32:17AM -0500, Cole Robinson wrote:
This ugly thing is a shell script to detect availability of the -q option for 'nc': debian and suse based distros need this flag to ensure the remote nc will exit on EOF, so it will go away when we close the tunnel. If it doesn't go away, a useless 'nc' process is left sitting on the remote host.
Fedora's 'nc' doesn't have this option, so we can't blindly pass -q. More info here:
I don't really like this approach. Shouldn't it be sufficient to just explicit SIGKILL the ssh client, rather than relying on the exit-on-EOF behaviour of nc.
kill() helps prevent virt-manager from hanging, but it doesn't address the dangling 'nc' process on the remote host that requires -q. Every connection will leave an 'nc' process hanging.
It does when I try it. Killing the SSH client connection results in SIGHUP for the process running on the remote host & nc exits on SIGHUP.
Connecting to a debian system? I just tried using the attached patch on top of latest git (not my patches):
Running virsh --connect URI to a debian lenny system, then invoking 'quit', leaves 'nc -U' processes running. 'quit' doesn't hang like it does with current git, but the nc process isn't closed on the remote host.
Ok, what's going on is this 1. Login & manually run nc # ssh somehost $ nc -U /var/run/libvirt/libvirt-sock-ro In this case the shell gets a controlling TTY. When the SSH client quits, the kernel's TTY notices EOF, and everything running under the TTY get SIGHUP 2. Remote command executuon ssh somehost "nc -U /var/run/libvirt/libvirt-sock-ro" There is no controlling TTY. When the SSH client quits, nc sees the EOF, but decides to carry on running. No SIGHUP is around, since there's no TTY to trigger one. There's no obvious way to ensure that remote SSH command execution sees SIGHUP like a SSH shell session would. The situation we're in with 'nc' is rather a trainwreck, and I'm inclined to say that it is time we took it out of the equation completely, by adding our own command /usr/libexec/libvirt-ssh-helper We can execute ssh somehost "test -e /usr/libexec/libvirt-ssh-helper && /usr/libexec/libvirt-ssh-helper /var/run/libvirt/libvirt-sock-ro || nc -U /var/run/libvirt/libvirt-sock-ro" The latter bit just being backwards compatibility code for old libvirt server deployments lacking the ssh-helper. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Cole Robinson wrote:
This ugly thing is a shell script to detect availability of the -q option for 'nc': debian and suse based distros need this flag to ensure the remote nc will exit on EOF, so it will go away when we close the tunnel. If it doesn't go away, a useless 'nc' process is left sitting on the remote host.
I've spent time trying to make this code more flexible with other types of relays, e.g. socat which can be invoked with socat - GOPEN:sockname LIBVIRTD_PRIV_UNIX_SOCKET but haven't come up with anything I would consider upstreamable. Jonas Eriksson also made a brave attempt [1] :-). Any suggestions for improving this code to handle other relays at runtime? Regards, Jim [1] http://www.redhat.com/archives/libvir-list/2009-August/msg00068.html

On 02/15/2010 07:40 PM, Jim Fehlig wrote:
Cole Robinson wrote:
This ugly thing is a shell script to detect availability of the -q option for 'nc': debian and suse based distros need this flag to ensure the remote nc will exit on EOF, so it will go away when we close the tunnel. If it doesn't go away, a useless 'nc' process is left sitting on the remote host.
I've spent time trying to make this code more flexible with other types of relays, e.g. socat which can be invoked with
socat - GOPEN:sockname LIBVIRTD_PRIV_UNIX_SOCKET
but haven't come up with anything I would consider upstreamable. Jonas Eriksson also made a brave attempt [1] :-). Any suggestions for improving this code to handle other relays at runtime?
If we go Dan's route of just creating a libvirt helper under /usr/libexec, we shouldn't need to handle other nc type utils AIUI. Aside from that, the only choice would be to make the shell script even hairier to detect available binaries on the remote system. - Cole
participants (3)
-
Cole Robinson
-
Daniel P. Berrange
-
Jim Fehlig