[libvirt] [PATCH] tests: avoid test failure on rawhide gnutls

I hit a VERY weird testsuite failure on rawhide, which included _binary_ output to stderr, followed by a hang waiting for me to type something! (Here, using ^@ for NUL): $ ./commandtest TEST: commandtest WARNING: gnome-keyring:: couldn't send data: Bad file descriptor .WARNING: gnome-keyring:: couldn't send data: Bad file descriptor .WARNING: gnome-keyring:: couldn't send data: Bad file descriptor WARNING: gnome-keyring:: couldn't send data: Bad file descriptor .8^@^@^@8^@^@^@^A^@^@^@^Bay^A^@^@^@)PRIVATE-GNOME-KEYRING-PKCS11-PROTOCOL-V-1 I finally traced it to the fact that gnome-keyring, called via gnutls_global_init which is turn called by virNetTLSInit, opens an internal fd that it expects to communicate to via a pthread_atfork handler (never mind that it violates POSIX by using non-async-signal-safe functions in that handler: https://bugzilla.redhat.com/show_bug.cgi?id=772320). Our problem stems from the fact that we pulled the rug out from under the library's expectations by closing an fd that it had just opened. While we aren't responsible for fixing the bugs in that pthread_atfork handler, we can at least avoid the bugs by not closing the fd in the first place. * tests/commandtest.c (mymain): Avoid closing fds that were opened by virInitialize. --- Pushing under the build-breaker rule. It cost me the better part of a morning to track this one down, so I left a super-long comment to help the next person to read the file understand what we're fighting against. tests/commandtest.c | 20 ++++++++++++++++++-- 1 files changed, 18 insertions(+), 2 deletions(-) diff --git a/tests/commandtest.c b/tests/commandtest.c index efc48fe..b4b6044 100644 --- a/tests/commandtest.c +++ b/tests/commandtest.c @@ -784,6 +784,22 @@ mymain(void) setpgid(0, 0); setsid(); + /* Our test expects particular fd values; to get that, we must not + * leak fds that we inherited from a lazy parent. At the same + * time, virInitialize may open some fds (perhaps via third-party + * libraries that it uses), and we must not kill off an fd that + * this process opens as it might break expectations of a + * pthread_atfork handler, as well as interfering with our tests + * trying to ensure we aren't leaking to our children. The + * solution is to do things in two phases - reserve the fds we + * want by overwriting any externally inherited fds, then + * initialize, then clear the slots for testing. */ + if ((fd = open("/dev/null", O_RDONLY)) < 0 || + dup2(fd, 3) < 0 || + dup2(fd, 4) < 0 || + dup2(fd, 5) < 0 || + (fd > 5 && VIR_CLOSE(fd) < 0)) + return EXIT_FAILURE; /* Prime the debug/verbose settings from the env vars, * since we're about to reset 'environ' */ @@ -791,8 +807,8 @@ mymain(void) virTestGetVerbose(); virInitialize(); - /* Kill off any inherited fds that might interfere with our - * testing. */ + + /* Phase two of killing interfering fds; see above. */ fd = 3; VIR_FORCE_CLOSE(fd); fd = 4; -- 1.7.7.5

On Fri, Jan 06, 2012 at 02:27:42PM -0700, Eric Blake wrote:
I hit a VERY weird testsuite failure on rawhide, which included _binary_ output to stderr, followed by a hang waiting for me to type something! (Here, using ^@ for NUL):
$ ./commandtest TEST: commandtest WARNING: gnome-keyring:: couldn't send data: Bad file descriptor .WARNING: gnome-keyring:: couldn't send data: Bad file descriptor .WARNING: gnome-keyring:: couldn't send data: Bad file descriptor WARNING: gnome-keyring:: couldn't send data: Bad file descriptor .8^@^@^@8^@^@^@^A^@^@^@^Bay^A^@^@^@)PRIVATE-GNOME-KEYRING-PKCS11-PROTOCOL-V-1
I finally traced it to the fact that gnome-keyring, called via gnutls_global_init which is turn called by virNetTLSInit, opens an internal fd that it expects to communicate to via a pthread_atfork handler (never mind that it violates POSIX by using non-async-signal-safe functions in that handler: https://bugzilla.redhat.com/show_bug.cgi?id=772320).
Our problem stems from the fact that we pulled the rug out from under the library's expectations by closing an fd that it had just opened. While we aren't responsible for fixing the bugs in that pthread_atfork handler, we can at least avoid the bugs by not closing the fd in the first place.
* tests/commandtest.c (mymain): Avoid closing fds that were opened by virInitialize. ---
Pushing under the build-breaker rule. It cost me the better part of a morning to track this one down, so I left a super-long comment to help the next person to read the file understand what we're fighting against.
ACK, nasty one to debug ! Daniel

Hi, On Sun, Jan 8, 2012 at 5:55 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
On Fri, Jan 06, 2012 at 02:27:42PM -0700, Eric Blake wrote:
I hit a VERY weird testsuite failure on rawhide, which included _binary_ output to stderr, followed by a hang waiting for me to type something! (Here, using ^@ for NUL):
$ ./commandtest TEST: commandtest WARNING: gnome-keyring:: couldn't send data: Bad file descriptor .WARNING: gnome-keyring:: couldn't send data: Bad file descriptor .WARNING: gnome-keyring:: couldn't send data: Bad file descriptor WARNING: gnome-keyring:: couldn't send data: Bad file descriptor
I failed to create a kvm domain using qemu:///session because qemu-kvm fails with this error when executed. I tried to track it down, but it's quite hard to follow it in gdb. I saw bind() failing too, so I turned off selinux and everything worked again. -- Marc-André Lureau

On 02/02/2012 12:24 PM, Marc-André Lureau wrote:
Hi,
On Sun, Jan 8, 2012 at 5:55 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
On Fri, Jan 06, 2012 at 02:27:42PM -0700, Eric Blake wrote:
I hit a VERY weird testsuite failure on rawhide, which included _binary_ output to stderr, followed by a hang waiting for me to type something! (Here, using ^@ for NUL):
$ ./commandtest TEST: commandtest WARNING: gnome-keyring:: couldn't send data: Bad file descriptor .WARNING: gnome-keyring:: couldn't send data: Bad file descriptor .WARNING: gnome-keyring:: couldn't send data: Bad file descriptor WARNING: gnome-keyring:: couldn't send data: Bad file descriptor
This was a bug in the testsuite,
I failed to create a kvm domain using qemu:///session because qemu-kvm fails with this error when executed. I tried to track it down, but it's quite hard to follow it in gdb. I saw bind() failing too, so I turned off selinux and everything worked again.
But you are describing what sounds like a bug in either libvirt itself, or more likely, in the SELinux policy for forbidding something that libvirt needs for qemu:///session to work correctly. Alas, qemu:///session doesn't get much good testing; so I haven't hit this if only because I haven't tried using it lately. Can you file a bugzilla report with the actual AVC created when running SELinux in permissive mode? -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (3)
-
Daniel P. Berrange
-
Eric Blake
-
Marc-André Lureau