
On 19.03.2014 12:10, Carlos Rodrigues wrote:
Hello Michal,
I am using libvirt 1.1.3 and perl-Sys-Virt 1.1.3 and perl-5.16 on Fedora 19 x86_64
The zombie process appears after open libvirt connection with qemu-tls, and perl module is binding for libvirt library XS.
Here is my running example with zombie process:
$ perl test-chldhandle-bug-fixed.pl & sleep 15 && echo && ps axf | grep perl && echo [2] 12427 init... pid=12427 while... fork 1 end... pid=12430 receive chld fork 2 end... pid=12431 receive chld 2014-03-19 11:06:38.712+0000: 12427: info : libvirt version: 1.1.3.1, package: 2.fc19 (Unknown, 2014-03-17-15:02:00, cmar-laptop.lan) 2014-03-19 11:06:38.712+0000: 12427: warning : virNetTLSContextCheckCertificate:1140 : Certificate check failed Certificate [session] owner does not match the hostname 10.10.4.249 connection open fork 3 end... pid=12432 fork 4 end... pid=12440
12427 pts/2 S 0:00 | \_ perl test-chldhandle-bug-fixed.pl 12432 pts/2 Z 0:00 | | \_ [perl] <defunct> 12440 pts/2 Z 0:00 | | \_ [perl] <defunct> 12442 pts/2 S+ 0:00 | \_ grep --color=auto perl
Aha! It seems like this is only present if using tls, I was unable to reproduce this with tcp or unix sockets. And when using tcp I can see SIGCHLD being delivered while with tls it is not. That makes me wonder if either libvirt or gnutls silently sets signal mask and not restore it back. Because if I take a look at signal mask I can clearly see SIGCHLD to be blocked (from /proc/$pid/status): SigPnd: 0000000000000000 ShdPnd: 0000000000010000 SigBlk: 0000000008011000 SigIgn: 0000000000001080 SigCgt: 0000000180010000 What we can see here is, SigBlk (the bitmask of blocked signals) contains 0x801100 which is SIGPIPE, SIGCHLD and SIGWINCH. Right, why would libvirt care about SIGWINCH anyway? Git greping it leads us to virNetClientSetTLSSession(). I can clearly see there we are adding just those three signals to a mask. Then setting this mask just prior to calling poll() and then restoring back. Oh wait, we are not! pthread_sigmask(SIG_BLOCK,...) is just adding new signals to the mask, not overwriting the old one. So yes, this is clearly libvirt bug. If I use SIG_SETMASK there, I am no longer getting any zombies. I'll post the patch shortly. Michal