Thank you Michal, this is good news for me.
I'll wait for this patch.
Regards,
--
Carlos Rodrigues
Engenheiro de Software Sénior
Eurotux Informática, S.A. |
www.eurotux.com
(t) +351 253 680 300 (m) +351 911 926 110
On Qua, 2014-03-19 at 18:27 +0100, Michal Privoznik wrote:
On 19.03.2014 12:10, Carlos Rodrigues wrote:
> Hello Michal,
>
> I am using libvirt 1.1.3 and perl-Sys-Virt 1.1.3 and perl-5.16 on Fedora
> 19 x86_64
>
> The zombie process appears after open libvirt connection with qemu-tls,
> and perl module is binding for libvirt library XS.
>
> Here is my running example with zombie process:
>
> $ perl test-chldhandle-bug-fixed.pl & sleep 15 && echo && ps axf
| grep perl && echo
> [2] 12427
> init... pid=12427
> while...
> fork 1
> end... pid=12430
> receive chld
> fork 2
> end... pid=12431
> receive chld
> 2014-03-19 11:06:38.712+0000: 12427: info : libvirt version: 1.1.3.1, package:
2.fc19 (Unknown, 2014-03-17-15:02:00, cmar-laptop.lan)
> 2014-03-19 11:06:38.712+0000: 12427: warning : virNetTLSContextCheckCertificate:1140
: Certificate check failed Certificate [session] owner does not match the hostname
10.10.4.249
> connection open
> fork 3
> end... pid=12432
> fork 4
> end... pid=12440
>
> 12427 pts/2 S 0:00 | \_ perl test-chldhandle-bug-fixed.pl
> 12432 pts/2 Z 0:00 | | \_ [perl] <defunct>
> 12440 pts/2 Z 0:00 | | \_ [perl] <defunct>
> 12442 pts/2 S+ 0:00 | \_ grep --color=auto perl
Aha! It seems like this is only present if using tls, I was unable to
reproduce this with tcp or unix sockets. And when using tcp I can see
SIGCHLD being delivered while with tls it is not. That makes me wonder
if either libvirt or gnutls silently sets signal mask and not restore it
back. Because if I take a look at signal mask I can clearly see SIGCHLD
to be blocked (from /proc/$pid/status):
SigPnd: 0000000000000000
ShdPnd: 0000000000010000
SigBlk: 0000000008011000
SigIgn: 0000000000001080
SigCgt: 0000000180010000
What we can see here is, SigBlk (the bitmask of blocked signals)
contains 0x801100 which is SIGPIPE, SIGCHLD and SIGWINCH. Right, why
would libvirt care about SIGWINCH anyway? Git greping it leads us to
virNetClientSetTLSSession(). I can clearly see there we are adding just
those three signals to a mask. Then setting this mask just prior to
calling poll() and then restoring back. Oh wait, we are not!
pthread_sigmask(SIG_BLOCK,...) is just adding new signals to the mask,
not overwriting the old one. So yes, this is clearly libvirt bug.
If I use SIG_SETMASK there, I am no longer getting any zombies. I'll
post the patch shortly.
Michal