On Mon, Jul 17, 2017 at 12:41:41 +0000, Caoxinhua wrote:
First: all of those backtrace is our private libvirt version
backtrace. I find opensource also have this problem
1、backtrace of child
Does your 'private' version contain any changes to the logging code?
(gdb) bt
#0 __lll_lock_wait_private () at
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1 0x00007fcd26f7299c in _L_lock_2462 () at tzset.c:621
#2 0x00007fcd26f727d7 in __tz_convert (timer=0x7fcd2727fec0 <tzset_lock>,
timer@entry=0x7fcd1a694068, use_localtime=use_localtime@entry=0,
tp=tp@entry=0x7fcd1a694070) at tzset.c:624
#3 0x00007fcd26f7075a in __GI___gmtime_r (t=t@entry=0x7fcd1a694068,
tp=tp@entry=0x7fcd1a694070) at gmtime.c:28
#4 0x00007fcd2a06b41d in virTimeLocalOffsetFromUTC (offset=offset@entry=0x7fcd1a6940e0)
at util/virtime.c:385
#5 0x00007fcd2a06b560 in virTimeStringNowRaw (buf=buf@entry=0x7fcd1a6941a0
"\023") at util/virtime.c:222
So this is fishy. virTimeStringNowRaw is specifically designed not to
call any time conversion functions, so it should not call
virTimeLocalOffsetFromUTC at all. Those functions should ever be called
only from the main process.
#6 0x00007fcd2a034e7e in virLogVMessage (source=<optimized
out>, priority=VIR_LOG_INFO, filename=0x7fcd2a28c04a
"security/security_dac.c", linenr=556, funcname=0x7fcd2a28c940
<__func__.22722> "virSecurityDACSetOwnershipInternal", metadata=0x0,
fmt=fmt@entry=0x7fcd2a28c2f8 "Setting DAC user and group on '%s' to
'%ld:%ld'", vargs=vargs@entry=0x7fcd1a694220) at util/virlog.c:585
#7 0x00007fcd2a035427 in virLogMessage (source=source@entry=0x7fcd2a519d90
<virLogSelf>, priority=priority@entry=VIR_LOG_INFO,
filename=filename@entry=0x7fcd2a28c04a "security/security_dac.c",
linenr=linenr@entry=556, funcname=funcname@entry=0x7fcd2a28c940 <__func__.22722>
"virSecurityDACSetOwnershipInternal", metadata=metadata@entry=0x0,
fmt=fmt@entry=0x7fcd2a28c2f8 "Setting DAC user and group on '%s' to
'%ld:%ld'") at util/virlog.c:515
This is called even from VIR_DEBUG, thus your patch wouldn't fix this.
[...]
We can see in libvirtd 16th thread, it wait a message from child
process. But child process is deadlocked at virTimeLocalOffsetFromUTC point.
This shouldn't ever happen. The logging code should not call this
function and it doesn't seem it does in the upstream repo.
If we use VIR_DEBUG, and our loglevel is info, then VIR_DEBUG will
not call virTimeLocalOffsetFromUTC, then avoid deadlock.
It will call it if you enable debug logging. The only reason such change
would fix it is that the message was not displayed.