At 11/30/2011 11:17 PM, Eric Blake Write:
On 11/30/2011 04:40 AM, Daniel P. Berrange wrote:
>> Hmm, I suspect this may be caused by
>>
>> commit fd7e85ac6af833845aa0eb2526158c319800a0ae
>> Author: Jiri Denemark <jdenemar(a)redhat.com>
>> Date: Tue Oct 11 15:05:52 2011 +0200
>>
>> virsh: Always run event loop
>>
>> Since virsh already implements event loop, it has to also run it. So far
>> the event loop was only running during virsh console command.
>>
>
> Ahhh, I had discounted that because the date was Oct 11, and I figured
> we would have seen it by now. But that's just your original commit date,
> the merge date was Oct 24th. So yeah, I reckon this must be the culprit
I've been seeing this fairly often (about 25% of my 'make check' runs)
since at least last Wednesday, on F16 when compiling at -O0; and the
analysis of a race caused by not locking around ctl->quit seems like a
reasonable culprit.
This may be one reason. I think there is some bug in testdriver, and it
will also cause 'make check' hang. Here is the backtrace when 'make
check'
hung:
(gdb) info threads
2 Thread 0x7f22e81da700 (LWP 25242) 0x0000003bdce0dff4 in __lll_lock_wait () from
/lib64/libpthread.so.0
* 1 Thread 0x7f22e81db800 (LWP 25227) 0x0000003bdce0804d in pthread_join () from
/lib64/libpthread.so.0
(gdb) bt
#0 0x0000003bdce0804d in pthread_join () from /lib64/libpthread.so.0
#1 0x000000000041c193 in vshDeinit (ctl=0x7fff6098df20) at virsh.c:17506
#2 0x0000000000425251 in main (argc=<value optimized out>, argv=0x7fff6098e0d8) at
virsh.c:17852
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f22e81da700 (LWP 25242))]#0 0x0000003bdce0dff4 in
__lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0 0x0000003bdce0dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x0000003bdce09328 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x0000003bdce091f7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007f22e84f6e9c in testDriverLock (timer=<value optimized out>,
opaque=0x2192e10) at test/test_driver.c:127
#4 testDomainEventFlush (timer=<value optimized out>, opaque=0x2192e10) at
test/test_driver.c:5515
#5 0x00007f22e8459a08 in virEventPollDispatchTimeouts () at util/event_poll.c:440
#6 virEventPollRunOnce () at util/event_poll.c:633
#7 0x00007f22e8458a27 in virEventRunDefaultImpl () at util/event.c:247
#8 0x000000000041bf83 in vshEventLoop (opaque=0x7fff6098df20) at virsh.c:17044
#9 0x00007f22e846a862 in virThreadHelper (data=<value optimized out>) at
util/threads-pthread.c:157
#10 0x0000003bdce077f1 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003bdc2e570d in clone () from /lib64/libc.so.6
According to the backtrace, lt-virsh is deadlock.
Thanks
Wen Congyang