Hi Michal,
This problem is triggerred by libvirt python's example event-test.py. the original examples has resouce leak issue
at the remove_handle and remove_timer.
with "python -u event-test.py" run this example and "systemctl restart libvirtd.service" will trigger resource leak problem.
with lsof -p <event-test.pid> can see socket handler's number increased , after restart libvirtd.serivce each time.
the reason is remove_handle and remove_timer do not return the remove handle information to libvirt-python's framework.
little patch was apply to this example, to fix this problem.
Now, run this example again and restart libvirtd.service , call sequence virNetSocketRemoveIOCallback->virNetSocketEventFree
can be observed , the no-recursive mutex, lock with recursive issue can be seen.
you can check the detail stack trace and our comments about the lock's issue in function virNetSocketEventFree in the following.
====================================================================
def add_timer(self, interval, cb, opaque):
timerID = self.nextTimerID + 1
self.nextTimerID = self.nextTimerID + 1
h = self.virEventLoopPureTimer(timerID, interval, cb, opaque)
self.timers.append(h)
- self.timers_opaque[timerID] = opaque
self.interrupt()
debug("Add timer %d interval %d" % (timerID, interval))
return timerID
def remove_handle(self, handleID):
handles = []
- opaque = None
for h in self.handles:
if h.get_id() == handleID:
self.poll.unregister(h.get_fd())
- opaque = self.opaques[handleID]
- del self.opaques[handleID]
debug("Remove handle %d fd %d" % (handleID, h.get_fd()))
else:
handles.append(h)
self.handles = handles
self.interrupt()
- return opaque
# Stop firing the periodic timer
def remove_timer(self, timerID):
timers = []
- opaque = None
for h in self.timers:
if h.get_id() != timerID:
timers.append(h)
- else:
- opaque = self.timers_opaque[timerID]
debug("Remove timer %d" % timerID)
self.timers = timers
self.interrupt()
- return opaque
====================================================================================
>>On 07/15/2017 05:00 PM, Peng Hao wrote:
>> virNetSocketRemoveIOCallback get sock's ObjectLock and will call
>> virNetSocketEventFree. virNetSocketEventFree may be free sock
>> object and virNetSocketRemoveIOCallback will access a null pointer
>> in release sock's ObjectLock.
>>
>> Signed-off-by: Liu Yun <liu.yunh@zte.com.cn>
>> Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
>> ---
>> src/rpc/virnetsocket.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>
>I don't think this can work.
>
>> diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c
>> index d228c8a..8b550e8 100644
>> --- a/src/rpc/virnetsocket.c
>> +++ b/src/rpc/virnetsocket.c
>> @@ -2140,14 +2140,12 @@ static void virNetSocketEventFree(void *opaque)
>> virFreeCallback ff;
>> void *eopaque;
>>
>> - virObjectLock(sock);
>> ff = sock->ff;
>> eopaque = sock->opaque;
>> sock->func = NULL;
>> sock->ff = NULL;
>> sock->opaque = NULL;
>> - virObjectUnlock(sock);
>
>I think we need the lock here. This function is called from the event
>loop thread. So even if virNetSocketUpdateIOCallback() locks the @socket
>this code can see it unlocked. Or locked. But the crucial part is it's
>modifying the object and thus should have lock held.
>
I have check the code , in default implementation of eventPoll, virEventPollRunOnce always dispatch and clear in one thread loop,
so, the lock in the virNetSocketEventFree may be unnessary.
>> -
>> +
>> if (ff)
>> ff(eopaque);
>>
>> @@ -2207,6 +2205,7 @@ void virNetSocketUpdateIOCallback(virNetSocketPtr sock,
>>
>> void virNetSocketRemoveIOCallback(virNetSocketPtr sock)
>> {
>> + virObjectRef(sock);
This should be mistake when generate the patch. The correct one is
+ virObjectUnref(sock);
>> virObjectLock(sock);
>
>I think this is what actually fixes your problem. However, I also think
>it introduces uneven ratio of ref:unref calls.
>
>>
>> if (sock->watch < 0) {
>> @@ -2220,6 +2219,7 @@ void virNetSocketRemoveIOCallback(virNetSocketPtr sock)
>> sock->watch = -1;
>>
>> virObjectUnlock(sock);
>> + virObjectRef(sock);
This should be mistake when generate the patch. The correct one is
+ virObjectUnref(sock);
>
>It definitely does so because you ref twice. Anyway, do you perhaps have
>a backtrace to share?
#0 __lll_lock_wait ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007fde6207cd02 in _L_lock_791 () from /lib64/libpthread.so.0
#2 0x00007fde6207cc08 in __GI___pthread_mutex_lock (
mutex=mutex@entry=0x119c3e0) at pthread_mutex_lock.c:64
#3 0x00007fde5a97ee15 in virMutexLock (m=m@entry=0x119c3e0)
at util/virthread.c:89
#4 0x00007fde5a9608ae in virObjectLock (anyobj=anyobj@entry=0x119c3d0)
at util/virobject.c:323
#5 0x00007fde5aaa752c in virNetSocketEventFree (opaque=0x119c3d0)
at rpc/virnetsocket.c:2134
#6 0x00007fde5ae57f87 in libvirt_virEventRemoveHandleFunc (
watch=<optimized out>) at libvirt-override.c:5496
#7 0x00007fde5aaaac69 in virNetSocketRemoveIOCallback (sock=0x119c3d0)
at rpc/virnetsocket.c:2212
#8 0x00007fde5aa96d76 in virNetClientMarkClose (client=0x119c650, reason=0)
at rpc/virnetclient.c:779
#9 0x00007fde5aa970eb in virNetClientIncomingEvent (sock=0x119c3d0, events=9,
opaque=0x119c650) at rpc/virnetclient.c:1985
#10 0x00007fde5ae4b347 in libvirt_virEventInvokeHandleCallback (
self=<optimized out>, args=<optimized out>) at libvirt-override.c:5718
#11 0x00007fde6236faa4 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#12 0x00007fde623710bd in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
---Type <return> to continue, or q <return> to quit---
#13 0x00007fde6236f76f in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#14 0x00007fde6236f860 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#15 0x00007fde6236f860 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#16 0x00007fde6236f860 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#17 0x00007fde623710bd in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#18 0x00007fde622fe05d in function_call () from /lib64/libpython2.7.so.1.0
#19 0x00007fde622d90b3 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#20 0x00007fde6236c2f7 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#21 0x00007fde6236f860 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#22 0x00007fde6236f860 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
#23 0x00007fde623710bd in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0
#24 0x00007fde622fdf68 in function_call () from /lib64/libpython2.7.so.1.0
#25 0x00007fde622d90b3 in PyObject_Call () from /lib64/libpython2.7.so.1.0
#26 0x00007fde622e80a5 in instancemethod_call ()
>
>Michal
BestRegards
LiuYun