Hi Michal,

Thanks for your review. The problem occured in a python applicatin using libvirt-python, which has no ref().  If we unref() first in virKeepAliveTimer, we may get a segfault in virObjectUnlock() when cleanup, so I suppose that my patch is safer, :-) Here is the backtrace:


#0  0x00007fd8f79970e8 in virClassIsDerivedFrom (klass=0xdeadbeef, parent=0x7fd8e8001b80) at util/virobject.c:169

169        if (klass->magic == parent->magic)

Missing separate debuginfos, use: debuginfo-install python-2.7.5-16.el7.x86_64

(gdb) bt

#0  0x00007fd8f79970e8 in virClassIsDerivedFrom (klass=0xdeadbeef, parent=0x7fd8e8001b80) at util/virobject.c:169

#1  0x00007fd8f799742e in virObjectIsClass (anyobj=anyobj@entry=0x7fd8e800b9c0, klass=<optimized out>) at util/virobject.c:365

#2  0x00007fd8f79974e4 in virObjectUnlock (anyobj=0x7fd8e800b9c0) at util/virobject.c:338

#3  0x00007fd8f7ac477e in virKeepAliveTimer (timer=<optimized out>, opaque=0x7fd8e800b9c0) at rpc/virkeepalive.c:177

#4  0x00007fd8f7e5c9cf in libvirt_virEventInvokeTimeoutCallback () from /usr/lib64/python2.7/site-packages/libvirtmod.so

#5  0x00007fd8ff64db94 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0

#6  0x00007fd8ff64f1ad in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0

#7  0x00007fd8ff64d85f in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0

#8  0x00007fd8ff64d950 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0

#9  0x00007fd8ff64d950 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0

#10 0x00007fd8ff64f1ad in PyEval_EvalCodeEx () from /lib64/libpython2.7.so.1.0

#11 0x00007fd8ff5dc098 in function_call () from /lib64/libpython2.7.so.1.0

#12 0x00007fd8ff5b7073 in PyObject_Call () from /lib64/libpython2.7.so.1.0

#13 0x00007fd8ff5c6085 in instancemethod_call () from /lib64/libpython2.7.so.1.0

#14 0x00007fd8ff5b7073 in PyObject_Call () from /lib64/libpython2.7.so.1.0

#15 0x00007fd8ff648ff7 in PyEval_CallObjectWithKeywords () from /lib64/libpython2.7.so.1.0

#16 0x00007fd8ff67d7e2 in t_bootstrap () from /lib64/libpython2.7.so.1.0

#17 0x00007fd8ff358df3 in start_thread () from /lib64/libpthread.so.0

#18 0x00007fd8fe97d3ed in clone () from /lib64/libc.so.6



---

Best wishes

Yi Wang

Original Mail
Sender:  <mprivozn@redhat.com>;
To: WangYi10129963; <libvir-list@redhat.com>;
CC:  <jdenemar@redhat.com>;LiuJianJun10033482;
Date: 2017/04/20 20:40
Subject: Re: [libvirt] [PATCH] [PATCH] rpc: fix keep alive timer segfault


On 04/18/2017 03:55 AM, Yi Wang wrote:
> ka maybe have been freeed in virObjectUnref, application using
> virKeepAliveTimer will segfault when unlock ka. We should keep
> ka's refs positive before using it.

> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
> ---
>  src/rpc/virkeepalive.c | 11 ++++-------
>  1 file changed, 4 insertions(+), 7 deletions(-)

> diff --git a/src/rpc/virkeepalive.c b/src/rpc/virkeepalive.c
> index c9faf88..4f666fd 100644
> --- a/src/rpc/virkeepalive.c
> +++ b/src/rpc/virkeepalive.c
> @@ -160,17 +160,17 @@ virKeepAliveTimer(int timer ATTRIBUTE_UNUSED, void *opaque)
>      bool dead;
>      void *client;

> +    virObjectRef(ka);
>      virObjectLock(ka);

>      client = ka->client;
>      dead = virKeepAliveTimerInternal(ka, &msg);

> +    virObjectUnlock(ka);
> +
>      if (!dead && !msg)
>          goto cleanup;

> -    virObjectRef(ka);
> -    virObjectUnlock(ka);
> -
>      if (dead) {
>          ka->deadCB(client);
>      } else if (ka->sendCB(client, msg) < 0) {
> @@ -178,11 +178,8 @@ virKeepAliveTimer(int timer ATTRIBUTE_UNUSED, void *opaque)
>          virNetMessageFree(msg);
>      }

> -    virObjectLock(ka);
> -    virObjectUnref(ka);
> -
>   cleanup:
> -    virObjectUnlock(ka);
> +    virObjectUnref(ka);
>  }




Something doesn't feel right here. Firstly, there should always be at 
least one reference for this callback to use obtained in 
virKeepAliveStart(). So we should be able to do virObjectLock(ka) safely 
here. If we are not, something else is messing up the ref counting.

Secondly, it shouldn't matter if we do ref() followed by lock() or the 
other way around. The first seems racy; well decreasing chance to hit a 
race but nor resolving it.

Do you have a reproducer or something? backtrace perhaps? We can 
investigate further.

Michal