On Mon, Nov 13, 2017 at 09:07:12PM +0800, xinhua.Cao wrote:
在 2017/11/12 22:45, John Ferlan 写道:
>
> On 11/11/2017 03:20 AM, xinhua.Cao wrote:
>>
>> 在 2017/11/11 5:43, John Ferlan 写道:
>>> On 11/06/2017 12:47 AM, xinhua.Cao wrote:
>>>> base on commit fe8f1c8b8650c8ab5e7d27338f4538582651bd14, we solve
>>>> libvirt coredump problem, but it introduce a memory leak sense:
>>>>
>>>> 1. one client register a domain event such as reboot event
>>>> 2. and client was terminated unexpectly, such as kill -9,
>>> unexpectedly
>>>
>>>> then this client object will not free at libvirtd service program.
>>>>
>>>> remoteDispatchConnectDomainEventCallbackRegisterAny reference the
>>>> client,
>>>> but when client was terminated before it call deRegisterAny,the
>>>> reference
>>>> of client will not reduced to zero. so the memory leak take place.
>>> What's the difference between the path that would terminate normally and
>>> the one that terminates abnormally. It seems to me there might be (an)
>>> extra Ref(s) on the client which prevents virNetServerClientDispose from
>>> calling it's @privateDataFreeFunc which would call remoteClientFreeFunc
>>> since remoteClientInitHook worked properly.
>>>
>>> That's what needs to be found.
>> If client terminate normally, client would call
>> virConnectDomainEventDeregisterAny to
>> clear eventCallbacks. Then libvirt daemon will call
>> remoteDispatchConnectDomainEventDeregisterAny
>> -> remoteEventCallbackFree -> virObjectUnref(callback->client) to free
>> eventCallbacks and unRef client.
>> just think if client don't call virConnectDomainEventDeregisterAny and
>> be killed. then there is no way to call call
>> remoteDispatchConnectDomainEventDeregisterAny -> remoteEventCallbackFree
>> -> virObjectUnref(callback->client)
>> so client's ref can't down to zero. so remoteClientFreeFunc can't be
>> called. then client object memory leak.
> So the client/sample code is not calling Deregister because it has no
> "handling" of an abnormal exit. That is no signal handler to call
> Deregister in the event some outside force kills the client/sample code
> while it's processing.
>
> So a different solution would be to add signal handling to the code in
> order to handle the kill and make the Deregister (and close) calls.
signal handler can handle such as kill -15 signal, But can't handle kill
-9 signal.
so we also need to resolve it in libvirt
Basically we need to handle bad behaving clients in the daemon anyway. I tried
running libvirtd under valgrind and there was no extra leak with a client which
didn't deregister the callback. How did you encounter this? I really want to
find the place that I think needs to be changed, but without a reproducer this
is just a guessing game.