On 2020/7/24 21:11, Daniel Henrique Barboza wrote:
On 7/19/20 11:09 PM, Bihong Yu wrote:
>
>
> On 2020/7/18 5:14, Daniel Henrique Barboza wrote:
>>
>>
>> On 7/17/20 8:10 AM, Bihong Yu wrote:
>>>> From c328ff62b11d58553fd2032a85fd3295e009b3d3 Mon Sep 17 00:00:00 2001
>>> From: Bihong Yu <yubihong(a)huawei.com>
>>> Date: Fri, 17 Jul 2020 16:55:12 +0800
>>> Subject: [PATCH] qemu: clear residual QMP caps processes during QEMU driver
>>> initialization
>>>
>>> In some cases, the QMP capabilities processes possible residue. So we need
to
>>> clear the residual QMP caps processes during starting libvirt.
>>
>> Which cases are you referring to? Do you have a reproducer for this behavior? I
>> tried it up starting and stopping libvirtd, fetching capabilities, starting vms
>> and so on, and wasn't able to see any 'qmp.pid' file hanging around.
>
> Yes, I have a reproducer for this behavior. I refer case is that send kill signal
> to libvirtd when libvirtd is fetching capabilities before libvirtd calling
> qemuProcessQMPFree() and qemuProcessQMPStop().
>
> When libvirtd restart, it can not find the temporary qmp directory and has no way
> to clear the residual QMP caps processes.
I see. The reproducer is an abnormal termination of the Libvirt daemon (I'm assuming
SIGKILL - SIGTERM would have been handled).
The problem is that processes in general can't handle SIGKILL with cleanup functions
(there are gimmicks, but one process can't avoid to not get killed instantly with
a SIGKILL by itself). I don't believe Libvirt can contemplate each cleanup scenario
where a program is terminated with SIGKILL.
You found one case where a kill -9 left stuff behind, but I guarantee you that there
are many places and situations where this will occur. Basically each temporary file
Libvirt creates are susceptible to this.
Yes, I see. But, this abnormal termination will cause the QMP caps processes residue,
and could residue many processes if reproduce this abnormal termination. They will
waste the system resources.
In addition, g_mkdtemp() is used only in one place where the qemuProcessQMPInit()
function use to make qmp-XXXXXX temporary directory. If libvirtd exit abnormally,
libvirtd will lost control of the temporary directory.
In my opition:
a. The temporary directory will cause residual processes not only temporary directory
itself.
b. Libvirtd use g_mkdtemp() only in one place, and will lost control of the temporary
directory if libvirtd exit abnormally.
So I think it's necessary to clear the residual QMP caps processes and the temporary
directory.
Your proposed fix doesn't help in the case you're trying to cover either. What
if
I land a SIGKILL while the code is trying to cleanup the leftovers of the previous
SIGKILL? Now imagine that each place where Libvirt creates a temporary file would
need a cleanup code like this, and every one of them is susceptible to this same
behavior.
In my opinion, the other temporary file will be reuse or overwrite if libvirtd
restart and it will not cause process residual.
Thanks,
Bihong Yu
I am not sure why you needed to kill the process with SIGKILL instead of SIGTERM,
but I'd rather try to understand and fix what lead you to send a -9 to libvirtd
instead.
Thanks,
DHB