On 1/22/21 12:09 PM, Nikolay Shirokovskiy wrote:
> On Fri, Jan 22, 2021 at 12:45 PM Michal Privoznik <mprivozn@redhat.com>
> wrote:
>
>> If libvirtd is sent SIGTERM while it is still initializing, it
>> may crash. The following scenario was observed (using 'stress' to
>> slow down CPU so much that the window where the problem exists is
>> bigger):
>>
>> 1) The main thread is already executing virNetDaemonRun() and is
>> in virEventRunDefaultImpl().
>> 2) The thread that's supposed to run daemonRunStateInit() is
>> spawned already, but daemonRunStateInit() is in its very early
>> stage (in the stack trace I see it's executing
>> virIdentityGetSystem()).
>>
>> If SIGTERM (or any other signal that we don't override handler
>> for) arrives at this point, the main thread jumps out from
>> virEventRunDefaultImpl() and enters virStateShutdownPrepare()
>> (via shutdownPrepareCb which was set earlier). This iterates
>> through stateShutdownPrepare() callbacks of state drivers and
>> reaching qemuStateShutdownPrepare() eventually only to
>> dereference qemu_driver. But since thread 2) has not been
>> scheduled/not proceeded yet, qemu_driver was not allocated yet.
>>
>> Solution is simple - just check if qemu_driver is not NULL. But
>> doing so only in qemuStateShutdownPrepare() would push the
>> problem down to virStateShutdownWait(), well
>> qemuStateShutdownWait(). Therefore, duplicate the trick there
>> too.
>>
>
> I guess this is a partial solution. Initialization may be in a state when
> qemu_driver is initialized but qemu_driver->workerPool is still NULL
> for example.
Yes.
>
> Maybe we'd better delay shutdown until initialization is finished?
I'm not exactly sure how to achieve that. Do you have a hint? Also, part
of qemu driver state init is autostarting domains (which may take ages).