
On 21.12.2017 17:59, John Ferlan wrote:
[...]
Patch looks good to me too. But still original "libvirtd: fix crash on termination" fixes another issue and if applied fixes "virt-manager issue" as well as John figured out.
Not sure there's enough coffee in the house this morning to make me "awake enough" for all this stuff, especially just before a holiday break. But perhaps better now than after the new year... sooo....
Finally I'm back on track with this (sorry for it taking so long), although you're right about this, it's not the correct fix, it's just a byproduct of your patch, however, the whole thing about closing connections and releasing memory is a bit of a mess. For example, right now, we only dispose of service objs (but we don't close them, we do that in virNetServerClose), but we both close and dispose of the client objs. Another thing, we toggle the service to stop accepting any new connections, but that's irrelevant at that point because we've closed them already by that time as a product of calling virNetDaemonClose->virNetServerClose->virNetServerServiceClose - so that
virNetServerServiceClose could call virNetServerServiceToggle(,false), even though it probably doesn't matter at this point.
Makes me wonder why virNetServerUpdateServicesLocked wasn't called in virNetServerDispose instead of open coding the nservices loop (sigh).
should be removed...I drifted a bit here, anyway, we need to make a clear distinction when we're releasing memory and when we're shutting down some kind of service - sometimes both can be done at the same time (see below), however it's very clear we can't do it here. Your issue is that Close currently only closes the sockets (this reflects my point of "shutting down a service") but does nothing with the threadpool connected to the server, thus leaving the worker threads to continue running and executing APIs the results of which they shouldn't even be able to return back to the client, since we're shutting down. Now the thing with threadpool is that both memory release and teardown are merged into one "object disposal" operation and therefore done as part of virNetServerDispose. Since I understand a removal from the hash table as a memory release operation, we should not be calling virHashRemoveAll from virNetDaemonClose. Now, I see 2 options:
1) split virThreadPoolFree into 2 functions, one which only broadcasts the "die" message and joins the threads (or waits for them in this case...) and the other releasing the resources - can't say I'm a fan of this one
Kind of a "virThreadPoolStop" type operation...
2) call virThreadPoolFree from virNetServerClose instead...
None of those approaches is ideal, but I can't seem to think off anything better at the moment.
I like 2 better, but it doesn't fix the problem of a long running thread (such as GetAllDomainStats), it just moves the cheese. Although I have a feeling the virStateShutdown series Nikolay is promoting may solve that issue.
No, no. The problem will be fixed as I mentined in a different reply. It just can not be solved by solely virHashRemoveAll with current unref ordering in libvirtd.c.
It's really a conundrum w/r/t how much time to spend on this especially if the short/long term goal is a shims for libvirtd (e.g. libvirt_qemu) which will move the cheese even more.
I'm going to move away from this for now, maybe a fresh look will help. Right now I'm not sure I can focus on this with the other threads I'm involved in.
John
I'm open to discuss any other suggestions. Erik
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list