This RFC is a combination of a couple of different patch postings that
I've combined into one central "stream" of patches that can be discussed
for their relative importance or need to fix the problem.
Although a bit long winded, I think I've captured enough history for
anyone so inclined to walk through the history to understand the maze
of twisty patches that it takes to hopefully resolve the issue.
The first two patches were presented previously, but not accepted:
https://www.redhat.com/archives/libvir-list/2017-November/msg00296.html
https://www.redhat.com/archives/libvir-list/2017-November/msg00297.html
However since that time, it seems "some form" of the patches is necessary.
Most importantly making sure the virObjectUnref for @srv and @srvAdm occurs
*prior to* the virNetDaemonClose(dmn); at cleanup (IOW: out of order for a
reason). Doing that also requires any program started on the servers also
has the virObjectUnref prior to daemon close.
The 3rd and 4th patches are a result of discussions held in mid
December related to libvirtd crashes/hangs and some possible adjustments
to help. Discussion starts here:
https://www.redhat.com/archives/libvir-list/2017-December/msg00515.html
This led to suggestions to move the toggling of services from Dispose
to Close *and* to split the virThreadPoolFree into a Drain function
that could also be called during the Close function rather than waiting
for the Dispose to occur.
Still testing showed that just those 4 patches it still wasn't enough
as libvirtd ended up just "hung" because of some patches Nikolay posted
that add a new shutdown state, see:
https://www.redhat.com/archives/libvir-list/2017-October/msg01134.html
Those patches languished mainly because it wasn't clear (at the time)
the relationship between them and another series dealing with libvirtd
crashes that was partially accepted and pushed:
https://www.redhat.com/archives/libvir-list/2017-October/msg01347.html
and followup discussion starting here:
https://www.redhat.com/archives/libvir-list/2017-November/msg00023.html
The 9th patch can be used to test that the first 8 do the job. The
details on how I set up the test environment is in the patch. If the
sequence is run before the first 8 patches, you will end up with a
couple of different hang scenarios. So if you're compelled to see
what the big deal is, then apply this one alone and have fun playing.
The 10th patch is the one patch from the partially pushed series that
wasn't pushed as it was not deemed necessary. It's presented here mainly
for completeness.
John Ferlan (5):
libvirtd: Alter refcnt processing for domain server objects
libvirtd: Alter refcnt processing for server program objects
netserver: Toggle service off during close
qemu: Introduce virTheadPoolDrain
APPLY ONLY FOR TESTING PURPOSES
Nikolay Shirokovskiy (5):
libvirt: introduce hypervisor driver shutdown function
qemu: implement state driver shutdown function
qemu: agent: fix monitor close during first sync
qemu: monitor: check monitor not closed upon send
libvirtd: fix crash on termination
daemon/libvirtd.c | 46 ++++++++++++++++++++++++++++++++-----------
src/driver-state.h | 4 ++++
src/libvirt.c | 18 +++++++++++++++++
src/libvirt_internal.h | 1 +
src/libvirt_private.syms | 2 ++
src/qemu/qemu_agent.c | 14 ++++++-------
src/qemu/qemu_driver.c | 44 +++++++++++++++++++++++++++++++++++++++++
src/qemu/qemu_monitor.c | 27 ++++++++++++-------------
src/rpc/virnetdaemon.c | 1 +
src/rpc/virnetserver.c | 5 ++---
src/rpc/virnetserverservice.c | 2 ++
src/util/virthreadpool.c | 19 ++++++++++++------
src/util/virthreadpool.h | 2 ++
13 files changed, 143 insertions(+), 42 deletions(-)
--
2.13.6