On Sat, Jul 07, 2018 at 08:11:05 -0400, John Ferlan wrote:
When virNetDaemonQuit is called from libvirtd's shutdown
handler (daemonShutdownHandler) we need to perform the quit
in multiple steps. The first part is to "request" the quit
and notify the NetServer's of the impending quit which causes
the NetServers to inform their workers that a quit was requested.
Still because we cannot guarantee a quit will happen or it's
possible there's no workers pending, use a virNetDaemonQuitTimer
to not only break the event loop but keep track of how long we're
waiting and we've waited too long, force an ungraceful exit so
that we don't hang waiting forever or cause some sort of SEGV
because something is still pending and we Unref things.
Signed-off-by: John Ferlan <jferlan(a)redhat.com>
---
src/libvirt_remote.syms | 1 +
src/remote/remote_daemon.c | 1 +
src/rpc/virnetdaemon.c | 68 +++++++++++++++++++++++++++++++++++++-
src/rpc/virnetdaemon.h | 4 +++
4 files changed, 73 insertions(+), 1 deletion(-)
[...]
@@ -855,10 +904,27 @@ virNetDaemonRun(virNetDaemonPtr dmn)
virObjectLock(dmn);
virHashForEach(dmn->servers, daemonServerProcessClients, NULL);
+
+ /* HACK: Add a dummy timeout to break event loop */
+ if (dmn->quitRequested && quitTimer == -1)
+ quitTimer = virEventAddTimeout(500, virNetDaemonQuitTimer,
+ &quitCount, NULL);
+
+ if (dmn->quitRequested && daemonServerWorkersDone(dmn)) {
+ dmn->quit = true;
+ } else {
+ /* Firing every 1/2 second and quitTimeout in seconds, force
+ * an exit when there are still worker threads running and we
+ * have waited long enough */
+ if (quitCount > dmn->quitTimeout * 2)
+ _exit(EXIT_FAILURE);
If you have a legitimate long-running job which would finish eventually
and e.g. write an updated status XML this will break things. I'm not
persuaded that this is a systematic solution to some API getting stuck.
The commit message also does not help persuading me that this is a good
idea.
NACK