
On Sat, Jul 07, 2018 at 08:11:05 -0400, John Ferlan wrote:
When virNetDaemonQuit is called from libvirtd's shutdown handler (daemonShutdownHandler) we need to perform the quit in multiple steps. The first part is to "request" the quit and notify the NetServer's of the impending quit which causes the NetServers to inform their workers that a quit was requested.
Still because we cannot guarantee a quit will happen or it's possible there's no workers pending, use a virNetDaemonQuitTimer to not only break the event loop but keep track of how long we're waiting and we've waited too long, force an ungraceful exit so that we don't hang waiting forever or cause some sort of SEGV because something is still pending and we Unref things.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/libvirt_remote.syms | 1 + src/remote/remote_daemon.c | 1 + src/rpc/virnetdaemon.c | 68 +++++++++++++++++++++++++++++++++++++- src/rpc/virnetdaemon.h | 4 +++ 4 files changed, 73 insertions(+), 1 deletion(-)
[...]
@@ -855,10 +904,27 @@ virNetDaemonRun(virNetDaemonPtr dmn) virObjectLock(dmn);
virHashForEach(dmn->servers, daemonServerProcessClients, NULL); + + /* HACK: Add a dummy timeout to break event loop */ + if (dmn->quitRequested && quitTimer == -1) + quitTimer = virEventAddTimeout(500, virNetDaemonQuitTimer, + &quitCount, NULL); + + if (dmn->quitRequested && daemonServerWorkersDone(dmn)) { + dmn->quit = true; + } else { + /* Firing every 1/2 second and quitTimeout in seconds, force + * an exit when there are still worker threads running and we + * have waited long enough */ + if (quitCount > dmn->quitTimeout * 2) + _exit(EXIT_FAILURE);
If you have a legitimate long-running job which would finish eventually and e.g. write an updated status XML this will break things. I'm not persuaded that this is a systematic solution to some API getting stuck. The commit message also does not help persuading me that this is a good idea. NACK