On 06/20/2012 03:07 PM, Jim Fehlig wrote:
I'm looking into a libvirtd deadlock on daemon shutdown. The
deadlock
occurs when shutting down virNetServer. If the handling of a job is in
flight in virNetServerHandleJob(), the virNetServer lock is acquired
when freeing job->prog (src/rpc/virnetserver.c:167). But the lock is
already held in virNetServerFree(), which is blocked in
virThreadPoolFree() waiting for all the workers to finish. No progress
can be made.
The attached hack fixes the problem, but I'm not convinced this is an
appropriate fix. Is it necessary to hold the virNetServer lock when
calling virNetServerProgramFree(job->prog)? I notice the lock is not
held in the error path of virNetServerHandleJob().
+++ b/src/rpc/virnetserver.c
@@ -774,7 +774,9 @@ void virNetServerFree(virNetServerPtr srv)
for (i = 0 ; i < srv->nservices ; i++)
virNetServerServiceToggle(srv->services[i], false);
+ virNetServerUnlock(srv);
virThreadPoolFree(srv->workers);
+ virNetServerLock(srv);
As written, this can't be right, because it reads a field of srv outside
the locks. But maybe you meant:
<type> tmp = srv->workers;
srv->workers = NULL;
virNetServerUnlock(srv);
virThreadPoolFree(tmp);
virNetServerLock(srv);
as a possible fix that doesn't violate the safety rules of reading
fields from srv outside a lock.
I hope danpb chimes in on this one, as he is more of an expert when it
comes to the locking rules in virnet*.
--
Eric Blake eblake(a)redhat.com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org