Re: [libvirt] [PATCH] remote: Prevent race when closing a connection

15 Mar 2013

On 03/14/2013 06:26 AM, Viktor Mihajlovski wrote:
...
A race condition can occur when virConnectClose is called parallel
to the execution of the connection close callback in remoteClientCloseFunc.
The race happens if the connection object is destroyed (including
the mutex) while remoteClientCloseFunc is waiting for the connection
mutex. After the destruction of the (non error checking) mutex
remoteClientCloseFunc starts to process the callbacks. However the
operations can occur against a freed (or even worse, reallocated)
object. Another issue is that the closeFreeCallback is invoked
even if it's NULL (this is the case for virsh).
The solution is to clean out the callback pointers in virConnectDispose
before destroying the mutex. This way remoteClientCloseFunc will
return immediately after passing virMutexLock, thus avoiding potential
data corruption. There's still the slight chance that the concluding
virMutexUnlock could do harm on the freed connection object.
See my question below...
...
This could be fixed using an error checking mutex which however has a
much broader scope and impact.
I'd like Dan to weigh in on this one, because he probably has a better
insight into the proper way to break the race.  But here are some
smaller comments:
...
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
---
 src/datatypes.c            |    8 +++++++-
 src/remote/remote_driver.c |    3 ++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/datatypes.c b/src/datatypes.c
index b04e100..2358bdf 100644
--- a/src/datatypes.c
+++ b/src/datatypes.c
@@ -146,8 +146,14 @@ virConnectDispose(void *obj)
virMutexLock(&conn->lock);
-    if (conn->closeFreeCallback)
+    if (conn->closeCallback)
+        conn->closeCallback = NULL;
The if is pointless.  Just blindly set conn->closeCallback to NULL.
...
+
+    if (conn->closeFreeCallback) {
         conn->closeFreeCallback(conn->closeOpaque);
+        conn->closeFreeCallback = NULL;
+        conn->closeOpaque = NULL;
Clearing conn->closeOpaque is pointless; it is only ever used depending
on conn->closeFreeCallback, and leaving it non-NULL doesn't hurt.
...
+    }
virResetError(&conn->err);
diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c
index 3721af9..885120e 100644
--- a/src/remote/remote_driver.c
+++ b/src/remote/remote_driver.c
@@ -358,7 +358,8 @@ static void remoteClientCloseFunc(virNetClientPtr client ATTRIBUTE_UNUSED,
         closeCallback(conn, reason, closeOpaque);
         virMutexLock(&conn->lock);
         conn->closeDispatch = false;
-        if (conn->closeUnregisterCount != closeUnregisterCount)
+        if (conn->closeUnregisterCount != closeUnregisterCount &&
+            closeFreeCallback)
             closeFreeCallback(closeOpaque);
     }
     virMutexUnlock(&conn->lock);
...Wouldn't it be better to stash a copy of the callback pointer while
the mutex is held, but avoid calling the callback until after the mutex
is unlocked?  Something like:

<TYPE> cb = NULL;
void* opaque;
virMutexLock(&conn->lock);
conn->closeDispatch = false;
if (conn->closeUnregisterCount != closeUnregisterCount) {
    cb = closeFreeCallback;
    opaque = closeOpaque;
}
virMutexUnlock(&conn->lock);
if (cb)
    cb(opaque);

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org