
On 03/14/2013 06:26 AM, Viktor Mihajlovski wrote:
A race condition can occur when virConnectClose is called parallel to the execution of the connection close callback in remoteClientCloseFunc.
The race happens if the connection object is destroyed (including the mutex) while remoteClientCloseFunc is waiting for the connection mutex. After the destruction of the (non error checking) mutex remoteClientCloseFunc starts to process the callbacks. However the operations can occur against a freed (or even worse, reallocated) object. Another issue is that the closeFreeCallback is invoked even if it's NULL (this is the case for virsh).
The solution is to clean out the callback pointers in virConnectDispose before destroying the mutex. This way remoteClientCloseFunc will return immediately after passing virMutexLock, thus avoiding potential data corruption. There's still the slight chance that the concluding virMutexUnlock could do harm on the freed connection object.
See my question below...
This could be fixed using an error checking mutex which however has a much broader scope and impact.
I'd like Dan to weigh in on this one, because he probably has a better insight into the proper way to break the race. But here are some smaller comments:
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com> --- src/datatypes.c | 8 +++++++- src/remote/remote_driver.c | 3 ++- 2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/src/datatypes.c b/src/datatypes.c index b04e100..2358bdf 100644 --- a/src/datatypes.c +++ b/src/datatypes.c @@ -146,8 +146,14 @@ virConnectDispose(void *obj)
virMutexLock(&conn->lock);
- if (conn->closeFreeCallback) + if (conn->closeCallback) + conn->closeCallback = NULL;
The if is pointless. Just blindly set conn->closeCallback to NULL.
+ + if (conn->closeFreeCallback) { conn->closeFreeCallback(conn->closeOpaque); + conn->closeFreeCallback = NULL; + conn->closeOpaque = NULL;
Clearing conn->closeOpaque is pointless; it is only ever used depending on conn->closeFreeCallback, and leaving it non-NULL doesn't hurt.
+ }
virResetError(&conn->err);
diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index 3721af9..885120e 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -358,7 +358,8 @@ static void remoteClientCloseFunc(virNetClientPtr client ATTRIBUTE_UNUSED, closeCallback(conn, reason, closeOpaque); virMutexLock(&conn->lock); conn->closeDispatch = false; - if (conn->closeUnregisterCount != closeUnregisterCount) + if (conn->closeUnregisterCount != closeUnregisterCount && + closeFreeCallback) closeFreeCallback(closeOpaque); } virMutexUnlock(&conn->lock);
...Wouldn't it be better to stash a copy of the callback pointer while the mutex is held, but avoid calling the callback until after the mutex is unlocked? Something like: <TYPE> cb = NULL; void* opaque; virMutexLock(&conn->lock); conn->closeDispatch = false; if (conn->closeUnregisterCount != closeUnregisterCount) { cb = closeFreeCallback; opaque = closeOpaque; } virMutexUnlock(&conn->lock); if (cb) cb(opaque); -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org