On 03/31/2013 10:20 AM, Peter Krempa wrote:
The last Viktor's effort to fix the race and memory corruption
unfortunately
wasn't complete in the case the close callback was not registered in an
connection. At that time, the trail of event's that I'll describe later could
still happend and corrupt the memory or cause a crash of the client (including
s/happend/happen/
the daemon in case of a p2p migration).
Consider the following prerequisities and trail of events:
Let's have a remote connection to a hypervisor that doesn't have a close
callback registered and the client is using the event loop. The crash happens in
cooperation of 2 threads. Thread E is the event loop and thread W is the worker
that does some stuff. R denotes the remote client.
1.) W - The client finishes everything and sheds the last reference on the client
2.) W - The virObject stuff invokes virConnectDispose that invokes doRemoteClose
3.) W - the remote close method invokes the REMOTE_PROC_CLOSE RPC method.
4.) W - The thread is preempted at this point.
5.) R - The remote side recieves the close and closes the socket.
s/recieves/receives/
6.) E - poll() wakes up due to the closed socket and invokes the
close callback
7.) E - The event loop is preempted right before remoteClientCloseFunc is called
8.) W - The worker now finishes, and frees the conn object.
9.) E - The remoteClientCloseFunc accesses the now-freed conn object in the
attempt to retrieve pointer for the real close callback.
10.) Kaboom, corrupted memory/segfault.
This patch tries to fix this by introducing a new object that survives the
freeing of the connection object. We can't increase the reference count on the
connection object itself as the connection would never be closed as the
s/itself as/itself or/; s/closed as/closed, as/
connection is closed only when the reference count reaches zero.
The new object - virConnectCloseCallbackData - is a lockable object that keeps
the pointers to the real user registered callback and ensures that the
connection callback is either not called if the connection was already freed or
that the connection isn't freed while this is being called.
---
src/datatypes.c | 55 ++++++++++++++++++++++++++++++++++++--------
src/datatypes.h | 22 ++++++++++++++----
src/libvirt.c | 29 ++++++++++++-----------
src/remote/remote_driver.c | 57 +++++++++++++++++++++++++++-------------------
4 files changed, 112 insertions(+), 51 deletions(-)
So far, this is just a code review. I'm also planning on testing the
patch, and will report back with results later in the day.
@@ -63,14 +65,19 @@ static void virStoragePoolDispose(void *obj);
static int
virDataTypesOnceInit(void)
{
-#define DECLARE_CLASS(basename) \
- if (!(basename ## Class = virClassNew(virClassForObject(), \
+#define DECLARE_CLASS_COMMON(basename, parent) \
+ if (!(basename ## Class = virClassNew(parent, \
#basename, \
sizeof(basename), \
basename ## Dispose))) \
return -1;
+#define DECLARE_CLASS(basename) \
+ DECLARE_CLASS_COMMON(basename, virClassForObject())
+#define DECLARE_CLASS_LOCKABLE(basename) \
+ DECLARE_CLASS_COMMON(basename, virClassForObjectLockable())
Wonder if these definitions are useful enough to rename to VIR_DECLARE_*
and move into virobject.h. But that would be a separate patch, to
adjust all clients that would benefit from it.
+++ b/src/datatypes.h
@@ -93,6 +93,22 @@ extern virClassPtr virStoragePoolClass;
# define VIR_IS_DOMAIN_SNAPSHOT(obj) \
(VIR_IS_SNAPSHOT(obj) && VIR_IS_DOMAIN((obj)->domain))
+
+typedef struct _virConnectCloseCallbackData virConnectCloseCallbackData;
+typedef virConnectCloseCallbackData *virConnectCloseCallbackDataPtr;
+
+/**
+ * Internal structure holding data related to connection close callbacks.
+ */
+struct _virConnectCloseCallbackData {
+ virObjectLockable parent;
+
+ virConnectPtr conn;
+ virConnectCloseFunc callback;
+ void *opaque;
+ virFreeCallback freeCallback;
+};
Would it make sense to move this struct definition to be local to
datatypes.c, and have all other uses of it treat it as opaque?...
+++ b/src/libvirt.c
@@ -20189,24 +20189,27 @@ int virConnectRegisterCloseCallback(virConnectPtr conn,
virObjectRef(conn);
virMutexLock(&conn->lock);
+ virObjectLock(conn->closeCallback);
virCheckNonNullArgGoto(cb, error);
- if (conn->closeCallback) {
+ if (conn->closeCallback->callback) {
...But then you would need to expose an accessor function instead of
directly accessing closeCallback->callback. Okay, probably fine as is.
+++ b/src/remote/remote_driver.c
@@ -337,32 +337,38 @@ enum virDrvOpenRemoteFlags {
VIR_DRV_OPEN_REMOTE_AUTOSTART = (1 << 2), /* Autostart a per-user daemon */
};
+static void
+remoteClientCloseFreeFunc(void *opaque)
+{
+ virConnectCloseCallbackDataPtr cbdata = opaque;
+
+ virObjectUnref(cbdata);
+}
You shouldn't need this; just use virObjectFreeCallback instead.
+remoteClientCloseFunc(virNetClientPtr client ATTRIBUTE_UNUSED,
+ int reason,
+ void *opaque)
{
- virConnectPtr conn = opaque;
+ virConnectCloseCallbackDataPtr cbdata = opaque;
- virMutexLock(&conn->lock);
- if (conn->closeCallback) {
- virConnectCloseFunc closeCallback = conn->closeCallback;
- void *closeOpaque = conn->closeOpaque;
- virFreeCallback closeFreeCallback = conn->closeFreeCallback;
- unsigned closeUnregisterCount = conn->closeUnregisterCount;
+ virObjectLock(cbdata);
- VIR_DEBUG("Triggering connection close callback %p reason=%d",
- conn->closeCallback, reason);
- conn->closeDispatch = true;
- virMutexUnlock(&conn->lock);
- closeCallback(conn, reason, closeOpaque);
- virMutexLock(&conn->lock);
- conn->closeDispatch = false;
- if (conn->closeUnregisterCount != closeUnregisterCount &&
- closeFreeCallback)
- closeFreeCallback(closeOpaque);
+ if (cbdata->callback) {
+ VIR_DEBUG("Triggering connection close callback %p reason=%d,
opaque=%p",
+ cbdata->callback, reason, cbdata->opaque);
+ cbdata->callback(cbdata->conn, reason, cbdata->opaque);
+
+ if (cbdata->freeCallback)
+ cbdata->freeCallback(cbdata->opaque);
+ cbdata->callback = NULL;
+ cbdata->freeCallback = NULL;
}
- virMutexUnlock(&conn->lock);
+ virObjectUnlock(cbdata);
+
+ /* free the connection reference that comes along with the callback
+ * registration */
+ virObjectUnref(cbdata->conn);
}
Took me a couple reads, but it looks right. The old code had to
temporarily drop connection lock while using the callback; the new code
never even grabs connection lock because all the callback data is
encapsulated in a separate object.
/* helper macro to ease extraction of arguments from the URI */
@@ -765,9 +771,11 @@ doRemoteOpen(virConnectPtr conn,
goto failed;
}
+ virObjectRef(conn->closeCallback);
+
virNetClientSetCloseCallback(priv->client,
remoteClientCloseFunc,
- conn, NULL);
+ conn->closeCallback, remoteClientCloseFreeFunc);
Here's where virObjectFreeCallback comes in handy.
On the surface, the code seems reasonable. If my testing passes, and
you fix up the typos and the unneeded helper function, then I don't mind
giving:
ACK.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org