
On 03/31/2013 10:20 AM, Peter Krempa wrote:
The last Viktor's effort to fix the race and memory corruption unfortunately wasn't complete in the case the close callback was not registered in an connection. At that time, the trail of event's that I'll describe later could still happend and corrupt the memory or cause a crash of the client (including
s/happend/happen/
the daemon in case of a p2p migration).
Consider the following prerequisities and trail of events: Let's have a remote connection to a hypervisor that doesn't have a close callback registered and the client is using the event loop. The crash happens in cooperation of 2 threads. Thread E is the event loop and thread W is the worker that does some stuff. R denotes the remote client.
1.) W - The client finishes everything and sheds the last reference on the client 2.) W - The virObject stuff invokes virConnectDispose that invokes doRemoteClose 3.) W - the remote close method invokes the REMOTE_PROC_CLOSE RPC method. 4.) W - The thread is preempted at this point. 5.) R - The remote side recieves the close and closes the socket.
s/recieves/receives/
6.) E - poll() wakes up due to the closed socket and invokes the close callback 7.) E - The event loop is preempted right before remoteClientCloseFunc is called 8.) W - The worker now finishes, and frees the conn object. 9.) E - The remoteClientCloseFunc accesses the now-freed conn object in the attempt to retrieve pointer for the real close callback. 10.) Kaboom, corrupted memory/segfault.
This patch tries to fix this by introducing a new object that survives the freeing of the connection object. We can't increase the reference count on the connection object itself as the connection would never be closed as the
s/itself as/itself or/; s/closed as/closed, as/
connection is closed only when the reference count reaches zero.
The new object - virConnectCloseCallbackData - is a lockable object that keeps the pointers to the real user registered callback and ensures that the connection callback is either not called if the connection was already freed or that the connection isn't freed while this is being called. --- src/datatypes.c | 55 ++++++++++++++++++++++++++++++++++++-------- src/datatypes.h | 22 ++++++++++++++---- src/libvirt.c | 29 ++++++++++++----------- src/remote/remote_driver.c | 57 +++++++++++++++++++++++++++------------------- 4 files changed, 112 insertions(+), 51 deletions(-)
So far, this is just a code review. I'm also planning on testing the patch, and will report back with results later in the day.
@@ -63,14 +65,19 @@ static void virStoragePoolDispose(void *obj); static int virDataTypesOnceInit(void) { -#define DECLARE_CLASS(basename) \ - if (!(basename ## Class = virClassNew(virClassForObject(), \ +#define DECLARE_CLASS_COMMON(basename, parent) \ + if (!(basename ## Class = virClassNew(parent, \ #basename, \ sizeof(basename), \ basename ## Dispose))) \ return -1; +#define DECLARE_CLASS(basename) \ + DECLARE_CLASS_COMMON(basename, virClassForObject()) +#define DECLARE_CLASS_LOCKABLE(basename) \ + DECLARE_CLASS_COMMON(basename, virClassForObjectLockable())
Wonder if these definitions are useful enough to rename to VIR_DECLARE_* and move into virobject.h. But that would be a separate patch, to adjust all clients that would benefit from it.
+++ b/src/datatypes.h @@ -93,6 +93,22 @@ extern virClassPtr virStoragePoolClass; # define VIR_IS_DOMAIN_SNAPSHOT(obj) \ (VIR_IS_SNAPSHOT(obj) && VIR_IS_DOMAIN((obj)->domain))
+ +typedef struct _virConnectCloseCallbackData virConnectCloseCallbackData; +typedef virConnectCloseCallbackData *virConnectCloseCallbackDataPtr; + +/** + * Internal structure holding data related to connection close callbacks. + */ +struct _virConnectCloseCallbackData { + virObjectLockable parent; + + virConnectPtr conn; + virConnectCloseFunc callback; + void *opaque; + virFreeCallback freeCallback; +};
Would it make sense to move this struct definition to be local to datatypes.c, and have all other uses of it treat it as opaque?...
+++ b/src/libvirt.c @@ -20189,24 +20189,27 @@ int virConnectRegisterCloseCallback(virConnectPtr conn, virObjectRef(conn);
virMutexLock(&conn->lock); + virObjectLock(conn->closeCallback);
virCheckNonNullArgGoto(cb, error);
- if (conn->closeCallback) { + if (conn->closeCallback->callback) {
...But then you would need to expose an accessor function instead of directly accessing closeCallback->callback. Okay, probably fine as is.
+++ b/src/remote/remote_driver.c @@ -337,32 +337,38 @@ enum virDrvOpenRemoteFlags { VIR_DRV_OPEN_REMOTE_AUTOSTART = (1 << 2), /* Autostart a per-user daemon */ };
+static void +remoteClientCloseFreeFunc(void *opaque) +{ + virConnectCloseCallbackDataPtr cbdata = opaque; + + virObjectUnref(cbdata); +}
You shouldn't need this; just use virObjectFreeCallback instead.
+remoteClientCloseFunc(virNetClientPtr client ATTRIBUTE_UNUSED, + int reason, + void *opaque) { - virConnectPtr conn = opaque; + virConnectCloseCallbackDataPtr cbdata = opaque;
- virMutexLock(&conn->lock); - if (conn->closeCallback) { - virConnectCloseFunc closeCallback = conn->closeCallback; - void *closeOpaque = conn->closeOpaque; - virFreeCallback closeFreeCallback = conn->closeFreeCallback; - unsigned closeUnregisterCount = conn->closeUnregisterCount; + virObjectLock(cbdata);
- VIR_DEBUG("Triggering connection close callback %p reason=%d", - conn->closeCallback, reason); - conn->closeDispatch = true; - virMutexUnlock(&conn->lock); - closeCallback(conn, reason, closeOpaque); - virMutexLock(&conn->lock); - conn->closeDispatch = false; - if (conn->closeUnregisterCount != closeUnregisterCount && - closeFreeCallback) - closeFreeCallback(closeOpaque); + if (cbdata->callback) { + VIR_DEBUG("Triggering connection close callback %p reason=%d, opaque=%p", + cbdata->callback, reason, cbdata->opaque); + cbdata->callback(cbdata->conn, reason, cbdata->opaque); + + if (cbdata->freeCallback) + cbdata->freeCallback(cbdata->opaque); + cbdata->callback = NULL; + cbdata->freeCallback = NULL; } - virMutexUnlock(&conn->lock); + virObjectUnlock(cbdata); + + /* free the connection reference that comes along with the callback + * registration */ + virObjectUnref(cbdata->conn); }
Took me a couple reads, but it looks right. The old code had to temporarily drop connection lock while using the callback; the new code never even grabs connection lock because all the callback data is encapsulated in a separate object.
/* helper macro to ease extraction of arguments from the URI */ @@ -765,9 +771,11 @@ doRemoteOpen(virConnectPtr conn, goto failed; }
+ virObjectRef(conn->closeCallback); + virNetClientSetCloseCallback(priv->client, remoteClientCloseFunc, - conn, NULL); + conn->closeCallback, remoteClientCloseFreeFunc);
Here's where virObjectFreeCallback comes in handy. On the surface, the code seems reasonable. If my testing passes, and you fix up the typos and the unneeded helper function, then I don't mind giving: ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org