[libvirt] [PATCH 1/14] Adding recursive locks

This patch adds recursive locks necessary due to the processing of network filter XML that can reference other network filters, including references that cause looks. Loops in the XML are prevented but their detection requires recursive locks.

On 03/18/2010 09:15 AM, Stefan Berger wrote:
This patch adds recursive locks necessary due to the processing of network filter XML that can reference other network filters, including references that cause looks. Loops in the XML are prevented but their detection requires recursive locks.
ACK. I had to double-check MSDN to make sure that threads-win32 already creates recursive mutex by default. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

On Thu, Mar 18, 2010 at 10:41:58AM -0600, Eric Blake wrote:
On 03/18/2010 09:15 AM, Stefan Berger wrote:
This patch adds recursive locks necessary due to the processing of network filter XML that can reference other network filters, including references that cause looks. Loops in the XML are prevented but their detection requires recursive locks.
ACK. I had to double-check MSDN to make sure that threads-win32 already creates recursive mutex by default.
I don't supposed you read if its possible to make it non-recursive, since the orginal impl is supposed to be non-recursive ! Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On 03/18/2010 10:51 AM, Daniel P. Berrange wrote:
On Thu, Mar 18, 2010 at 10:41:58AM -0600, Eric Blake wrote:
On 03/18/2010 09:15 AM, Stefan Berger wrote:
This patch adds recursive locks necessary due to the processing of network filter XML that can reference other network filters, including references that cause looks. Loops in the XML are prevented but their detection requires recursive locks.
ACK. I had to double-check MSDN to make sure that threads-win32 already creates recursive mutex by default.
I don't supposed you read if its possible to make it non-recursive, since the orginal impl is supposed to be non-recursive !
Unfortunately, I don't see a way: http://msdn.microsoft.com/en-us/library/ms684266%28VS.85%29.aspx states: After a thread obtains ownership of a mutex, it can specify the same mutex in repeated calls to the wait-functions without blocking its execution. This prevents a thread from deadlocking itself while waiting for a mutex that it already owns. To release its ownership under such circumstances, the thread must call ReleaseMutex once for each time that the mutex satisfied the conditions of a wait function. with no mention of any way to parameterize it to be non-recursive. But what does a true non-recursive mutex buy you? The only difference between recursive and true non-recursive is whether you declare that an attempt to relock a mutex that you already own is a fatal deadlock error, rather than incrementing a counter for matching unlocks. It's just that non-recursive mutexes typically have faster implementations. For that matter, do we even need the distinction? Maybe ALL our code should be using recursive mutexes by default, by changing virMutexInit to be recursive no matter what, and not worry about introducing virMutexInitRecursive. Looking more closely at virMutexInit in the pthreads version, we use pthread_mutex_init(,NULL), which requests PTHREAD_MUTEX_DEFAULT. But POSIX states: http://www.opengroup.org/onlinepubs/9699919799/functions/pthread_mutexattr_s... PTHREAD_MUTEX_DEFAULT Attempting to recursively lock a mutex of this type results in undefined behavior. Attempting to unlock a mutex of this type which was not locked by the calling thread results in undefined behavior. Attempting to unlock a mutex of this type which is not locked results in undefined behavior. An implementation may map this mutex to one of the other mutex types. That is, our current implementation of virMutexInit is NOT a true non-recursive mutex, so much as a mutex that is unspecified whether it is recursive or not. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

On 03/18/2010 11:04 AM, Eric Blake wrote:
But what does a true non-recursive mutex buy you? The only difference between recursive and true non-recursive is whether you declare that an attempt to relock a mutex that you already own is a fatal deadlock error, rather than incrementing a counter for matching unlocks. It's just that non-recursive mutexes typically have faster implementations.
Actually, it DOES buy something. virCondWait DEPENDS on getting a true non-recursive function (PTHREAD_MUTEX_NORMAL or PTHREAD_MUTEX_ERRORCHECK, although the latter has better guaranteed behavior in the case of deadlock), because POSIX is clear that: It is advised that an application should not use a PTHREAD_MUTEX_RECURSIVE mutex with condition variables because the implicit unlock performed for a pthread_cond_timedwait() or pthread_cond_wait() may not actually release the mutex (if it had been locked multiple times). If this happens, no other thread can satisfy the condition of the predicate.
For that matter, do we even need the distinction? Maybe ALL our code should be using recursive mutexes by default, by changing virMutexInit to be recursive no matter what, and not worry about introducing virMutexInitRecursive. Looking more closely at virMutexInit in the pthreads version, we use pthread_mutex_init(,NULL), which requests PTHREAD_MUTEX_DEFAULT.
That is, our current implementation of virMutexInit is NOT a true non-recursive mutex, so much as a mutex that is unspecified whether it is recursive or not.
And that means we have a bug in threads-pthread.c - we should be explicitly requesting a pthread_mutexattr with PTHREAD_MUTEX_ERRORCHECK rather than relying on NULL. On the other hand, the win32 virCondWait is coded to correctly deal with the fact that Windows mutexes are always recursive (that is, windows events do not suffer from the pthread deadlock where a cond_wait on a recursively-held mutex fails to release the mutex). So, in a way, while the pthread virMutexInit must be non-recursive, I don't see any problem with the win32 being recursive. But we do need the distinction between virMutexInit and virMutexInitRecursive for pthreads, even though the win32 implementation can be the same for both. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

On Thu, Mar 18, 2010 at 11:16:09AM -0600, Eric Blake wrote:
On 03/18/2010 11:04 AM, Eric Blake wrote:
But what does a true non-recursive mutex buy you? The only difference between recursive and true non-recursive is whether you declare that an attempt to relock a mutex that you already own is a fatal deadlock error, rather than incrementing a counter for matching unlocks. It's just that non-recursive mutexes typically have faster implementations.
Actually, it DOES buy something. virCondWait DEPENDS on getting a true non-recursive function (PTHREAD_MUTEX_NORMAL or PTHREAD_MUTEX_ERRORCHECK, although the latter has better guaranteed behavior in the case of deadlock), because POSIX is clear that:
It is advised that an application should not use a PTHREAD_MUTEX_RECURSIVE mutex with condition variables because the implicit unlock performed for a pthread_cond_timedwait() or pthread_cond_wait() may not actually release the mutex (if it had been locked multiple times). If this happens, no other thread can satisfy the condition of the predicate.
For that matter, do we even need the distinction? Maybe ALL our code should be using recursive mutexes by default, by changing virMutexInit to be recursive no matter what, and not worry about introducing virMutexInitRecursive. Looking more closely at virMutexInit in the pthreads version, we use pthread_mutex_init(,NULL), which requests PTHREAD_MUTEX_DEFAULT.
That is, our current implementation of virMutexInit is NOT a true non-recursive mutex, so much as a mutex that is unspecified whether it is recursive or not.
And that means we have a bug in threads-pthread.c - we should be explicitly requesting a pthread_mutexattr with PTHREAD_MUTEX_ERRORCHECK rather than relying on NULL.
No, we should set PTHREAD_MUTEX_NORMAL - we don't want it returning an error code on failure, because all our code assumes pthread_mutex_lock will not fail. Deadlock is what we want. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Thu, Mar 18, 2010 at 11:04:17AM -0600, Eric Blake wrote:
On 03/18/2010 10:51 AM, Daniel P. Berrange wrote:
On Thu, Mar 18, 2010 at 10:41:58AM -0600, Eric Blake wrote:
On 03/18/2010 09:15 AM, Stefan Berger wrote:
This patch adds recursive locks necessary due to the processing of network filter XML that can reference other network filters, including references that cause looks. Loops in the XML are prevented but their detection requires recursive locks.
ACK. I had to double-check MSDN to make sure that threads-win32 already creates recursive mutex by default.
I don't supposed you read if its possible to make it non-recursive, since the orginal impl is supposed to be non-recursive !
Unfortunately, I don't see a way:
http://msdn.microsoft.com/en-us/library/ms684266%28VS.85%29.aspx
states:
After a thread obtains ownership of a mutex, it can specify the same mutex in repeated calls to the wait-functions without blocking its execution. This prevents a thread from deadlocking itself while waiting for a mutex that it already owns. To release its ownership under such circumstances, the thread must call ReleaseMutex once for each time that the mutex satisfied the conditions of a wait function.
with no mention of any way to parameterize it to be non-recursive.
But what does a true non-recursive mutex buy you? The only difference between recursive and true non-recursive is whether you declare that an attempt to relock a mutex that you already own is a fatal deadlock error, rather than incrementing a counter for matching unlocks. It's just that non-recursive mutexes typically have faster implementations.
For that matter, do we even need the distinction? Maybe ALL our code should be using recursive mutexes by default, by changing virMutexInit to be recursive no matter what, and not worry about introducing virMutexInitRecursive.
Having the mutex be non-recursive is a good thing, because more or less any code which causes recursion with our thread locking rules is broken by design and/or impl. Thus any deadlocks we encounter from non-recursive mutexes are highlighting bugs we'd not otherwise see. I'm not even convinced that this network filter stuff should need to have recursive mutexes, but we can leave that to address another day Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On 03/18/2010 01:04 PM, Eric Blake wrote:
On 03/18/2010 10:51 AM, Daniel P. Berrange wrote:
On Thu, Mar 18, 2010 at 10:41:58AM -0600, Eric Blake wrote:
On 03/18/2010 09:15 AM, Stefan Berger wrote:
This patch adds recursive locks necessary due to the processing of network filter XML that can reference other network filters, including references that cause looks. Loops in the XML are prevented but their detection requires recursive locks.
ACK. I had to double-check MSDN to make sure that threads-win32 already creates recursive mutex by default.
I don't supposed you read if its possible to make it non-recursive, since the orginal impl is supposed to be non-recursive !
Unfortunately, I don't see a way:
http://msdn.microsoft.com/en-us/library/ms684266%28VS.85%29.aspx
states:
After a thread obtains ownership of a mutex, it can specify the same mutex in repeated calls to the wait-functions without blocking its execution. This prevents a thread from deadlocking itself while waiting for a mutex that it already owns. To release its ownership under such circumstances, the thread must call ReleaseMutex once for each time that the mutex satisfied the conditions of a wait function.
with no mention of any way to parameterize it to be non-recursive.
But what does a true non-recursive mutex buy you? The only difference between recursive and true non-recursive is whether you declare that an attempt to relock a mutex that you already own is a fatal deadlock error, rather than incrementing a counter for matching unlocks. It's just that non-recursive mutexes typically have faster implementations.
For that matter, do we even need the distinction? Maybe ALL our code should be using recursive mutexes by default, by changing virMutexInit to be recursive no matter what, and not worry about introducing virMutexInitRecursive.
Not speaking in particular about libvirt code, but in general a non-recursive mutex can protect you against accidentally modifying a data structure inside a function that's called by some other function that's in the middle of modifying the same data structure. So it's useful not for any sort of concurrency resolution, but as an assertion (very important IMO) on top of the normal uses of a recursive lock. Since any occurrence of a non-recursive lock failing due to the lock already being held by the same thread will, by definition, result in a dead-lock, we could achieve the same thing (with better error reporting) in the case of Windows by adding a simple atomic counter that's incremented/decremented along with the lock, and logs an error message (and optionally somehow attempts to abort the operation?) if the counter ever goes higher than 1.
Looking more closely at virMutexInit in the pthreads version, we use pthread_mutex_init(,NULL), which requests PTHREAD_MUTEX_DEFAULT. But POSIX states:
http://www.opengroup.org/onlinepubs/9699919799/functions/pthread_mutexattr_s...
PTHREAD_MUTEX_DEFAULT
Attempting to recursively lock a mutex of this type results in undefined behavior. Attempting to unlock a mutex of this type which was not locked by the calling thread results in undefined behavior. Attempting to unlock a mutex of this type which is not locked results in undefined behavior. An implementation may map this mutex to one of the other mutex types.
That is, our current implementation of virMutexInit is NOT a true non-recursive mutex, so much as a mutex that is unspecified whether it is recursive or not.
Eww. That seems a bit problematic. This has been a very productive discussion, eh? ;-)

On 03/18/2010 06:04 PM, Eric Blake wrote:
ACK. I had to double-check MSDN to make sure that threads-win32 already creates recursive mutex by default.
I don't supposed you read if its possible to make it non-recursive, since the orginal impl is supposed to be non-recursive !
Unfortunately, I don't see a way:
You use an event instead (automatically reset, initially set): hEvent = CreateEvent (NULL, FALSE, TRUE, NULL); To acquire the mutex you use WaitForSingleObject, to release the mutex you use SetEvent. Paolo

"Daniel P. Berrange" <berrange@redhat.com> wrote on 03/18/2010 12:51:55 PM:
Please respond to "Daniel P. Berrange"
On Thu, Mar 18, 2010 at 10:41:58AM -0600, Eric Blake wrote:
On 03/18/2010 09:15 AM, Stefan Berger wrote:
This patch adds recursive locks necessary due to the processing of network filter XML that can reference other network filters,
references that cause looks. Loops in the XML are prevented but
including their
detection requires recursive locks.
ACK. I had to double-check MSDN to make sure that threads-win32 already creates recursive mutex by default.
I don't supposed you read if its possible to make it non-recursive, since the orginal impl is supposed to be non-recursive !
From what I remember from some msdn page is that the win32 mutexes are all recursive. So, the call that I made for creating a recursive mutex by calling the virMutexInit() function is not correct -- it should be the other way around... Something to fix some other day ?
Stefan
Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/:| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org:| |: http://autobuild.org -o- http://search.cpan.org/~danberr/:| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (5)
-
Daniel P. Berrange
-
Eric Blake
-
Laine Stump
-
Paolo Bonzini
-
Stefan Berger