On 10/13/2010 09:07 AM, Daniel P. Berrange wrote:
On Wed, Oct 06, 2010 at 05:56:36PM -0400, Stefan Berger wrote:
> V2:
> remove the locks from qemudVMFilterRebuild& umlVMFilterRebuild
>
> This is from a bug report and conversation on IRC where Soren reported
> that while a filter update is occurring on one or more VMs (due to a
> rule having been edited for example), a deadlock can occur when a VM
> referencing a filter is started.
>
> The problem is caused by the two locking sequences of
>
> qemu driver, qemu domain, filter # for the VM start operation
> filter, qemu_driver, qemu_domain # for the filter update
> operation
>
> that obviously don't lock in the same order. The problem is the 2nd lock
> sequence. Here the qemu_driver lock is being grabbed in
> qemu_driver:qemudVMFilterRebuild()
>
> The following solution is based on the idea of trying to re-arrange the
> 2nd sequence of locks as follows:
>
> qemu_driver, filter, qemu_driver, qemu_domain
>
> and making the qemu driver recursively lockable so that a second lock
> can occur, this would then lead to the following net-locking sequence
>
> qemu_driver, filter, qemu_domain
>
> where the 2nd qemu_driver lock has been ( logically ) eliminated.
>
> The 2nd part of the idea is that the sequence of locks (filter,
> qemu_domain) and (qemu_domain, filter) becomes interchangeable if all
> code paths where filter AND qemu_domain are locked have a preceding
> qemu_domain lock that basically blocks their concurrent execution
I'm not entirely happy with this locking scheme, because the
broken lock ordering problem is still present. We're just
relying on the fact that we retain the global lock to protect
us against the ongoing lock ordering issue.
That said, I don't see any immediately obvious alternative
to solve this problem. So I guess I'll ack this approach
for now. One day this will likely need re-visiting...
Ok, I'll push it.
Stefan