20.04.2016 14:08, Pavel Hrdina пишет:
On Sat, Apr 09, 2016 at 07:14:30PM +0300, Maxim Nestratov wrote:
> Below is backtraces of two deadlocked threads:
>
> thread #1:
> virDomainConfVMNWFilterTeardown
> virNWFilterTeardownFilter
> lock updateMutex <------------
> _virNWFilterTeardownFilter
> try to lock interface <----------
>
> thread #2:
> learnIPAddressThread
> lock interface <-------
> virNWFilterInstantiateFilterLate
> try to lock updateMutex <----------
>
> The problem is fixed by unlocking interface before calling
> virNWFilterInstantiateFilterLate to avoid updateMutex and interface ordering
> deadlocks. Otherwise we are going to instantiate the filter while holding
> interface lock, which will try to lock updateMutex, and if some other thread
> instantiating a filter in parallel is holding updateMutex and is trying to
> lock interface, both will deadlock.
> Also it is safe to unlock interface before virNWFilterInstantiateFilterLate
> because learnIPAddressThread stopped capturing packets and applied necessary
> rules on the interface, while instantiating a new filter doesn't require a
> locked interface.
>
> Signed-off-by: Maxim Nestratov <mnestratov(a)virtuozzo.com>
> ---
The nwfilter code is complex. This seems to be fixing a small corner case
because virDomainConfVMNWFilterTeardown should terminate that
learnIPAddressThread, but it doesn't wait until that thread is terminated. I'm
not sure, whether it's safe to unlock the iface. Do you have some reproducer
for this deadlock?
Pavel
No, I don't think it's only about a small corner case with
virDomainConfVMNWFilterTeardown
and learnIPAddressThread. It's more common case with
learnIPAddressThread and *any*
virNWFilterInstantiateFilter call. We would have had a corner case fix
if we just waited for
learnIPAddressThread completion in virDomainConfVMNWFilterTeardown.
I don't have a reproducer in a distributable form. The issue was caught
by our testing system
which perfomed a kind of stress start/stop test. A VM had a network
interface with
<filterref filter='no-mac-spoofing-no-ip-spoofing-no-promisc'/> section
without IP address
specified. For me it's a kind of exotic configuration, nevertheless
deadlock should not happen.
I don't insint on my approach but it seems pretty safe to call
virNWFilterInstantiateFilterLate
with unlocked interface just because the function itself locks it when
necessary.
Maxim