
20.04.2016 14:08, Pavel Hrdina пишет:
On Sat, Apr 09, 2016 at 07:14:30PM +0300, Maxim Nestratov wrote:
Below is backtraces of two deadlocked threads:
thread #1: virDomainConfVMNWFilterTeardown virNWFilterTeardownFilter lock updateMutex <------------ _virNWFilterTeardownFilter try to lock interface <----------
thread #2: learnIPAddressThread lock interface <------- virNWFilterInstantiateFilterLate try to lock updateMutex <----------
The problem is fixed by unlocking interface before calling virNWFilterInstantiateFilterLate to avoid updateMutex and interface ordering deadlocks. Otherwise we are going to instantiate the filter while holding interface lock, which will try to lock updateMutex, and if some other thread instantiating a filter in parallel is holding updateMutex and is trying to lock interface, both will deadlock. Also it is safe to unlock interface before virNWFilterInstantiateFilterLate because learnIPAddressThread stopped capturing packets and applied necessary rules on the interface, while instantiating a new filter doesn't require a locked interface.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com> --- The nwfilter code is complex. This seems to be fixing a small corner case because virDomainConfVMNWFilterTeardown should terminate that learnIPAddressThread, but it doesn't wait until that thread is terminated. I'm not sure, whether it's safe to unlock the iface. Do you have some reproducer for this deadlock?
Pavel
No, I don't think it's only about a small corner case with virDomainConfVMNWFilterTeardown and learnIPAddressThread. It's more common case with learnIPAddressThread and *any* virNWFilterInstantiateFilter call. We would have had a corner case fix if we just waited for learnIPAddressThread completion in virDomainConfVMNWFilterTeardown. I don't have a reproducer in a distributable form. The issue was caught by our testing system which perfomed a kind of stress start/stop test. A VM had a network interface with <filterref filter='no-mac-spoofing-no-ip-spoofing-no-promisc'/> section without IP address specified. For me it's a kind of exotic configuration, nevertheless deadlock should not happen. I don't insint on my approach but it seems pretty safe to call virNWFilterInstantiateFilterLate with unlocked interface just because the function itself locks it when necessary. Maxim