On Tue, Jan 26, 2016 at 01:31:25PM +0300, Maxim Nestratov wrote:
26.01.2016 12:29, Martin Kletzander пишет:
> On Mon, Jan 25, 2016 at 12:16:12PM +0300, Maxim Nestratov wrote:
>> A pretty nasty deadlock occurs while trying to rename a VM in parallel
>> with virDomainObjListNumOfDomains.
>> The short description of the problem is as follows:
>>
>> Thread #1:
>>
>> qemuDomainRename:
>> ------> aquires domain lock by qemuDomObjFromDomain
>> ---------> waits for domain list lock in any of the listed
>> functions:
>> - virDomainObjListFindByName
>> - virDomainObjListRenameAddNew
>> - virDomainObjListRenameRemove
>>
>> Thread #2:
>>
>> virDomainObjListNumOfDomains:
>> ------> aquires domain list lock
>> ---------> waits for domain lock in virDomainObjListCount
>>
>
> Ouch, the rename API should hold the list lock all the time it's doing
> something and not acquire it when any domain is locked.
>
> But you are duplicating a lot of code,
Not that much
I meant to edit that out after reading the whole e-mail, my apologies.
>
>> The patch establishes a single order of taking locks: driver->domains
>> list
>> first, then a particular VM. This is done by implementing a set of
>> virDomainObjListXxxLocked functions working with driver->domains that
>> assume
>> that list lock is already aquired by calling functions.
>>
>> Signed-off-by: Maxim Nestratov <mnestratov(a)virtuozzo.com>
>> ---
>> src/conf/virdomainobjlist.c | 74
>> +++++++++++++++++++++++++++++++++++++--------
>> src/conf/virdomainobjlist.h | 9 ++++++
>> src/libvirt_private.syms | 4 +++
>> src/qemu/qemu_driver.c | 40 +++++++++++++++++++++---
>> 4 files changed, 110 insertions(+), 17 deletions(-)
>>
>> diff --git a/src/conf/virdomainobjlist.c b/src/conf/virdomainobjlist.c
>> index 7568f93..89e28a8 100644
>> --- a/src/conf/virdomainobjlist.c
>> +++ b/src/conf/virdomainobjlist.c
>> @@ -132,21 +132,19 @@ virDomainObjPtr
>> virDomainObjListFindByID(virDomainObjListPtr doms,
>>
>>
>> static virDomainObjPtr
>> -virDomainObjListFindByUUIDInternal(virDomainObjListPtr doms,
>> +virDomainObjListFindByUUIDInternalLocked(virDomainObjListPtr doms,
>> const unsigned char *uuid,
>> bool ref)
>
> Indentation is off for renamed functions.
Ok. I'll fix it if we decide to go on with my approach.
>
>> {
>> char uuidstr[VIR_UUID_STRING_BUFLEN];
>> virDomainObjPtr obj;
>>
>> - virObjectLock(doms);
>> virUUIDFormat(uuid, uuidstr);
>>
>> obj = virHashLookup(doms->objs, uuidstr);
>> - if (ref) {
>> + if (ref)
>> virObjectRef(obj);
>> - virObjectUnlock(doms);
>> - }
>> +
>> if (obj) {
>> virObjectLock(obj);
>> if (obj->removing) {
>> @@ -156,8 +154,19 @@
>> virDomainObjListFindByUUIDInternal(virDomainObjListPtr doms,
>> obj = NULL;
>> }
>> }
>> - if (!ref)
>> - virObjectUnlock(doms);
>> + return obj;
>> +}
>> +
>> +static virDomainObjPtr
>> +virDomainObjListFindByUUIDInternal(virDomainObjListPtr doms,
>> + const unsigned char *uuid,
>> + bool ref)
>> +{
>> + virDomainObjPtr obj;
>> +
>> + virObjectLock(doms);
>> + obj = virDomainObjListFindByUUIDInternalLocked(doms, uuid, ref);
>> + virObjectUnlock(doms);
>
> This won't work for ref == true as in that case the unlock is not the
> last thing done in virDomainObjListFindByUUIDInternalLocked().
I still think that this won't cause any problem as far as list lock is
aquired.
Otherwise we would have seen them in virDomainObjListNumOfDomains
That's kind of why there's the difference between ref=true and
ref=false. The thing is that ref=false is prone to deadlocks even
though they are not as easy to reproduce. We'll see what the decision
will be.
Martin