On Thu, Apr 23, 2015 at 14:02:25 +0200, Michal Privoznik wrote:
On 23.04.2015 13:32, zhang bo wrote:
> On 2015/4/23 19:06, Daniel P. Berrange wrote:
>
>> On Thu, Apr 23, 2015 at 07:00:21PM +0800, zhang bo wrote:
>>> The reason for the problem is that:
>>> 1 guestA locks vm while creating each tapDev(virNetDevTapCreate) in
qemuBuildCommandLine(), for 10seconds
>>> 2 guestB calls qemuMigrationPrepareAny->*virDomainObjListAdd* to get its
vm object, which locks 'doms'
>>> and waits for the vm lock.
>>> 3 doms will be locked until guestA unlock its vm, we say that's 10
seconds.
>>> 4 guestC calls qemuDomainMigrateFinish3->virDomainObjListFindByName,
which tries to lock doms. because it's
>>
>> Ok, this is the real core problem - FindByName has a bad impl that requires
>> iterating over every single guest. Unfortunately due to the design of the
>> migration API we can't avoid this call, but we could add a second hash
table
>> of name -> virDomainObj so we make it O(1) and lock-less.
>>
Agreed. This will solve more than this.
>
> I got a question: shall we add an object (similar to doms) and lock it while
searching the vm
> in the new hash table? If so, the problem may still exist.
No. Currently, we use a hash table, where domain's UUID is the key, and
pointer to internal domain representation (or domain object as it's
called) is the value. Looking up a key in hash table is O(1).
However, if we look up a domain by name, it's practically lookup by
value, because domain name is stored in the domain object. So we have to
iterate over all hash table items, and lock each one just to compare.
However, if an object is locked already, we must wait until it's
unlocked again. During this wait we still hold the hash table lock.
I don't think this is trivial though. So let me see if I can provide
some patches.
A simple workaround currently would be to iterate the hash table and
store domain pointers that you increase the reference count on in a new
array, then drop the hash table lock and start iterating it after you
drop it.
That's the way I'm planing on fixing the listAllDomains API which has
the same problem.
Michal
Peter