On 2015/4/23 19:06, Daniel P. Berrange wrote:
On Thu, Apr 23, 2015 at 07:00:21PM +0800, zhang bo wrote:
> The reason for the problem is that:
> 1 guestA locks vm while creating each tapDev(virNetDevTapCreate) in
qemuBuildCommandLine(), for 10seconds
> 2 guestB calls qemuMigrationPrepareAny->*virDomainObjListAdd* to get its vm
object, which locks 'doms'
> and waits for the vm lock.
> 3 doms will be locked until guestA unlock its vm, we say that's 10 seconds.
> 4 guestC calls qemuDomainMigrateFinish3->virDomainObjListFindByName, which tries
to lock doms. because it's
Ok, this is the real core problem - FindByName has a bad impl that requires
iterating over every single guest. Unfortunately due to the design of the
migration API we can't avoid this call, but we could add a second hash table
of name -> virDomainObj so we make it O(1) and lock-less.
I got a question: shall we add an object (similar to doms) and lock it while searching the
vm
in the new hash table? If so, the problem may still exist.
> now locked by guestB, guestC blocks here, and it can't be
unpaused for at least 10 seconds.
> 5 then comes to guestD, guestE, guestF, etc, the downtime will be added up, to even
50 seconds or more.
> 6 the command 'virsh list' is blocked as well.
>
> Thus, we think the problem must be solved.
Regards,
Daniel