
On Thu, Apr 23, 2015 at 07:00:21PM +0800, zhang bo wrote:
The reason for the problem is that: 1 guestA locks vm while creating each tapDev(virNetDevTapCreate) in qemuBuildCommandLine(), for 10seconds 2 guestB calls qemuMigrationPrepareAny->*virDomainObjListAdd* to get its vm object, which locks 'doms' and waits for the vm lock. 3 doms will be locked until guestA unlock its vm, we say that's 10 seconds. 4 guestC calls qemuDomainMigrateFinish3->virDomainObjListFindByName, which tries to lock doms. because it's
Ok, this is the real core problem - FindByName has a bad impl that requires iterating over every single guest. Unfortunately due to the design of the migration API we can't avoid this call, but we could add a second hash table of name -> virDomainObj so we make it O(1) and lock-less.
now locked by guestB, guestC blocks here, and it can't be unpaused for at least 10 seconds. 5 then comes to guestD, guestE, guestF, etc, the downtime will be added up, to even 50 seconds or more. 6 the command 'virsh list' is blocked as well.
Thus, we think the problem must be solved.
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|