On Thu, Apr 23, 2015 at 07:00:21PM +0800, zhang bo wrote:
The reason for the problem is that:
1 guestA locks vm while creating each tapDev(virNetDevTapCreate) in
qemuBuildCommandLine(), for 10seconds
2 guestB calls qemuMigrationPrepareAny->*virDomainObjListAdd* to get its vm object,
which locks 'doms'
and waits for the vm lock.
3 doms will be locked until guestA unlock its vm, we say that's 10 seconds.
4 guestC calls qemuDomainMigrateFinish3->virDomainObjListFindByName, which tries to
lock doms. because it's
Ok, this is the real core problem - FindByName has a bad impl that requires
iterating over every single guest. Unfortunately due to the design of the
migration API we can't avoid this call, but we could add a second hash table
of name -> virDomainObj so we make it O(1) and lock-less.
now locked by guestB, guestC blocks here, and it can't be
unpaused for at least 10 seconds.
5 then comes to guestD, guestE, guestF, etc, the downtime will be added up, to even 50
seconds or more.
6 the command 'virsh list' is blocked as well.
Thus, we think the problem must be solved.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|