Hi,
I started a VM on KVM environment(libvirt1.2.6 qemu1.5.1).
I found that the startup thread keeps vm lock too long.
And this would cause other VMs paused(both on src and dest) too long
during migration.
Steps to Reproduce:
1. Define and start three VMs(VMA, VMB, VMC) on source host
with 16 NICs for each. XML configuration for NIC:
<interface type='bridge'>
<source bridge='br0'/>
<model type='virtio'/>
</interface>
2. Migrate the three VMs from source host to destination host concurrently.
3. On destination host, the three VMs may do the following operation:
1) VMA: do qemuProcessStart(get VM lock until end) -> .. ->
virNetDevTapCreateInBridgePort(). This function costs time 0.28s per NIC,
and 16 NICs costs about 4s.
The following log shows time cost for creating NICs on my host. And it
seems that the time of creating NICs is different between hosts.
2014-07-03 08:40:41.283+0000: 47007: info : remoteDispatchAuthList:2781 : Bypass
polkit auth for privileged client pid:47635,uid:0
2014-07-03 08:40:41.285+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:41.560+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:41.852+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:42.144+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:42.464+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:42.756+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:43.076+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:43.372+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:43.680+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:43.972+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:44.268+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:44.560+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:44.720+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:44.804+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:44.888+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:45.040+0000: 47009: info : virNetDevProbeVnetHdr:203 : Enabling
IFF_VNET_HDR
2014-07-03 08:40:45.369+0000: 47009: warning : qemuDomainObjTaint:1670 : Domain id=7
name='suse11sp3_test_7' uuid=94c4ac0b-3a6a-41aa-b16c-80aa7adbc6b8 is tainted:
high-privileges
2014-07-03 08:40:45.708+0000: 47009: info : virSecurityDACSetOwnership:227 : Setting
DAC user and group on '/home/wcj/DTS/suse11sp3_test_7' to '0:0'
2) VMB: do qemuMigrationPrepareAny() -> virDomainObjListAdd(). This function
acquires
driver->doms lock. And then virHashSearch() waits for VMA's vm lock which is
hold
by VMA's qemuProcessStart() thread.
3) VMC: do qemuDomainMigrateFinish3 -> virDomainObjListFindByName.
This operation waits for driver->doms lock which is hold by VMB's
qemuMigrationPrepareAny() thread.
As VMC is in finish phase, it's in pause state on both source host and destination
host.
In the worst case, VMC may stay in pause state for about 4s during migration.
And the pause time increased if we migrate more VMs concurrently.
VirNetDevTapCreateInBridgePort() which holds vm lock costs a little long time.
IMHO it would be nice to do some lock optimization for this case. (I think it makes
little sense to optimize the time of creating net device. Because even though
the time is reduced to 0.1s, if vm has 16 NICs, the vm lock will be held for more
than 1.6s.)
Any ideas?