[libvirt] [PATCH] qemu: lock: unlock vm during qemuBuildCommandLine

The function qemuBuildCommandLine() may take a long time, for example if we configure tens of vifs for the guest, each may cost hundrands of milliseconds to create tap dev, senconds in total. Thus, unlock vm before calling it. Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Zhou Yimin <zhouyimin@huawei.com> --- src/qemu/qemu_process.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 753afe8..d1aaaec 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4628,14 +4628,18 @@ int qemuProcessStart(virConnectPtr conn, } VIR_DEBUG("Building emulator command line"); + virObjectUnlock(vm); if (!(cmd = qemuBuildCommandLine(conn, driver, vm->def, priv->monConfig, priv->monJSON, priv->qemuCaps, migrateFrom, stdin_fd, snapshot, vmop, &buildCommandLineCallbacks, false, qemuCheckFips(), priv->autoNodeset, - &nnicindexes, &nicindexes))) + &nnicindexes, &nicindexes))) { + virObjectLock(vm); goto cleanup; + } + virObjectLock(vm); /* now that we know it is about to start call the hook if present */ if (virHookPresent(VIR_HOOK_DRIVER_QEMU)) { -- 1.7.12.4

On Thu, Apr 23, 2015 at 11:44:49 +0800, zhang bo wrote:
The function qemuBuildCommandLine() may take a long time, for example if we configure tens of vifs for the guest, each may cost hundrands of milliseconds to create tap dev, senconds in total. Thus, unlock vm before calling it.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Zhou Yimin <zhouyimin@huawei.com> --- src/qemu/qemu_process.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 753afe8..d1aaaec 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4628,14 +4628,18 @@ int qemuProcessStart(virConnectPtr conn, }
VIR_DEBUG("Building emulator command line"); + virObjectUnlock(vm); if (!(cmd = qemuBuildCommandLine(conn, driver, vm->def, priv->monConfig, priv->monJSON, priv->qemuCaps, migrateFrom, stdin_fd, snapshot, vmop, &buildCommandLineCallbacks, false, qemuCheckFips(), priv->autoNodeset, - &nnicindexes, &nicindexes))) + &nnicindexes, &nicindexes))) { + virObjectLock(vm); goto cleanup; + } + virObjectLock(vm);
Why do you need to unlock the object? The VM is starting at this point so you won't be able to save any time since APIs will either be blocked by a job or by the fact that the VM was not started. Would you care to explain what are the benefits of doing this? NACK unless you provide a convincing example where this would help. Peter

On 23.04.2015 10:30, Peter Krempa wrote:
On Thu, Apr 23, 2015 at 11:44:49 +0800, zhang bo wrote:
The function qemuBuildCommandLine() may take a long time, for example if we configure tens of vifs for the guest, each may cost hundrands of milliseconds to create tap dev, senconds in total. Thus, unlock vm before calling it.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Zhou Yimin <zhouyimin@huawei.com> --- src/qemu/qemu_process.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 753afe8..d1aaaec 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4628,14 +4628,18 @@ int qemuProcessStart(virConnectPtr conn, }
VIR_DEBUG("Building emulator command line"); + virObjectUnlock(vm); if (!(cmd = qemuBuildCommandLine(conn, driver, vm->def, priv->monConfig, priv->monJSON, priv->qemuCaps, migrateFrom, stdin_fd, snapshot, vmop, &buildCommandLineCallbacks, false, qemuCheckFips(), priv->autoNodeset, - &nnicindexes, &nicindexes))) + &nnicindexes, &nicindexes))) { + virObjectLock(vm); goto cleanup; + } + virObjectLock(vm);
Why do you need to unlock the object? The VM is starting at this point so you won't be able to save any time since APIs will either be blocked by a job or by the fact that the VM was not started.
Not true. We still allow QUERY jobs, or APIs that lock a domain object but don't necessarily take a job. For instance, if in one thread I'm doing virDomainGetOSType(), virDomainGetInfo(), ... (essentially 'virsh dominfo') while the other just starts the domain. I don't see a reason why virDomainGetOSType() should wait for the domain to be started up. Domain state should have no affect on the guest OS type, should it? On the other hand, I don't think we can just lock and unlock the domain object as we please. qemuBuildCommandLine is a very complex function and as such it calls many others. Some of them may rely on the fact, that the object is locked by caller. Michal

On Thu, Apr 23, 2015 at 11:19:34AM +0200, Michal Privoznik wrote:
On 23.04.2015 10:30, Peter Krempa wrote:
On Thu, Apr 23, 2015 at 11:44:49 +0800, zhang bo wrote:
The function qemuBuildCommandLine() may take a long time, for example if we configure tens of vifs for the guest, each may cost hundrands of milliseconds to create tap dev, senconds in total. Thus, unlock vm before calling it.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Zhou Yimin <zhouyimin@huawei.com> --- src/qemu/qemu_process.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 753afe8..d1aaaec 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4628,14 +4628,18 @@ int qemuProcessStart(virConnectPtr conn, }
VIR_DEBUG("Building emulator command line"); + virObjectUnlock(vm); if (!(cmd = qemuBuildCommandLine(conn, driver, vm->def, priv->monConfig, priv->monJSON, priv->qemuCaps, migrateFrom, stdin_fd, snapshot, vmop, &buildCommandLineCallbacks, false, qemuCheckFips(), priv->autoNodeset, - &nnicindexes, &nicindexes))) + &nnicindexes, &nicindexes))) { + virObjectLock(vm); goto cleanup; + } + virObjectLock(vm);
Why do you need to unlock the object? The VM is starting at this point so you won't be able to save any time since APIs will either be blocked by a job or by the fact that the VM was not started.
Not true. We still allow QUERY jobs, or APIs that lock a domain object but don't necessarily take a job. For instance, if in one thread I'm doing virDomainGetOSType(), virDomainGetInfo(), ... (essentially 'virsh dominfo') while the other just starts the domain. I don't see a reason why virDomainGetOSType() should wait for the domain to be started up. Domain state should have no affect on the guest OS type, should it?
On the other hand, I don't think we can just lock and unlock the domain object as we please. qemuBuildCommandLine is a very complex function and as such it calls many others. Some of them may rely on the fact, that the object is locked by caller.
Yeah, I'm pretty wary of allowing unlocking of the config while in the build command line function. I think I'd rather than it were refactored so that it was always fast and has no side effects on external system state. ie, we should allocate all the TAP devices upfront in a separate function, and then just pass in the list of TAP device file descriptors to qemuBuildCommandLine so it can generate the CLI args fast. Then we can reliably determine that the code which allocates TAP devices is safe to run unlocked. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Apr 23, 2015 at 11:19:34 +0200, Michal Privoznik wrote:
On 23.04.2015 10:30, Peter Krempa wrote:
On Thu, Apr 23, 2015 at 11:44:49 +0800, zhang bo wrote:
The function qemuBuildCommandLine() may take a long time, for example if we configure tens of vifs for the guest, each may cost hundrands of milliseconds to create tap dev, senconds in total. Thus, unlock vm before calling it.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Zhou Yimin <zhouyimin@huawei.com> --- src/qemu/qemu_process.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 753afe8..d1aaaec 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4628,14 +4628,18 @@ int qemuProcessStart(virConnectPtr conn, }
VIR_DEBUG("Building emulator command line"); + virObjectUnlock(vm); if (!(cmd = qemuBuildCommandLine(conn, driver, vm->def, priv->monConfig, priv->monJSON, priv->qemuCaps, migrateFrom, stdin_fd, snapshot, vmop, &buildCommandLineCallbacks, false, qemuCheckFips(), priv->autoNodeset, - &nnicindexes, &nicindexes))) + &nnicindexes, &nicindexes))) { + virObjectLock(vm); goto cleanup; + } + virObjectLock(vm);
Why do you need to unlock the object? The VM is starting at this point so you won't be able to save any time since APIs will either be blocked by a job or by the fact that the VM was not started.
Not true. We still allow QUERY jobs, or APIs that lock a domain object but don't necessarily take a job. For instance, if in one thread I'm doing virDomainGetOSType(), virDomainGetInfo(), ... (essentially 'virsh dominfo') while the other just starts the domain. I don't see a reason why virDomainGetOSType() should wait for the domain to be started up. Domain state should have no affect on the guest OS type, should it?
OK that is technically right. I wanted to point out that most of the stats are available only for online machines or it shouldn't be much of a problem to dealy the delivery. Your example certainly doesn't illustrate why the delay to format the command line should be a problem when using libvirt. Or are you polling virDomainGetOSType every millisecond? I am curious to see why the delay would be a problem.
On the other hand, I don't think we can just lock and unlock the domain object as we please. qemuBuildCommandLine is a very complex function and as such it calls many others. Some of them may rely on the fact, that the object is locked by caller.
Well, you definitely can't since almost everything in there is accessing vm->def. Locking semantics would be broken right in the line after @vm was unlocked by dereferencing it. Peter

On 23.04.2015 11:32, Peter Krempa wrote:
On Thu, Apr 23, 2015 at 11:19:34 +0200, Michal Privoznik wrote:
On 23.04.2015 10:30, Peter Krempa wrote:
On Thu, Apr 23, 2015 at 11:44:49 +0800, zhang bo wrote:
The function qemuBuildCommandLine() may take a long time, for example if we configure tens of vifs for the guest, each may cost hundrands of milliseconds to create tap dev, senconds in total. Thus, unlock vm before calling it.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Zhou Yimin <zhouyimin@huawei.com> --- src/qemu/qemu_process.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 753afe8..d1aaaec 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4628,14 +4628,18 @@ int qemuProcessStart(virConnectPtr conn, }
VIR_DEBUG("Building emulator command line"); + virObjectUnlock(vm); if (!(cmd = qemuBuildCommandLine(conn, driver, vm->def, priv->monConfig, priv->monJSON, priv->qemuCaps, migrateFrom, stdin_fd, snapshot, vmop, &buildCommandLineCallbacks, false, qemuCheckFips(), priv->autoNodeset, - &nnicindexes, &nicindexes))) + &nnicindexes, &nicindexes))) { + virObjectLock(vm); goto cleanup; + } + virObjectLock(vm);
Why do you need to unlock the object? The VM is starting at this point so you won't be able to save any time since APIs will either be blocked by a job or by the fact that the VM was not started.
Not true. We still allow QUERY jobs, or APIs that lock a domain object but don't necessarily take a job. For instance, if in one thread I'm doing virDomainGetOSType(), virDomainGetInfo(), ... (essentially 'virsh dominfo') while the other just starts the domain. I don't see a reason why virDomainGetOSType() should wait for the domain to be started up. Domain state should have no affect on the guest OS type, should it?
OK that is technically right. I wanted to point out that most of the stats are available only for online machines or it shouldn't be much of a problem to dealy the delivery.
Your example certainly doesn't illustrate why the delay to format the command line should be a problem when using libvirt. Or are you polling virDomainGetOSType every millisecond?
I am curious to see why the delay would be a problem.
Yep, I'm too. So far we don't really care much about libvirt response time (otherwise our critical sections would be much shorter), but maybe it's an issue for somebody.
On the other hand, I don't think we can just lock and unlock the domain object as we please. qemuBuildCommandLine is a very complex function and as such it calls many others. Some of them may rely on the fact, that the object is locked by caller.
Well, you definitely can't since almost everything in there is accessing vm->def. Locking semantics would be broken right in the line after @vm was unlocked by dereferencing it.
Well, anything that would change a domain definition should grap a MODIFY job. But such jobs are serialized, so even if we unlock the domain we should be okay to still access vm->def. What I am more worried about are all the small functions that interact with system or something. Currently they are serialized by @vm lock, but one we remove it ... Michal

On 2015/4/23 17:46, Michal Privoznik wrote:
On 23.04.2015 11:32, Peter Krempa wrote:
On Thu, Apr 23, 2015 at 11:19:34 +0200, Michal Privoznik wrote:
On 23.04.2015 10:30, Peter Krempa wrote:
On Thu, Apr 23, 2015 at 11:44:49 +0800, zhang bo wrote:
The function qemuBuildCommandLine() may take a long time, for example if we configure tens of vifs for the guest, each may cost hundrands of milliseconds to create tap dev, senconds in total. Thus, unlock vm before calling it.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Zhou Yimin <zhouyimin@huawei.com> --- src/qemu/qemu_process.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 753afe8..d1aaaec 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4628,14 +4628,18 @@ int qemuProcessStart(virConnectPtr conn, }
VIR_DEBUG("Building emulator command line"); + virObjectUnlock(vm); if (!(cmd = qemuBuildCommandLine(conn, driver, vm->def, priv->monConfig, priv->monJSON, priv->qemuCaps, migrateFrom, stdin_fd, snapshot, vmop, &buildCommandLineCallbacks, false, qemuCheckFips(), priv->autoNodeset, - &nnicindexes, &nicindexes))) + &nnicindexes, &nicindexes))) { + virObjectLock(vm); goto cleanup; + } + virObjectLock(vm);
Why do you need to unlock the object? The VM is starting at this point so you won't be able to save any time since APIs will either be blocked by a job or by the fact that the VM was not started.
Not true. We still allow QUERY jobs, or APIs that lock a domain object but don't necessarily take a job. For instance, if in one thread I'm doing virDomainGetOSType(), virDomainGetInfo(), ... (essentially 'virsh dominfo') while the other just starts the domain. I don't see a reason why virDomainGetOSType() should wait for the domain to be started up. Domain state should have no affect on the guest OS type, should it?
OK that is technically right. I wanted to point out that most of the stats are available only for online machines or it shouldn't be much of a problem to dealy the delivery.
Your example certainly doesn't illustrate why the delay to format the command line should be a problem when using libvirt. Or are you polling virDomainGetOSType every millisecond?
I am curious to see why the delay would be a problem.
Yep, I'm too. So far we don't really care much about libvirt response time (otherwise our critical sections would be much shorter), but maybe it's an issue for somebody.
The specific semantic is: migrating multiple guests simultaneously, downtime of each guest will add up, even to an unacceptable value. 1 suppose the downtime is 2 seconds when we migrate only 1 guest at one time. 2 suppose it costs 0.5sec to create each tapDev, and each guest has 20 vifs, that's 10 seconds for qemuBuildCommandLine. 3 now, we migrate 10 guest simultaneously, the downtime of the guests will vary from seconds to even 100 seconds. The reason for the problem is that: 1 guestA locks vm while creating each tapDev(virNetDevTapCreate) in qemuBuildCommandLine(), for 10seconds 2 guestB calls qemuMigrationPrepareAny->*virDomainObjListAdd* to get its vm object, which locks 'doms' and waits for the vm lock. 3 doms will be locked until guestA unlock its vm, we say that's 10 seconds. 4 guestC calls qemuDomainMigrateFinish3->virDomainObjListFindByName, which tries to lock doms. because it's now locked by guestB, guestC blocks here, and it can't be unpaused for at least 10 seconds. 5 then comes to guestD, guestE, guestF, etc, the downtime will be added up, to even 50 seconds or more. 6 the command 'virsh list' is blocked as well. Thus, we think the problem must be solved.
On the other hand, I don't think we can just lock and unlock the domain object as we please. qemuBuildCommandLine is a very complex function and as such it calls many others. Some of them may rely on the fact, that the object is locked by caller.
Well, you definitely can't since almost everything in there is accessing vm->def. Locking semantics would be broken right in the line after @vm was unlocked by dereferencing it.
Well, anything that would change a domain definition should grap a MODIFY job. But such jobs are serialized, so even if we unlock the domain we should be okay to still access vm->def. What I am more worried about are all the small functions that interact with system or something. Currently they are serialized by @vm lock, but one we remove it ...
Michal
.
After the discussion above, maybe it's better to move virNetDevTapCreate() prior to qemuBuildCommandLine(), what do you think about that? If that's ok, I'd like to apply patchV2 here.

On Thu, Apr 23, 2015 at 07:00:21PM +0800, zhang bo wrote:
The reason for the problem is that: 1 guestA locks vm while creating each tapDev(virNetDevTapCreate) in qemuBuildCommandLine(), for 10seconds 2 guestB calls qemuMigrationPrepareAny->*virDomainObjListAdd* to get its vm object, which locks 'doms' and waits for the vm lock. 3 doms will be locked until guestA unlock its vm, we say that's 10 seconds. 4 guestC calls qemuDomainMigrateFinish3->virDomainObjListFindByName, which tries to lock doms. because it's
Ok, this is the real core problem - FindByName has a bad impl that requires iterating over every single guest. Unfortunately due to the design of the migration API we can't avoid this call, but we could add a second hash table of name -> virDomainObj so we make it O(1) and lock-less.
now locked by guestB, guestC blocks here, and it can't be unpaused for at least 10 seconds. 5 then comes to guestD, guestE, guestF, etc, the downtime will be added up, to even 50 seconds or more. 6 the command 'virsh list' is blocked as well.
Thus, we think the problem must be solved.
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 2015/4/23 19:06, Daniel P. Berrange wrote:
On Thu, Apr 23, 2015 at 07:00:21PM +0800, zhang bo wrote:
The reason for the problem is that: 1 guestA locks vm while creating each tapDev(virNetDevTapCreate) in qemuBuildCommandLine(), for 10seconds 2 guestB calls qemuMigrationPrepareAny->*virDomainObjListAdd* to get its vm object, which locks 'doms' and waits for the vm lock. 3 doms will be locked until guestA unlock its vm, we say that's 10 seconds. 4 guestC calls qemuDomainMigrateFinish3->virDomainObjListFindByName, which tries to lock doms. because it's
Ok, this is the real core problem - FindByName has a bad impl that requires iterating over every single guest. Unfortunately due to the design of the migration API we can't avoid this call, but we could add a second hash table of name -> virDomainObj so we make it O(1) and lock-less.
I got a question: shall we add an object (similar to doms) and lock it while searching the vm in the new hash table? If so, the problem may still exist.
now locked by guestB, guestC blocks here, and it can't be unpaused for at least 10 seconds. 5 then comes to guestD, guestE, guestF, etc, the downtime will be added up, to even 50 seconds or more. 6 the command 'virsh list' is blocked as well.
Thus, we think the problem must be solved.
Regards, Daniel

On 23.04.2015 13:32, zhang bo wrote:
On 2015/4/23 19:06, Daniel P. Berrange wrote:
On Thu, Apr 23, 2015 at 07:00:21PM +0800, zhang bo wrote:
The reason for the problem is that: 1 guestA locks vm while creating each tapDev(virNetDevTapCreate) in qemuBuildCommandLine(), for 10seconds 2 guestB calls qemuMigrationPrepareAny->*virDomainObjListAdd* to get its vm object, which locks 'doms' and waits for the vm lock. 3 doms will be locked until guestA unlock its vm, we say that's 10 seconds. 4 guestC calls qemuDomainMigrateFinish3->virDomainObjListFindByName, which tries to lock doms. because it's
Ok, this is the real core problem - FindByName has a bad impl that requires iterating over every single guest. Unfortunately due to the design of the migration API we can't avoid this call, but we could add a second hash table of name -> virDomainObj so we make it O(1) and lock-less.
Agreed. This will solve more than this.
I got a question: shall we add an object (similar to doms) and lock it while searching the vm in the new hash table? If so, the problem may still exist.
No. Currently, we use a hash table, where domain's UUID is the key, and pointer to internal domain representation (or domain object as it's called) is the value. Looking up a key in hash table is O(1). However, if we look up a domain by name, it's practically lookup by value, because domain name is stored in the domain object. So we have to iterate over all hash table items, and lock each one just to compare. However, if an object is locked already, we must wait until it's unlocked again. During this wait we still hold the hash table lock. I don't think this is trivial though. So let me see if I can provide some patches. Michal

On Thu, Apr 23, 2015 at 14:02:25 +0200, Michal Privoznik wrote:
On 23.04.2015 13:32, zhang bo wrote:
On 2015/4/23 19:06, Daniel P. Berrange wrote:
On Thu, Apr 23, 2015 at 07:00:21PM +0800, zhang bo wrote:
The reason for the problem is that: 1 guestA locks vm while creating each tapDev(virNetDevTapCreate) in qemuBuildCommandLine(), for 10seconds 2 guestB calls qemuMigrationPrepareAny->*virDomainObjListAdd* to get its vm object, which locks 'doms' and waits for the vm lock. 3 doms will be locked until guestA unlock its vm, we say that's 10 seconds. 4 guestC calls qemuDomainMigrateFinish3->virDomainObjListFindByName, which tries to lock doms. because it's
Ok, this is the real core problem - FindByName has a bad impl that requires iterating over every single guest. Unfortunately due to the design of the migration API we can't avoid this call, but we could add a second hash table of name -> virDomainObj so we make it O(1) and lock-less.
Agreed. This will solve more than this.
I got a question: shall we add an object (similar to doms) and lock it while searching the vm in the new hash table? If so, the problem may still exist.
No. Currently, we use a hash table, where domain's UUID is the key, and pointer to internal domain representation (or domain object as it's called) is the value. Looking up a key in hash table is O(1).
However, if we look up a domain by name, it's practically lookup by value, because domain name is stored in the domain object. So we have to iterate over all hash table items, and lock each one just to compare. However, if an object is locked already, we must wait until it's unlocked again. During this wait we still hold the hash table lock.
I don't think this is trivial though. So let me see if I can provide some patches.
A simple workaround currently would be to iterate the hash table and store domain pointers that you increase the reference count on in a new array, then drop the hash table lock and start iterating it after you drop it. That's the way I'm planing on fixing the listAllDomains API which has the same problem.
Michal
Peter

On Thu, Apr 23, 2015 at 02:02:25PM +0200, Michal Privoznik wrote:
On 23.04.2015 13:32, zhang bo wrote:
On 2015/4/23 19:06, Daniel P. Berrange wrote:
On Thu, Apr 23, 2015 at 07:00:21PM +0800, zhang bo wrote:
The reason for the problem is that: 1 guestA locks vm while creating each tapDev(virNetDevTapCreate) in qemuBuildCommandLine(), for 10seconds 2 guestB calls qemuMigrationPrepareAny->*virDomainObjListAdd* to get its vm object, which locks 'doms' and waits for the vm lock. 3 doms will be locked until guestA unlock its vm, we say that's 10 seconds. 4 guestC calls qemuDomainMigrateFinish3->virDomainObjListFindByName, which tries to lock doms. because it's
Ok, this is the real core problem - FindByName has a bad impl that requires iterating over every single guest. Unfortunately due to the design of the migration API we can't avoid this call, but we could add a second hash table of name -> virDomainObj so we make it O(1) and lock-less.
Agreed. This will solve more than this.
I got a question: shall we add an object (similar to doms) and lock it while searching the vm in the new hash table? If so, the problem may still exist.
No. Currently, we use a hash table, where domain's UUID is the key, and pointer to internal domain representation (or domain object as it's called) is the value. Looking up a key in hash table is O(1).
However, if we look up a domain by name, it's practically lookup by value, because domain name is stored in the domain object. So we have to iterate over all hash table items, and lock each one just to compare. However, if an object is locked already, we must wait until it's unlocked again. During this wait we still hold the hash table lock.
I don't think this is trivial though. So let me see if I can provide some patches.
I'd just create a 2nd hash table in virDomainObjList object, that either maps name -> virDomainObjPtr, or maps name -> uuid. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
participants (4)
-
Daniel P. Berrange
-
Michal Privoznik
-
Peter Krempa
-
zhang bo