On 04/09/2013 07:02 AM, Peter Krempa wrote:
This patch fixes crash of the daemon that happens due to the
following race
condition:
Let's have two threads in the libvirtd daemon's qemu driver:
A - thread executing a API call to get information about a domain
B - thread executing undefine on the same domain
You mixed up the two threads here, compared to your description below.
In the description, A is undefining the domain, while B is attempting to
get information.
Assume following serialization of operations done by the threads:
1) A has the lock on the domain object and is executing some code prior to
virDomainObjListRemove()
2) B takes the lock on the domain object list, looks up the domain object
pointer and blocks in the attempt to lock the domain object as A is holding the
lock
3) A reaches virDomainObjListRemove() and unlocks the lock on the domain object
4) A blocks on the attempt to get the domain list lock
5) B is able to lock the domain object now and unlocks the domain list
6) A is now able to lock the domain list, and shed the last reference on the
s/shed/sheds/
tomain object, this triggers the freeing function.
s/tomain/domain/
6) B starts executing the code on the pointer that is being freed
7) The libvirtd daemon crashes while attempting to access invalid pointer in
thread B.
Yep, that sequence matches what the reproducer in 2/2 was exposing.
This patch fixes the race by acquiring a reference on the domain object before
unlocking it in virDomainObjListRemove() and re-locks the object prior to
removing and freeing it. This ensures that no thread that does not hold a
reference on the domain object expects the pointer to be valid and the monitor
code expects the object to vanish.
Double negative; I would write:
This ensures that no thread holds a lock on the domain object at the
time it is removed from the list, and that doing a list lookup will
never find a domain that is about to vanish.
This is a minimal fix of the problem, but a better solution will be to switch to
full reference counting for domain objects.
---
src/conf/domain_conf.c | 4 ++++
1 file changed, 4 insertions(+)
Commit message needs help, but the code is correct.
ACK
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 03e5740..cafef0c 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -2238,10 +2238,14 @@ void virDomainObjListRemove(virDomainObjListPtr doms,
char uuidstr[VIR_UUID_STRING_BUFLEN];
virUUIDFormat(dom->def->uuid, uuidstr);
+ virObjectRef(dom);
virObjectUnlock(dom);
virObjectLock(doms);
+ virObjectLock(dom);
virHashRemoveEntry(doms->objs, uuidstr);
+ virObjectUnlock(dom);
+ virObjectUnref(dom);
virObjectUnlock(doms);
}
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org