
On 04/09/2013 07:02 AM, Peter Krempa wrote:
This patch fixes crash of the daemon that happens due to the following race condition:
Let's have two threads in the libvirtd daemon's qemu driver: A - thread executing a API call to get information about a domain B - thread executing undefine on the same domain
You mixed up the two threads here, compared to your description below. In the description, A is undefining the domain, while B is attempting to get information.
Assume following serialization of operations done by the threads: 1) A has the lock on the domain object and is executing some code prior to virDomainObjListRemove() 2) B takes the lock on the domain object list, looks up the domain object pointer and blocks in the attempt to lock the domain object as A is holding the lock 3) A reaches virDomainObjListRemove() and unlocks the lock on the domain object 4) A blocks on the attempt to get the domain list lock 5) B is able to lock the domain object now and unlocks the domain list 6) A is now able to lock the domain list, and shed the last reference on the
s/shed/sheds/
tomain object, this triggers the freeing function.
s/tomain/domain/
6) B starts executing the code on the pointer that is being freed 7) The libvirtd daemon crashes while attempting to access invalid pointer in thread B.
Yep, that sequence matches what the reproducer in 2/2 was exposing.
This patch fixes the race by acquiring a reference on the domain object before unlocking it in virDomainObjListRemove() and re-locks the object prior to removing and freeing it. This ensures that no thread that does not hold a reference on the domain object expects the pointer to be valid and the monitor code expects the object to vanish.
Double negative; I would write: This ensures that no thread holds a lock on the domain object at the time it is removed from the list, and that doing a list lookup will never find a domain that is about to vanish.
This is a minimal fix of the problem, but a better solution will be to switch to full reference counting for domain objects. --- src/conf/domain_conf.c | 4 ++++ 1 file changed, 4 insertions(+)
Commit message needs help, but the code is correct. ACK
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 03e5740..cafef0c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2238,10 +2238,14 @@ void virDomainObjListRemove(virDomainObjListPtr doms, char uuidstr[VIR_UUID_STRING_BUFLEN];
virUUIDFormat(dom->def->uuid, uuidstr); + virObjectRef(dom); virObjectUnlock(dom);
virObjectLock(doms); + virObjectLock(dom); virHashRemoveEntry(doms->objs, uuidstr); + virObjectUnlock(dom); + virObjectUnref(dom); virObjectUnlock(doms); }
-- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org