On 04/08/2013 09:43 PM, Eric Blake wrote:
> Thanks; I can confirm under valgrind that we have a use after
free, with
> all sorts of nasty heap corruption potential, after instrumenting my
> source a bit more:
>
>
> Once again, I'm trying to ascertain how far back this issue appears.
This time, it appears the problem is more recent. I initially suspected
commit d1c7b00b (Feb 2013, v1.0.3), since that rearranged the locks
inside virDomainObjListRemove, but even after instrumenting that
function, I was unable to get a crash; instead, I got the expected
lookup failure:
# virsh undefine fedora-local& sleep .1; virsh dominfo fedora-local
[1] 25898
Domain fedora-local has been undefined
error: failed to get domain 'fedora-local'
error: Domain not found: no domain with matching name 'fedora-local'
[1]+ Done virsh undefine fedora-local
It's too late for me now to search any more tonight, but on the bright
side, that narrows down the search, and we have at most two releases
affected (rather than all the way back to 0.10.0 on the race that
originally spawned this thread).
My 'git bisect' is complete. Commit d1c7b00b introduced the bug, but
the bug was latent until commit a9e97e0 removed qemuDriverLock.
Furthermore, applying danpb's/Peter's fix [1] on top of a9e97e0
(basically fixing the latent bug of d1c7b00b) once again avoids the crash.
[1]
https://www.redhat.com/archives/libvir-list/2013-April/msg00706.html
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org