On Fri, May 17, 2013 at 10:05:25PM -0600, Eric Blake wrote:
I found root cause on two nasty races today, but ran out of time to
write the patches until next week.
1. We have a TOCTTOU race when moving processes between cgroups.
virCgroupMoveTask can fail if listing the threads in the parent group
picks up multiple threads, then one of those threads exits before we
then write to the emulator subgroup to relocate those threads; the chain
reaction from this failure is that the domain fails to start. qemu
spawns short-lived threads all the time (possibly due to glibc using
threads under the hood to implement aio as qemu first probes a disk
image); with a 3-second-delay hook script, I was able to hit this race
about 5% of the time when repeatedly starting multiple domains in
parallel. The fix should be ignoring any ESRCH failures when writing
during virCgroupAddTAskStrController, but I plan to test that before
posting my patch. Present at least since 0.10.2 (Fedora 18); probably
earlier (but I didn't check how far back).
It is worse than that - you can't simply ignore ESRCH. In the same way
that existing processes can exit, new processes can come into live.
The virCgroupMoveTask method doesn't take any of this into account.
The real problem is that we should not be trying to move QEMU after it
has been started - we should aim to get it into the right place right
away.
2. We have a (rare) deadlock due to the use of a
non-async-signal-safe
getpwuid_r in between fork and exec. I was extremely surprised when I
hit it in real life while trying to debug the first race; but sure
enough, the backtrace confirms that glibc grabs a mutex inside
getpwuid_r, and if some other thread in the parent process is
interrupted in the middle of getpwuid_r while we are forking a child in
our thread, then our child is hung; and since the parent depends on a
handshake with the child, the parent's worker thread was also hung. I
think the fix to this is to call getpwuid_r prior to fork, and change
virSetUIDGID to take the list of supplemental groups as a parameter
rather than trying to learn it on the fly. Again, something I plan to
test once I have the patch written. Present at least since commit
f42cf7cb (Dec 2010) with the introduction of virSetUIDGID, and probably
further back.
Hmm, interesting. Nasty issue.
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|