
On Fri, Nov 15, 2013 at 10:37:53AM -0700, Eric Blake wrote:
On 11/15/2013 09:20 AM, Daniel P. Berrange wrote:
The glibc setxid is supposed to be async signal safe, but libc developers confirm that it is not. This causes a problem when libvirt_lxc starts the FUSE thread and then runs clone() to start the container. If the clone() was done before the FUSE thread has completely started up, then the container will hang in setxid after clone().
The fix is to avoid creating any threads until after the container has been clone()'d. By avoiding any threads in the parent, the child is no longer required to run in an async signal safe context, and we thus avoid the glibc bug.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> --- src/lxc/lxc_controller.c | 11 +++++++++-- src/lxc/lxc_fuse.c | 21 +++++++++++++++------ src/lxc/lxc_fuse.h | 1 + 3 files changed, 25 insertions(+), 8 deletions(-)
I can review the code, but I'd feel better if this also got field testing as resolving the problem before you push it.
ACK.
I was able to reproduce the problem one time in 3 without the patch and it appears gone after applying it. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|