On 11/09/2013 03:42 AM, Rich Felker wrote:
On Fri, Nov 08, 2013 at 01:30:09PM +0800, Daniel P. Berrange wrote:
> On Thu, Nov 07, 2013 at 09:15:43PM +0800, Gao feng wrote:
>> I met a problem that container blocked by seteuid/setegid
>> which is call in lxcContainerSetID on UP system and libvirt
>> compiled with --with-fuse=yes.
>>
>> I looked into the glibc's codes, and found setxid in glibc
>> calls futex() to wait for other threads to change their
>> setxid_futex to 0(see setxid_mark_thread in glibc).
>>
>> since the process created by clone system call will not
>> share the memory with the other threads and the context
>> of memory doesn't changed until we call execl.(COW)
>>
>> So if the process which created by clone is called before
>> fuse thread being stated, the new setxid_futex of fuse
>> thread will not be saw in this process, it will be blocked
>> forever.
>>
>> Maybe this problem should be fixed in glibc, but I send
>> this patch as a quick fix.
>
> Can you show a stack trace of the threads/processes deadlocking
I think this is a symptom of setxid not being async-signal-safe like
it's required to be. I'm not sure if we have a bug tracker entry for
that; if not, it should be added. But if clone() is being used except
in a fork-like manner, this is probably invalid application usage too.
I post a patch to the glibc community, but I can't find my patch on the
mail list archive. the patch is attached. do you think this glibc patch
is needed or we just should add some bug tracker on manpage?