On 2/14/22 15:01, Daniel P. Berrangé wrote:
On Mon, Feb 14, 2022 at 02:32:55PM +0100, Michal Privoznik wrote:
> I've rewritten our virMutexes to check for failures [1] and saw couple
> of problems. In most cases we are destroying a locked mutex which these
> patches aim to fix. Then we have virLogLock() which is locked in
> virFork() and then unlocked from child (a different process) which is
> problematic. Pthread doesn't like it when one thread locks a mutex only
> so that another one unlocks it. Semaphores don't care.
Despite behaviuor being undefined, in practice it works with any
impl we have in supported platforms.
Indeed, that's why I haven't sent the patch that rewrites virLogMutex.
The key problem we're addressing is that the child process needs
to inherit a known state for the mutex. Whether that state is
locked or unlocked doesn't matter, as long as it has a bulletproof
guarantee of what that status is.
We call virLogLock before fork() to guarantee a locked state,
by blocking on any other threads in the parent that might be
temporarily holding it.
Again, I understand that. It's just that the following code returns an
error:
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
pthread_mutex_t l;
pthread_mutexattr_t attr;
pthread_mutexattr_init(&attr);
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_ERRORCHECK);
pthread_mutex_init(&l, &attr);
pthread_mutex_lock(&l);
if (fork() == 0) {
/* child */
int rc;
if ((rc = pthread_mutex_unlock(&l)) != 0) {
fprintf(stderr, "unlock error: %s\n", strerror(rc));
abort();
}
}
return 0;
}
Switching to a semaphore works again. I'm not advocating for that
though, it's needless because apparently pthread does the right thing on
all platforms we care about.
The pthread_atfork() handler can be used todo the same trick
if you don't control the place where fork() is called. The
man page explicitly says
https://linux.die.net/man/3/pthread_atfork
"The expected usage is that the prepare handler acquires
all mutex locks and the other two fork handlers release
them."
Which is a terrible advice, let me say. This assumes that there is a
global list of mutexes (which there certainly is not in projects of our
size), but more importantly - this is so fragile to lock ordering than
anything else.
Michal