
On 2/14/22 15:01, Daniel P. Berrangé wrote:
On Mon, Feb 14, 2022 at 02:32:55PM +0100, Michal Privoznik wrote:
I've rewritten our virMutexes to check for failures [1] and saw couple of problems. In most cases we are destroying a locked mutex which these patches aim to fix. Then we have virLogLock() which is locked in virFork() and then unlocked from child (a different process) which is problematic. Pthread doesn't like it when one thread locks a mutex only so that another one unlocks it. Semaphores don't care.
Despite behaviuor being undefined, in practice it works with any impl we have in supported platforms.
Indeed, that's why I haven't sent the patch that rewrites virLogMutex.
The key problem we're addressing is that the child process needs to inherit a known state for the mutex. Whether that state is locked or unlocked doesn't matter, as long as it has a bulletproof guarantee of what that status is.
We call virLogLock before fork() to guarantee a locked state, by blocking on any other threads in the parent that might be temporarily holding it.
Again, I understand that. It's just that the following code returns an error: #include <pthread.h> #include <stdlib.h> #include <unistd.h> #include <stdio.h> #include <string.h> int main(int argc, char *argv[]) { pthread_mutex_t l; pthread_mutexattr_t attr; pthread_mutexattr_init(&attr); pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_ERRORCHECK); pthread_mutex_init(&l, &attr); pthread_mutex_lock(&l); if (fork() == 0) { /* child */ int rc; if ((rc = pthread_mutex_unlock(&l)) != 0) { fprintf(stderr, "unlock error: %s\n", strerror(rc)); abort(); } } return 0; } Switching to a semaphore works again. I'm not advocating for that though, it's needless because apparently pthread does the right thing on all platforms we care about.
The pthread_atfork() handler can be used todo the same trick if you don't control the place where fork() is called. The man page explicitly says
https://linux.die.net/man/3/pthread_atfork
"The expected usage is that the prepare handler acquires all mutex locks and the other two fork handlers release them."
Which is a terrible advice, let me say. This assumes that there is a global list of mutexes (which there certainly is not in projects of our size), but more importantly - this is so fragile to lock ordering than anything else. Michal