On 03/02/2018 11:12 AM, Daniel P. Berrangé wrote:
On Fri, Mar 02, 2018 at 04:52:23PM +0000, Daniel P. Berrangé wrote:
> On Thu, Mar 01, 2018 at 04:42:36PM -0700, Jim Fehlig wrote:
>> Locks held by virtlockd are dropped on re-exec.
>>
>> virtlockd 94306 POSIX 5.4G WRITE 0 0 0 /tmp/test.qcow2
>> virtlockd 94306 POSIX 5B WRITE 0 0 0 /run/virtlockd.pid
>> virtlockd 94306 POSIX 5B WRITE 0 0 0 /run/virtlockd.pid
>>
>> Acquire locks in PostExecRestart code path.
>
> This is really strange and should *not* be happening. POSIX locks
> are supposed to be preserved across execve() if the FD has CLOEXEC
> unset, and you don't fork() before the exec.
[snip]
> So I wonder what we've screwed up to cause the locks to get
> released - reaquiring them definitely isn't desirable as we
> should not loose them in the first place !
This is really very strange. The problem seems to be the existance of
threads at time of execve().
If you spawn a thread and the thread exits, and you execve the locks
are preserved.
If you spawn a thread and the thread is still running, and you execve
the locks are lost.
Indeed you are correct. I'm seeing the same behavior with the below
modifications to your demo. The lock is preserved after execve when BREAK_FLOCK
is 0, but removed when BREAK_FLOCK is 1.
--- lock.c 2018-03-02 15:10:59.200154182 -0700
+++ lock-thr.c 2018-03-02 15:14:30.501441105 -0700
@@ -4,6 +4,15 @@
#include <pthread.h>
#include <unistd.h>
+#define BREAK_FLOCK 1
+
+static void *thr_func(void *arg)
+{
+#if BREAK_FLOCK == 1
+ while (1)
+#endif
+ sleep(5);
+}
int main(int argc, char **argv) {
@@ -33,6 +42,13 @@
sleep(50);
} else {
+ pthread_t thr;
+
+ if (pthread_create(&thr, NULL, thr_func, NULL) != 0) {
+ fprintf(stderr, "pthread_create failed\n");
+ abort();
+ }
+
int fd = open("lock.txt", O_WRONLY|O_CREAT|O_TRUNC, 0755);
if (fd < 0)
abort();
This behaviour makes no sense at all to time. Why should it matter
if
the thread exits itself, or is force exited during execve(). I wonder
if it is even possibly a kernel bug.
I'll attach the reproducer to an internal bug (sorry!), but will report back
here with any findings.
Regards,
Jim