On Thu, Sep 27, 2018 at 10:14 AM +0200, Michal Privoznik <mprivozn(a)redhat.com>
wrote:
On 09/27/2018 09:57 AM, Marc Hartmayer wrote:
> On Thu, Sep 27, 2018 at 09:01 AM +0200, Bjoern Walk <bwalk(a)linux.ibm.com>
wrote:
>> Michal Privoznik <mprivozn(a)redhat.com> [2018-09-19, 11:45AM +0200]:
>>> On 09/19/2018 11:17 AM, Bjoern Walk wrote:
>>>> Bjoern Walk <bwalk(a)linux.ibm.com> [2018-09-12, 01:17PM +0200]:
>>>>> Michal Privoznik <mprivozn(a)redhat.com> [2018-09-12, 11:32AM
+0200]:
>>>
>>>>>
>>>>
>>>> Still seeing the same timeout. Is this expected behaviour?
>>>>
>>>
>>> Nope. I wonder if something has locked the path and forgot to unlock it
>>> (however, virtlockd should have unlocked all the paths owned by PID on
>>> connection close), or something is still holding the lock and connection
>>> opened.
>>>
>>> Can you see the timeout even when you turn off the selinux driver
>>> (security_driver="none' in qemu.conf)? I tried to reproduce the
issue
>>> yesterday and was unsuccessful. Do you have any steps to reproduce?
>>
>> So, I haven't been able to actually dig into the debugging but we have
>> been able to reproduce this behaviour on x86 (both with SELinux and
>> DAC). Looks like a general problem, if a problem at all, because from
>> what I can see in the code, the 60 seconds timeout is actually intended,
>> or not? The security manager does try for 60 seconds to acquire the lock
>> and only then bails out. Why is this?
>
> Backtrace of libvirtd where it’s hanging (in thread A)
>
> (gdb) bt
> #0 read () from /lib64/libpthread.so.0
> #1 read (__nbytes=1024, __buf=0x7f84e401afd0, __fd=31) at
/usr/include/bits/unistd.h:44
> #2 saferead (fd=fd@entry=31, buf=0x7f84e401afd0, count=count@entry=1024) at
util/virfile.c:1061
> #3 saferead_lim (fd=31, max_len=max_len@entry=1024,
length=length@entry=0x7f84f3fa8240) at util/virfile.c:1345
> #4 virFileReadHeaderFD (fd=<optimized out>, maxlen=maxlen@entry=1024,
buf=buf@entry=0x7f84f3fa8278) at util/virfile.c:1376
> #5 virProcessRunInMountNamespace () at util/virprocess.c:1149
> #6 virSecuritySELinuxTransactionCommit (mgr=<optimized out>, pid=23977) at
security/security_selinux.c:1106
> #7 virSecurityManagerTransactionCommit (mgr=0x7f848c0ffd60, pid=pid@entry=23977) at
security/security_manager.c:324
> #8 virSecurityStackTransactionCommit (mgr=<optimized out>, pid=23977) at
security/security_stack.c:166
> #9 virSecurityManagerTransactionCommit (mgr=0x7f848c183760, pid=pid@entry=23977) at
security/security_manager.c:324
> #10 qemuSecuritySetAllLabel (driver=driver@entry=0x7f848c108c60,
vm=vm@entry=0x7f848c1c3a50, stdin_path=stdin_path@entry=0x0) at _security.c:56
> #11 in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
asyncJob=a=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0, snapshot=snapshot@entry=0x0,
vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1 qemu/qemu_process.c:6329
> #12 in qemuProcessStart (conn=conn@entry=0x7f84c0003e40,
> driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
> updatedCPU==0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START,
> migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0,
> snapshot=0x=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at
> qemu/qemu_process.c:6816
This is just waiting for forked child to finish. Nothing suspicious here.
>
> …
>
> #25 in start_thread () from /lib64/libpthread.so.0
> #26 in clone () from /lib64/libc.so.6
>
>
> Backtrace of the forked process where the process is trying to get the
> meta data lock for 60 seconds.>
> #0 0x00007f8508572730 in nanosleep () from target:/lib64/libc.so.6
> #1 0x00007f850859dff4 in usleep () from target:/lib64/libc.so.6
> #2 0x00007f850c26efea in virTimeBackOffWait (var=var@entry=0x7f84f3fa81a0) at
util/virtime.c:453
> #3 0x00007f850c3108d8 in virSecurityManagerMetadataLock (mgr=0x7f848c183850,
paths=<optimized out>, npaths=<optimized out>) at
security/security_manager.c:1345
> #4 0x00007f850c30e44b in virSecurityDACTransactionRun (pid=pid@entry=23977,
opaque=opaque@entry=0x7f84e4021450) at security/security_dac.c:226
> #5 0x00007f850c250021 in virProcessNamespaceHelper (opaque=0x7f84e4021450,
cb=0x7f850c30e330 <virSecurityDACTransactionRun>, pid=23977, errfd=33) at
util/virprocess.c:1100
> #6 virProcessRunInMountNamespace () at util/virprocess.c:1140
> #7 0x00007f850c30e55c in virSecurityDACTransactionCommit (mgr=<optimized out>,
pid=23977) at security/security_dac.c:565
> #8 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183850,
pid=pid@entry=23977) at security/security_manager.c:324
> #9 0x00007f850c30b36b in virSecurityStackTransactionCommit (mgr=<optimized
out>, pid=23977) at security/security_stack.c:166
> #10 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183760,
pid=pid@entry=23977) at security/security_manager.c:324
> #11 0x00007f84e0586bf2 in qemuSecuritySetAllLabel
(driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
stdin_path=stdin_path@entry=0x0) at qemu/qemu_security.c:56
> #12 0x00007f84e051b7fd in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0,
snapshot=snapshot@entry=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at
qemu/qemu_process.c:6329
> #13 0x00007f84e051ee2e in qemuProcessStart (conn=conn@entry=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
updatedCPU=updatedCPU@entry=0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START,
migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0, snapshot=0x0,
vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6816
> #14 0x00007f84e057f5cd in qemuDomainObjStart (conn=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=0x7f848c1c3a50, flags=flags@entry=0,
asyncJob=QEMU_ASYNC_JOB_START) at qemu/qemu_driver.c:7296
> #15 0x00007f84e057fc19 in qemuDomainCreateWithFlags (dom=0x7f84e4001620, flags=0) at
qemu/qemu_driver.c:7349
> #16 0x00007f850c426d57 in virDomainCreate (domain=domain@entry=0x7f84e4001620) at
libvirt-domain.c:6534
> #17 0x000055a82b2f1cae in remoteDispatchDomainCreate (server=0x55a82c9e16d0,
msg=0x55a82ca53e00, args=<optimized out>, rerr=0x7f84f3fa8960,
client=0x55a82ca5d180) at remote/remote_daemon_dispatch_stubs.h:4434
> #18 remoteDispatchDomainCreateHelper (server=0x55a82c9e16d0, client=0x55a82ca5d180,
msg=0x55a82ca53e00, rerr=0x7f84f3fa8960, args=<optimized out>, ret=0x7f84e4003800)
at remote/remote_daemon_dispatch_stubs.h:4410
> #19 0x00007f850c338734 in virNetServerProgramDispatchCall (msg=0x55a82ca53e00,
client=0x55a82ca5d180, server=0x55a82c9e16d0, prog=0x55a82ca4be70) at
rpc/virnetserverprogram.c:437
> #20 virNetServerProgramDispatch (prog=0x55a82ca4be70,
server=server@entry=0x55a82c9e16d0, client=0x55a82ca5d180, msg=0x55a82ca53e00) at
rpc/virnetserverprogram.c:304
> #21 0x00007f850c33ec3c in virNetServerProcessMsg (msg=<optimized out>,
prog=<optimized out>, client=<optimized out>, srv=0x55a82c9e16d0) at
rpc/virnetserver.c:144
> #22 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x55a82c9e16d0) at
rpc/virnetserver.c:165
> #23 0x00007f850c26db20 in virThreadPoolWorker (opaque=opaque@entry=0x55a82c9ebe30) at
util/virthreadpool.c:167
> #24 0x00007f850c26ce2c in virThreadHelper (data=<optimized out>) at
util/virthread.c:206
> #25 0x00007f8508872594 in start_thread () from target:/lib64/libpthread.so.0
> #26 0x00007f85085a5e6f in clone () from target:/lib64/libc.so.6
Right, so this is just waiting for virtlockd to lock the paths.
virtlockd is obviously unable to do that (as I suggested in my previous
e-mail - is perhaps some other process holding the lock?).
The other domain is holding the lock (obviously).
Can you please enable debug logs for virtlockd, reproduce and share the
log with me? Setting this in /etc/libvirt/virtlockd.conf should be
enough:
Sure, but I don’t think this is needed. We’re trying to start two
domains which have disks backed on the same image. So the current
behavior is probably intended (apart from the problem with the later API
calls). In my opinion, the timeout is just way too high.
log_level=1
log_filters="4:event 3:rpc"
log_outputs="1:file:/var/log/virtlockd.lod"
Thanks,
Michal
--
Kind regards / Beste Grüße
Marc Hartmayer
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294