On 09/27/2018 09:57 AM, Marc Hartmayer wrote:
On Thu, Sep 27, 2018 at 09:01 AM +0200, Bjoern Walk
<bwalk(a)linux.ibm.com> wrote:
> Michal Privoznik <mprivozn(a)redhat.com> [2018-09-19, 11:45AM +0200]:
>> On 09/19/2018 11:17 AM, Bjoern Walk wrote:
>>> Bjoern Walk <bwalk(a)linux.ibm.com> [2018-09-12, 01:17PM +0200]:
>>>> Michal Privoznik <mprivozn(a)redhat.com> [2018-09-12, 11:32AM
+0200]:
>>
>>>>
>>>
>>> Still seeing the same timeout. Is this expected behaviour?
>>>
>>
>> Nope. I wonder if something has locked the path and forgot to unlock it
>> (however, virtlockd should have unlocked all the paths owned by PID on
>> connection close), or something is still holding the lock and connection
>> opened.
>>
>> Can you see the timeout even when you turn off the selinux driver
>> (security_driver="none' in qemu.conf)? I tried to reproduce the issue
>> yesterday and was unsuccessful. Do you have any steps to reproduce?
>
> So, I haven't been able to actually dig into the debugging but we have
> been able to reproduce this behaviour on x86 (both with SELinux and
> DAC). Looks like a general problem, if a problem at all, because from
> what I can see in the code, the 60 seconds timeout is actually intended,
> or not? The security manager does try for 60 seconds to acquire the lock
> and only then bails out. Why is this?
Backtrace of libvirtd where it’s hanging (in thread A)
(gdb) bt
#0 read () from /lib64/libpthread.so.0
#1 read (__nbytes=1024, __buf=0x7f84e401afd0, __fd=31) at /usr/include/bits/unistd.h:44
#2 saferead (fd=fd@entry=31, buf=0x7f84e401afd0, count=count@entry=1024) at
util/virfile.c:1061
#3 saferead_lim (fd=31, max_len=max_len@entry=1024, length=length@entry=0x7f84f3fa8240)
at util/virfile.c:1345
#4 virFileReadHeaderFD (fd=<optimized out>, maxlen=maxlen@entry=1024,
buf=buf@entry=0x7f84f3fa8278) at util/virfile.c:1376
#5 virProcessRunInMountNamespace () at util/virprocess.c:1149
#6 virSecuritySELinuxTransactionCommit (mgr=<optimized out>, pid=23977) at
security/security_selinux.c:1106
#7 virSecurityManagerTransactionCommit (mgr=0x7f848c0ffd60, pid=pid@entry=23977) at
security/security_manager.c:324
#8 virSecurityStackTransactionCommit (mgr=<optimized out>, pid=23977) at
security/security_stack.c:166
#9 virSecurityManagerTransactionCommit (mgr=0x7f848c183760, pid=pid@entry=23977) at
security/security_manager.c:324
#10 qemuSecuritySetAllLabel (driver=driver@entry=0x7f848c108c60,
vm=vm@entry=0x7f848c1c3a50, stdin_path=stdin_path@entry=0x0) at _security.c:56
#11 in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
asyncJob=a=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0, snapshot=snapshot@entry=0x0,
vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1 qemu/qemu_process.c:6329
#12 in qemuProcessStart (conn=conn@entry=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
updatedCPU==0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START,
migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0,
snapshot=0x=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at
qemu/qemu_process.c:6816
This is just waiting for forked child to finish. Nothing suspicious here.
…
#25 in start_thread () from /lib64/libpthread.so.0
#26 in clone () from /lib64/libc.so.6
Backtrace of the forked process where the process is trying to get the
meta data lock for 60 seconds.>
#0 0x00007f8508572730 in nanosleep () from target:/lib64/libc.so.6
#1 0x00007f850859dff4 in usleep () from target:/lib64/libc.so.6
#2 0x00007f850c26efea in virTimeBackOffWait (var=var@entry=0x7f84f3fa81a0) at
util/virtime.c:453
#3 0x00007f850c3108d8 in virSecurityManagerMetadataLock (mgr=0x7f848c183850,
paths=<optimized out>, npaths=<optimized out>) at
security/security_manager.c:1345
#4 0x00007f850c30e44b in virSecurityDACTransactionRun (pid=pid@entry=23977,
opaque=opaque@entry=0x7f84e4021450) at security/security_dac.c:226
#5 0x00007f850c250021 in virProcessNamespaceHelper (opaque=0x7f84e4021450,
cb=0x7f850c30e330 <virSecurityDACTransactionRun>, pid=23977, errfd=33) at
util/virprocess.c:1100
#6 virProcessRunInMountNamespace () at util/virprocess.c:1140
#7 0x00007f850c30e55c in virSecurityDACTransactionCommit (mgr=<optimized out>,
pid=23977) at security/security_dac.c:565
#8 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183850,
pid=pid@entry=23977) at security/security_manager.c:324
#9 0x00007f850c30b36b in virSecurityStackTransactionCommit (mgr=<optimized out>,
pid=23977) at security/security_stack.c:166
#10 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183760,
pid=pid@entry=23977) at security/security_manager.c:324
#11 0x00007f84e0586bf2 in qemuSecuritySetAllLabel (driver=driver@entry=0x7f848c108c60,
vm=vm@entry=0x7f848c1c3a50, stdin_path=stdin_path@entry=0x0) at qemu/qemu_security.c:56
#12 0x00007f84e051b7fd in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0,
snapshot=snapshot@entry=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at
qemu/qemu_process.c:6329
#13 0x00007f84e051ee2e in qemuProcessStart (conn=conn@entry=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
updatedCPU=updatedCPU@entry=0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START,
migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0, snapshot=0x0,
vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6816
#14 0x00007f84e057f5cd in qemuDomainObjStart (conn=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=0x7f848c1c3a50, flags=flags@entry=0,
asyncJob=QEMU_ASYNC_JOB_START) at qemu/qemu_driver.c:7296
#15 0x00007f84e057fc19 in qemuDomainCreateWithFlags (dom=0x7f84e4001620, flags=0) at
qemu/qemu_driver.c:7349
#16 0x00007f850c426d57 in virDomainCreate (domain=domain@entry=0x7f84e4001620) at
libvirt-domain.c:6534
#17 0x000055a82b2f1cae in remoteDispatchDomainCreate (server=0x55a82c9e16d0,
msg=0x55a82ca53e00, args=<optimized out>, rerr=0x7f84f3fa8960,
client=0x55a82ca5d180) at remote/remote_daemon_dispatch_stubs.h:4434
#18 remoteDispatchDomainCreateHelper (server=0x55a82c9e16d0, client=0x55a82ca5d180,
msg=0x55a82ca53e00, rerr=0x7f84f3fa8960, args=<optimized out>, ret=0x7f84e4003800)
at remote/remote_daemon_dispatch_stubs.h:4410
#19 0x00007f850c338734 in virNetServerProgramDispatchCall (msg=0x55a82ca53e00,
client=0x55a82ca5d180, server=0x55a82c9e16d0, prog=0x55a82ca4be70) at
rpc/virnetserverprogram.c:437
#20 virNetServerProgramDispatch (prog=0x55a82ca4be70, server=server@entry=0x55a82c9e16d0,
client=0x55a82ca5d180, msg=0x55a82ca53e00) at rpc/virnetserverprogram.c:304
#21 0x00007f850c33ec3c in virNetServerProcessMsg (msg=<optimized out>,
prog=<optimized out>, client=<optimized out>, srv=0x55a82c9e16d0) at
rpc/virnetserver.c:144
#22 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x55a82c9e16d0) at
rpc/virnetserver.c:165
#23 0x00007f850c26db20 in virThreadPoolWorker (opaque=opaque@entry=0x55a82c9ebe30) at
util/virthreadpool.c:167
#24 0x00007f850c26ce2c in virThreadHelper (data=<optimized out>) at
util/virthread.c:206
#25 0x00007f8508872594 in start_thread () from target:/lib64/libpthread.so.0
#26 0x00007f85085a5e6f in clone () from target:/lib64/libc.so.6
Right, so this is just waiting for virtlockd to lock the paths.
virtlockd is obviously unable to do that (as I suggested in my previous
e-mail - is perhaps some other process holding the lock?).
Can you please enable debug logs for virtlockd, reproduce and share the
log with me? Setting this in /etc/libvirt/virtlockd.conf should be enough:
log_level=1
log_filters="4:event 3:rpc"
log_outputs="1:file:/var/log/virtlockd.lod"
Thanks,
Michal