On Thu, Sep 27, 2018 at 09:01 AM +0200, Bjoern Walk <bwalk(a)linux.ibm.com> wrote:
Michal Privoznik <mprivozn(a)redhat.com> [2018-09-19, 11:45AM
+0200]:
> On 09/19/2018 11:17 AM, Bjoern Walk wrote:
> > Bjoern Walk <bwalk(a)linux.ibm.com> [2018-09-12, 01:17PM +0200]:
> >> Michal Privoznik <mprivozn(a)redhat.com> [2018-09-12, 11:32AM +0200]:
>
> >>
> >
> > Still seeing the same timeout. Is this expected behaviour?
> >
>
> Nope. I wonder if something has locked the path and forgot to unlock it
> (however, virtlockd should have unlocked all the paths owned by PID on
> connection close), or something is still holding the lock and connection
> opened.
>
> Can you see the timeout even when you turn off the selinux driver
> (security_driver="none' in qemu.conf)? I tried to reproduce the issue
> yesterday and was unsuccessful. Do you have any steps to reproduce?
So, I haven't been able to actually dig into the debugging but we have
been able to reproduce this behaviour on x86 (both with SELinux and
DAC). Looks like a general problem, if a problem at all, because from
what I can see in the code, the 60 seconds timeout is actually intended,
or not? The security manager does try for 60 seconds to acquire the lock
and only then bails out. Why is this?
Backtrace of libvirtd where it’s hanging (in thread A)
(gdb) bt
#0 read () from /lib64/libpthread.so.0
#1 read (__nbytes=1024, __buf=0x7f84e401afd0, __fd=31) at /usr/include/bits/unistd.h:44
#2 saferead (fd=fd@entry=31, buf=0x7f84e401afd0, count=count@entry=1024) at
util/virfile.c:1061
#3 saferead_lim (fd=31, max_len=max_len@entry=1024, length=length@entry=0x7f84f3fa8240) at
util/virfile.c:1345
#4 virFileReadHeaderFD (fd=<optimized out>, maxlen=maxlen@entry=1024,
buf=buf@entry=0x7f84f3fa8278) at util/virfile.c:1376
#5 virProcessRunInMountNamespace () at util/virprocess.c:1149
#6 virSecuritySELinuxTransactionCommit (mgr=<optimized out>, pid=23977) at
security/security_selinux.c:1106
#7 virSecurityManagerTransactionCommit (mgr=0x7f848c0ffd60, pid=pid@entry=23977) at
security/security_manager.c:324
#8 virSecurityStackTransactionCommit (mgr=<optimized out>, pid=23977) at
security/security_stack.c:166
#9 virSecurityManagerTransactionCommit (mgr=0x7f848c183760, pid=pid@entry=23977) at
security/security_manager.c:324
#10 qemuSecuritySetAllLabel (driver=driver@entry=0x7f848c108c60,
vm=vm@entry=0x7f848c1c3a50, stdin_path=stdin_path@entry=0x0) at _security.c:56
#11 in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
asyncJob=a=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0, snapshot=snapshot@entry=0x0,
vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1 qemu/qemu_process.c:6329
#12 in qemuProcessStart (conn=conn@entry=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
updatedCPU==0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START,
migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0,
snapshot=0x=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at
qemu/qemu_process.c:6816
…
#25 in start_thread () from /lib64/libpthread.so.0
#26 in clone () from /lib64/libc.so.6
Backtrace of the forked process where the process is trying to get the
meta data lock for 60 seconds.
#0 0x00007f8508572730 in nanosleep () from target:/lib64/libc.so.6
#1 0x00007f850859dff4 in usleep () from target:/lib64/libc.so.6
#2 0x00007f850c26efea in virTimeBackOffWait (var=var@entry=0x7f84f3fa81a0) at
util/virtime.c:453
#3 0x00007f850c3108d8 in virSecurityManagerMetadataLock (mgr=0x7f848c183850,
paths=<optimized out>, npaths=<optimized out>) at
security/security_manager.c:1345
#4 0x00007f850c30e44b in virSecurityDACTransactionRun (pid=pid@entry=23977,
opaque=opaque@entry=0x7f84e4021450) at security/security_dac.c:226
#5 0x00007f850c250021 in virProcessNamespaceHelper (opaque=0x7f84e4021450,
cb=0x7f850c30e330 <virSecurityDACTransactionRun>, pid=23977, errfd=33) at
util/virprocess.c:1100
#6 virProcessRunInMountNamespace () at util/virprocess.c:1140
#7 0x00007f850c30e55c in virSecurityDACTransactionCommit (mgr=<optimized out>,
pid=23977) at security/security_dac.c:565
#8 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183850,
pid=pid@entry=23977) at security/security_manager.c:324
#9 0x00007f850c30b36b in virSecurityStackTransactionCommit (mgr=<optimized out>,
pid=23977) at security/security_stack.c:166
#10 0x00007f850c30eeca in virSecurityManagerTransactionCommit (mgr=0x7f848c183760,
pid=pid@entry=23977) at security/security_manager.c:324
#11 0x00007f84e0586bf2 in qemuSecuritySetAllLabel (driver=driver@entry=0x7f848c108c60,
vm=vm@entry=0x7f848c1c3a50, stdin_path=stdin_path@entry=0x0) at qemu/qemu_security.c:56
#12 0x00007f84e051b7fd in qemuProcessLaunch (conn=conn@entry=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, incoming=incoming@entry=0x0,
snapshot=snapshot@entry=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at
qemu/qemu_process.c:6329
#13 0x00007f84e051ee2e in qemuProcessStart (conn=conn@entry=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=vm@entry=0x7f848c1c3a50,
updatedCPU=updatedCPU@entry=0x0, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START,
migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0, snapshot=0x0,
vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17) at qemu/qemu_process.c:6816
#14 0x00007f84e057f5cd in qemuDomainObjStart (conn=0x7f84c0003e40,
driver=driver@entry=0x7f848c108c60, vm=0x7f848c1c3a50, flags=flags@entry=0,
asyncJob=QEMU_ASYNC_JOB_START) at qemu/qemu_driver.c:7296
#15 0x00007f84e057fc19 in qemuDomainCreateWithFlags (dom=0x7f84e4001620, flags=0) at
qemu/qemu_driver.c:7349
#16 0x00007f850c426d57 in virDomainCreate (domain=domain@entry=0x7f84e4001620) at
libvirt-domain.c:6534
#17 0x000055a82b2f1cae in remoteDispatchDomainCreate (server=0x55a82c9e16d0,
msg=0x55a82ca53e00, args=<optimized out>, rerr=0x7f84f3fa8960,
client=0x55a82ca5d180) at remote/remote_daemon_dispatch_stubs.h:4434
#18 remoteDispatchDomainCreateHelper (server=0x55a82c9e16d0, client=0x55a82ca5d180,
msg=0x55a82ca53e00, rerr=0x7f84f3fa8960, args=<optimized out>, ret=0x7f84e4003800)
at remote/remote_daemon_dispatch_stubs.h:4410
#19 0x00007f850c338734 in virNetServerProgramDispatchCall (msg=0x55a82ca53e00,
client=0x55a82ca5d180, server=0x55a82c9e16d0, prog=0x55a82ca4be70) at
rpc/virnetserverprogram.c:437
#20 virNetServerProgramDispatch (prog=0x55a82ca4be70, server=server@entry=0x55a82c9e16d0,
client=0x55a82ca5d180, msg=0x55a82ca53e00) at rpc/virnetserverprogram.c:304
#21 0x00007f850c33ec3c in virNetServerProcessMsg (msg=<optimized out>,
prog=<optimized out>, client=<optimized out>, srv=0x55a82c9e16d0) at
rpc/virnetserver.c:144
#22 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x55a82c9e16d0) at
rpc/virnetserver.c:165
#23 0x00007f850c26db20 in virThreadPoolWorker (opaque=opaque@entry=0x55a82c9ebe30) at
util/virthreadpool.c:167
#24 0x00007f850c26ce2c in virThreadHelper (data=<optimized out>) at
util/virthread.c:206
#25 0x00007f8508872594 in start_thread () from target:/lib64/libpthread.so.0
#26 0x00007f85085a5e6f in clone () from target:/lib64/libc.so.6
However, an actual bug is that while the security manager waits for the
lock acquire the whole libvirtd hangs, but from what I understood and
Marc explained to me, this is more of a pathological error in libvirt
behaviour with various domain locks.
Since thread A owns the lock for the virDomainObjList other operations
like 'virsh list' won’t return for about 60s.
Hope this information will help :)
--
IBM Systems
Linux on Z & Virtualization Development
--------------------------------------------------
IBM Deutschland Research & Development GmbH
Schönaicher Str. 220, 71032 Böblingen
Phone: +49 7031 16 1819
--------------------------------------------------
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
--
Kind regards / Beste Grüße
Marc Hartmayer
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294