[libvirt] Call to virDomainIsActive hangs forever

Hi ! I'm submitting my messages on this mailing list to request a bit of help on a case that I have where a Python application makes a call to virDomainIsActive, and the call never returns. I have tried to investigate, but as there are no debug symbols for libvirt on Debian Stretch, i can only have the following GDB backtrace: (gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f49026f5b76 in virCondWait () from /usr/lib/libvirt.so.0 #2 0x00007f4902808bab in ?? () from /usr/lib/libvirt.so.0 #3 0x00007f490280a433 in virNetClientSendWithReply () from /usr/lib/libvirt.so.0 #4 0x00007f490280abe2 in virNetClientProgramCall () from /usr/lib/libvirt.so.0 #5 0x00007f49027e0ea4 in ?? () from /usr/lib/libvirt.so.0 #6 0x00007f49027ea1bb in ?? () from /usr/lib/libvirt.so.0 #7 0x00007f49027b0ef3 in virDomainIsActive () from /usr/lib/libvirt.so.0 #8 0x00007f4902b7fbd0 in libvirt_virDomainIsActive () from /usr/lib/python3/dist-packages/libvirtmod.cpython-35m-x86_64-linux-gnu.so #9 0x0000558eeec696df in PyCFunction_Call () at ../Objects/methodobject.c:109 The libvirt driver used is QEMU, and i have a specific monitoring in place using virtual machine introspection: https://github.com/KVM-VMI/kvm-vmi Now this specific monitoring somehow triggers this bug, and at this point, i don't know if it's a corner case in the libvirt QEMU driver or not. That's why i would like to have your lights on this. libvirt version: 3.0.0-4 -> Could you tell me where i should look in the code ? -> Do you have more information about this virCondWait ? Which condition is it waiting for ? -> How can i get the symbols without having the recompile libvirt and install it system wide, erasing the binaries installed by the package ? Best regards, -- Mathieu Tarral

Hi, On Thursday, 22 March 2018 17:10:49 CET Mathieu Tarral wrote:
I have tried to investigate, but as there are no debug symbols for libvirt on Debian Stretch,
Yes there are: Stretch started the migration to automatically generated packages with debug symbols -- see: https://wiki.debian.org/AutomaticDebugPackages After adding the right repository, for a $pkg binary is a $pkg-dbgsym package, if the automatic generation was enabled for the source of $pkg (which is the case for libvirt). -- Pino Toscano

On Thu, Mar 22, 2018 at 06:10:49PM +0200, Mathieu Tarral wrote:
Hi !
I'm submitting my messages on this mailing list to request a bit of help on a case that I have where a Python application makes a call to virDomainIsActive, and the call never returns.
I have tried to investigate, but as there are no debug symbols for libvirt on Debian Stretch, i can only have the following GDB backtrace:
(gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f49026f5b76 in virCondWait () from /usr/lib/libvirt.so.0 #2 0x00007f4902808bab in ?? () from /usr/lib/libvirt.so.0 #3 0x00007f490280a433 in virNetClientSendWithReply () from /usr/lib/libvirt.so.0 #4 0x00007f490280abe2 in virNetClientProgramCall () from /usr/lib/libvirt.so.0 #5 0x00007f49027e0ea4 in ?? () from /usr/lib/libvirt.so.0 #6 0x00007f49027ea1bb in ?? () from /usr/lib/libvirt.so.0 #7 0x00007f49027b0ef3 in virDomainIsActive () from /usr/lib/libvirt.so.0 #8 0x00007f4902b7fbd0 in libvirt_virDomainIsActive () from /usr/lib/python3/dist-packages/libvirtmod.cpython-35m-x86_64-linux-gnu.so #9 0x0000558eeec696df in PyCFunction_Call () at ../Objects/methodobject.c:109
The libvirt driver used is QEMU, and i have a specific monitoring in place using virtual machine introspection: https://github.com/KVM-VMI/kvm-vmi
Now this specific monitoring somehow triggers this bug, and at this point, i don't know if it's a corner case in the libvirt QEMU driver or not. That's why i would like to have your lights on this.
libvirt version: 3.0.0-4
-> Could you tell me where i should look in the code ?
You're probably looking at virLogManagerDomain* methods located in src/logging/log_manager.c and the wait call is issued from virNetClientIO.
-> Do you have more information about this virCondWait ? Which condition is it waiting for ? -> How can i get the symbols without having the recompile libvirt and install it system wide, erasing the binaries installed by the package ?
To be honest, I think it's always worth debugging a custom built binary from sources, since the debug symbols shipped via distro package are most likely generated with optimizations which makes any kind of interactive debugging painful. The problem is that you're going to built v3.0.0 tag on a new distro, since new GCCs will complain about a lot of stuff, I looked at the code, tried a few things, but honestly I didn't see a path where you could get to virNetClientProgrammCall from virDomainIsActive (since I don't know what the original call was), so unless you post a full backtrace, we can't help you much here. One more thing, yes, you'd have to compile libvirt from git, but you don't have to install it, in fact, I'm always running upstream daemon, but haven't installed it system-wide, on the contrary, I've got a distro-provided version installed in the system which is disabled most of the time. Regards, Erik

On Fri, Mar 23, 2018 at 01:24:46PM +0100, Erik Skultety wrote:
On Thu, Mar 22, 2018 at 06:10:49PM +0200, Mathieu Tarral wrote:
Hi !
I'm submitting my messages on this mailing list to request a bit of help on a case that I have where a Python application makes a call to virDomainIsActive, and the call never returns.
I have tried to investigate, but as there are no debug symbols for libvirt on Debian Stretch, i can only have the following GDB backtrace:
(gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f49026f5b76 in virCondWait () from /usr/lib/libvirt.so.0 #2 0x00007f4902808bab in ?? () from /usr/lib/libvirt.so.0 #3 0x00007f490280a433 in virNetClientSendWithReply () from /usr/lib/libvirt.so.0 #4 0x00007f490280abe2 in virNetClientProgramCall () from /usr/lib/libvirt.so.0 #5 0x00007f49027e0ea4 in ?? () from /usr/lib/libvirt.so.0 #6 0x00007f49027ea1bb in ?? () from /usr/lib/libvirt.so.0 #7 0x00007f49027b0ef3 in virDomainIsActive () from /usr/lib/libvirt.so.0 #8 0x00007f4902b7fbd0 in libvirt_virDomainIsActive () from /usr/lib/python3/dist-packages/libvirtmod.cpython-35m-x86_64-linux-gnu.so #9 0x0000558eeec696df in PyCFunction_Call () at ../Objects/methodobject.c:109
The libvirt driver used is QEMU, and i have a specific monitoring in place using virtual machine introspection: https://github.com/KVM-VMI/kvm-vmi
Now this specific monitoring somehow triggers this bug, and at this point, i don't know if it's a corner case in the libvirt QEMU driver or not. That's why i would like to have your lights on this.
libvirt version: 3.0.0-4
-> Could you tell me where i should look in the code ?
You're probably looking at virLogManagerDomain* methods located in src/logging/log_manager.c and the wait call is issued from virNetClientIO.
-> Do you have more information about this virCondWait ? Which condition is it waiting for ? -> How can i get the symbols without having the recompile libvirt and install it system wide, erasing the binaries installed by the package ?
To be honest, I think it's always worth debugging a custom built binary from sources, since the debug symbols shipped via distro package are most likely generated with optimizations which makes any kind of interactive debugging painful. The problem is that you're going to built v3.0.0 tag on a new distro, since new GCCs will complain about a lot of stuff, I looked at the code, tried a few things, but honestly I didn't see a path where you could get to virNetClientProgrammCall from virDomainIsActive (since I don't know what the original call was), so unless you post a full backtrace, we can't help you much here.
That's easy - virDomainIsActive() calls into the driver APIs. This is client side trace, so will get into the remote driver client, which will then call virNetClientProgramCall. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

I have installed the following debug symbols: - libvirt-daemon-dbgsym_3.0.0-4+deb9u3_amd64.deb - libvirt-daemon-system-dbgsym_3.0.0-4+deb9u3_amd64.deb - libvirt-clients-dbgsym_3.0.0-4+deb9u3_amd64.deb - libvirt0-dbgsym_3.0.0-4+deb9u3_amd64.deb - python3-libvirt-dbgsym_3.0.0-2_amd64.deb And i was able to reproduce my bug. In my Python application, i had 2 threads hanging on a libvirt call. (sorry for the verbosity) Thread 1: (gdb) py-bt Traceback (most recent call first): <built-in method virDomainGetState of module object at remote 0x7f508e6b1278> File "/usr/lib/python3/dist-packages/libvirt.py", line 2551, in state ret = libvirtmod.virDomainGetState(self._o, flags) (gdb) bt pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f508dfe2b86 in virCondWait (c=c@entry=0x55a1fe420728, m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 #2 0x00007f508e0f5bbb in virNetClientIO (thiscall=0x55a1fe420710, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x00007f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0) at ../../../src/rpc/virnetclient.c:2144 #5 0x00007f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=108, proc=proc@entry=212, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0f13e0 <xdr_remote_domain_get_state_args>, args=0x7ffd1518cfd0, ret_filter=0x7f508e0f1410 <xdr_remote_domain_get_state_ret>, ret=0x7ffd1518cfc8) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x00007f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=212, args_filter=0x7f508e0f13e0 <xdr_remote_domain_get_state_args>, args=0x7ffd1518cfd0 "`k.\376\241U", ret_filter=0x7f508e0f1410 <xdr_remote_domain_get_state_ret>, ret=0x7ffd1518cfc8 "", conn=<optimized out>) at ../../../src/remote/remote_driver.c:6636 #7 0x00007f508e0d7f90 in call (conn=<optimized out>, ret=0x7ffd1518cfc8 "", ret_filter=<optimized out>, args=0x7ffd1518cfd0 "`k.\376\241U", args_filter=<optimized out>, proc_nr=212, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainGetState (domain=0x55a1fe212da0, state=0x7ffd1518d0a4, reason=0x7ffd1518d0a8, flags=0) at ../../../src/remote/remote_driver.c:2458 #9 0x00007f508e08e248 in virDomainGetState (domain=domain@entry=0x55a1fe212da0, state=state@entry=0x7ffd1518d0a4, reason=reason@entry=0x7ffd1518d0a8, flags=0) at ../../../src/libvirt-domain.c:2495 #10 0x00007f508e466f28 in libvirt_virDomainGetState (self=<optimized out>, args=<optimized out>) at libvirt-override.c:2539 #11 0x000055a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109 And Thread 2: (gdb) py-bt <built-in method virDomainIsActive of module object at remote 0x7f508e6b1278> File "/usr/lib/python3/dist-packages/libvirt.py", line 1340, in isActive ret = libvirtmod.virDomainIsActive(self._o) (gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f508dfe2b86 in virCondWait (c=c@entry=0x7f4ff051fdc8, m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 #2 0x00007f508e0f5bbb in virNetClientIO (thiscall=0x7f4ff051fdb0, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x00007f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10) at ../../../src/rpc/virnetclient.c:2144 #5 0x00007f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=107, proc=proc@entry=150, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0efe70 <xdr_remote_domain_is_active_args>, args=0x7f4fff7fd040, ret_filter=0x7f508e0efe90 <xdr_remote_domain_is_active_ret>, ret=0x7f4fff7fd03c) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x00007f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=150, args_filter=0x7f508e0efe70 <xdr_remote_domain_is_active_args>, args=0x7f4fff7fd040 "`k.\376\241U", ret_filter=0x7f508e0efe90 <xdr_remote_domain_is_active_ret>, ret=0x7f4fff7fd03c "", conn=<optimized out>) at ../../../src/remote/remote_driver.c:6636 #7 0x00007f508e0d71cb in call (conn=<optimized out>, ret=0x7f4fff7fd03c "", ret_filter=<optimized out>, args=0x7f4fff7fd040 "`k.\376\241U", args_filter=<optimized out>, proc_nr=150, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainIsActive (dom=0x55a1fe212da0) at ../../../src/remote/remote_client_bodies.h:2842 #9 0x00007f508e09df03 in virDomainIsActive (dom=dom@entry=0x55a1fe212da0) at ../../../src/libvirt-domain.c:8467 #10 0x00007f508e46cbd0 in libvirt_virDomainIsActive (self=<optimized out>, args=<optimized out>) at build/libvirt.c:1288 #11 0x000055a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109 So i ran GDB on /usr/sbin/libvirtd to know what was happening: (gdb) thread all apply bt Thread 17 (Thread 0x7f241dd40700 (LWP 5100)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x7f24183eeba8, m=m@entry=0x7f24183eeb80) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071a350) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f241dd40700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 16 (Thread 0x7f241fdfd700 (LWP 3518)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x55705073e728, m=m@entry=0x55705073e700) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705073e4b0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f241fdfd700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 15 (Thread 0x7f24205fe700 (LWP 3517)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x55705073e728, m=m@entry=0x55705073e700) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071e310) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f24205fe700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 14 (Thread 0x7f2420dff700 (LWP 3516)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x55705073e728, m=m@entry=0x55705073e700) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705073e4b0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f2420dff700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 13 (Thread 0x7f2421600700 (LWP 3515)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x55705073e728, m=m@entry=0x55705073e700) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071e250) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f2421600700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 12 (Thread 0x7f2421e01700 (LWP 3514)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x55705073e728, m=m@entry=0x55705073e700) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705073e4b0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f2421e01700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 11 (Thread 0x7f2439e14700 (LWP 3513)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x557050729358, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f2448241694 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b8d0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f2439e14700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 10 (Thread 0x7f243a615700 (LWP 3512)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x557050729358, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f2448241694 in virThreadPoolWorker (opaque=opaque@entry=0x55705071bb10) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243a615700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 9 (Thread 0x7f243ae16700 (LWP 3511)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x557050729358, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f2448241694 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b810) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243ae16700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 8 (Thread 0x7f243b617700 (LWP 3510)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x557050729358, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f2448241694 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b8d0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243b617700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 7 (Thread 0x7f243be18700 (LWP 3509)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x557050729358, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f2448241694 in virThreadPoolWorker (opaque=opaque@entry=0x55705071bb10) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243be18700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 6 (Thread 0x7f243c619700 (LWP 3508)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x5570507292b8, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b810) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243c619700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 5 (Thread 0x7f243ce1a700 (LWP 3507)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x5570507292b8, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b8d0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243ce1a700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 4 (Thread 0x7f243d61b700 (LWP 3506)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x5570507292b8, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b990) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243d61b700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 3 (Thread 0x7f243de1c700 (LWP 3505)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x5570507292b8, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071ba50) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243de1c700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 2 (Thread 0x7f243e61d700 (LWP 3504)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x5570507292b8, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071bb10) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243e61d700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 Thread 1 (Thread 0x7f24489f3940 (LWP 3502)): #0 0x00007f244485367d in poll () at ../sysdeps/unix/syscall-template.S:84 #1 0x00007f24481f304f in poll (__timeout=-1, __nfds=50, __fds=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46 #2 virEventPollRunOnce () at ../../../src/util/vireventpoll.c:641 #3 0x00007f24481f1c21 in virEventRunDefaultImpl () at ../../../src/util/virevent.c:314 #4 0x00007f244835837d in virNetDaemonRun (dmn=0x55705072a330) at ../../../src/rpc/virnetdaemon.c:818 #5 0x000055704f773b87 in main (argc=<optimized out>, argv=<optimized out>) at ../../../daemon/libvirtd.c:1547 But this doesn't really help me understand why it hangs, or which threads is processing the client requests that are hanging. Does this information provides you any help ? If i should run other commands, please, let me know, libvirt will be hanging forever. Best regards, 2018-03-26 13:16 GMT+03:00 Daniel P. Berrangé <berrange@redhat.com>:
On Fri, Mar 23, 2018 at 01:24:46PM +0100, Erik Skultety wrote:
On Thu, Mar 22, 2018 at 06:10:49PM +0200, Mathieu Tarral wrote:
Hi !
I'm submitting my messages on this mailing list to request a bit of help on a case that I have where a Python application makes a call to virDomainIsActive, and the call never returns.
I have tried to investigate, but as there are no debug symbols for libvirt on Debian Stretch, i can only have the following GDB backtrace:
(gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f49026f5b76 in virCondWait () from /usr/lib/libvirt.so.0 #2 0x00007f4902808bab in ?? () from /usr/lib/libvirt.so.0 #3 0x00007f490280a433 in virNetClientSendWithReply () from /usr/lib/libvirt.so.0 #4 0x00007f490280abe2 in virNetClientProgramCall () from /usr/lib/libvirt.so.0 #5 0x00007f49027e0ea4 in ?? () from /usr/lib/libvirt.so.0 #6 0x00007f49027ea1bb in ?? () from /usr/lib/libvirt.so.0 #7 0x00007f49027b0ef3 in virDomainIsActive () from /usr/lib/libvirt.so.0 #8 0x00007f4902b7fbd0 in libvirt_virDomainIsActive () from /usr/lib/python3/dist-packages/libvirtmod.cpython-35m-x86_64-linux-gnu.so #9 0x0000558eeec696df in PyCFunction_Call () at ../Objects/methodobject.c:109
The libvirt driver used is QEMU, and i have a specific monitoring in place using virtual machine introspection: https://github.com/KVM-VMI/kvm-vmi
Now this specific monitoring somehow triggers this bug, and at this point, i don't know if it's a corner case in the libvirt QEMU driver or not. That's why i would like to have your lights on this.
libvirt version: 3.0.0-4
-> Could you tell me where i should look in the code ?
You're probably looking at virLogManagerDomain* methods located in src/logging/log_manager.c and the wait call is issued from virNetClientIO.
-> Do you have more information about this virCondWait ? Which condition is it waiting for ? -> How can i get the symbols without having the recompile libvirt and install it system wide, erasing the binaries installed by the package ?
To be honest, I think it's always worth debugging a custom built binary from sources, since the debug symbols shipped via distro package are most likely generated with optimizations which makes any kind of interactive debugging painful. The problem is that you're going to built v3.0.0 tag on a new distro, since new GCCs will complain about a lot of stuff, I looked at the code, tried a few things, but honestly I didn't see a path where you could get to virNetClientProgrammCall from virDomainIsActive (since I don't know what the original call was), so unless you post a full backtrace, we can't help you much here.
That's easy - virDomainIsActive() calls into the driver APIs. This is client side trace, so will get into the remote driver client, which will then call virNetClientProgramCall.
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
-- Mathieu Tarral

On Tue, Mar 27, 2018 at 03:30:14PM +0300, Mathieu Tarral wrote:
I have installed the following debug symbols:
- libvirt-daemon-dbgsym_3.0.0-4+deb9u3_amd64.deb - libvirt-daemon-system-dbgsym_3.0.0-4+deb9u3_amd64.deb - libvirt-clients-dbgsym_3.0.0-4+deb9u3_amd64.deb - libvirt0-dbgsym_3.0.0-4+deb9u3_amd64.deb - python3-libvirt-dbgsym_3.0.0-2_amd64.deb
And i was able to reproduce my bug.
In my Python application, i had 2 threads hanging on a libvirt call. (sorry for the verbosity) Thread 1:
(gdb) py-bt Traceback (most recent call first): <built-in method virDomainGetState of module object at remote 0x7f508e6b1278> File "/usr/lib/python3/dist-packages/libvirt.py", line 2551, in state ret = libvirtmod.virDomainGetState(self._o, flags)
(gdb) bt pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f508dfe2b86 in virCondWait (c=c@entry=0x55a1fe420728, m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 #2 0x00007f508e0f5bbb in virNetClientIO (thiscall=0x55a1fe420710, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x00007f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0) at ../../../src/rpc/virnetclient.c:2144 #5 0x00007f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=108, proc=proc@entry=212, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0f13e0 <xdr_remote_domain_get_state_args>, args=0x7ffd1518cfd0, ret_filter=0x7f508e0f1410 <xdr_remote_domain_get_state_ret>, ret=0x7ffd1518cfc8) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x00007f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=212, args_filter=0x7f508e0f13e0 <xdr_remote_domain_get_state_args>, args=0x7ffd1518cfd0 "`k.\376\241U", ret_filter=0x7f508e0f1410 <xdr_remote_domain_get_state_ret>, ret=0x7ffd1518cfc8 "", conn=<optimized out>) at ../../../src/remote/remote_driver.c:6636 #7 0x00007f508e0d7f90 in call (conn=<optimized out>, ret=0x7ffd1518cfc8 "", ret_filter=<optimized out>, args=0x7ffd1518cfd0 "`k.\376\241U", args_filter=<optimized out>, proc_nr=212, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainGetState (domain=0x55a1fe212da0, state=0x7ffd1518d0a4, reason=0x7ffd1518d0a8, flags=0) at ../../../src/remote/remote_driver.c:2458 #9 0x00007f508e08e248 in virDomainGetState (domain=domain@entry=0x55a1fe212da0, state=state@entry=0x7ffd1518d0a4, reason=reason@entry=0x7ffd1518d0a8, flags=0) at ../../../src/libvirt-domain.c:2495 #10 0x00007f508e466f28 in libvirt_virDomainGetState (self=<optimized out>, args=<optimized out>) at libvirt-override.c:2539 #11 0x000055a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109
That's pretty much as expected - just remote client waiting for a reply from the daemon.
And Thread 2:
(gdb) py-bt <built-in method virDomainIsActive of module object at remote 0x7f508e6b1278> File "/usr/lib/python3/dist-packages/libvirt.py", line 1340, in isActive ret = libvirtmod.virDomainIsActive(self._o)
(gdb) bt
Are you sure this isa different thread ? It looks identical to the first stack trace you give above.
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f508dfe2b86 in virCondWait (c=c@entry=0x7f4ff051fdc8, m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 #2 0x00007f508e0f5bbb in virNetClientIO (thiscall=0x7f4ff051fdb0, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x00007f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10) at ../../../src/rpc/virnetclient.c:2144 #5 0x00007f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=107, proc=proc@entry=150, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0efe70 <xdr_remote_domain_is_active_args>, args=0x7f4fff7fd040, ret_filter=0x7f508e0efe90 <xdr_remote_domain_is_active_ret>, ret=0x7f4fff7fd03c) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x00007f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=150, args_filter=0x7f508e0efe70 <xdr_remote_domain_is_active_args>, args=0x7f4fff7fd040 "`k.\376\241U", ret_filter=0x7f508e0efe90 <xdr_remote_domain_is_active_ret>, ret=0x7f4fff7fd03c "", conn=<optimized out>) at ../../../src/remote/remote_driver.c:6636 #7 0x00007f508e0d71cb in call (conn=<optimized out>, ret=0x7f4fff7fd03c "", ret_filter=<optimized out>, args=0x7f4fff7fd040 "`k.\376\241U", args_filter=<optimized out>, proc_nr=150, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainIsActive (dom=0x55a1fe212da0) at ../../../src/remote/remote_client_bodies.h:2842 #9 0x00007f508e09df03 in virDomainIsActive (dom=dom@entry=0x55a1fe212da0) at ../../../src/libvirt-domain.c:8467 #10 0x00007f508e46cbd0 in libvirt_virDomainIsActive (self=<optimized out>, args=<optimized out>) at build/libvirt.c:1288 #11 0x000055a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109
Interesting. This is an identical stack trace - so we have 2 python threads both calling virDomainIsActive(). Nothing wrong with that per-se - we support multithreaded usage like this. Curious why it would break. Can you confirm there are no other threads running libvirt code in your python app ? Did you have any thread running the libvirt event loop perhaps ?
So i ran GDB on /usr/sbin/libvirtd to know what was happening:
(gdb) thread all apply bt Thread 17 (Thread 0x7f241dd40700 (LWP 5100)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x7f24183eeba8, m=m@entry=0x7f24183eeb80) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071a350) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f241dd40700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 16 (Thread 0x7f241fdfd700 (LWP 3518)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x55705073e728, m=m@entry=0x55705073e700) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705073e4b0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f241fdfd700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 15 (Thread 0x7f24205fe700 (LWP 3517)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x55705073e728, m=m@entry=0x55705073e700) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071e310) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f24205fe700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 14 (Thread 0x7f2420dff700 (LWP 3516)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x55705073e728, m=m@entry=0x55705073e700) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705073e4b0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f2420dff700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 13 (Thread 0x7f2421600700 (LWP 3515)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x55705073e728, m=m@entry=0x55705073e700) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071e250) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f2421600700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 12 (Thread 0x7f2421e01700 (LWP 3514)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x55705073e728, m=m@entry=0x55705073e700) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705073e4b0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f2421e01700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 11 (Thread 0x7f2439e14700 (LWP 3513)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x557050729358, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f2448241694 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b8d0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f2439e14700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 10 (Thread 0x7f243a615700 (LWP 3512)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x557050729358, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f2448241694 in virThreadPoolWorker (opaque=opaque@entry=0x55705071bb10) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243a615700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 9 (Thread 0x7f243ae16700 (LWP 3511)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x557050729358, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f2448241694 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b810) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243ae16700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 8 (Thread 0x7f243b617700 (LWP 3510)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x557050729358, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f2448241694 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b8d0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243b617700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 7 (Thread 0x7f243be18700 (LWP 3509)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x557050729358, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f2448241694 in virThreadPoolWorker (opaque=opaque@entry=0x55705071bb10) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243be18700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 6 (Thread 0x7f243c619700 (LWP 3508)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x5570507292b8, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b810) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243c619700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 5 (Thread 0x7f243ce1a700 (LWP 3507)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x5570507292b8, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b8d0) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243ce1a700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 4 (Thread 0x7f243d61b700 (LWP 3506)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x5570507292b8, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071b990) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243d61b700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 3 (Thread 0x7f243de1c700 (LWP 3505)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x5570507292b8, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071ba50) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243de1c700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 2 (Thread 0x7f243e61d700 (LWP 3504)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f2448240b86 in virCondWait (c=c@entry=0x5570507292b8, m=m@entry=0x557050729290) at ../../../src/util/virthread.c:154 #2 0x00007f24482416d3 in virThreadPoolWorker (opaque=opaque@entry=0x55705071bb10) at ../../../src/util/virthreadpool.c:124 #3 0x00007f2448240928 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #4 0x00007f2444b1a494 in start_thread (arg=0x7f243e61d700) at pthread_create.c:333 #5 0x00007f244485cacf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Thread 1 (Thread 0x7f24489f3940 (LWP 3502)): #0 0x00007f244485367d in poll () at ../sysdeps/unix/syscall-template.S:84 #1 0x00007f24481f304f in poll (__timeout=-1, __nfds=50, __fds=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46 #2 virEventPollRunOnce () at ../../../src/util/vireventpoll.c:641 #3 0x00007f24481f1c21 in virEventRunDefaultImpl () at ../../../src/util/virevent.c:314 #4 0x00007f244835837d in virNetDaemonRun (dmn=0x55705072a330) at ../../../src/rpc/virnetdaemon.c:818 #5 0x000055704f773b87 in main (argc=<optimized out>, argv=<optimized out>) at ../../../daemon/libvirtd.c:1547
But this doesn't really help me understand why it hangs, or which threads is processing the client requests that are hanging.
This shows that libvirtd is entirely idle. Thread 1 is the event loop and waiting for I/O. All the other threads are API workers waiting for any API call to process. So all completely normal. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Are you sure this isa different thread ? It looks identical to the first stack trace you give above.
Yes, the first one is calling libvirtmod.virDomainGetState and the second one libvirtmod.virDomainIsActive.
Interesting. This is an identical stack trace - so we have 2 python threads both calling virDomainIsActive(). Nothing wrong with that per-se - we support multithreaded usage like this.
virDomainGetState() and virDomainIsActive()
Can you confirm there are no other threads running libvirt code in your python app ? Did you have any thread running the libvirt event loop perhaps ?
Actually i found 2 others threads in Python app calling libvirt. So, as a recap: Thread 1 - calling virDomainGetState (gdb) py-bt Traceback (most recent call first): <built-in method virDomainGetState of module object at remote 0x7f508e6b1278> File "/usr/lib/python3/dist-packages/libvirt.py", line 2551, in state ret = libvirtmod.virDomainGetState(self._o, flags) (gdb) bt pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f508dfe2b86 in virCondWait (c=c@entry=0x55a1fe420728, m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 #2 0x00007f508e0f5bbb in virNetClientIO (thiscall=0x55a1fe420710, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x00007f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0) at ../../../src/rpc/virnetclient.c:2144 #5 0x00007f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=108, proc=proc@entry=212, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0f13e0 <xdr_remote_domain_get_state_args>, args=0x7ffd1518cfd0, ret_filter=0x7f508e0f1410 <xdr_remote_domain_get_state_ret>, ret=0x7ffd1518cfc8) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x00007f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=212, args_filter=0x7f508e0f13e0 <xdr_remote_domain_get_state_args>, args=0x7ffd1518cfd0 "`k.\376\241U", ret_filter=0x7f508e0f1410 <xdr_remote_domain_get_state_ret>, ret=0x7ffd1518cfc8 "", conn=<optimized out>) at ../../../src/remote/remote_driver.c:6636 #7 0x00007f508e0d7f90 in call (conn=<optimized out>, ret=0x7ffd1518cfc8 "", ret_filter=<optimized out>, args=0x7ffd1518cfd0 "`k.\376\241U", args_filter=<optimized out>, proc_nr=212, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainGetState (domain=0x55a1fe212da0, state=0x7ffd1518d0a4, reason=0x7ffd1518d0a8, flags=0) at ../../../src/remote/remote_driver.c:2458 #9 0x00007f508e08e248 in virDomainGetState (domain=domain@entry=0x55a1fe212da0, state=state@entry=0x7ffd1518d0a4, reason=reason@entry=0x7ffd1518d0a8, flags=0) at ../../../src/libvirt-domain.c:2495 #10 0x00007f508e466f28 in libvirt_virDomainGetState (self=<optimized out>, args=<optimized out>) at libvirt-override.c:2539 #11 0x000055a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109 Thread 32 - calling virDomainIsActive (gdb) py-bt Traceback (most recent call first): <built-in method virDomainIsActive of module object at remote 0x7f508e6b1278> File "/usr/lib/python3/dist-packages/libvirt.py", line 1340, in isActive ret = libvirtmod.virDomainIsActive(self._o) (gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f508dfe2b86 in virCondWait (c=c@entry=0x7f4ff051fdc8, m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 #2 0x00007f508e0f5bbb in virNetClientIO (thiscall=0x7f4ff051fdb0, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x00007f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10) at ../../../src/rpc/virnetclient.c:2144 #5 0x00007f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=107, proc=proc@entry=150, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0efe70 <xdr_remote_domain_is_active_args>, args=0x7f4fff7fd040, ret_filter=0x7f508e0efe90 <xdr_remote_domain_is_active_ret>, ret=0x7f4fff7fd03c) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x00007f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=150, args_filter=0x7f508e0efe70 <xdr_remote_domain_is_active_args>, args=0x7f4fff7fd040 "`k.\376\241U", ret_filter=0x7f508e0efe90 <xdr_remote_domain_is_active_ret>, ret=0x7f4fff7fd03c "", conn=<optimized out>) at ../../../src/remote/remote_driver.c:6636 #7 0x00007f508e0d71cb in call (conn=<optimized out>, ret=0x7f4fff7fd03c "", ret_filter=<optimized out>, args=0x7f4fff7fd040 "`k.\376\241U", args_filter=<optimized out>, proc_nr=150, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainIsActive (dom=0x55a1fe212da0) at ../../../src/remote/remote_client_bodies.h:2842 #9 0x00007f508e09df03 in virDomainIsActive (dom=dom@entry=0x55a1fe212da0) at ../../../src/libvirt-domain.c:8467 #10 0x00007f508e46cbd0 in libvirt_virDomainIsActive (self=<optimized out>, args=<optimized out>) at build/libvirt.c:1288 #11 0x000055a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109 Thread 35 - calling virDomainGetXMLDesc (gdb) py-bt <built-in method virDomainGetXMLDesc of module object at remote 0x7f508e6b1278> File "/usr/lib/python3/dist-packages/libvirt.py", line 491, in XMLDesc ret = libvirtmod.virDomainGetXMLDesc(self._o, flags) (gdb) bt #0 pthread_sigmask (how=how@entry=0, newmask=<optimized out>, newmask@entry=0x7f4ffd7f8d10, oldmask=oldmask@entry=0x7f4ffd7f8c90) at ../sysdeps/unix/sysv/linux/pthread_sigmask.c:50 #1 0x00007f508e0f52fa in virNetClientIOEventLoop (client=client@entry=0x55a1fde4d2b0, thiscall=thiscall@entry=0x7f4fe005a350) at ../../../src/rpc/virnetclient.c:1659 #2 0x00007f508e0f5a16 in virNetClientIO (thiscall=0x7f4fe005a350, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1944 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe0031f50, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x00007f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe0031f50) at ../../../src/rpc/virnetclient.c:2144 #5 0x00007f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=105, proc=proc@entry=14, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0ecba0 <xdr_remote_domain_get_xml_desc_args>, args=0x7f4ffd7f8fe0, ret_filter=0x7f508e0ecbd0 <xdr_remote_domain_get_xml_desc_ret>, ret=0x7f4ffd7f8fd8) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x00007f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=14, args_filter=0x7f508e0ecba0 <xdr_remote_domain_get_xml_desc_args>, args=0x7f4ffd7f8fe0 "`k.\376\241U", ret_filter=0x7f508e0ecbd0 <xdr_remote_domain_get_xml_desc_ret>, ret=0x7f4ffd7f8fd8 "", conn=<optimized out>) at ../../../src/remote/remote_driver.c:6636 #7 0x00007f508e0d7b58 in call (conn=<optimized out>, ret=0x7f4ffd7f8fd8 "", ret_filter=<optimized out>, args=0x7f4ffd7f8fe0 "`k.\376\241U", args_filter=<optimized out>, proc_nr=14, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainGetXMLDesc (dom=<optimized out>, flags=0) at ../../../src/remote/remote_client_bodies.h:2698 #9 0x00007f508e08f5c1 in virDomainGetXMLDesc (domain=domain@entry=0x55a1fe212da0, flags=0) at ../../../src/libvirt-domain.c:2592 #10 0x00007f508e46c8c0 in libvirt_virDomainGetXMLDesc (self=<optimized out>, args=<optimized out>) at build/libvirt.c:1212 #11 0x000055a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109 Thread 36 - calling virDomainQemuMonitorCommand (gdb) py-bt <built-in method virDomainQemuMonitorCommand of module object at remote 0x7f506e317908> File "/usr/lib/python3/dist-packages/libvirt_qemu.py", line 71, in qemuMonitorCommand ret = libvirtmod_qemu.virDomainQemuMonitorCommand(domain._o, cmd, flags) (gdb) #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f508dfe2b86 in virCondWait (c=c@entry=0x7f4fe80da208, m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 #2 0x00007f508e0f5bbb in virNetClientIO (thiscall=0x7f4fe80da1f0, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe80da160, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x00007f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe80da160) at ../../../src/rpc/virnetclient.c:2144 #5 0x00007f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fe2b5ae0, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=106, proc=proc@entry=1, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0f34f0 <xdr_qemu_domain_monitor_command_args>, args=0x7f4ffe7fb1a0, ret_filter=0x7f508e0f3540 <xdr_qemu_domain_monitor_command_ret>, ret=0x7f4ffe7fb198) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x00007f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=1, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=1, args_filter=0x7f508e0f34f0 <xdr_qemu_domain_monitor_command_args>, args=0x7f4ffe7fb1a0 "`k.\376\241U", ret_filter=0x7f508e0f3540 <xdr_qemu_domain_monitor_command_ret>, ret=0x7f4ffe7fb198 "", conn=<optimized out>) at ../../../src/remote/remote_driver.c:6636 #7 0x00007f508e0dda39 in call (conn=<optimized out>, ret=<optimized out>, ret_filter=<optimized out>, args=<optimized out>, args_filter=<optimized out>, proc_nr=1, flags=1, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainQemuMonitorCommand (domain=0x55a1fe212da0, cmd=<optimized out>, result=0x7f4ffe7fb290, flags=<optimized out>) at ../../../src/remote/remote_driver.c:5993 #9 0x00007f506deaeda5 in virDomainQemuMonitorCommand (domain=domain@entry=0x55a1fe212da0, cmd=0x7f506c9b3690 "mouse_move 385 212", result=result@entry=0x7f4ffe7fb290, flags=1) at ../../../src/libvirt-qemu.c:86 #10 0x00007f506e0b50cc in libvirt_qemu_virDomainQemuMonitorCommand (self=<optimized out>, args=<optimized out>) at libvirt-qemu-override.c:146 #11 0x000055a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109 Thank you. -- Mathieu Tarral

On Tue, Mar 27, 2018 at 04:04:33PM +0300, Mathieu Tarral wrote:
Are you sure this isa different thread ? It looks identical to the first stack trace you give above.
Yes, the first one is calling libvirtmod.virDomainGetState and the second one libvirtmod.virDomainIsActive.
Interesting. This is an identical stack trace - so we have 2 python threads both calling virDomainIsActive(). Nothing wrong with that per-se - we support multithreaded usage like this.
virDomainGetState() and virDomainIsActive()
Opps, yes i see.
Can you confirm there are no other threads running libvirt code in your python app ? Did you have any thread running the libvirt event loop perhaps ?
Actually i found 2 others threads in Python app calling libvirt.
So, as a recap:
(gdb) bt #0 pthread_sigmask (how=how@entry=0, newmask=<optimized out>, newmask@entry=0x7f4ffd7f8d10, oldmask=oldmask@entry=0x7f4ffd7f8c90) at ../sysdeps/unix/sysv/linux/pthread_sigmask.c:50
This is slightly unusual - pthread_sigmask() should complete in a tiny fraction of a second, so seeing it in the stack trace is odd unless you have very fortuitous timing when taking the stack trace.
#1 0x00007f508e0f52fa in virNetClientIOEventLoop (client=client@entry=0x55a1fde4d2b0, thiscall=thiscall@entry=0x7f4fe005a350) at ../../../src/rpc/virnetclient.c:1659 #2 0x00007f508e0f5a16 in virNetClientIO (thiscall=0x7f4fe005a350, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1944 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe0031f50, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x00007f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe0031f50) at ../../../src/rpc/virnetclient.c:2144 #5 0x00007f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=105, proc=proc@entry=14, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0ecba0 <xdr_remote_domain_get_xml_desc_args>, args=0x7f4ffd7f8fe0, ret_filter=0x7f508e0ecbd0 <xdr_remote_domain_get_xml_desc_ret>, ret=0x7f4ffd7f8fd8) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x00007f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=14, args_filter=0x7f508e0ecba0 <xdr_remote_domain_get_xml_desc_args>, args=0x7f4ffd7f8fe0 "`k.\376\241U", ret_filter=0x7f508e0ecbd0 <xdr_remote_domain_get_xml_desc_ret>, ret=0x7f4ffd7f8fd8 "", conn=<optimized out>) at ../../../src/remote/remote_driver.c:6636 #7 0x00007f508e0d7b58 in call (conn=<optimized out>, ret=0x7f4ffd7f8fd8 "", ret_filter=<optimized out>, args=0x7f4ffd7f8fe0 "`k.\376\241U", args_filter=<optimized out>, proc_nr=14, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainGetXMLDesc (dom=<optimized out>, flags=0) at ../../../src/remote/remote_client_bodies.h:2698 #9 0x00007f508e08f5c1 in virDomainGetXMLDesc (domain=domain@entry=0x55a1fe212da0, flags=0) at ../../../src/libvirt-domain.c:2592 #10 0x00007f508e46c8c0 in libvirt_virDomainGetXMLDesc (self=<optimized out>, args=<optimized out>) at build/libvirt.c:1212 #11 0x000055a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109
Aside from the thing mentioned above I don't see any reason why you would have bad problems here. I don't have much more useful to suggest, other than to try using the very latest libvirt to see if you get the same behaviour. If not, then it would point to a bug in old libvirt, but I don't recall anything that would cause this behaviour you see offhand. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Hi, thanks for your answer Daniel. I think the best way to get more information about this bug is to reproduce it with libvirt master branch. However, i'm facing an issue when i try to run my own daemon: there is a chid process which is failing in a loop. I used sudo strace -f ./libvirtd to watch all child processes, and this is the output: https://gist.github.com/Wenzel/620327a9b3b7e13454108c2e472eaa77 One of them is continously failing, and i don't clearly understand why. Does this log file can help you identify the cause maybe ? I'm here to provide more information if needed. Thank you ! 2018-03-27 16:12 GMT+03:00 Daniel P. Berrangé <berrange@redhat.com>:
On Tue, Mar 27, 2018 at 04:04:33PM +0300, Mathieu Tarral wrote:
Are you sure this isa different thread ? It looks identical to the first stack trace you give above.
Yes, the first one is calling libvirtmod.virDomainGetState and the second one libvirtmod.virDomainIsActive.
Interesting. This is an identical stack trace - so we have 2 python threads both calling virDomainIsActive(). Nothing wrong with that per-se - we support multithreaded usage like this.
virDomainGetState() and virDomainIsActive()
Opps, yes i see.
Can you confirm there are no other threads running libvirt code in your python app ? Did you have any thread running the libvirt event loop perhaps ?
Actually i found 2 others threads in Python app calling libvirt.
So, as a recap:
(gdb) bt #0 pthread_sigmask (how=how@entry=0, newmask=<optimized out>, newmask@entry=0x7f4ffd7f8d10, oldmask=oldmask@entry=0x7f4ffd7f8c90) at ../sysdeps/unix/sysv/linux/pthread_sigmask.c:50
This is slightly unusual - pthread_sigmask() should complete in a tiny fraction of a second, so seeing it in the stack trace is odd unless you have very fortuitous timing when taking the stack trace.
#1 0x00007f508e0f52fa in virNetClientIOEventLoop (client=client@entry=0x55a1fde4d2b0, thiscall=thiscall@entry=0x7f4fe005a350) at ../../../src/rpc/virnetclient.c:1659 #2 0x00007f508e0f5a16 in virNetClientIO (thiscall=0x7f4fe005a350, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1944 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe0031f50, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x00007f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe0031f50) at ../../../src/rpc/virnetclient.c:2144 #5 0x00007f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=105, proc=proc@entry=14, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0ecba0 <xdr_remote_domain_get_xml_desc_args>, args=0x7f4ffd7f8fe0, ret_filter=0x7f508e0ecbd0 <xdr_remote_domain_get_xml_desc_ret>, ret=0x7f4ffd7f8fd8) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x00007f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=14, args_filter=0x7f508e0ecba0 <xdr_remote_domain_get_xml_desc_args>, args=0x7f4ffd7f8fe0 "`k.\376\241U", ret_filter=0x7f508e0ecbd0 <xdr_remote_domain_get_xml_desc_ret>, ret=0x7f4ffd7f8fd8 "", conn=<optimized out>) at ../../../src/remote/remote_driver.c:6636 #7 0x00007f508e0d7b58 in call (conn=<optimized out>, ret=0x7f4ffd7f8fd8 "", ret_filter=<optimized out>, args=0x7f4ffd7f8fe0 "`k.\376\241U", args_filter=<optimized out>, proc_nr=14, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainGetXMLDesc (dom=<optimized out>, flags=0) at ../../../src/remote/remote_client_bodies.h:2698 #9 0x00007f508e08f5c1 in virDomainGetXMLDesc (domain=domain@entry=0x55a1fe212da0, flags=0) at ../../../src/libvirt-domain.c:2592 #10 0x00007f508e46c8c0 in libvirt_virDomainGetXMLDesc (self=<optimized out>, args=<optimized out>) at build/libvirt.c:1212 #11 0x000055a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109
Aside from the thing mentioned above I don't see any reason why you would have bad problems here.
I don't have much more useful to suggest, other than to try using the very latest libvirt to see if you get the same behaviour. If not, then it would point to a bug in old libvirt, but I don't recall anything that would cause this behaviour you see offhand.
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
-- Mathieu Tarral

Hi Daniel, Thanks for your help, i was able to run the libvirt daemon from master, and after add some debug messages, i understood that the call to virDomainIsActive was actually hanging from the client. It never reached the daemon. Looking back at the call again: https://gist.github.com/Wenzel/624cb8c0746c05bd824d2b3c04b6b9f9 The call is waiting for pthread_sigmask to return. Looking into the libvirt source code (3.0.0) at the location of the call: https://github.com/libvirt/libvirt/blob/v3.0.0/src/rpc/virnetclient.c?utf8=%... Is this supposed to be an instant call, or does it wait for specific conditions/event to return ? Any idea what might go wrong ? Thank you. -- Mathieu Tarral

On Thu, Jun 14, 2018 at 04:31:57PM +0300, Mathieu Tarral wrote:
Hi Daniel,
Thanks for your help, i was able to run the libvirt daemon from master, and after add some debug messages, i understood that the call to virDomainIsActive was actually hanging from the client.
It never reached the daemon.
Looking back at the call again: https://gist.github.com/Wenzel/624cb8c0746c05bd824d2b3c04b6b9f9
The call is waiting for pthread_sigmask to return.
Looking into the libvirt source code (3.0.0) at the location of the call: https://github.com/libvirt/libvirt/blob/v3.0.0/src/rpc/virnetclient.c?utf8=%...
Is this supposed to be an instant call, or does it wait for specific conditions/event to return ?
Yeah this is very bad - pthread_sigmask should be instantaneous, and I can't imagine what would make it hang besides severe memory corruption somewhere :-( Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 03/22/2018 05:10 PM, Mathieu Tarral wrote:
Hi !
I'm submitting my messages on this mailing list to request a bit of help on a case that I have where a Python application makes a call to virDomainIsActive, and the call never returns.
I have tried to investigate, but as there are no debug symbols for libvirt on Debian Stretch, i can only have the following GDB backtrace:
(gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f49026f5b76 in virCondWait () from /usr/lib/libvirt.so.0 #2 0x00007f4902808bab in ?? () from /usr/lib/libvirt.so.0 #3 0x00007f490280a433 in virNetClientSendWithReply () from /usr/lib/libvirt.so.0 #4 0x00007f490280abe2 in virNetClientProgramCall () from /usr/lib/libvirt.so.0 #5 0x00007f49027e0ea4 in ?? () from /usr/lib/libvirt.so.0 #6 0x00007f49027ea1bb in ?? () from /usr/lib/libvirt.so.0 #7 0x00007f49027b0ef3 in virDomainIsActive () from /usr/lib/libvirt.so.0 #8 0x00007f4902b7fbd0 in libvirt_virDomainIsActive () from /usr/lib/python3/dist-packages/libvirtmod.cpython-35m-x86_64-linux-gnu.so #9 0x0000558eeec696df in PyCFunction_Call () at ../Objects/methodobject.c:109
This is just a client waiting for server's reply. You want to attach gdb to corresponding daemon and run 't a a bt' (thread apply all backtrace). Libvirtd is multithreaded and one of the threads will be executing your API. Michal
participants (5)
-
Daniel P. Berrangé
-
Erik Skultety
-
Mathieu Tarral
-
Michal Privoznik
-
Pino Toscano