[libvirt] libvirt 3.9.0 locks (was libvirt 3.5.0 locks)

Hi again. I'm still try to determine why my libvirtd locks, this is another portion of gdb stuff: https://gist.github.com/vtolstov/ae8c4a67e15b2fbd14bbb95c226fb427 error looks in logs like: Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: error : qemuDomainObjBeginJobInternal:4403 : Timed out during operation: cannot acquire state change lock (held by remoteDispatchConnectGetAllDomainStats) Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: warning : qemuDomainObjBeginJobInternal:4391 : Cannot start job (destroy, none) for domain 225560-20001; current job is (query, none) owned by (1180 remoteDispatchConnectGetAllDomainStats, 0 <null>) for (30s, 0s) and virsh list says that domain in shutdown state (but qemu process already exited). Please, help me to solve this issue. -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru

On Fri, Dec 01, 2017 at 15:26:35 +0300, Vasiliy Tolstov wrote:
Hi again. I'm still try to determine why my libvirtd locks, this is another portion of gdb stuff: https://gist.github.com/vtolstov/ae8c4a67e15b2fbd14bbb95c226fb427
error looks in logs like: Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: error : qemuDomainObjBeginJobInternal:4403 : Timed out during operation: cannot acquire state change lock (held by remoteDispatchConnectGetAllDomainStats) Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: warning : qemuDomainObjBeginJobInternal:4391 : Cannot start job (destroy, none) for domain 225560-20001; current job is (query, none) owned by (1180 remoteDispatchConnectGetAllDomainStats, 0 <null>) for (30s, 0s)
and virsh list says that domain in shutdown state (but qemu process already exited).
Please, help me to solve this issue.
Looks like qemu did not respond while trying to update vCPU halted state: hread 42 (Thread 0x7f84f96fc700 (LWP 1180)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f8503bf62c6 in virCondWait (c=<optimized out>, m=<optimized out>) at ../../../src/util/virthread.c:154 #2 0x00007f84e13c309b in qemuMonitorSend (mon=0x7f8498000da0, msg=0x80) at ../../../src/qemu/qemu_monitor.c:1067 #3 0x00007f84e13d7045 in qemuMonitorJSONCommandWithFd (mon=0x7f8498000da0, cmd=0x7f84b0001d40, scm_fd=-1, reply=0x7f84f96fb908) at ../../../src/qemu/qemu_monitor_json.c:300 #4 0x00007f84e13d8f67 in qemuMonitorJSONCommand (reply=<optimized out>, cmd=<optimized out>, mon=<optimized out>) at ../../../src/qemu/qemu_monitor_json.c:330 #5 qemuMonitorJSONQueryCPUs (mon=0x7f8498000da0, entries=0x80, nentries=0x6dfa1, force=79) at ../../../src/qemu/qemu_monitor_json.c:1481 #6 0x00007f84e13c5712 in qemuMonitorGetCpuHalted (mon=0x7f8498000da0, maxvcpus=4) at ../../../src/qemu/qemu_monitor.c:2054 #7 0x00007f84e138876c in qemuDomainRefreshVcpuHalted (driver=0x7f84d8151b40, vm=0x7f84d81fe080, asyncJob=0) at ../../../src/qemu/qemu_domain.c:7613 #8 0x00007f84e1406ab0 in qemuDomainGetStatsVcpu (driver=0x7f84d8151b40, dom=0x7f84d81fe080, record=0x7f84b0002248, maxparams=0x7f84f96fbb04, privflags=1) at ../../../src/qemu/qemu_driver.c:19508 Is the guest working at this point?

May be i'm too late check the status, but always when i have such issue virsh says that domain in shutdown state, and qemu process not founded. 2017-12-01 16:10 GMT+03:00 Peter Krempa <pkrempa@redhat.com>:
On Fri, Dec 01, 2017 at 15:26:35 +0300, Vasiliy Tolstov wrote:
Hi again. I'm still try to determine why my libvirtd locks, this is another portion of gdb stuff: https://gist.github.com/vtolstov/ae8c4a67e15b2fbd14bbb95c226fb427
error looks in logs like: Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: error : qemuDomainObjBeginJobInternal:4403 : Timed out during operation: cannot acquire state change lock (held by remoteDispatchConnectGetAllDomainStats) Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: warning : qemuDomainObjBeginJobInternal:4391 : Cannot start job (destroy, none) for domain 225560-20001; current job is (query, none) owned by (1180 remoteDispatchConnectGetAllDomainStats, 0 <null>) for (30s, 0s)
and virsh list says that domain in shutdown state (but qemu process already exited).
Please, help me to solve this issue.
Looks like qemu did not respond while trying to update vCPU halted state:
hread 42 (Thread 0x7f84f96fc700 (LWP 1180)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f8503bf62c6 in virCondWait (c=<optimized out>, m=<optimized out>) at ../../../src/util/virthread.c:154 #2 0x00007f84e13c309b in qemuMonitorSend (mon=0x7f8498000da0, msg=0x80) at ../../../src/qemu/qemu_monitor.c:1067 #3 0x00007f84e13d7045 in qemuMonitorJSONCommandWithFd (mon=0x7f8498000da0, cmd=0x7f84b0001d40, scm_fd=-1, reply=0x7f84f96fb908) at ../../../src/qemu/qemu_monitor_json.c:300 #4 0x00007f84e13d8f67 in qemuMonitorJSONCommand (reply=<optimized out>, cmd=<optimized out>, mon=<optimized out>) at ../../../src/qemu/qemu_monitor_json.c:330 #5 qemuMonitorJSONQueryCPUs (mon=0x7f8498000da0, entries=0x80, nentries=0x6dfa1, force=79) at ../../../src/qemu/qemu_monitor_json.c:1481 #6 0x00007f84e13c5712 in qemuMonitorGetCpuHalted (mon=0x7f8498000da0, maxvcpus=4) at ../../../src/qemu/qemu_monitor.c:2054 #7 0x00007f84e138876c in qemuDomainRefreshVcpuHalted (driver=0x7f84d8151b40, vm=0x7f84d81fe080, asyncJob=0) at ../../../src/qemu/qemu_domain.c:7613 #8 0x00007f84e1406ab0 in qemuDomainGetStatsVcpu (driver=0x7f84d8151b40, dom=0x7f84d81fe080, record=0x7f84b0002248, maxparams=0x7f84f96fbb04, privflags=1) at ../../../src/qemu/qemu_driver.c:19508
Is the guest working at this point?
-- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru

On 01.12.2017 15:26, Vasiliy Tolstov wrote:
Hi again. I'm still try to determine why my libvirtd locks, this is another portion of gdb stuff: https://gist.github.com/vtolstov/ae8c4a67e15b2fbd14bbb95c226fb427
error looks in logs like: Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: error : qemuDomainObjBeginJobInternal:4403 : Timed out during operation: cannot acquire state change lock (held by remoteDispatchConnectGetAllDomainStats) Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: warning : qemuDomainObjBeginJobInternal:4391 : Cannot start job (destroy, none) for domain 225560-20001; current job is (query, none) owned by (1180 remoteDispatchConnectGetAllDomainStats, 0 <null>) for (30s, 0s)
and virsh list says that domain in shutdown state (but qemu process already exited).
Please, help me to solve this issue.
Hi, Vasiliy.
From your stacktrace I guess this could be issue introduced in libvirt version 3.4.0 by aeda1b8c. I tried to address it in [1] but later decide to take a different approach but don't undertake any since then. One can just revert the mentioned patch.
[1] https://www.redhat.com/archives/libvir-list/2017-September/msg00172.html Nikolay

I know that libvirt 3.2 does not have this issue, this is last version that i try before this issue (after 3.2 i'm go to 3.5.0) 2017-12-01 16:14 GMT+03:00 Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>:
On 01.12.2017 15:26, Vasiliy Tolstov wrote:
Hi again. I'm still try to determine why my libvirtd locks, this is another portion of gdb stuff: https://gist.github.com/vtolstov/ae8c4a67e15b2fbd14bbb95c226fb427
error looks in logs like: Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: error : qemuDomainObjBeginJobInternal:4403 : Timed out during operation: cannot acquire state change lock (held by remoteDispatchConnectGetAllDomainStats) Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: warning : qemuDomainObjBeginJobInternal:4391 : Cannot start job (destroy, none) for domain 225560-20001; current job is (query, none) owned by (1180 remoteDispatchConnectGetAllDomainStats, 0 <null>) for (30s, 0s)
and virsh list says that domain in shutdown state (but qemu process already exited).
Please, help me to solve this issue.
Hi, Vasiliy.
From your stacktrace I guess this could be issue introduced in libvirt version 3.4.0 by aeda1b8c. I tried to address it in [1] but later decide to take a different approach but don't undertake any since then. One can just revert the mentioned patch.
[1] https://www.redhat.com/archives/libvir-list/2017-September/msg00172.html
Nikolay
-- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru

2017-12-01 16:14 GMT+03:00 Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>:
On 01.12.2017 15:26, Vasiliy Tolstov wrote:
Hi again. I'm still try to determine why my libvirtd locks, this is another portion of gdb stuff: https://gist.github.com/vtolstov/ae8c4a67e15b2fbd14bbb95c226fb427
error looks in logs like: Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: error : qemuDomainObjBeginJobInternal:4403 : Timed out during operation: cannot acquire state change lock (held by remoteDispatchConnectGetAllDomainStats) Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: warning : qemuDomainObjBeginJobInternal:4391 : Cannot start job (destroy, none) for domain 225560-20001; current job is (query, none) owned by (1180 remoteDispatchConnectGetAllDomainStats, 0 <null>) for (30s, 0s)
and virsh list says that domain in shutdown state (but qemu process already exited).
Please, help me to solve this issue.
Hi, Vasiliy.
From your stacktrace I guess this could be issue introduced in libvirt version 3.4.0 by aeda1b8c. I tried to address it in [1] but later decide to take a different approach but don't undertake any since then. One can just revert the mentioned patch.
[1] https://www.redhat.com/archives/libvir-list/2017-September/msg00172.html
Nikolay
So what next that can i do to fix this? (And i'm happy if this fixed before 3.10, but may be is too late...) -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru

On 04.12.2017 18:46, Vasiliy Tolstov wrote:
2017-12-01 16:14 GMT+03:00 Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>:
On 01.12.2017 15:26, Vasiliy Tolstov wrote:
Hi again. I'm still try to determine why my libvirtd locks, this is another portion of gdb stuff: https://gist.github.com/vtolstov/ae8c4a67e15b2fbd14bbb95c226fb427
error looks in logs like: Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: error : qemuDomainObjBeginJobInternal:4403 : Timed out during operation: cannot acquire state change lock (held by remoteDispatchConnectGetAllDomainStats) Dec 01 12:35:49 cn04 libvirtd[1171]: 2017-12-01 09:35:49.637+0000: 28063: warning : qemuDomainObjBeginJobInternal:4391 : Cannot start job (destroy, none) for domain 225560-20001; current job is (query, none) owned by (1180 remoteDispatchConnectGetAllDomainStats, 0 <null>) for (30s, 0s)
and virsh list says that domain in shutdown state (but qemu process already exited).
Please, help me to solve this issue.
Hi, Vasiliy.
From your stacktrace I guess this could be issue introduced in libvirt version 3.4.0 by aeda1b8c. I tried to address it in [1] but later decide to take a different approach but don't undertake any since then. One can just revert the mentioned patch.
[1] https://www.redhat.com/archives/libvir-list/2017-September/msg00172.html
Nikolay
So what next that can i do to fix this? (And i'm happy if this fixed before 3.10, but may be is too late...)
You can revert aeda1b8c in your downstream build until it is fixed upstream.

2017-12-05 9:23 GMT+03:00 Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>:
So what next that can i do to fix this? (And i'm happy if this fixed before 3.10, but may be is too late...)
You can revert aeda1b8c in your downstream build until it is fixed upstream.
Thanks! -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru
participants (3)
-
Nikolay Shirokovskiy
-
Peter Krempa
-
Vasiliy Tolstov