[libvirt] [RFC] duplicate suspend/resume lifecycle events with QEMU

Hello, I'm using the libvirt event mechanism and noticed, that several events are reported twice:
$ python examples/event-test.py qemu:///session Using uri:qemu:///session myDomainEventCallback1 EVENT: Domain installer(2) Resumed Unpaused myDomainEventCallback2 EVENT: Domain installer(2) Resumed Unpaused myDomainEventCallback1 EVENT: Domain installer(2) Resumed Unpaused myDomainEventCallback2 EVENT: Domain installer(2) Resumed Unpaused myDomainEventCallback1 EVENT: Domain installer(2) Suspended Paused myDomainEventCallback2 EVENT: Domain installer(2) Suspended Paused myDomainEventCallback1 EVENT: Domain installer(2) Suspended Paused myDomainEventCallback2 EVENT: Domain installer(2) Suspended Paused myDomainEventCallback1 EVENT: Domain installer(-1) Stopped Destroyed myDomainEventCallback2 EVENT: Domain installer(-1) Stopped Destroyed
(the Python example program registers 2 handlers, so each event is printed twice, but each handler gets the suspend/resume event twice itself.) Interestingly enough it only "suspend" and "resume", but not "created" or "destroyed". But it isn't Python specific, virsh shows the same behaviour:
$ virsh event --loop --event lifecycle event 'lifecycle' for domain installer: Started Booted event 'lifecycle' for domain installer: Suspended Paused event 'lifecycle' for domain installer: Resumed Unpaused event 'lifecycle' for domain installer: Resumed Unpaused event 'lifecycle' for domain installer: Suspended Paused event 'lifecycle' for domain installer: Suspended Paused event 'lifecycle' for domain installer: Stopped Destroyed
After shutting down libvirtd I used `socat stdio unix-connect:...` to connect the the qemu UNIX socket and used {"execute": "stop"} and {"execute": "cont"} to verify, that QEMU only send one event. So to me it looks like libvirt is duplicating the event. Enabling DEBUG for »log_filters="1:qemu_monitor_json"« shows noting wrong with QEMU:
qemuMonitorJSONCommandWithFd:286 : Send command '{"execute":"cont","id":"libvirt-9"}' for write with FD -1 qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7f1940000910 msg={"execute":"cont","id":"libvirt-9"} fd=-1 qemuMonitorIOWrite:503 : QEMU_MONITOR_IO_WRITE: mon=0x7f1940000910 buf={"execute":"cont","id":"libvirt-9"} len=37 ret=37 errno=11 qemuMonitorIOProcess:399 : QEMU_MONITOR_IO_PROCESS: mon=0x7f1940000910 buf={"timestamp": {"seconds": 1497326279, "microseconds": 688619}, "event": "RESUME"} {"return": {}, "id": "libvirt-9"} len=118 qemuMonitorJSONIOProcessLine:179 : Line [{"timestamp": {"seconds": 1497326279, "microseconds": 688619}, "event": "RESUME"}] qemuMonitorJSONIOProcessLine:194 : QEMU_MONITOR_RECV_EVENT: mon=0x7f1940000910 event={"timestamp": {"seconds": 1497326279, "microseconds": 688619}, "event": "RESUME"} qemuMonitorJSONIOProcessEvent:138 : mon=0x7f1940000910 obj=0x5636fa1f1f00 qemuMonitorEmitEvent:1186 : mon=0x7f1940000910 event=RESUME qemuMonitorJSONIOProcessEvent:165 : handle RESUME handler=0x7f19535903d0 data=(nil) qemuMonitorEmitResume:1237 : mon=0x7f1940000910
Looking at src/qemu/qemu_monitor_json.c:qemuMonitorJSONIOProcessEvent:
141 qemuMonitorJSONIOProcessEvent(qemuMonitorPtr mon, 142 virJSONValuePtr obj) 143 { ... 170 qemuMonitorEmitEvent(mon, type, seconds, micros, details);
This seems to send the first event
171 VIR_FREE(details); 172 173 handler = bsearch(type, eventHandlers, ARRAY_CARDINALITY(eventHandlers), 174 sizeof(eventHandlers[0]), qemuMonitorEventCompare); 175 if (handler) { 176 VIR_DEBUG("handle %s handler=%p data=%p", type, 177 handler->handler, data); 178 (handler->handler)(mon, data);
and this the second instance.
179 } 180 return 0; 181 }
Looking deeper into the GIT source code I see this:
98 static qemuEventHandler eventHandlers[] = { ... 113 { "RESUME", qemuMonitorJSONHandleResume, },
544 static void qemuMonitorJSONHandleResume(qemuMonitorPtr mon, virJSONValuePtr data ATTRIBUTE_UNUSED) 546 qemuMonitorEmitResume(mon); ... 1267 qemuMonitorEmitEvent(qemuMonitorPtr mon, const char *event, 1268 long long seconds, unsigned int micros, 1269 const char *details) ... 1274 QEMU_MONITOR_CALLBACK(mon, ret, domainEvent, mon->vm, event, seconds, 1275 micros, details); ... 1325 qemuMonitorEmitResume(qemuMonitorPtr mon) 1330 QEMU_MONITOR_CALLBACK(mon, ret, domainResume, mon->vm);
but this doesn't yet completely explain, why only some events are reported twice. Is there some way to get rid of the duplication (in Python) or at least to distinguish them, so I can ignore the second one? I've observed this behaviour with both Debian-Jessie
Compiled against library: libvirt 1.2.9 Using library: libvirt 1.2.9 Using API: QEMU 1.2.9 Running hypervisor: QEMU 2.6.0
and also with the newer Debian-Sid
Compiled against library: libvirt 3.0.0 Using library: libvirt 3.0.0 Using API: QEMU 3.0.0 Running hypervisor: QEMU 2.8.1
Thank you in advance. Philipp -- Philipp Hahn Open Source Software Engineer Univention GmbH be open. Mary-Somerville-Str. 1 D-28359 Bremen Tel.: +49 421 22232-0 Fax : +49 421 22232-99 hahn@univention.de http://www.univention.de/ Geschäftsführer: Peter H. Ganten HRB 20755 Amtsgericht Bremen Steuer-Nr.: 71-597-02876

Hello, Am 13.06.2017 um 06:13 schrieb Philipp Hahn:
I'm using the libvirt event mechanism and noticed, that several events are reported twice: ...
$ virsh event --loop --event lifecycle event 'lifecycle' for domain installer: Started Booted event 'lifecycle' for domain installer: Suspended Paused event 'lifecycle' for domain installer: Resumed Unpaused event 'lifecycle' for domain installer: Resumed Unpaused event 'lifecycle' for domain installer: Suspended Paused event 'lifecycle' for domain installer: Suspended Paused event 'lifecycle' for domain installer: Stopped Destroyed ... Looking at src/qemu/qemu_monitor_json.c:qemuMonitorJSONIOProcessEvent:
That was not the cause. Using gdb shows the real duplicator: Breakpoint 9, virDomainEventLifecycleNew (id=10, name=0x7fffd825e700 "win8", uuid=0x7fffd825e128 "o8r\204\344\372F\263\214oGF\325@=9", type=4, detail=0) at ../../../src/conf/domain_event.c:657 657 } $13 = {parent = {parent = {parent = {u = {dummy_align1 = 7700611138, dummy_align2 = 0x1cafe0042, s = {magic = 3405643842, refs = 1}}, klass = 0x5555558304e0}, eventID = 0, meta = {id = 10, name = 0x5555558304a0 "win8", uuid = "o8r\204\344\372F\263\214oGF\325@=9", key = 0x555555831270 "6f387284-e4fa-46b3-8c6f-4746d5403d39"}, remoteID = -1, dispatch = 0x7ffff77d85c0 <virDomainEventDispatchDefaultFunc>}, dummy = false}, type = 4, detail = 0}
(gdb) bt #0 virDomainEventLifecycleNew (id=10, name=0x7fffd825e700 "win8", uuid=0x7fffd825e128 "o8r\204\344\372F\263\214oGF\325@=9", type=4, detail=0) at ../../../src/conf/domain_event.c:657 #1 0x00007fffe52fefef in qemuProcessHandleResume (mon=0x0, vm=0x7fffd825ec80, opaque=0x7fffd8101b90) at ../../../src/qemu/qemu_process.c:773 #2 0x00007fffe531c9c2 in qemuMonitorEmitResume (mon=0x7fffd0000ea0) at ../../../src/qemu/qemu_monitor.c:1330 #3 0x00007fffe532fd0b in qemuMonitorJSONIOProcessEvent (obj=<optimized out>, mon=<optimized out>) at ../../../src/qemu/qemu_monitor_json.c:178 #4 qemuMonitorJSONIOProcessLine (mon=0x0, line=0x7fffe55d50e0 <eventHandlers+224> "}\242\071\345\377\177", msg=0xe) at ../../../src/qemu/qemu_monitor_json.c:207 #5 0x00007fffe532fedb in qemuMonitorJSONIOProcess (mon=0x0, data=0x7ffff774f9b0 <virLogOnce> "H\203\354\bH\215=\245\315G", len=119, msg=0x0) at ../../../src/qemu/qemu_monitor_json.c:249 #6 0x00007fffe531b293 in qemuMonitorIOProcess (mon=<optimized out>) at ../../../src/qemu/qemu_monitor.c:432 #7 qemuMonitorIO (watch=18, fd=-134433088, events=1, opaque=0x7fffd0000ea0) at ../../../src/qemu/qemu_monitor.c:686 #8 0x00007ffff7735446 in virEventPollDispatchHandles (fds=<optimized out>, nfds=<optimized out>) at ../../../src/util/vireventpoll.c:508 #9 virEventPollRunOnce () at ../../../src/util/vireventpoll.c:657 #10 0x00007ffff7733b71 in virEventRunDefaultImpl () at ../../../src/util/virevent.c:314 #11 0x00007ffff788aced in virNetDaemonRun (dmn=0x555555807d60) at ../../../src/rpc/virnetdaemon.c:818 #12 0x000055555556e486 in main (argc=-10000, argv=0x0) at ../../../daemon/libvirtd.c:1547 (gdb) c
This is the QEMU monitor event handler
Continuing. [Switching to Thread 0x7fffeee05700 (LWP 11193)]
Breakpoint 9, virDomainEventLifecycleNew (id=10, name=0x7fffd825e700 "win8", uuid=0x7fffd825e128 "o8r\204\344\372F\263\214oGF\325@=9", type=4, detail=0) at ../../../src/conf/domain_event.c:657 657 } $14 = {parent = {parent = {parent = {u = {dummy_align1 = 7700611138, dummy_align2 = 0x1cafe0042, s = {magic = 3405643842, refs = 1}}, klass = 0x5555558304e0}, eventID = 0, meta = {id = 10, name = 0x7fffbc000a30 "win8", uuid = "o8r\204\344\372F\263\214oGF\325@=9", key = 0x7fffbc000b70 "6f387284-e4fa-46b3-8c6f-4746d5403d39"}, remoteID = -1, dispatch = 0x7ffff77d85c0 <virDomainEventDispatchDefaultFunc>}, dummy = false}, type = 4, detail = 0} (gdb) bt #0 virDomainEventLifecycleNew (id=10, name=0x7fffd825e700 "win8", uuid=0x7fffd825e128 "o8r\204\344\372F\263\214oGF\325@=9", type=4, detail=0) at ../../../src/conf/domain_event.c:657 #1 0x00007fffe53579dc in qemuDomainResume (dom=0x7fffbc000930) at ../../../src/qemu/qemu_driver.c:1965 #2 0x00007ffff7818acf in virDomainResume (domain=0x7fffbc000930) at ../../../src/libvirt-domain.c:679 #3 0x0000555555595a3e in remoteDispatchDomainResume (server=<optimized out>, msg=<optimized out>, args=<optimized out>, rerr=<optimized out>, client=<optimized out>) at ../../../daemon/remote_dispatch.h:9132 #4 remoteDispatchDomainResumeHelper (server=0x555555806d30, client=0x555555832970, msg=0x555555837910, rerr=0x7fffeee04cc0, args=0x7fffbc000930, ret=0x7fffbc000fc0) at ../../../daemon/remote_dispatch.h:9108 #5 0x00007ffff7890939 in virNetServerProgramDispatchCall (msg=<optimized out>, client=<optimized out>, server=<optimized out>, prog=<optimized out>) at ../../../src/rpc/virnetserverprogram.c:437 #6 virNetServerProgramDispatch (prog=0x55555581be70, server=0x7ffff774f9b0 <virLogOnce>, server@entry=0x555555806d30, client=0x555555832970, msg=0x555555837910) at ../../../src/rpc/virnetserverprogram.c:307 #7 0x00005555555a7378 in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x555555806d30) at ../../../src/rpc/virnetserver.c:148 #8 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x555555806d30) at ../../../src/rpc/virnetserver.c:169 #9 0x00007ffff7782c98 in virThreadPoolWorker (opaque=opaque@entry=0x5555557faa70) at ../../../src/util/virthreadpool.c:167 #10 0x00007ffff7782068 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #11 0x00007ffff45b4064 in start_thread (arg=0x7fffeee05700) at pthread_create.c:309 #12 0x00007ffff42e962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
this is the code from src/qemu/qemu_driver.c:qemuDomainResume()
1903 static int qemuDomainResume(virDomainPtr dom) 1904 { ... 1942 if (qemuProcessStartCPUs(driver, vm, dom->conn, 1943 VIR_DOMAIN_RUNNING_UNPAUSED, 1944 QEMU_ASYNC_JOB_NONE) < 0) { 1945 if (virGetLastError() == NULL) 1946 virReportError(VIR_ERR_OPERATION_FAILED, 1947 "%s", _("resume operation failed")); 1948 goto endjob; 1949 } 1950 event = virDomainEventLifecycleNewFromObj(vm, 1951 VIR_DOMAIN_EVENT_RESUMED, 1952 VIR_DOMAIN_EVENT_RESUMED_UNPAUSED); 1953 } ... 1966 }
My interpretation is this: This is the legacy code from the time libvirt was using HMP to talk to QEMU and did NOT get events back than. Nowadays QEMU emits an event itself over QMP and libvirt still duplicates that. Can we get rid of them or at least disable those Lifecycle events when QMP is used, so we get the real state from QEMU and not some desired state from libvirt? Philipp

On Tue, Jun 13, 2017 at 10:30:59AM +0200, Philipp Hahn wrote:
Hello,
Am 13.06.2017 um 06:13 schrieb Philipp Hahn:
I'm using the libvirt event mechanism and noticed, that several events are reported twice: ...
$ virsh event --loop --event lifecycle event 'lifecycle' for domain installer: Started Booted event 'lifecycle' for domain installer: Suspended Paused event 'lifecycle' for domain installer: Resumed Unpaused event 'lifecycle' for domain installer: Resumed Unpaused event 'lifecycle' for domain installer: Suspended Paused event 'lifecycle' for domain installer: Suspended Paused event 'lifecycle' for domain installer: Stopped Destroyed ... Looking at src/qemu/qemu_monitor_json.c:qemuMonitorJSONIOProcessEvent:
That was not the cause. Using gdb shows the real duplicator:
Breakpoint 9, virDomainEventLifecycleNew (id=10, name=0x7fffd825e700 "win8", uuid=0x7fffd825e128 "o8r\204\344\372F\263\214oGF\325@=9", type=4, detail=0) at ../../../src/conf/domain_event.c:657 657 } $13 = {parent = {parent = {parent = {u = {dummy_align1 = 7700611138, dummy_align2 = 0x1cafe0042, s = {magic = 3405643842, refs = 1}}, klass = 0x5555558304e0}, eventID = 0, meta = {id = 10, name = 0x5555558304a0 "win8", uuid = "o8r\204\344\372F\263\214oGF\325@=9", key = 0x555555831270 "6f387284-e4fa-46b3-8c6f-4746d5403d39"}, remoteID = -1, dispatch = 0x7ffff77d85c0 <virDomainEventDispatchDefaultFunc>}, dummy = false}, type = 4, detail = 0}
(gdb) bt #0 virDomainEventLifecycleNew (id=10, name=0x7fffd825e700 "win8", uuid=0x7fffd825e128 "o8r\204\344\372F\263\214oGF\325@=9", type=4, detail=0) at ../../../src/conf/domain_event.c:657 #1 0x00007fffe52fefef in qemuProcessHandleResume (mon=0x0, vm=0x7fffd825ec80, opaque=0x7fffd8101b90) at ../../../src/qemu/qemu_process.c:773 #2 0x00007fffe531c9c2 in qemuMonitorEmitResume (mon=0x7fffd0000ea0) at ../../../src/qemu/qemu_monitor.c:1330 #3 0x00007fffe532fd0b in qemuMonitorJSONIOProcessEvent (obj=<optimized out>, mon=<optimized out>) at ../../../src/qemu/qemu_monitor_json.c:178 #4 qemuMonitorJSONIOProcessLine (mon=0x0, line=0x7fffe55d50e0 <eventHandlers+224> "}\242\071\345\377\177", msg=0xe) at ../../../src/qemu/qemu_monitor_json.c:207 #5 0x00007fffe532fedb in qemuMonitorJSONIOProcess (mon=0x0, data=0x7ffff774f9b0 <virLogOnce> "H\203\354\bH\215=\245\315G", len=119, msg=0x0) at ../../../src/qemu/qemu_monitor_json.c:249 #6 0x00007fffe531b293 in qemuMonitorIOProcess (mon=<optimized out>) at ../../../src/qemu/qemu_monitor.c:432 #7 qemuMonitorIO (watch=18, fd=-134433088, events=1, opaque=0x7fffd0000ea0) at ../../../src/qemu/qemu_monitor.c:686 #8 0x00007ffff7735446 in virEventPollDispatchHandles (fds=<optimized out>, nfds=<optimized out>) at ../../../src/util/vireventpoll.c:508 #9 virEventPollRunOnce () at ../../../src/util/vireventpoll.c:657 #10 0x00007ffff7733b71 in virEventRunDefaultImpl () at ../../../src/util/virevent.c:314 #11 0x00007ffff788aced in virNetDaemonRun (dmn=0x555555807d60) at ../../../src/rpc/virnetdaemon.c:818 #12 0x000055555556e486 in main (argc=-10000, argv=0x0) at ../../../daemon/libvirtd.c:1547 (gdb) c
This is the QEMU monitor event handler
Continuing. [Switching to Thread 0x7fffeee05700 (LWP 11193)]
Breakpoint 9, virDomainEventLifecycleNew (id=10, name=0x7fffd825e700 "win8", uuid=0x7fffd825e128 "o8r\204\344\372F\263\214oGF\325@=9", type=4, detail=0) at ../../../src/conf/domain_event.c:657 657 } $14 = {parent = {parent = {parent = {u = {dummy_align1 = 7700611138, dummy_align2 = 0x1cafe0042, s = {magic = 3405643842, refs = 1}}, klass = 0x5555558304e0}, eventID = 0, meta = {id = 10, name = 0x7fffbc000a30 "win8", uuid = "o8r\204\344\372F\263\214oGF\325@=9", key = 0x7fffbc000b70 "6f387284-e4fa-46b3-8c6f-4746d5403d39"}, remoteID = -1, dispatch = 0x7ffff77d85c0 <virDomainEventDispatchDefaultFunc>}, dummy = false}, type = 4, detail = 0} (gdb) bt #0 virDomainEventLifecycleNew (id=10, name=0x7fffd825e700 "win8", uuid=0x7fffd825e128 "o8r\204\344\372F\263\214oGF\325@=9", type=4, detail=0) at ../../../src/conf/domain_event.c:657 #1 0x00007fffe53579dc in qemuDomainResume (dom=0x7fffbc000930) at ../../../src/qemu/qemu_driver.c:1965 #2 0x00007ffff7818acf in virDomainResume (domain=0x7fffbc000930) at ../../../src/libvirt-domain.c:679 #3 0x0000555555595a3e in remoteDispatchDomainResume (server=<optimized out>, msg=<optimized out>, args=<optimized out>, rerr=<optimized out>, client=<optimized out>) at ../../../daemon/remote_dispatch.h:9132 #4 remoteDispatchDomainResumeHelper (server=0x555555806d30, client=0x555555832970, msg=0x555555837910, rerr=0x7fffeee04cc0, args=0x7fffbc000930, ret=0x7fffbc000fc0) at ../../../daemon/remote_dispatch.h:9108 #5 0x00007ffff7890939 in virNetServerProgramDispatchCall (msg=<optimized out>, client=<optimized out>, server=<optimized out>, prog=<optimized out>) at ../../../src/rpc/virnetserverprogram.c:437 #6 virNetServerProgramDispatch (prog=0x55555581be70, server=0x7ffff774f9b0 <virLogOnce>, server@entry=0x555555806d30, client=0x555555832970, msg=0x555555837910) at ../../../src/rpc/virnetserverprogram.c:307 #7 0x00005555555a7378 in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x555555806d30) at ../../../src/rpc/virnetserver.c:148 #8 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x555555806d30) at ../../../src/rpc/virnetserver.c:169 #9 0x00007ffff7782c98 in virThreadPoolWorker (opaque=opaque@entry=0x5555557faa70) at ../../../src/util/virthreadpool.c:167 #10 0x00007ffff7782068 in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:206 #11 0x00007ffff45b4064 in start_thread (arg=0x7fffeee05700) at pthread_create.c:309 #12 0x00007ffff42e962d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
this is the code from src/qemu/qemu_driver.c:qemuDomainResume()
1903 static int qemuDomainResume(virDomainPtr dom) 1904 { ... 1942 if (qemuProcessStartCPUs(driver, vm, dom->conn, 1943 VIR_DOMAIN_RUNNING_UNPAUSED, 1944 QEMU_ASYNC_JOB_NONE) < 0) { 1945 if (virGetLastError() == NULL) 1946 virReportError(VIR_ERR_OPERATION_FAILED, 1947 "%s", _("resume operation failed")); 1948 goto endjob; 1949 } 1950 event = virDomainEventLifecycleNewFromObj(vm, 1951 VIR_DOMAIN_EVENT_RESUMED, 1952 VIR_DOMAIN_EVENT_RESUMED_UNPAUSED); 1953 } ... 1966 }
My interpretation is this: This is the legacy code from the time libvirt was using HMP to talk to QEMU and did NOT get events back than. Nowadays QEMU emits an event itself over QMP and libvirt still duplicates that.
Can we get rid of them or at least disable those Lifecycle events when QMP is used, so we get the real state from QEMU and not some desired state from libvirt?
Looks like you got the case precisely. However, I'm not sure we can disable the events based merely on the fact that we use QMP. Please, correct me if I'm wrong, that would make the fix very easy (just a removal, actually) and we could do that for more events, probably. But I think older versions could've lacked that particular event. It's not listed in the output of query-events, so we can't check for the support that way. We can check somewhere that we sent the event and set it if we do, but that seems like the last fallback as that just adds unnecessary code.
Philipp
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
participants (2)
-
Martin Kletzander
-
Philipp Hahn