If a remote call fails during event registration (more than likely
from
a network failure or remote libvirtd restart timed just right), then when
calling the virObjectEventStateDeregisterID we don't want to call the
registered @freecb function because that breaks our contract that we
would only call it after succesfully returning. If the @freecb routine
were called, it could result in a double free from properly coded
applications that free their opaque data on failure to register, as seen
in the following details:
Program terminated with signal 6, Aborted.
#0 0x00007fc45cba15d7 in raise
#1 0x00007fc45cba2cc8 in abort
#2 0x00007fc45cbe12f7 in __libc_message
#3 0x00007fc45cbe86d3 in _int_free
#4 0x00007fc45d8d292c in PyDict_Fini
#5 0x00007fc45d94f46a in Py_Finalize
#6 0x00007fc45d960735 in Py_Main
#7 0x00007fc45cb8daf5 in __libc_start_main
#8 0x0000000000400721 in _start
The double dereference of 'pyobj_cbData' is triggered in the following way:
(1) libvirt_virConnectDomainEventRegisterAny is invoked.
(2) the event is successfully added to the event callback list
(virDomainEventStateRegisterClient in
remoteConnectDomainEventRegisterAny returns 1 which means ok).
(3) when function remoteConnectDomainEventRegisterAny is hit,
network connection disconnected coincidently (or libvirtd is
restarted) in the context of function 'call' then the connection
is lost and the function 'call' failed, the branch
virObjectEventStateDeregisterID is therefore taken.
(4) 'pyobj_conn' is dereferenced the 1st time in
libvirt_virConnectDomainEventFreeFunc.
(5) 'pyobj_cbData' (refered to pyobj_conn) is dereferenced the
2nd time in libvirt_virConnectDomainEventRegisterAny.
(6) the double free error is triggered.
Resolve this by adding an @inhibitFreeCb boolean in order to avoid calling
freecb in virObjectEventStateDeregisterID for any remote call failure in
a remoteConnect*EventRegister* API. For remoteConnect*EventDeregister* calls,
the passed value would be false indicating they should run the freecb if it
exists.
Patch based on the investigation and initial patch posted by:
fangying <fangying1(a)huawei.com>.
Signed-off-by: John Ferlan <jferlan(a)redhat.com>
---
Initial patch found at:
https://www.redhat.com/archives/libvir-list/2017-June/msg00039.html
based on feedback from Daniel Berrange to a previous posting:
https://www.redhat.com/archives/libvir-list/2017-May/msg01089.html
Since no new patch was posted, I posted my idea from review for
consideration.
src/bhyve/bhyve_driver.c | 2 +-
src/conf/domain_event.c | 2 +-
src/conf/object_event.c | 23 +++++++++++++++++------
src/conf/object_event.h | 3 ++-
src/libxl/libxl_driver.c | 2 +-
src/lxc/lxc_driver.c | 2 +-
src/network/bridge_driver.c | 2 +-
src/node_device/node_device_driver.c | 2 +-
src/qemu/qemu_driver.c | 4 ++--
src/remote/remote_driver.c | 32 ++++++++++++++++----------------
src/secret/secret_driver.c | 2 +-
src/storage/storage_driver.c | 2 +-
src/test/test_driver.c | 8 ++++----
src/uml/uml_driver.c | 2 +-
src/vz/vz_driver.c | 2 +-
src/xen/xen_driver.c | 2 +-
16 files changed, 52 insertions(+), 40 deletions(-)
diff --git a/src/bhyve/bhyve_driver.c b/src/bhyve/bhyve_driver.c
index ed2221a..6722dc4 100644
--- a/src/bhyve/bhyve_driver.c
+++ b/src/bhyve/bhyve_driver.c
@@ -1503,7 +1503,7 @@ bhyveConnectDomainEventDeregisterAny(virConnectPtr conn,
if (virObjectEventStateDeregisterID(conn,
privconn->domainEventState,
- callbackID) < 0)
+ callbackID, false) < 0)
return -1;
return 0;
diff --git a/src/conf/domain_event.c b/src/conf/domain_event.c
index 6e471d7..ff4c602 100644
--- a/src/conf/domain_event.c
+++ b/src/conf/domain_event.c
@@ -2301,7 +2301,7 @@ virDomainEventStateDeregister(virConnectPtr conn,
NULL);
if (callbackID < 0)
return -1;
- return virObjectEventStateDeregisterID(conn, state, callbackID);
+ return virObjectEventStateDeregisterID(conn, state, callbackID, false);
Why is this false? virDomainEventStateDeregister is called directly from
public API implementations. For instance from
qemuConnectDomainEventDeregister(). Don't we want to run free callback then?