On Fri, Sep 16, 2022 at 13:30:07 +0100, Daniel P. Berrangé wrote:
On Mon, Sep 05, 2022 at 03:57:03PM +0200, Kristina Hanicova wrote:
> This patch adds the generalized job object into the domain object
> so that it can be used by all drivers without the need to extract
> it from the private data.
>
> Because of this, the job object needs to be created and set
> during the creation of the domain object. This patch also extends
> xmlopt with possible job config containing virDomainJobObj
> callbacks, its private data callbacks and one variable
> (maxQueuedJobs).
>
> This patch includes:
> * addition of virDomainJobObj into virDomainObj (used in the
> following patches)
> * extending xmlopt with job config structure
> * new function for freeing the virDomainJobObj
Since this series was pushed, virsh unit tests are failing on
some platforms:
# ./build/tools/virsh -c test:///default uri
test:///default
Segmentation fault (core dumped)
In fact after checking with valgrind, they should be failing on all
platforms, but most aren't crashing on the use-after-free
$ valgrind ./build/tools/virsh -c test:///default uri
test:///default
==2099625== Invalid read of size 8
==2099625== at 0x4A1FCCB: virDomainObjResetAsyncJob (virdomainjob.c:184)
==2099625== by 0x4A1FDF0: virDomainObjClearJob (virdomainjob.c:224)
==2099625== by 0x4A1FE8D: virDomainJobObjFree (virdomainjob.c:240)
==2099625== by 0x499017D: vir_object_finalize (virobject.c:323)
==2099625== by 0x52F8D31: g_object_unref (in /usr/lib64/libgobject-2.0.so.0.7200.3)
==2099625== by 0x49907D7: virObjectUnref (virobject.c:377)
==2099625== by 0x4F357D2: ??? (in /usr/lib64/libglib-2.0.so.0.7200.3)
==2099625== by 0x4F39772: g_hash_table_unref (in /usr/lib64/libglib-2.0.so.0.7200.3)
==2099625== by 0x499017D: vir_object_finalize (virobject.c:323)
==2099625== by 0x52F8D31: g_object_unref (in /usr/lib64/libgobject-2.0.so.0.7200.3)
==2099625== by 0x49907D7: virObjectUnref (virobject.c:377)
==2099625== by 0x4B6F18A: testDriverDispose (test_driver.c:158)
==2099625== Address 0x69aa1e8 is 392 bytes inside a block of size 464 free'd
==2099625== at 0x48480E4: free (vg_replace_malloc.c:872)
==2099625== by 0x4F4BB8C: g_free (in /usr/lib64/libglib-2.0.so.0.7200.3)
==2099625== by 0x4F67013: g_slice_free1 (in /usr/lib64/libglib-2.0.so.0.7200.3)
==2099625== by 0x530D7D3: g_type_free_instance (in
/usr/lib64/libgobject-2.0.so.0.7200.3)
==2099625== by 0x49907D7: virObjectUnref (virobject.c:377)
==2099625== by 0x4B6F17E: testDriverDispose (test_driver.c:157)
==2099625== by 0x499017D: vir_object_finalize (virobject.c:323)
==2099625== by 0x52F8D31: g_object_unref (in /usr/lib64/libgobject-2.0.so.0.7200.3)
==2099625== by 0x49907D7: virObjectUnref (virobject.c:377)
==2099625== by 0x4B6EE62: testDriverCloseInternal (test_driver.c:1498)
==2099625== by 0x4B6EEE2: testConnectClose (test_driver.c:1543)
==2099625== by 0x4BBE495: virConnectDispose (datatypes.c:174)
==2099625== Block was alloc'd at
==2099625== at 0x484586F: malloc (vg_replace_malloc.c:381)
==2099625== by 0x4F4F278: g_malloc (in /usr/lib64/libglib-2.0.so.0.7200.3)
==2099625== by 0x4F67BA5: g_slice_alloc (in /usr/lib64/libglib-2.0.so.0.7200.3)
==2099625== by 0x4F69BCC: g_slice_alloc0 (in /usr/lib64/libglib-2.0.so.0.7200.3)
==2099625== by 0x5313016: g_type_create_instance (in
/usr/lib64/libgobject-2.0.so.0.7200.3)
==2099625== by 0x52FAE37: ??? (in /usr/lib64/libgobject-2.0.so.0.7200.3)
==2099625== by 0x52FC080: g_object_new_with_properties (in
/usr/lib64/libgobject-2.0.so.0.7200.3)
==2099625== by 0x52FCB20: g_object_new (in /usr/lib64/libgobject-2.0.so.0.7200.3)
==2099625== by 0x4990723: virObjectNew (virobject.c:252)
==2099625== by 0x49E0DD6: virDomainXMLOptionNew (domain_conf.c:1616)
==2099625== by 0x4B6F098: testDriverNew (test_driver.c:463)
==2099625== by 0x4B7709C: testOpenDefault (test_driver.c:1397)
==2099625== by 0x4B7709C: testConnectOpen (test_driver.c:1522)
The root cause here is that the virDomainJobObj now copies the pointer
to virDomainObjPrivateJobCallbacks from 'xmlopt' without ensuring that
the lifetime of the 'xmlopt' objects is longer than the lifetime of
domains.
At least in case of the 'test' driver that pre-condition is not met and
thus it dereferences pointers to freed memory. A simple demonstration of
the lifetime constraint is that the following patch fixes the invalid
read:
diff --git a/src/test/test_driver.c b/src/test/test_driver.c
index 686ff051a8..f574756518 100644
--- a/src/test/test_driver.c
+++ b/src/test/test_driver.c
@@ -154,8 +154,8 @@ testDriverDispose(void *obj)
size_t i;
virObjectUnref(driver->caps);
- virObjectUnref(driver->xmlopt);
virObjectUnref(driver->domains);
+ virObjectUnref(driver->xmlopt);
virNodeDeviceObjListFree(driver->devs);
virObjectUnref(driver->networks);
virObjectUnref(driver->ifaces);
A proper fix will be to not rely on stolen pointers though as doing this
is too fragile in other cases.