On Mon, 2017-11-06 at 15:53 -0500, John Ferlan wrote:
Resolve a storage driver crash as a result of a long running
storageVolCreateXML when the virStorageVolPoolRefreshThread is
run as a result of a storageVolUpload complete and ran the
virStoragePoolObjClearVols without checking if the creation
code was currently processing a buildVol after incrementing
the driver->asyncjob count.
The refreshThread needs to wait until all creation threads
are completed so as to not alter the volume list while the
volume is being built.
Crash from valgrind is as follows (with a bit of editing):
==21309== Invalid read of size 8
==21309== at 0x153E47AF: storageBackendUpdateVolTargetInfo
==21309== by 0x153E4C30: virStorageBackendUpdateVolInfo
==21309== by 0x153E52DE: virStorageBackendVolRefreshLocal
==21309== by 0x153DE29E: storageVolCreateXML
==21309== by 0x562035B: virStorageVolCreateXML
==21309== by 0x147366: remoteDispatchStorageVolCreateXML
...
==21309== Address 0x2590a720 is 64 bytes inside a block of size 336 free'd
==21309== at 0x4C2F2BB: free
==21309== by 0x54CB9FA: virFree
==21309== by 0x55BC800: virStorageVolDefFree
==21309== by 0x55BF1D8: virStoragePoolObjClearVols
==21309== by 0x153D967E: virStorageVolPoolRefreshThread
...
==21309== Block was alloc'd at
==21309== at 0x4C300A5: calloc
==21309== by 0x54CB483: virAlloc
==21309== by 0x55BDC1F: virStorageVolDefParseXML
==21309== by 0x55BDC1F: virStorageVolDefParseNode
==21309== by 0x55BE5A4: virStorageVolDefParse
==21309== by 0x153DDFF1: storageVolCreateXML
==21309== by 0x562035B: virStorageVolCreateXML
==21309== by 0x147366: remoteDispatchStorageVolCreateXML
...
Signed-off-by: John Ferlan <jferlan(a)redhat.com>
---
src/storage/storage_driver.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c
index b0edf9f885..5e920fc14c 100644
--- a/src/storage/storage_driver.c
+++ b/src/storage/storage_driver.c
@@ -2257,6 +2257,7 @@ virStorageVolPoolRefreshThread(void *opaque)
virStorageBackendPtr backend;
virObjectEventPtr event = NULL;
+ retry:
storageDriverLock();
if (cbdata->vol_path) {
if (virStorageBackendPloopRestoreDesc(cbdata->vol_path) < 0)
@@ -2270,6 +2271,14 @@ virStorageVolPoolRefreshThread(void *opaque)
if (!(backend = virStorageBackendForType(def->type)))
goto cleanup;
+ /* Some thread is creating a new volume in the pool, we need to retry */
+ if (virStoragePoolObjGetAsyncjobs(obj) > 0) {
+ virStoragePoolObjUnlock(obj);
+ storageDriverUnlock();
+ usleep(100 * 1000);
+ goto retry;
+ }
+
virStoragePoolObjClearVols(obj);
if (backend->refreshPool(NULL, obj) < 0)
VIR_DEBUG("Failed to refresh storage pool");
ACK, does the job here. I'm rather surprised to see you implementing it
with sleep, while you pointed me towards virCondWait yesterday. But using
sleep is a way less intrusive change.
--
Cedric