[libvirt] [PATCH v4 0/4] qemu: Fix deadlock if create qemuProcessReconnect thread failed

v3 patch: https://www.redhat.com/archives/libvir-list/2018-September/msg00950.html --- Changes in v4: - Split up v3 into four parts Wang Yechao (4): qemu: Split up qemuDomainRemoveInactive qemu: Create a qemuDomainRemoveInactiveLocked qemu: Create a qemuDomainRemoveInactiveJobLocked qemu: Fix deadlock if create qemuProcessReconnect thread failed src/qemu/qemu_domain.c | 79 ++++++++++++++++++++++++++++++++++++++++--------- src/qemu/qemu_domain.h | 3 ++ src/qemu/qemu_process.c | 2 +- 3 files changed, 69 insertions(+), 15 deletions(-) -- 1.8.3.1

Split up qemuDomainRemoveInactive into qemuDomainRemoveInactiveCommon and virDomainObjListRemove Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn> --- src/qemu/qemu_domain.c | 38 +++++++++++++++++++++++--------------- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 2fd8a2a..191113a 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8344,23 +8344,13 @@ qemuDomainSnapshotDiscardAllMetadata(virQEMUDriverPtr driver, } -/** - * qemuDomainRemoveInactive: - * - * The caller must hold a lock to the vm. - */ -void -qemuDomainRemoveInactive(virQEMUDriverPtr driver, - virDomainObjPtr vm) +static void +qemuDomainRemoveInactiveCommon(virQEMUDriverPtr driver, + virDomainObjPtr vm) { char *snapDir; virQEMUDriverConfigPtr cfg; - if (vm->persistent) { - /* Short-circuit, we don't want to remove a persistent domain */ - return; - } - cfg = virQEMUDriverGetConfig(driver); /* Remove any snapshot metadata prior to removing the domain */ @@ -8379,13 +8369,31 @@ qemuDomainRemoveInactive(virQEMUDriverPtr driver, } qemuExtDevicesCleanupHost(driver, vm->def); - virDomainObjListRemove(driver->domains, vm); - virObjectUnref(cfg); } /** + * qemuDomainRemoveInactive: + * + * The caller must hold a lock to the vm. + */ +void +qemuDomainRemoveInactive(virQEMUDriverPtr driver, + virDomainObjPtr vm) +{ + if (vm->persistent) { + /* Short-circuit, we don't want to remove a persistent domain */ + return; + } + + qemuDomainRemoveInactiveCommon(driver, vm); + + virDomainObjListRemove(driver->domains, vm); +} + + +/** * qemuDomainRemoveInactiveJob: * * Just like qemuDomainRemoveInactive but it tries to grab a -- 1.8.3.1

Create a qemuDomainRemoveInactiveLocked which is a copy of qemuDomainRemoveInactive except that instead of calling virDomainObjListRemove it calls virDomainObjListRemoveLocked. Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn> --- src/qemu/qemu_domain.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 191113a..22436d2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8394,6 +8394,28 @@ qemuDomainRemoveInactive(virQEMUDriverPtr driver, /** + * qemuDomainRemoveInactiveLocked: + * + * The caller must hold a lock to the vm and must hold the + * lock on driver->domains in order to call the remove obj + * from locked list method. + */ +static void +qemuDomainRemoveInactiveLocked(virQEMUDriverPtr driver, + virDomainObjPtr vm) +{ + if (vm->persistent) { + /* Short-circuit, we don't want to remove a persistent domain */ + return; + } + + qemuDomainRemoveInactiveCommon(driver, vm); + + virDomainObjListRemoveLocked(driver->domains, vm); +} + + +/** * qemuDomainRemoveInactiveJob: * * Just like qemuDomainRemoveInactive but it tries to grab a -- 1.8.3.1

Create a qemuDomainRemoveInactiveJobLocked which copies qemuDomainRemoveInactiveJob except of course calling virDomainObjListRemoveLocked. Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn> --- src/qemu/qemu_domain.c | 21 +++++++++++++++++++++ src/qemu/qemu_domain.h | 3 +++ 2 files changed, 24 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 22436d2..b20d430 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8438,6 +8438,27 @@ qemuDomainRemoveInactiveJob(virQEMUDriverPtr driver, } +/** + * qemuDomainRemoveInactiveJobLocked: + * + * Similar to qemuDomainRemoveInactiveJob, except that the caller must + * also hold the lock @driver->domains + */ +void +qemuDomainRemoveInactiveJobLocked(virQEMUDriverPtr driver, + virDomainObjPtr vm) +{ + bool haveJob; + + haveJob = qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) >= 0; + + qemuDomainRemoveInactiveLocked(driver, vm); + + if (haveJob) + qemuDomainObjEndJob(driver, vm); +} + + void qemuDomainSetFakeReboot(virQEMUDriverPtr driver, virDomainObjPtr vm, diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 914c9a6..ecbe2e8 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -704,6 +704,9 @@ void qemuDomainRemoveInactive(virQEMUDriverPtr driver, void qemuDomainRemoveInactiveJob(virQEMUDriverPtr driver, virDomainObjPtr vm); +void qemuDomainRemoveInactiveJobLocked(virQEMUDriverPtr driver, + virDomainObjPtr vm); + void qemuDomainSetFakeReboot(virQEMUDriverPtr driver, virDomainObjPtr vm, bool value); -- 1.8.3.1

qemuProcessReconnectHelper has hold lock on doms, if create qemuProcessReconnect thread failed, calls qemuDomainRemoveInactiveJob will lead to deadlock. Modify qemuProcessReconnectHelper to call qemuDomainRemoveInactiveJobLocked. Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn> --- src/qemu/qemu_process.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index f9a01da..711db10 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -8033,7 +8033,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj, */ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED, QEMU_ASYNC_JOB_NONE, 0); - qemuDomainRemoveInactiveJob(src->driver, obj); + qemuDomainRemoveInactiveJobLocked(src->driver, obj); virDomainObjEndAPI(&obj); virNWFilterUnlockFilterUpdates(); -- 1.8.3.1

On 9/21/18 12:35 AM, Wang Yechao wrote:
v3 patch: https://www.redhat.com/archives/libvir-list/2018-September/msg00950.html
--- Changes in v4: - Split up v3 into four parts
Wang Yechao (4): qemu: Split up qemuDomainRemoveInactive qemu: Create a qemuDomainRemoveInactiveLocked qemu: Create a qemuDomainRemoveInactiveJobLocked qemu: Fix deadlock if create qemuProcessReconnect thread failed
src/qemu/qemu_domain.c | 79 ++++++++++++++++++++++++++++++++++++++++--------- src/qemu/qemu_domain.h | 3 ++ src/qemu/qemu_process.c | 2 +- 3 files changed, 69 insertions(+), 15 deletions(-)
Thanks for splitting. I cleaned up the commit messages, merged patches 2&3 (must be able successfully build after each patch), and pushed the series. Reviewed-by: John Ferlan <jferlan@redhat.com> John
participants (2)
-
John Ferlan
-
Wang Yechao