[PATCH 0/2] Properly fix backup job termination

Peter Krempa (2): backup: Store 'apiFlags' in private section of virDomainBackupDef qemuBackupJobTerminate: Fix job termination for inactive VMs src/conf/backup_conf.h | 2 ++ src/qemu/qemu_backup.c | 33 ++++++++++++++++++--------------- src/qemu/qemu_process.c | 9 ++++++--- 3 files changed, 26 insertions(+), 18 deletions(-) -- 2.29.2

'qemuBackupJobTerminate' needs the API flags to see whether VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL. Unfortunately when called via qemuProcessReconnect()->qemuProcessStop() early (e.g. if the qemu process died while we were reconnecting) the job is cleared temporarily so that other APIs can be called. This would mean that we couldn't clean up the files in some cases. Save the 'apiFlags' inside the backup object and set it from the 'qemuDomainJobObj' 'apiFlags' member when reconnecting to a VM. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/conf/backup_conf.h | 2 ++ src/qemu/qemu_backup.c | 4 +++- src/qemu/qemu_process.c | 9 ++++++--- 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/src/conf/backup_conf.h b/src/conf/backup_conf.h index bda2bdcfe4..2902f39fb7 100644 --- a/src/conf/backup_conf.h +++ b/src/conf/backup_conf.h @@ -99,6 +99,8 @@ struct _virDomainBackupDef { unsigned long long pull_tmp_total; char *errmsg; /* error message of failed sub-blockjob */ + + unsigned int apiFlags; /* original flags used when starting the job */ }; typedef enum { diff --git a/src/qemu/qemu_backup.c b/src/qemu/qemu_backup.c index 6ce29c28e1..f6096f643f 100644 --- a/src/qemu/qemu_backup.c +++ b/src/qemu/qemu_backup.c @@ -560,7 +560,7 @@ qemuBackupJobTerminate(virDomainObjPtr vm, qemuDomainObjPrivatePtr priv = vm->privateData; size_t i; - if (!(priv->job.apiFlags & VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL) && + if (!(priv->backup->apiFlags & VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL) && (priv->backup->type == VIR_DOMAIN_BACKUP_TYPE_PULL || (priv->backup->type == VIR_DOMAIN_BACKUP_TYPE_PUSH && jobstatus != QEMU_DOMAIN_JOB_STATUS_COMPLETED))) { @@ -766,6 +766,8 @@ qemuBackupBegin(virDomainObjPtr vm, if (def->type == VIR_DOMAIN_BACKUP_TYPE_PULL) pull = true; + def->apiFlags = flags; + /* we'll treat this kind of backup job as an asyncjob as it uses some of the * infrastructure for async jobs. We'll allow standard modify-type jobs * as the interlocking of conflicting operations is handled on the block diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 89ede27751..971a270793 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -96,6 +96,7 @@ #include "virthreadjob.h" #include "virutil.h" #include "storage_source.h" +#include "backup_conf.h" #define VIR_FROM_THIS VIR_FROM_QEMU @@ -8315,12 +8316,14 @@ qemuProcessReconnect(void *opaque) g_clear_object(&data->identity); VIR_FREE(data); + cfg = virQEMUDriverGetConfig(driver); + priv = obj->privateData; + qemuDomainObjRestoreJob(obj, &oldjob); if (oldjob.asyncJob == QEMU_ASYNC_JOB_MIGRATION_IN) stopFlags |= VIR_QEMU_PROCESS_STOP_MIGRATED; - - cfg = virQEMUDriverGetConfig(driver); - priv = obj->privateData; + if (oldjob.asyncJob == QEMU_ASYNC_JOB_BACKUP && priv->backup) + priv->backup->apiFlags = oldjob.apiFlags; /* expect that libvirt might have crashed during VM start, so prevent * cleanup of transient disks */ -- 2.29.2

Commit cb29e4e801d didn't take into account that the VM can be inactive when it's destroyed. This means that the job would remain active also when the VM became inactive. To fix this properly: 1) Remove the bogus VM liveness check and early return (reverts the aforementioned commit) 2) Conditionalize the stats assignment only when the stats object is present (properly fix the crash when VM dies when reconnecting) 3) end the asyncjob only when it was already set (prevent corruption of priv->jobs_queued) Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1937598 Fixes: cb29e4e801d Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- I didn't come up with a reasonable split so that I'd avoid the 'technically' 3 changes in one patch so that it'd still make sense. src/qemu/qemu_backup.c | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/src/qemu/qemu_backup.c b/src/qemu/qemu_backup.c index f6096f643f..f91d632715 100644 --- a/src/qemu/qemu_backup.c +++ b/src/qemu/qemu_backup.c @@ -583,27 +583,28 @@ qemuBackupJobTerminate(virDomainObjPtr vm, } } - if (!virDomainObjIsActive(vm)) - return; - - qemuDomainJobInfoUpdateTime(priv->job.current); + if (priv->job.current) { + qemuDomainJobInfoUpdateTime(priv->job.current); - g_clear_pointer(&priv->job.completed, qemuDomainJobInfoFree); - priv->job.completed = qemuDomainJobInfoCopy(priv->job.current); + g_clear_pointer(&priv->job.completed, qemuDomainJobInfoFree); + priv->job.completed = qemuDomainJobInfoCopy(priv->job.current); - priv->job.completed->stats.backup.total = priv->backup->push_total; - priv->job.completed->stats.backup.transferred = priv->backup->push_transferred; - priv->job.completed->stats.backup.tmp_used = priv->backup->pull_tmp_used; - priv->job.completed->stats.backup.tmp_total = priv->backup->pull_tmp_total; + priv->job.completed->stats.backup.total = priv->backup->push_total; + priv->job.completed->stats.backup.transferred = priv->backup->push_transferred; + priv->job.completed->stats.backup.tmp_used = priv->backup->pull_tmp_used; + priv->job.completed->stats.backup.tmp_total = priv->backup->pull_tmp_total; - priv->job.completed->status = jobstatus; - priv->job.completed->errmsg = g_strdup(priv->backup->errmsg); + priv->job.completed->status = jobstatus; + priv->job.completed->errmsg = g_strdup(priv->backup->errmsg); - qemuDomainEventEmitJobCompleted(priv->driver, vm); + qemuDomainEventEmitJobCompleted(priv->driver, vm); + } virDomainBackupDefFree(priv->backup); priv->backup = NULL; - qemuDomainObjEndAsyncJob(priv->driver, vm); + + if (priv->job.asyncJob == QEMU_ASYNC_JOB_BACKUP) + qemuDomainObjEndAsyncJob(priv->driver, vm); } -- 2.29.2

On a Thursday in 2021, Peter Krempa wrote:
Peter Krempa (2): backup: Store 'apiFlags' in private section of virDomainBackupDef qemuBackupJobTerminate: Fix job termination for inactive VMs
src/conf/backup_conf.h | 2 ++ src/qemu/qemu_backup.c | 33 ++++++++++++++++++--------------- src/qemu/qemu_process.c | 9 ++++++--- 3 files changed, 26 insertions(+), 18 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano
participants (2)
-
Ján Tomko
-
Peter Krempa