Re: [libvirt PATCH 27/30] qemu_process: abort snapshot delete when daemon starts

14 Dec 2022

On Thu, Dec 08, 2022 at 14:31:03 +0100, Pavel Hrdina wrote:
...
If the daemon crashes or is restarted while the snapshot delete is in
progress we have to handle it gracefully to not leave any block jobs
active.
For now we will simply abort the snapshot delete operation so user can
start it again.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
---
 src/qemu/qemu_process.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 5de55435d2..cc23b4a799 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -3677,6 +3677,37 @@ qemuProcessRecoverMigration(virQEMUDriver *driver,
 }
+static void
+qemuProcessAbortSnapshotDelete(virDomainObj *vm)
+{
+    size_t i;
+    qemuDomainObjPrivate *priv = vm->privateData;
+
+    for (i = 0; i < vm->def->ndisks; i++) {
+        virDomainDiskDef *disk = vm->def->disks[i];
+        g_autoptr(qemuBlockJobData) diskJob = qemuBlockJobDiskGetJob(disk);
+
+        if (!diskJob)
+            continue;
+
+        if (diskJob->type != QEMU_BLOCKJOB_TYPE_COMMIT &&
+            diskJob->type != QEMU_BLOCKJOB_TYPE_ACTIVE_COMMIT) {
+            continue;
+        }
+
+        qemuBlockJobSyncBegin(diskJob);
+
+        qemuDomainObjEnterMonitor(vm);
+        ignore_value(qemuMonitorBlockJobCancel(priv->mon, diskJob->name, false));
+        qemuDomainObjExitMonitor(vm);
+
+        diskJob->state = QEMU_BLOCKJOB_STATE_ABORTING;
+
+        qemuBlockJobSyncEnd(vm, diskJob, VIR_ASYNC_JOB_NONE);
+    }
+}
+
+
 static int
 qemuProcessRecoverJob(virQEMUDriver *driver,
                       virDomainObj *vm,
So assume that you have a VM where you've e.g. added a new disk after it
had some snapshots. Now you have a running blockjob on the new disk.

Now you try delete a snapshot. The un-related job could get restarted as
this job doesn't distinguish between those situations.

This is as a regular block-job doesn't even register as an async job.

The simplest fix will be to call qemuDomainHasBlockjob and refuse the
whole deletion.

Once you add that to the appropriate place:

Reviewed-by: Peter Krempa <pkrempa@redhat.com>