[PATCH 0/8] qemu: fixes for blockjobs with <dataStore>

Couple fixes for the <dataStore> feature when used with blockjobs: - fixes for qemuBlockReopenAccess (code driving 'blockdev-reopen' in qemu) for bugs noticed during testing of <dataStore> - the 'auto-read-only' qemu feature doesn't apply od the 'data-file' blockdevs, thus libvirt needs to handle them explicitly - lookup of the node name associated with the <dataStore> volume in the chains (for block threshold event handling) Peter Krempa (8): qemuBlockReopenAccess: Add debug log entry about state of the image qemuBlockReopenAccess: Fix update of 'readonly' state qemuBlockReopenAccess: Don't require backing chain terminator for non-chained images qemu: block: Ensure that <dataStore> is in appropriate state qemu: snapshot: Change 'data-file' to read-only after snapshot qemuDomainVirStorageSourceFindByNodeName: Extract nodename matching qemuDomainVirStorageSourceFindByNodeName: Match also '<dataStore>' sources qemuDomainGetStorageSourceByDevstr: Lookup also '<dataStore>' src/qemu/qemu_block.c | 60 +++++++++++++++++++++++++++++++--------- src/qemu/qemu_blockjob.c | 16 ++++++++++- src/qemu/qemu_domain.c | 38 +++++++++++++++++++++---- src/qemu/qemu_snapshot.c | 12 ++++++-- 4 files changed, 105 insertions(+), 21 deletions(-) -- 2.47.0

Log the node name and current and expected state to simplify debugging. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_block.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/qemu/qemu_block.c b/src/qemu/qemu_block.c index dab9ce4dc2..088c128424 100644 --- a/src/qemu/qemu_block.c +++ b/src/qemu/qemu_block.c @@ -3174,6 +3174,10 @@ qemuBlockReopenAccess(virDomainObj *vm, int rc; int ret = -1; + VIR_DEBUG("nodename:'%s' current-ro:'%d requested-ro='%d'", + qemuBlockStorageSourceGetEffectiveNodename(src), + src->readonly, readonly); + if (src->readonly == readonly) return 0; -- 2.47.0

Refactors done in 24b667eeed78d2df (and also 9ec0e28e876b17df9) broke the expected handling of the update of 'readonly' flag of a virStorage. The source is actually set to the proper state but rolled back to the previous state as the 'cleanup' label should have been 'error' and thus not reached on success. Additionally some of the code paths violate the statement in the comment after updating 'readonly' that only 'goto error' must be used. Fixes: 24b667eeed78d2df0376a38a592ed9d8c2744bdc Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_block.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/src/qemu/qemu_block.c b/src/qemu/qemu_block.c index 088c128424..a7c8be8d8b 100644 --- a/src/qemu/qemu_block.c +++ b/src/qemu/qemu_block.c @@ -3172,7 +3172,6 @@ qemuBlockReopenAccess(virDomainObj *vm, g_autoptr(virJSONValue) reopenoptions = virJSONValueNewArray(); g_autoptr(virJSONValue) srcprops = NULL; int rc; - int ret = -1; VIR_DEBUG("nodename:'%s' current-ro:'%d requested-ro='%d'", qemuBlockStorageSourceGetEffectiveNodename(src), @@ -3190,39 +3189,39 @@ qemuBlockReopenAccess(virDomainObj *vm, } src->readonly = readonly; - /* from now on all error paths must use 'goto cleanup' */ + /* from now on all error paths must use 'goto error' which restores the original state */ /* based on which is the current 'effecitve' layer we must reopen the * appropriate blockdev */ if (qemuBlockStorageSourceGetFormatNodename(src)) { if (!(srcprops = qemuBlockStorageSourceGetFormatProps(src, src->backingStore))) - return -1; + goto error; } else if (qemuBlockStorageSourceGetSliceNodename(src)) { if (!(srcprops = qemuBlockStorageSourceGetBlockdevStorageSliceProps(src, true, false))) - return -1; + goto error; } else { if (!(srcprops = qemuBlockStorageSourceGetBackendProps(src, QEMU_BLOCK_STORAGE_SOURCE_BACKEND_PROPS_EFFECTIVE_NODE))) - return -1; + goto error; } if (virJSONValueArrayAppend(reopenoptions, &srcprops) < 0) - return -1; + goto error; if (qemuDomainObjEnterMonitorAsync(vm, asyncJob) < 0) - goto cleanup; + goto error; rc = qemuMonitorBlockdevReopen(qemuDomainGetMonitor(vm), &reopenoptions); qemuDomainObjExitMonitor(vm); if (rc < 0) - goto cleanup; + goto error; - ret = 0; + return 0; - cleanup: + error: src->readonly = !readonly; - return ret; + return -1; } -- 2.47.0

Add an exception for image formats not supporting backing images so that they can be reopened RW/RO without the need for adding a terminating virStorageSource as they simply can't have a backing image. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/qemu/qemu_block.c b/src/qemu/qemu_block.c index a7c8be8d8b..af317a1f1f 100644 --- a/src/qemu/qemu_block.c +++ b/src/qemu/qemu_block.c @@ -3182,7 +3182,7 @@ qemuBlockReopenAccess(virDomainObj *vm, /* If we are lacking the object here, qemu might have opened an image with * a node name unknown to us */ - if (!src->backingStore) { + if (src->format >= VIR_STORAGE_FILE_BACKING && !src->backingStore) { virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("can't reopen image with unknown presence of backing store")); return -1; -- 2.47.0

In contrast to normal backing chain members where qemu does honour the 'auto-read-only' property the 'data-file' nodes are not automatically reopened by qemu. Libvirt now has the infrastructure to reopen them explicitly so use it for all transitions of the 'commit' block job. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_block.c | 33 ++++++++++++++++++++++++++++++++- src/qemu/qemu_blockjob.c | 16 +++++++++++++++- 2 files changed, 47 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_block.c b/src/qemu/qemu_block.c index af317a1f1f..35dca8ee7b 100644 --- a/src/qemu/qemu_block.c +++ b/src/qemu/qemu_block.c @@ -3696,6 +3696,15 @@ qemuBlockCommit(virDomainObj *vm, false, false, false) < 0) goto cleanup; + if (baseSource->dataFileStore) { + if (qemuDomainStorageSourceAccessAllow(driver, vm, baseSource->dataFileStore, + false, false, false) < 0) + goto cleanup; + + if (qemuBlockReopenReadWrite(vm, baseSource->dataFileStore, asyncJob) < 0) + goto cleanup; + } + if (top_parent && top_parent != disk->src) { /* While top_parent is topmost image, we don't need to remember its * owner as it will be overwritten upon finishing the commit. Hence, @@ -3703,6 +3712,15 @@ qemuBlockCommit(virDomainObj *vm, if (qemuDomainStorageSourceAccessAllow(driver, vm, top_parent, false, false, false) < 0) goto cleanup; + + if (top_parent->dataFileStore) { + if (qemuDomainStorageSourceAccessAllow(driver, vm, top_parent->dataFileStore, + false, false, false) < 0) + goto cleanup; + + if (qemuBlockReopenReadWrite(vm, top_parent->dataFileStore, asyncJob) < 0) + goto cleanup; + } } if (!(job = qemuBlockJobDiskNewCommit(vm, disk, top_parent, topSource, @@ -3748,12 +3766,25 @@ qemuBlockCommit(virDomainObj *vm, if (rc < 0 && clean_access) { virErrorPtr orig_err; virErrorPreserveLast(&orig_err); + /* Revert access to read-only, if possible. */ + if (baseSource->dataFileStore) { + qemuDomainStorageSourceAccessAllow(driver, vm, baseSource->dataFileStore, + true, false, false); + qemuBlockReopenReadOnly(vm, baseSource->dataFileStore, asyncJob); + } qemuDomainStorageSourceAccessAllow(driver, vm, baseSource, true, false, false); - if (top_parent && top_parent != disk->src) + if (top_parent && top_parent != disk->src) { + if (top_parent->dataFileStore) { + qemuDomainStorageSourceAccessAllow(driver, vm, top_parent->dataFileStore, + true, false, false); + + qemuBlockReopenReadWrite(vm, top_parent->dataFileStore, asyncJob); + } qemuDomainStorageSourceAccessAllow(driver, vm, top_parent, true, false, false); + } virErrorRestore(&orig_err); } diff --git a/src/qemu/qemu_blockjob.c b/src/qemu/qemu_blockjob.c index c35321790e..4e77543fa8 100644 --- a/src/qemu/qemu_blockjob.c +++ b/src/qemu/qemu_blockjob.c @@ -1064,11 +1064,25 @@ qemuBlockJobProcessEventCompletedCommit(virQEMUDriver *driver, return; /* revert access to images */ + if (job->data.commit.base->dataFileStore) { + qemuDomainStorageSourceAccessAllow(driver, vm, job->data.commit.base->dataFileStore, + true, false, false); + qemuBlockReopenReadOnly(vm, job->data.commit.base->dataFileStore, asyncJob); + } qemuDomainStorageSourceAccessAllow(driver, vm, job->data.commit.base, true, false, false); - if (job->data.commit.topparent != job->disk->src) + + if (job->data.commit.topparent != job->disk->src) { + if (job->data.commit.topparent->dataFileStore) { + qemuDomainStorageSourceAccessAllow(driver, vm, job->data.commit.topparent->dataFileStore, + true, false, false); + + qemuBlockReopenReadWrite(vm, job->data.commit.topparent->dataFileStore, asyncJob); + } qemuDomainStorageSourceAccessAllow(driver, vm, job->data.commit.topparent, true, false, true); + } + baseparent->backingStore = NULL; job->data.commit.topparent->backingStore = job->data.commit.base; -- 2.47.0

For the reason outlined in previous commit qemu doesn't do this automatically. Handle it manually after the snapshot. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_snapshot.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c index 5b3aadcbf0..f880d1eeec 100644 --- a/src/qemu/qemu_snapshot.c +++ b/src/qemu/qemu_snapshot.c @@ -1434,12 +1434,14 @@ qemuSnapshotGetTransientDiskDef(virDomainDiskDef *domdisk, * qemuSnapshotDiskUpdateSource: * @vm: domain object * @dd: snapshot disk data object + * @asyncJob: async job type * * Updates disk definition after a successful snapshot. */ static void qemuSnapshotDiskUpdateSource(virDomainObj *vm, - qemuSnapshotDiskData *dd) + qemuSnapshotDiskData *dd, + virDomainAsyncJob asyncJob) { qemuDomainObjPrivate *priv = vm->privateData; virQEMUDriver *driver = priv->driver; @@ -1451,6 +1453,11 @@ qemuSnapshotDiskUpdateSource(virDomainObj *vm, if (qemuSecurityMoveImageMetadata(driver, vm, dd->disk->src, dd->src) < 0) VIR_WARN("Unable to move disk metadata on vm %s", vm->def->name); + /* if the original image has a data-file turn it read-only */ + if (dd->disk->src->dataFileStore) { + ignore_value(qemuBlockReopenReadOnly(vm, dd->disk->src->dataFileStore, asyncJob)); + } + /* unlock the write lock on the original image as qemu will no longer write to it */ virDomainLockImageDetach(driver->lockManager, vm, dd->disk->src); @@ -1470,6 +1477,7 @@ qemuSnapshotDiskUpdateSource(virDomainObj *vm, dd->persistsrc->backingStore = g_steal_pointer(&dd->persistdisk->src); dd->persistdisk->src = g_steal_pointer(&dd->persistsrc); } + } @@ -1498,7 +1506,7 @@ qemuSnapshotDiskCreate(qemuSnapshotDiskContext *snapctxt) virDomainAuditDisk(snapctxt->vm, dd->disk->src, dd->src, "snapshot", rc >= 0); if (rc == 0) - qemuSnapshotDiskUpdateSource(snapctxt->vm, dd); + qemuSnapshotDiskUpdateSource(snapctxt->vm, dd, snapctxt->asyncJob); } if (rc < 0) -- 2.47.0

On Tue, Nov 26, 2024 at 16:16:16 +0100, Peter Krempa wrote:
For the reason outlined in previous commit qemu doesn't do this automatically. Handle it manually after the snapshot.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_snapshot.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c index 5b3aadcbf0..f880d1eeec 100644 --- a/src/qemu/qemu_snapshot.c +++ b/src/qemu/qemu_snapshot.c ... @@ -1470,6 +1477,7 @@ qemuSnapshotDiskUpdateSource(virDomainObj *vm, dd->persistsrc->backingStore = g_steal_pointer(&dd->persistdisk->src); dd->persistdisk->src = g_steal_pointer(&dd->persistsrc); } + }
Looks like an accidental change, which was not supposed to be included in the commit. Jirka

On Thu, Nov 28, 2024 at 10:16:10 +0100, Jiri Denemark wrote:
On Tue, Nov 26, 2024 at 16:16:16 +0100, Peter Krempa wrote:
For the reason outlined in previous commit qemu doesn't do this automatically. Handle it manually after the snapshot.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_snapshot.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c index 5b3aadcbf0..f880d1eeec 100644 --- a/src/qemu/qemu_snapshot.c +++ b/src/qemu/qemu_snapshot.c ... @@ -1470,6 +1477,7 @@ qemuSnapshotDiskUpdateSource(virDomainObj *vm, dd->persistsrc->backingStore = g_steal_pointer(&dd->persistdisk->src); dd->persistdisk->src = g_steal_pointer(&dd->persistsrc); } + }
Looks like an accidental change, which was not supposed to be included in the commit.
Indeed. It was the place where I put the code originally, later realizing that it doesn't make sense :)

Extract the matching of the node name of a single virStorage source so that the logic can be extended in the upcoming patch. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_domain.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d28ff0cd22..150f0736f3 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2884,6 +2884,25 @@ qemuDomainObjPrivateXMLParseBlockjobChain(xmlNodePtr node, } +/** + * qemuDomainVirStorageSourceMatchNodename: + * @src: storage source to match + * @nodeName to match + * + * Returns true if any of the nodenames of @src matches @nodeName. + */ +static bool +qemuDomainVirStorageSourceMatchNodename(virStorageSource *src, + const char *nodeName) +{ + const char *nodestorage = qemuBlockStorageSourceGetStorageNodename(src); + const char *nodeformat = qemuBlockStorageSourceGetFormatNodename(src); + + return (nodeformat && STREQ(nodeformat, nodeName)) || + (nodestorage && STREQ(nodestorage, nodeName)); +} + + /** * qemuDomainVirStorageSourceFindByNodeName: * @top: backing chain top @@ -2900,11 +2919,7 @@ qemuDomainVirStorageSourceFindByNodeName(virStorageSource *top, virStorageSource *tmp; for (tmp = top; virStorageSourceIsBacking(tmp); tmp = tmp->backingStore) { - const char *nodestorage = qemuBlockStorageSourceGetStorageNodename(tmp); - const char *nodeformat = qemuBlockStorageSourceGetFormatNodename(tmp); - - if ((nodeformat && STREQ(nodeformat, nodeName)) || - (nodestorage && STREQ(nodestorage, nodeName))) + if (qemuDomainVirStorageSourceMatchNodename(tmp, nodeName)) return tmp; } -- 2.47.0

On Tue, Nov 26, 2024 at 16:16:17 +0100, Peter Krempa wrote:
Extract the matching of the node name of a single virStorage source so that the logic can be extended in the upcoming patch.
This is confusing. I was expecting the logic in qemuDomainVirStorageSourceFindByNodeName to be extended in the following patch, but in reality you just needed to reuse the same code in another place. That is the goal of moving the code to a separate function was to avoid code duplication. Jirka

On Thu, Nov 28, 2024 at 10:22:53 +0100, Jiri Denemark wrote:
On Tue, Nov 26, 2024 at 16:16:17 +0100, Peter Krempa wrote:
Extract the matching of the node name of a single virStorage source so that the logic can be extended in the upcoming patch.
This is confusing. I was expecting the logic in qemuDomainVirStorageSourceFindByNodeName to be extended in the following patch, but in reality you just needed to reuse the same code in another place. That is the goal of moving the code to a separate function was to avoid code duplication.
so s/extended/reused/ ?

On Thu, Nov 28, 2024 at 10:24:07 +0100, Peter Krempa wrote:
On Thu, Nov 28, 2024 at 10:22:53 +0100, Jiri Denemark wrote:
On Tue, Nov 26, 2024 at 16:16:17 +0100, Peter Krempa wrote:
Extract the matching of the node name of a single virStorage source so that the logic can be extended in the upcoming patch.
This is confusing. I was expecting the logic in qemuDomainVirStorageSourceFindByNodeName to be extended in the following patch, but in reality you just needed to reuse the same code in another place. That is the goal of moving the code to a separate function was to avoid code duplication.
so s/extended/reused/ ?
Sounds good. Jirka

As the source for the data file is a completely separate virStorageSource including it's own index we need to match it explicitly, so that code such as storage threshold events work properly and separately for the data file. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_domain.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 150f0736f3..4499fed6bc 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -2921,6 +2921,10 @@ qemuDomainVirStorageSourceFindByNodeName(virStorageSource *top, for (tmp = top; virStorageSourceIsBacking(tmp); tmp = tmp->backingStore) { if (qemuDomainVirStorageSourceMatchNodename(tmp, nodeName)) return tmp; + + if (tmp->dataFileStore && + qemuDomainVirStorageSourceMatchNodename(tmp->dataFileStore, nodeName)) + return tmp->dataFileStore; } return NULL; -- 2.47.0

The <dataStore> volumes have their own 'id' so we need to be able to look them up for the given image chain. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_domain.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 4499fed6bc..1fc4e2f33f 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -9260,12 +9260,18 @@ qemuDomainGetStorageSourceByDevstr(const char *devstr, for (n = disk->src; virStorageSourceIsBacking(n); n = n->backingStore) { if (n->id == idx) return n; + + if (n->dataFileStore && n->dataFileStore->id == idx) + return n->dataFileStore; } if (disk->mirror) { for (n = disk->mirror; virStorageSourceIsBacking(n); n = n->backingStore) { if (n->id == idx) return n; + + if (n->dataFileStore && n->dataFileStore->id == idx) + return n->dataFileStore; } } @@ -9281,6 +9287,9 @@ qemuDomainGetStorageSourceByDevstr(const char *devstr, for (n = backupdisk->store; virStorageSourceIsBacking(n); n = n->backingStore) { if (n->id == idx) return n; + + if (n->dataFileStore && n->dataFileStore->id == idx) + return n->dataFileStore; } } } -- 2.47.0

On Tue, Nov 26, 2024 at 16:16:11 +0100, Peter Krempa wrote:
Couple fixes for the <dataStore> feature when used with blockjobs: - fixes for qemuBlockReopenAccess (code driving 'blockdev-reopen' in qemu) for bugs noticed during testing of <dataStore> - the 'auto-read-only' qemu feature doesn't apply od the 'data-file' blockdevs, thus libvirt needs to handle them explicitly - lookup of the node name associated with the <dataStore> volume in the chains (for block threshold event handling)
Peter Krempa (8): qemuBlockReopenAccess: Add debug log entry about state of the image qemuBlockReopenAccess: Fix update of 'readonly' state qemuBlockReopenAccess: Don't require backing chain terminator for non-chained images qemu: block: Ensure that <dataStore> is in appropriate state qemu: snapshot: Change 'data-file' to read-only after snapshot qemuDomainVirStorageSourceFindByNodeName: Extract nodename matching qemuDomainVirStorageSourceFindByNodeName: Match also '<dataStore>' sources qemuDomainGetStorageSourceByDevstr: Lookup also '<dataStore>'
src/qemu/qemu_block.c | 60 +++++++++++++++++++++++++++++++--------- src/qemu/qemu_blockjob.c | 16 ++++++++++- src/qemu/qemu_domain.c | 38 +++++++++++++++++++++---- src/qemu/qemu_snapshot.c | 12 ++++++-- 4 files changed, 105 insertions(+), 21 deletions(-)
After addressing my comments in 5/8 and 6/8... Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
participants (2)
-
Jiri Denemark
-
Peter Krempa