[RFC PATCH 0/7] Fix issue when storage migration is requested but there are no disks to migrate

In certain weird cases and due to very crusty code we'd instruct qemu to migrate storage using the old style storage migration when there wasn't anything to migrate. This was caused by a recent refactor. Note that the series is RFC as I didn't really test it yet, but feel free to provide feedback. Peter Krempa (7): util: xml: Introduce virXMLFormatElementEmpty qemuMigrationCookieNBDXMLFormat: Format empty <nbd/> element qemuMigrationSrcNBDStorageCopy: Return error code on error qemuMigrationSrcNBDStorageCopy: Don't pass migrate_flags qemuMigrationSrcRun: Sanitize setting of cookieFlags and migrate_flags on storage migration qemuMigrationSrcRun: Don't attempt any storage migration if no disks will be migrated qemuMigrationSrcBeginPhase: Don't offer 'nbd' in cookie if there are no disks to migrate src/libvirt_private.syms | 1 + src/qemu/qemu_migration.c | 102 ++++++++++++++++++------------- src/qemu/qemu_migration_cookie.c | 2 +- src/util/virxml.c | 42 ++++++++----- src/util/virxml.h | 7 +++ 5 files changed, 95 insertions(+), 59 deletions(-) -- 2.30.2

Add a helper which will format an XML element with attributes and children, but compared to virXMLFormatElement it also formats an empty element if both buffers are empty. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virxml.c | 42 ++++++++++++++++++++++++++-------------- src/util/virxml.h | 7 +++++++ 3 files changed, 35 insertions(+), 15 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index b3f9c9681a..64002de39a 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -3541,6 +3541,7 @@ virXMLBufferCreate; virXMLCheckIllegalChars; virXMLExtractNamespaceXML; virXMLFormatElement; +virXMLFormatElementEmpty; virXMLNewNode; virXMLNodeContentString; virXMLNodeNameEqual; diff --git a/src/util/virxml.c b/src/util/virxml.c index 117f50f2bf..c2a49cbef2 100644 --- a/src/util/virxml.c +++ b/src/util/virxml.c @@ -1314,6 +1314,32 @@ virXMLValidatorFree(virXMLValidator *validator) } +/* same as virXMLFormatElement but outputs an empty element if @attrBuf and + * @childBuf are both empty */ +void +virXMLFormatElementEmpty(virBuffer *buf, + const char *name, + virBuffer *attrBuf, + virBuffer *childBuf) +{ + virBufferAsprintf(buf, "<%s", name); + + if (attrBuf && virBufferUse(attrBuf) > 0) + virBufferAddBuffer(buf, attrBuf); + + if (childBuf && virBufferUse(childBuf) > 0) { + virBufferAddLit(buf, ">\n"); + virBufferAddBuffer(buf, childBuf); + virBufferAsprintf(buf, "</%s>\n", name); + } else { + virBufferAddLit(buf, "/>\n"); + } + + virBufferFreeAndReset(attrBuf); + virBufferFreeAndReset(childBuf); +} + + /** * virXMLFormatElement * @buf: the parent buffer where the element will be placed @@ -1338,21 +1364,7 @@ virXMLFormatElement(virBuffer *buf, (!childBuf || virBufferUse(childBuf) == 0)) return; - virBufferAsprintf(buf, "<%s", name); - - if (attrBuf && virBufferUse(attrBuf) > 0) - virBufferAddBuffer(buf, attrBuf); - - if (childBuf && virBufferUse(childBuf) > 0) { - virBufferAddLit(buf, ">\n"); - virBufferAddBuffer(buf, childBuf); - virBufferAsprintf(buf, "</%s>\n", name); - } else { - virBufferAddLit(buf, "/>\n"); - } - - virBufferFreeAndReset(attrBuf); - virBufferFreeAndReset(childBuf); + virXMLFormatElementEmpty(buf, name, attrBuf, childBuf); } diff --git a/src/util/virxml.h b/src/util/virxml.h index de171dce12..a81db478f0 100644 --- a/src/util/virxml.h +++ b/src/util/virxml.h @@ -226,6 +226,13 @@ virXMLFormatElement(virBuffer *buf, virBuffer *attrBuf, virBuffer *childBuf); +void +virXMLFormatElementEmpty(virBuffer *buf, + const char *name, + virBuffer *attrBuf, + virBuffer *childBuf); + + struct _virXPathContextNodeSave { xmlXPathContextPtr ctxt; xmlNodePtr node; -- 2.30.2

On Tue, 2021-04-13 at 17:38 +0200, Peter Krempa wrote:
Add a helper which will format an XML element with attributes and children, but compared to virXMLFormatElement it also formats an empty element if both buffers are empty.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virxml.c | 42 ++++++++++++++++++++++++++-------------- src/util/virxml.h | 7 +++++++ 3 files changed, 35 insertions(+), 15 deletions(-)
Reviewed-by: Andrea Bolognani <abologna@redhat.com> -- Andrea Bolognani / Red Hat / Virtualization

Commit 518be41aaa3 refactored qemuMigrationCookieNBDXMLFormat to use virXMLFormatElement which in comparison to the previous code doesn't format the element if it's empty. Unfortunately some crusty bits of our migration code use questionable logic to assert use of the old-style storage migration parameters which breaks if no disks are being migrated and the <nbd/> element is not present. While later patches will fix the code, re-instate formatting of empty <nbd/> for increased compatibility. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_migration_cookie.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/qemu/qemu_migration_cookie.c b/src/qemu/qemu_migration_cookie.c index 41533e4549..52998ddd1b 100644 --- a/src/qemu/qemu_migration_cookie.c +++ b/src/qemu/qemu_migration_cookie.c @@ -790,7 +790,7 @@ qemuMigrationCookieNBDXMLFormat(qemuMigrationCookieNBD *nbd, virBufferAsprintf(&childBuf, " capacity='%llu'/>\n", nbd->disks[i].capacity); } - virXMLFormatElement(buf, "nbd", &attrBuf, &childBuf); + virXMLFormatElementEmpty(buf, "nbd", &attrBuf, &childBuf); } -- 2.30.2

On Tue, 2021-04-13 at 17:38 +0200, Peter Krempa wrote:
- virXMLFormatElement(buf, "nbd", &attrBuf, &childBuf); + virXMLFormatElementEmpty(buf, "nbd", &attrBuf, &childBuf);
The explanation in the commit message is good, but I think having at least a very short nod in the code itself would be nice. Reviewed-by: Andrea Bolognani <abologna@redhat.com> -- Andrea Bolognani / Red Hat / Virtualization

In case the 'nbdURI' schema is not known the code would report an error but wouldn't return failure. Fixes: 49186372dbe Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_migration.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index df88f954ed..1c3e599128 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1154,6 +1154,7 @@ qemuMigrationSrcNBDStorageCopy(virQEMUDriver *driver, virReportError(VIR_ERR_INVALID_ARG, _("Unsupported scheme in disks URI: %s"), uri->scheme); + return -1; } } -- 2.30.2

On Tue, 2021-04-13 at 17:38 +0200, Peter Krempa wrote:
In case the 'nbdURI' schema is not known the code would report an error but wouldn't return failure.
Fixes: 49186372dbe
You can use the full commit ID here. Reviewed-by: Andrea Bolognani <abologna@redhat.com> -- Andrea Bolognani / Red Hat / Virtualization

'migrate_flags' can be updated in the only caller and since qemuMigrationSrcNBDStorageCopy already takes @flags which contains VIR_MIGRATE_NON_SHARED_INC (used to set QEMU_MONITOR_MIGRATE_NON_SHARED_INC) we can completely remove the parameter. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_migration.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 1c3e599128..99b278e218 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1074,14 +1074,11 @@ qemuMigrationSrcNBDStorageCopyOne(virQEMUDriver *driver, * @mig: migration cookie * @host: where are we migrating to * @speed: bandwidth limit in MiB/s - * @migrate_flags: migrate monitor command flags * * Migrate non-shared storage using the NBD protocol to the server running * inside the qemu process on dst and wait until the copy converges. - * On success update @migrate_flags so we don't tell 'migrate' command - * to do the very same operation. On failure, the caller is - * expected to call qemuMigrationSrcNBDCopyCancel to stop all - * running copy operations. + * On failure, the caller is expected to call qemuMigrationSrcNBDCopyCancel + * to stop all running copy operations. * * Returns 0 on success (@migrate_flags updated), * -1 otherwise. @@ -1092,7 +1089,6 @@ qemuMigrationSrcNBDStorageCopy(virQEMUDriver *driver, qemuMigrationCookie *mig, const char *host, unsigned long speed, - unsigned int *migrate_flags, size_t nmigrate_disks, const char **migrate_disks, virConnectPtr dconn, @@ -1104,7 +1100,7 @@ qemuMigrationSrcNBDStorageCopy(virQEMUDriver *driver, int port; size_t i; unsigned long long mirror_speed = speed; - bool mirror_shallow = *migrate_flags & QEMU_MONITOR_MIGRATE_NON_SHARED_INC; + bool mirror_shallow = flags & VIR_MIGRATE_NON_SHARED_INC; int rv; g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); g_autoptr(virURI) uri = NULL; @@ -1201,11 +1197,6 @@ qemuMigrationSrcNBDStorageCopy(virQEMUDriver *driver, qemuMigrationSrcFetchMirrorStats(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, priv->job.current); - - /* Okay, all disks are ready. Modify migrate_flags */ - *migrate_flags &= ~(QEMU_MONITOR_MIGRATE_NON_SHARED_DISK | - QEMU_MONITOR_MIGRATE_NON_SHARED_INC); - return 0; } @@ -4130,17 +4121,20 @@ qemuMigrationSrcRun(virQEMUDriver *driver, goto error; } - /* This will update migrate_flags on success */ if (qemuMigrationSrcNBDStorageCopy(driver, vm, mig, host, migrate_speed, - &migrate_flags, nmigrate_disks, migrate_disks, dconn, tlsAlias, nbdURI, flags) < 0) { goto error; } + + /* mask out the legacy migration flags if we are using NBD */ + migrate_flags &= ~(QEMU_MONITOR_MIGRATE_NON_SHARED_DISK | + QEMU_MONITOR_MIGRATE_NON_SHARED_INC); + } else { /* Destination doesn't support NBD server. * Fall back to previous implementation. */ -- 2.30.2

On Tue, 2021-04-13 at 17:38 +0200, Peter Krempa wrote:
'migrate_flags' can be updated in the only caller and since qemuMigrationSrcNBDStorageCopy already takes @flags which contains VIR_MIGRATE_NON_SHARED_INC (used to set QEMU_MONITOR_MIGRATE_NON_SHARED_INC) we can completely remove the parameter.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_migration.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-)
Reviewed-by: Andrea Bolognani <abologna@redhat.com> -- Andrea Bolognani / Red Hat / Virtualization

Base the decision on the main API flags (VIR_MIGRATE_NON_SHARED_DISK, QEMU_MONITOR_MIGRATE_NON_SHARED_INC) via a boolean 'storageMigration' rather than juggling everything trhough 'migration_flags'. After this patch 'migration_flags' is updated to contain the legacy storage migration flags only when we'll be about to use it rather than setting it and then resetting it. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_migration.c | 30 +++++++++++++----------------- 1 file changed, 13 insertions(+), 17 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 99b278e218..9ea008836c 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3995,6 +3995,7 @@ qemuMigrationSrcRun(virQEMUDriver *driver, bool abort_on_error = !!(flags & VIR_MIGRATE_ABORT_ON_ERROR); bool events = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_EVENT); bool bwParam = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_PARAM_BANDWIDTH); + bool storageMigration = flags & (VIR_MIGRATE_NON_SHARED_DISK | QEMU_MONITOR_MIGRATE_NON_SHARED_INC); bool cancel = false; unsigned int waitFlags; g_autoptr(virDomainDef) persistDef = NULL; @@ -4010,20 +4011,14 @@ qemuMigrationSrcRun(virQEMUDriver *driver, spec, spec->destType, spec->fwdType, dconn, NULLSTR(graphicsuri), nmigrate_disks, migrate_disks); - if (flags & VIR_MIGRATE_NON_SHARED_DISK) { - migrate_flags |= QEMU_MONITOR_MIGRATE_NON_SHARED_DISK; + if (storageMigration) { cookieFlags |= QEMU_MIGRATION_COOKIE_NBD; - } - if (flags & VIR_MIGRATE_NON_SHARED_INC) { - migrate_flags |= QEMU_MONITOR_MIGRATE_NON_SHARED_INC; - cookieFlags |= QEMU_MIGRATION_COOKIE_NBD; + if (virQEMUCapsGet(priv->qemuCaps, + QEMU_CAPS_MIGRATION_PARAM_BLOCK_BITMAP_MAPPING)) + cookieFlags |= QEMU_MIGRATION_COOKIE_BLOCK_DIRTY_BITMAPS; } - if (cookieFlags & QEMU_MIGRATION_COOKIE_NBD && - virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATION_PARAM_BLOCK_BITMAP_MAPPING)) - cookieFlags |= QEMU_MIGRATION_COOKIE_BLOCK_DIRTY_BITMAPS; - if (virLockManagerPluginUsesState(driver->lockManager) && !cookieout) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -4100,8 +4095,8 @@ qemuMigrationSrcRun(virQEMUDriver *driver, migParams) < 0) goto error; - if (migrate_flags & (QEMU_MONITOR_MIGRATE_NON_SHARED_DISK | - QEMU_MONITOR_MIGRATE_NON_SHARED_INC)) { + + if (storageMigration) { if (mig->nbd) { const char *host = ""; @@ -4130,16 +4125,17 @@ qemuMigrationSrcRun(virQEMUDriver *driver, nbdURI, flags) < 0) { goto error; } - - /* mask out the legacy migration flags if we are using NBD */ - migrate_flags &= ~(QEMU_MONITOR_MIGRATE_NON_SHARED_DISK | - QEMU_MONITOR_MIGRATE_NON_SHARED_INC); - } else { /* Destination doesn't support NBD server. * Fall back to previous implementation. */ VIR_DEBUG("Destination doesn't support NBD server " "Falling back to previous implementation."); + + if (flags & VIR_MIGRATE_NON_SHARED_DISK) + migrate_flags |= QEMU_MONITOR_MIGRATE_NON_SHARED_DISK; + + if (flags & VIR_MIGRATE_NON_SHARED_INC) + migrate_flags |= QEMU_MONITOR_MIGRATE_NON_SHARED_INC; } } -- 2.30.2

Don't even try to setup storage migration if there are no eligible disks. This also fixes migration from older libvirts which didn't format an empty <nbd/> element in the migration cookie if there weren't any disks to migrate. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_migration.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 9ea008836c..119459342e 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -312,6 +312,22 @@ qemuMigrationAnyCopyDisk(virDomainDiskDef const *disk, } +static bool +qemuMigrationHasAnyStorageMigrationDisks(virDomainDef *def, + const char **migrate_disks, + size_t nmigrate_disks) +{ + size_t i; + + for (i = 0; i < def->ndisks; i++) { + if (qemuMigrationAnyCopyDisk(def->disks[i], nmigrate_disks, migrate_disks)) + return true; + } + + return false; +} + + static int qemuMigrationDstPrecreateStorage(virDomainObj *vm, qemuMigrationCookieNBD *nbd, @@ -4011,6 +4027,11 @@ qemuMigrationSrcRun(virQEMUDriver *driver, spec, spec->destType, spec->fwdType, dconn, NULLSTR(graphicsuri), nmigrate_disks, migrate_disks); + if (storageMigration) + storageMigration = qemuMigrationHasAnyStorageMigrationDisks(vm->def, + migrate_disks, + nmigrate_disks); + if (storageMigration) { cookieFlags |= QEMU_MIGRATION_COOKIE_NBD; -- 2.30.2

Don't try to setup disk migration and the NBD stuff if we end up migrating nothing. The destination side has luckily no setup for the non-NBD cases so omitting the element fully is okay. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_migration.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 119459342e..3a622d38cd 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -2365,26 +2365,30 @@ qemuMigrationSrcBeginPhase(virQEMUDriver *driver, return NULL; } } else { - cookieFlags |= QEMU_MIGRATION_COOKIE_NBD; - priv->nbdPort = 0; - } - - if (nmigrate_disks) { - size_t i, j; - /* Check user requested only known disk targets. */ - for (i = 0; i < nmigrate_disks; i++) { - for (j = 0; j < vm->def->ndisks; j++) { - if (STREQ(vm->def->disks[j]->dst, migrate_disks[i])) - break; - } + if (nmigrate_disks) { + size_t i, j; + /* Check user requested only known disk targets. */ + for (i = 0; i < nmigrate_disks; i++) { + for (j = 0; j < vm->def->ndisks; j++) { + if (STREQ(vm->def->disks[j]->dst, migrate_disks[i])) + break; + } - if (j == vm->def->ndisks) { - virReportError(VIR_ERR_INVALID_ARG, - _("disk target %s not found"), - migrate_disks[i]); - return NULL; + if (j == vm->def->ndisks) { + virReportError(VIR_ERR_INVALID_ARG, + _("disk target %s not found"), + migrate_disks[i]); + return NULL; + } } } + + priv->nbdPort = 0; + + if (qemuMigrationHasAnyStorageMigrationDisks(vm->def, + migrate_disks, + nmigrate_disks)) + cookieFlags |= QEMU_MIGRATION_COOKIE_NBD; } } -- 2.30.2

On Tue, 2021-04-13 at 17:38 +0200, Peter Krempa wrote:
In certain weird cases and due to very crusty code we'd instruct qemu to migrate storage using the old style storage migration when there wasn't anything to migrate. This was caused by a recent refactor.
Note that the series is RFC as I didn't really test it yet, but feel free to provide feedback.
Peter Krempa (7): util: xml: Introduce virXMLFormatElementEmpty qemuMigrationCookieNBDXMLFormat: Format empty <nbd/> element qemuMigrationSrcNBDStorageCopy: Return error code on error qemuMigrationSrcNBDStorageCopy: Don't pass migrate_flags qemuMigrationSrcRun: Sanitize setting of cookieFlags and migrate_flags on storage migration qemuMigrationSrcRun: Don't attempt any storage migration if no disks will be migrated qemuMigrationSrcBeginPhase: Don't offer 'nbd' in cookie if there are no disks to migrate
My knowledge of migration code is such that, while your changes look reasonable overall, I don't feel too confident ACKing them myself and would prefer if someone with more expertise in the area would go over them instead. I can, however, offer a Tested-by: Andrea Bolognani <abologna@redhat.com> for the series, as I have verified that it makes the KubeVirt test suite failure that brought this issue to our attention in the first place go away. -- Andrea Bolognani / Red Hat / Virtualization

On 4/13/21 5:38 PM, Peter Krempa wrote:
In certain weird cases and due to very crusty code we'd instruct qemu to migrate storage using the old style storage migration when there wasn't anything to migrate. This was caused by a recent refactor.
Note that the series is RFC as I didn't really test it yet, but feel free to provide feedback.
Peter Krempa (7): util: xml: Introduce virXMLFormatElementEmpty qemuMigrationCookieNBDXMLFormat: Format empty <nbd/> element qemuMigrationSrcNBDStorageCopy: Return error code on error qemuMigrationSrcNBDStorageCopy: Don't pass migrate_flags qemuMigrationSrcRun: Sanitize setting of cookieFlags and migrate_flags on storage migration qemuMigrationSrcRun: Don't attempt any storage migration if no disks will be migrated qemuMigrationSrcBeginPhase: Don't offer 'nbd' in cookie if there are no disks to migrate
src/libvirt_private.syms | 1 + src/qemu/qemu_migration.c | 102 ++++++++++++++++++------------- src/qemu/qemu_migration_cookie.c | 2 +- src/util/virxml.c | 42 ++++++++----- src/util/virxml.h | 7 +++ 5 files changed, 95 insertions(+), 59 deletions(-)
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Michal
participants (3)
-
Andrea Bolognani
-
Michal Privoznik
-
Peter Krempa