[PATCH 0/2] qemu: Fix migration with custom XML

Jiri Denemark (2): qemu: Fix migration with custom XML NEWS: Mention migration bug with custom XML NEWS.rst | 8 ++++++++ src/qemu/qemu_domain.c | 38 +++++++++++++++++++++----------------- src/qemu/qemu_process.c | 14 ++------------ 3 files changed, 31 insertions(+), 29 deletions(-) -- 2.44.0

Ages ago origCPU in domain private data was introduced to provide backward compatibility when migrating to an old libvirt, which did not support fetching updated CPU definition from QEMU. Thus origCPU will contain the original CPU definition before such update. But only if the update actually changed anything. Let's always fill origCPU with the original definition when starting a domain so that we can rely on it being always set, even if it matches the updated definition. This fixes migration or save operations with custom domain XML after commit v10.1.0-88-g14d3517410, which expected origCPU to be always set to the CPU definition from inactive XML to check features explicitly requested by a user. https://issues.redhat.com/browse/RHEL-30622 Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_domain.c | 38 +++++++++++++++++++++----------------- src/qemu/qemu_process.c | 14 ++------------ 2 files changed, 23 insertions(+), 29 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 6b869797a8..ca50d435d2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -10995,12 +10995,13 @@ virSaveCookieCallbacks virQEMUDriverDomainSaveCookie = { * @vm: domain which is being started * @cpu: CPU updated when the domain was running previously (before migration, * snapshot, or save) - * @origCPU: where to store the original CPU from vm->def in case @cpu was - * used instead + * @origCPU: where to store the original CPU from vm->def * - * Replace the CPU definition with the updated one when QEMU is new enough to - * allow us to check extra features it is about to enable or disable when - * starting a domain. The original CPU is stored in @origCPU. + * Save the original CPU definition from inactive XML in @origCPU so that we + * can safely update it and still be able to check what exactly a user asked + * for. The domain definition will either contain a copy of the original CPU + * definition or a copy of @cpu in case the domain was already running and + * we're just restoring a saved state or preparing for incoming migration. * * Returns 0 on success, -1 on error. */ @@ -11013,15 +11014,25 @@ qemuDomainUpdateCPU(virDomainObj *vm, *origCPU = NULL; - if (!cpu || !vm->def->cpu || - !virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_QUERY_CPU_MODEL_EXPANSION) || - virCPUDefIsEqual(vm->def->cpu, cpu, false)) + if (!vm->def->cpu) return 0; - if (!(cpu = virCPUDefCopy(cpu))) + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_QUERY_CPU_MODEL_EXPANSION)) + return 0; + + /* nothing to do if only topology part of CPU def is used */ + if (vm->def->cpu->mode == VIR_CPU_MODE_CUSTOM && !vm->def->cpu->model) + return 0; + + if (cpu) + cpu = virCPUDefCopy(cpu); + else + cpu = virCPUDefCopy(vm->def->cpu); + + if (!cpu) return -1; - VIR_DEBUG("Replacing CPU def with the updated one"); + VIR_DEBUG("Replacing CPU definition"); *origCPU = vm->def->cpu; vm->def->cpu = cpu; @@ -11064,13 +11075,6 @@ qemuDomainFixupCPUs(virDomainObj *vm, !vm->def->cpu->model) return 0; - /* Missing origCPU means QEMU created exactly the same virtual CPU which - * we asked for or libvirt was too old to mess up the translation from - * host-model. - */ - if (!*origCPU) - return 0; - if (virCPUDefFindFeature(vm->def->cpu, "cmt")) { g_autoptr(virCPUDef) fixedCPU = virCPUDefCopyWithoutModel(vm->def->cpu); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e4bcb628cf..8165c28dbc 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4437,8 +4437,6 @@ qemuProcessUpdateLiveGuestCPU(virDomainObj *vm, virCPUData *disabled) { virDomainDef *def = vm->def; - qemuDomainObjPrivate *priv = vm->privateData; - g_autoptr(virCPUDef) orig = NULL; int rc; if (!enabled) @@ -4449,19 +4447,11 @@ qemuProcessUpdateLiveGuestCPU(virDomainObj *vm, !def->cpu->model)) return 0; - orig = virCPUDefCopy(def->cpu); - - if ((rc = virCPUUpdateLive(def->os.arch, def->cpu, enabled, disabled)) < 0) { + if ((rc = virCPUUpdateLive(def->os.arch, def->cpu, enabled, disabled)) < 0) return -1; - } else if (rc == 0) { - /* Store the original CPU in priv if QEMU changed it and we didn't - * get the original CPU via migration, restore, or snapshot revert. - */ - if (!priv->origCPU && !virCPUDefIsEqual(def->cpu, orig, false)) - priv->origCPU = g_steal_pointer(&orig); + if (rc == 0) def->cpu->check = VIR_CPU_CHECK_FULL; - } return 0; } -- 2.44.0

On Wed, Apr 17, 2024 at 1:54 AM Jiri Denemark <jdenemar@redhat.com> wrote:
Ages ago origCPU in domain private data was introduced to provide backward compatibility when migrating to an old libvirt, which did not support fetching updated CPU definition from QEMU. Thus origCPU will contain the original CPU definition before such update. But only if the update actually changed anything. Let's always fill origCPU with the original definition when starting a domain so that we can rely on it being always set, even if it matches the updated definition.
This fixes migration or save operations with custom domain XML after commit v10.1.0-88-g14d3517410, which expected origCPU to be always set to the CPU definition from inactive XML to check features explicitly requested by a user.
https://issues.redhat.com/browse/RHEL-30622
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_domain.c | 38 +++++++++++++++++++++----------------- src/qemu/qemu_process.c | 14 ++------------ 2 files changed, 23 insertions(+), 29 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 6b869797a8..ca50d435d2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -10995,12 +10995,13 @@ virSaveCookieCallbacks virQEMUDriverDomainSaveCookie = { * @vm: domain which is being started * @cpu: CPU updated when the domain was running previously (before migration, * snapshot, or save) - * @origCPU: where to store the original CPU from vm->def in case @cpu was - * used instead + * @origCPU: where to store the original CPU from vm->def * - * Replace the CPU definition with the updated one when QEMU is new enough to - * allow us to check extra features it is about to enable or disable when - * starting a domain. The original CPU is stored in @origCPU. + * Save the original CPU definition from inactive XML in @origCPU so that we + * can safely update it and still be able to check what exactly a user asked + * for. The domain definition will either contain a copy of the original CPU + * definition or a copy of @cpu in case the domain was already running and + * we're just restoring a saved state or preparing for incoming migration. * * Returns 0 on success, -1 on error. */ @@ -11013,15 +11014,25 @@ qemuDomainUpdateCPU(virDomainObj *vm,
*origCPU = NULL;
- if (!cpu || !vm->def->cpu || - !virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_QUERY_CPU_MODEL_EXPANSION) || - virCPUDefIsEqual(vm->def->cpu, cpu, false)) + if (!vm->def->cpu) return 0;
- if (!(cpu = virCPUDefCopy(cpu))) + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_QUERY_CPU_MODEL_EXPANSION)) + return 0; + + /* nothing to do if only topology part of CPU def is used */ + if (vm->def->cpu->mode == VIR_CPU_MODE_CUSTOM && !vm->def->cpu->model) + return 0; + + if (cpu) + cpu = virCPUDefCopy(cpu); + else + cpu = virCPUDefCopy(vm->def->cpu); + + if (!cpu) return -1;
- VIR_DEBUG("Replacing CPU def with the updated one"); + VIR_DEBUG("Replacing CPU definition");
*origCPU = vm->def->cpu; vm->def->cpu = cpu; @@ -11064,13 +11075,6 @@ qemuDomainFixupCPUs(virDomainObj *vm, !vm->def->cpu->model) return 0;
- /* Missing origCPU means QEMU created exactly the same virtual CPU which - * we asked for or libvirt was too old to mess up the translation from - * host-model. - */ - if (!*origCPU) - return 0; - if (virCPUDefFindFeature(vm->def->cpu, "cmt")) { g_autoptr(virCPUDef) fixedCPU = virCPUDefCopyWithoutModel(vm->def->cpu);
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e4bcb628cf..8165c28dbc 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4437,8 +4437,6 @@ qemuProcessUpdateLiveGuestCPU(virDomainObj *vm, virCPUData *disabled) { virDomainDef *def = vm->def; - qemuDomainObjPrivate *priv = vm->privateData; - g_autoptr(virCPUDef) orig = NULL; int rc;
if (!enabled) @@ -4449,19 +4447,11 @@ qemuProcessUpdateLiveGuestCPU(virDomainObj *vm, !def->cpu->model)) return 0;
- orig = virCPUDefCopy(def->cpu); - - if ((rc = virCPUUpdateLive(def->os.arch, def->cpu, enabled, disabled)) < 0) { + if ((rc = virCPUUpdateLive(def->os.arch, def->cpu, enabled, disabled)) < 0) return -1; - } else if (rc == 0) { - /* Store the original CPU in priv if QEMU changed it and we didn't - * get the original CPU via migration, restore, or snapshot revert. - */ - if (!priv->origCPU && !virCPUDefIsEqual(def->cpu, orig, false)) - priv->origCPU = g_steal_pointer(&orig);
+ if (rc == 0) def->cpu->check = VIR_CPU_CHECK_FULL; - }
return 0; } -- 2.44.0 _______________________________________________ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-leave@lists.libvirt.org
Test pass with this patched to v10.2.0-96-gded74b3369 -- Tested-by: Han Han <hhan@redhat.com>

On Tue, Apr 16, 2024 at 19:53:25 +0200, Jiri Denemark wrote:
Ages ago origCPU in domain private data was introduced to provide backward compatibility when migrating to an old libvirt, which did not support fetching updated CPU definition from QEMU. Thus origCPU will contain the original CPU definition before such update. But only if the update actually changed anything. Let's always fill origCPU with the original definition when starting a domain so that we can rely on it being always set, even if it matches the updated definition.
This fixes migration or save operations with custom domain XML after commit v10.1.0-88-g14d3517410, which expected origCPU to be always set to the CPU definition from inactive XML to check features explicitly requested by a user.
https://issues.redhat.com/browse/RHEL-30622
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_domain.c | 38 +++++++++++++++++++++----------------- src/qemu/qemu_process.c | 14 ++------------ 2 files changed, 23 insertions(+), 29 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 6b869797a8..ca50d435d2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -10995,12 +10995,13 @@ virSaveCookieCallbacks virQEMUDriverDomainSaveCookie = { * @vm: domain which is being started * @cpu: CPU updated when the domain was running previously (before migration, * snapshot, or save) - * @origCPU: where to store the original CPU from vm->def in case @cpu was - * used instead + * @origCPU: where to store the original CPU from vm->def * - * Replace the CPU definition with the updated one when QEMU is new enough to - * allow us to check extra features it is about to enable or disable when - * starting a domain. The original CPU is stored in @origCPU. + * Save the original CPU definition from inactive XML in @origCPU so that we + * can safely update it and still be able to check what exactly a user asked + * for. The domain definition will either contain a copy of the original CPU + * definition or a copy of @cpu in case the domain was already running and + * we're just restoring a saved state or preparing for incoming migration. * * Returns 0 on success, -1 on error. */ @@ -11013,15 +11014,25 @@ qemuDomainUpdateCPU(virDomainObj *vm,
*origCPU = NULL;
- if (!cpu || !vm->def->cpu || - !virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_QUERY_CPU_MODEL_EXPANSION) || - virCPUDefIsEqual(vm->def->cpu, cpu, false)) + if (!vm->def->cpu) return 0;
- if (!(cpu = virCPUDefCopy(cpu))) + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_QUERY_CPU_MODEL_EXPANSION)) + return 0; + + /* nothing to do if only topology part of CPU def is used */ + if (vm->def->cpu->mode == VIR_CPU_MODE_CUSTOM && !vm->def->cpu->model) + return 0; + + if (cpu) + cpu = virCPUDefCopy(cpu); + else + cpu = virCPUDefCopy(vm->def->cpu); + + if (!cpu) return -1;
First two reads were very confusing as this is overwriting the argument ....
- VIR_DEBUG("Replacing CPU def with the updated one"); + VIR_DEBUG("Replacing CPU definition");
*origCPU = vm->def->cpu; vm->def->cpu = cpu;
... just to store it here. At this point it's no longer necessary. You can first move the old def into 'origCPU' and then directly assign into the variable to avoid the overwrite for readability.
@@ -11064,13 +11075,6 @@ qemuDomainFixupCPUs(virDomainObj *vm, !vm->def->cpu->model) return 0;
- /* Missing origCPU means QEMU created exactly the same virtual CPU which - * we asked for or libvirt was too old to mess up the translation from - * host-model. - */ - if (!*origCPU) - return 0;
This function is called from two places: src/qemu/qemu_process.c=qemuProcessStartWithMemoryState(virConnectPtr conn, src/qemu/qemu_process.c: qemuDomainFixupCPUs(vm, &cookie->cpu) < 0) src/qemu/qemu_process.c=qemuProcessRefreshCPU(virQEMUDriver *driver, src/qemu/qemu_process.c: if (qemuDomainFixupCPUs(vm, &priv->origCPU) < 0 'priv->origCPU' which is used in the latter is AFAIU filled only in 'qemuDomainUpdateCPU' called when starting up a qemu process and in 'qemuDomainObjPrivateXMLParse' when loading a status XML. In both cases there are code paths which may leave 'priv->origCPU' to be NULL. The following code will dereference it unconditionally: if (virCPUDefFindFeature(*origCPU, "cmt")) {
- if (virCPUDefFindFeature(vm->def->cpu, "cmt")) { g_autoptr(virCPUDef) fixedCPU = virCPUDefCopyWithoutModel(vm->def->cpu);
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e4bcb628cf..8165c28dbc 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4437,8 +4437,6 @@ qemuProcessUpdateLiveGuestCPU(virDomainObj *vm, virCPUData *disabled) { virDomainDef *def = vm->def; - qemuDomainObjPrivate *priv = vm->privateData; - g_autoptr(virCPUDef) orig = NULL; int rc;
if (!enabled) @@ -4449,19 +4447,11 @@ qemuProcessUpdateLiveGuestCPU(virDomainObj *vm, !def->cpu->model)) return 0;
- orig = virCPUDefCopy(def->cpu); - - if ((rc = virCPUUpdateLive(def->os.arch, def->cpu, enabled, disabled)) < 0) { + if ((rc = virCPUUpdateLive(def->os.arch, def->cpu, enabled, disabled)) < 0) return -1; - } else if (rc == 0) { - /* Store the original CPU in priv if QEMU changed it and we didn't - * get the original CPU via migration, restore, or snapshot revert. - */ - if (!priv->origCPU && !virCPUDefIsEqual(def->cpu, orig, false)) - priv->origCPU = g_steal_pointer(&orig);
+ if (rc == 0) def->cpu->check = VIR_CPU_CHECK_FULL; - }
This code path can also theoretically be called on a status XML created by libvirt which didn't unconditionally fill origCPU IIUC.

On Wed, Apr 17, 2024 at 09:15:39 +0200, Peter Krempa wrote:
On Tue, Apr 16, 2024 at 19:53:25 +0200, Jiri Denemark wrote:
Ages ago origCPU in domain private data was introduced to provide backward compatibility when migrating to an old libvirt, which did not support fetching updated CPU definition from QEMU. Thus origCPU will contain the original CPU definition before such update. But only if the update actually changed anything. Let's always fill origCPU with the original definition when starting a domain so that we can rely on it being always set, even if it matches the updated definition.
This fixes migration or save operations with custom domain XML after commit v10.1.0-88-g14d3517410, which expected origCPU to be always set to the CPU definition from inactive XML to check features explicitly requested by a user.
https://issues.redhat.com/browse/RHEL-30622
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_domain.c | 38 +++++++++++++++++++++----------------- src/qemu/qemu_process.c | 14 ++------------ 2 files changed, 23 insertions(+), 29 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 6b869797a8..ca50d435d2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11013,15 +11014,25 @@ qemuDomainUpdateCPU(virDomainObj *vm,
*origCPU = NULL;
- if (!cpu || !vm->def->cpu || - !virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_QUERY_CPU_MODEL_EXPANSION) || - virCPUDefIsEqual(vm->def->cpu, cpu, false)) + if (!vm->def->cpu) return 0;
- if (!(cpu = virCPUDefCopy(cpu))) + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_QUERY_CPU_MODEL_EXPANSION)) + return 0; + + /* nothing to do if only topology part of CPU def is used */ + if (vm->def->cpu->mode == VIR_CPU_MODE_CUSTOM && !vm->def->cpu->model) + return 0; + + if (cpu) + cpu = virCPUDefCopy(cpu); + else + cpu = virCPUDefCopy(vm->def->cpu); + + if (!cpu) return -1;
First two reads were very confusing as this is overwriting the argument ....
- VIR_DEBUG("Replacing CPU def with the updated one"); + VIR_DEBUG("Replacing CPU definition");
*origCPU = vm->def->cpu; vm->def->cpu = cpu;
... just to store it here. At this point it's no longer necessary. You can first move the old def into 'origCPU' and then directly assign into the variable to avoid the overwrite for readability.
Oh right, virCPUDefCopy can never return NULL.
@@ -11064,13 +11075,6 @@ qemuDomainFixupCPUs(virDomainObj *vm, !vm->def->cpu->model) return 0;
- /* Missing origCPU means QEMU created exactly the same virtual CPU which - * we asked for or libvirt was too old to mess up the translation from - * host-model. - */ - if (!*origCPU) - return 0;
This function is called from two places:
src/qemu/qemu_process.c=qemuProcessStartWithMemoryState(virConnectPtr conn, src/qemu/qemu_process.c: qemuDomainFixupCPUs(vm, &cookie->cpu) < 0)
Oops, nice catch. Can I pretend I added this hunk to check if reviewers pay enough attention? :-) The call is guarded by cookie != NULL, but no check for cookie->cpu is done. I'll drop this hunk.
src/qemu/qemu_process.c=qemuProcessRefreshCPU(virQEMUDriver *driver, src/qemu/qemu_process.c: if (qemuDomainFixupCPUs(vm, &priv->origCPU) < 0
'priv->origCPU' which is used in the latter is AFAIU filled only in 'qemuDomainUpdateCPU' called when starting up a qemu process and in 'qemuDomainObjPrivateXMLParse' when loading a status XML.
In both cases there are code paths which may leave 'priv->origCPU' to be NULL.
The following code will dereference it unconditionally:
if (virCPUDefFindFeature(*origCPU, "cmt")) {
This specific issue would not exist if I didn't forget to update one more place... see below.
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e4bcb628cf..8165c28dbc 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4449,19 +4447,11 @@ qemuProcessUpdateLiveGuestCPU(virDomainObj *vm, !def->cpu->model)) return 0;
- orig = virCPUDefCopy(def->cpu); - - if ((rc = virCPUUpdateLive(def->os.arch, def->cpu, enabled, disabled)) < 0) { + if ((rc = virCPUUpdateLive(def->os.arch, def->cpu, enabled, disabled)) < 0) return -1; - } else if (rc == 0) { - /* Store the original CPU in priv if QEMU changed it and we didn't - * get the original CPU via migration, restore, or snapshot revert. - */ - if (!priv->origCPU && !virCPUDefIsEqual(def->cpu, orig, false)) - priv->origCPU = g_steal_pointer(&orig);
+ if (rc == 0) def->cpu->check = VIR_CPU_CHECK_FULL; - }
This code path can also theoretically be called on a status XML created by libvirt which didn't unconditionally fill origCPU IIUC.
I forgot to make sure the original CPU is also saved when reconnecting to a domain started by older libvirt which did not set it. Jirka

Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- NEWS.rst | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/NEWS.rst b/NEWS.rst index c108e66f46..852dadf532 100644 --- a/NEWS.rst +++ b/NEWS.rst @@ -21,6 +21,14 @@ v10.3.0 (unreleased) * **Bug fixes** + * qemu: Fix migration with custom XML + + Libvirt 10.2.0 would sometimes complain about incompatible CPU definition + when trying to migrate or save a domain and passing a custom XML even + though such XML was properly generated as migratable. Hitting this bug + depends on the guest CPU definition and the host on which a particular + domain was running. + v10.2.0 (2024-04-02) ==================== -- 2.44.0

On Tue, Apr 16, 2024 at 19:53:26 +0200, Jiri Denemark wrote:
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- NEWS.rst | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/NEWS.rst b/NEWS.rst index c108e66f46..852dadf532 100644 --- a/NEWS.rst +++ b/NEWS.rst @@ -21,6 +21,14 @@ v10.3.0 (unreleased)
* **Bug fixes**
+ * qemu: Fix migration with custom XML
custom migration XML perhaps ? or custom definition for migration? Each VM xml is custom ;)
+ + Libvirt 10.2.0 would sometimes complain about incompatible CPU definition + when trying to migrate or save a domain and passing a custom XML even + though such XML was properly generated as migratable. Hitting this bug + depends on the guest CPU definition and the host on which a particular + domain was running.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>

On Wed, Apr 17, 2024 at 09:17:31 +0200, Peter Krempa wrote:
On Tue, Apr 16, 2024 at 19:53:26 +0200, Jiri Denemark wrote:
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- NEWS.rst | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/NEWS.rst b/NEWS.rst index c108e66f46..852dadf532 100644 --- a/NEWS.rst +++ b/NEWS.rst @@ -21,6 +21,14 @@ v10.3.0 (unreleased)
* **Bug fixes**
+ * qemu: Fix migration with custom XML
custom migration XML perhaps ? or custom definition for migration? Each VM xml is custom ;)
Yeah, I didn't like too many "migration" words on a single line :-) Jirka
participants (3)
-
Han Han
-
Jiri Denemark
-
Peter Krempa