On Wed, Jul 31, 2024 at 01:02:26PM GMT, Peter Krempa wrote:
The s390(x) machines never supported ACPI. That didn't stop
users
enabling ACPI in their config. As of libvirt-9.2 (98c4e3d073) with new
enough qemu we reject configs which require ACPI, but qemu can't satisfy
it.
This breaks migration of existing VMs with the old wrong configs to new
libvirt installations.
To address this introduce a post-parse fixup removing the ACPI flag
specifically for s390 machines which do enable it in the definition.
The advantage of doing it in post-parse, rather than simply relaxing the
ABI stability check to allow users providing an fixed XML when migrating
(allowing change of the ACPI flag for s390 in ABI stability check, as it
doesn't impact ABI), is that only the destination installation needs to
be patched in order to preserve migration.
The disadvantage is that users will not be notified that they are using
a broken configuration as it will be silently fixed.
More on this below.
+/**
+ * qemuDomainDefACPIPostParse:
+ * @def: domain definition
+ * @qemuCaps: qemu capabilities object
+ *
+ * Fixup the use of ACPI flag on certain architectures that never supported it
+ * and users for some reason used it, which would break migration to newer
+ * libvirt versions which check whether given machine type supports ACPI.
+ */
+static void
+qemuDomainDefACPIPostParse(virDomainDef *def,
+ virQEMUCaps *qemuCaps)
+{
+ bool stripACPI = false;
+
+ /* Only cases when ACPI is enabled need to be fixed up */
+ if (def->features[VIR_DOMAIN_FEATURE_ACPI] != VIR_TRISTATE_SWITCH_ON)
+ return;
+
+ switch (def->os.arch) {
+ /* x86(_64) and AARCH64 always supported ACPI */
+ case VIR_ARCH_I686:
+ case VIR_ARCH_X86_64:
+ case VIR_ARCH_AARCH64:
+ stripACPI = false;
+ break;
+
+ /* S390 never supported ACPI, but some users enabled it regardless in their
+ * configs for some reason. In order to fix incoming migration we'll strip
+ * ACPI from those definitions, so that user configs are fixed by updating
+ * only the destination libvirt installation */
+ case VIR_ARCH_S390:
+ case VIR_ARCH_S390X:
+ stripACPI = true;
+ break;
+
+ /* Any other historic architectures could undergo the above treatment but
+ * since there were no user reports of invalid usage we avoid stripping ACPI */
+ case VIR_ARCH_ARMV6L:
+ case VIR_ARCH_ARMV7L:
+ case VIR_ARCH_ARMV7B:
+ case VIR_ARCH_PPC64:
+ case VIR_ARCH_PPC64LE:
+ case VIR_ARCH_ALPHA:
+ case VIR_ARCH_PPC:
+ case VIR_ARCH_PPCEMB:
+ case VIR_ARCH_SH4:
+ case VIR_ARCH_SH4EB:
+ case VIR_ARCH_RISCV32:
+ case VIR_ARCH_SPARC64:
+ case VIR_ARCH_MIPS:
+ case VIR_ARCH_MIPSEL:
+ case VIR_ARCH_MIPS64:
+ case VIR_ARCH_MIPS64EL:
+ case VIR_ARCH_CRIS:
+ case VIR_ARCH_ITANIUM:
+ case VIR_ARCH_LM32:
+ case VIR_ARCH_M68K:
+ case VIR_ARCH_MICROBLAZE:
+ case VIR_ARCH_MICROBLAZEEL:
+ case VIR_ARCH_OR32:
+ case VIR_ARCH_PARISC:
+ case VIR_ARCH_PARISC64:
+ case VIR_ARCH_PPCLE:
+ case VIR_ARCH_SPARC:
+ case VIR_ARCH_UNICORE32:
+ case VIR_ARCH_XTENSA:
+ case VIR_ARCH_XTENSAEB:
+ stripACPI = false;
+ break;
+
+ /* Any modern architecture must obey what qemu reports */
+ case VIR_ARCH_LOONGARCH64:
+ case VIR_ARCH_RISCV64:
+ stripACPI = false;
+ break;
+
+ case VIR_ARCH_NONE:
+ case VIR_ARCH_LAST:
+ default:
+ break;
+ }
This looks unnecessarily verbose. I kinda get the idea, but
considering that we're unlikely to go back and do anything about most
of the architectures that we're currently leaving alone, does it
really make sense to have all this code instead of just
bool stripACPI = false;
/* S390 never supported ACPI, but some users enabled it regardless in their
* configs for some reason. In order to fix incoming migration we'll strip
* ACPI from those definitions, so that user configs are fixed by updating
* only the destination libvirt installation */
if (def->os.arch == VIR_ARCH_S390 || def->os.arch == VIR_ARCH_S390X)
stripACPI = true;
? Maybe there's some motivation that I'm not seeing.
+ /* To be sure, we only strip ACPI if given machine type
doesn't support it
+ * or if the support state is unknown (which effectively means that it's
+ * unsupported, as we don't fixup x86(_64) and AARCH64) */
+ if (stripACPI &&
+ virQEMUCapsMachineSupportsACPI(qemuCaps, def->virtType, def->os.machine)
!= VIR_TRISTATE_BOOL_YES)
+ def->features[VIR_DOMAIN_FEATURE_ACPI] = VIR_TRISTATE_SWITCH_ABSENT;
Clever safety measure.
Regarding the disadvantage you mention in the commit message, how
about we implement a middle-ground solution instead? If we only
actually strip the feature when
!(parseFlags & VIR_DOMAIN_DEF_PARSE_ABI_UPDATE)
then we still silently fix migration, but we also prevent new s390x
domains with ACPI (supposedly) enabled from being defined, and
generally limit the scope of the workaround as much as possible.
--
Andrea Bolognani / Red Hat / Virtualization