On Thu, 21 Oct 2021, Laine Stump wrote:
Starting with commit 58ba0f6a3d7342fba29edbbf2bb9cb5497c870e5,
support
for overriding the default hotplug method for a guest using the
<acpi-bridge-hotplug> subelement of <features>/<pci> was added to
libvirt. This uses the QEMU global commandline switch:
ICH9-LPC.acpi-pci-hotplug-with-bridge-support=<off|on>
that was added to QEMU in qemu-6.1 (along with QEMU switching the
default hotplug type for new Q35-based machinetypes from "native PCIe"
to "ACPI").
Unfortunately, soon after we pushed the <acpi-bridge-hotplug> patches
to libvirt (2 days after, to be exact), during a regular weekly
meeting I attend with some of the QEMU developers, along with bugs
that had been found in ACPI hotplug since it was made the default,
they were also discussing issues they'd found with the implementation
of the QEMU commandline switch to change back to native PCIe hotplug.
Just to be clear, this mess is for q35 side of that QEMU flag
(acpi-pci-hotplug-with-bridge-support) only, not i440fx
side. My libvirt patch implemented support in libvirt for both i440fx
side as well as the q35 side. The i440fx flag has existed for a very long
time in QEMU and hence it seems it is stable as there has been no known
issues around it reported (there was one issue I found in QEMU last year
which I since fixed). That being said, up until now there has been no
motivation from the upstream community to implement an equivalent switch in
libvirt for only i440fx. Clearly few people really care about that QEMU flag
on i440fx side and those who do can use the libvirt bypass mechanism to
pass the QEMU commandline directly. The main motiovation for my libvirt
work also was to allow users control/flip ACPI based hotplug on the q35 side
for pcie-root-ports in a libvirt friendly manner. i440fx side came along
the way for a free ride.
Apparently the current method used by
acpi-pci-hotplug-with-bridge-support is causing "confusion" in guests,
so they were talking about the possibility of changing what the switch
does (or replacing it), and suggested that libvirt should "hold off"
on supporting it for now. (oops) (they didn't know that we had just
pushed Ani's patches that used it)
Since the current QEMU option is doomed to either changed behavior
(which would result in a guest-visible ABI change) or full deprecation
and replacement, it would be better for libvirt to not use it at all,
and re-implement based on whatever replacement QEMU comes up
with.
Once a new XML element has appeared in a libvirt upstream
release, we are committed to keeping it there "forever". However,
since there has not yet been an upstream release since
<acpi-bridge-hotplug> was added, there is still time to revert and
avoid the endless support/maintenance burden.
Not much time though! And that is the purpose of this patch
series. They revert, in reverse order, Ani's 4 original patches adding
the feature, along with Peter's 6 followup patches that correct a
logic error and streamline/beef up the unit tests.
(Note that it is still possible that QEMU would figure out a way out
of this without modifying their current flag in any significant way,
and in that case we could always "re-apply" the original
patches. However, if we don't revert before the next release (Andrea
has pointed out the freeze for RC1 is next Tuesday, and it would be
best to have it done before then), we will no longer have the option
to remove it.)
There was some conflict resolution necessary after the "git revert"
commands, due to other unrelated patches that changed test case data,
and due to a couple of Peter's patches also touching up the i440fx
pci-root-hotplug disabling test cases (which are *not* being reverted
- they work as advertised!).
Note that this series involves *re-adding* some things, just to remove
them again in a later revert - this is because Peter's patches
effectively reverted the addition of a QEMU capability flag that had
been added in Ani's original patches. If anyone would prefer, I can
squash those patches together so that the extra dance is eliminated
from the diffs; it just seemed more mechanical to do it this way
though.
A more detailed explanation of the issue:
Current Behavior of existing QEMU option
========================================
As far as I understand (and keep in mind that I have misunderstood and
misinterpreted this *at least* 4 separate times since it was first
explained to me!), the effect of this QEMU setting:
-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=on
is to:
1) enable ACPI hotplug (i.e. expose it to the guest)
2) disable native PCIe hotplug (don't expose it to the guest)
By having only one of the two types of hotplug enabled, the intent is
to force the guest to use the still-enabled type of hotplug.
Unfortunately, after QEMU 6.1 was released with acpi-pci-hotplug=on as
the default for new machinetypes, problems were encountered with ACPI
hotplug, which caused more attention to be called to the QEMU switch,
and the people looking into that found that enabling ACPI and
disabling native PCIe hotplug doesn't necessarily have the desired
effect of causing the guest to use ACPI for hotplug (or maybe it was
the opposite direction). Instead, it "gets confused" (a very technical
term, I know. You can ask Julia or Michael for a definition :-)).
One possible fix that they talked about was changing the behavior of
ICH9-LPC.acpi-pci-hotplug-with-bridge-support:
Potential Change to Behavior of QEMU option
===========================================
I know it's more complex than this (again, Julia or Michael can
explain), but my basic understanding of the way that they're currently
thinking of modifying the acpi-pci-hotplug-with-bridge-support option
is to have everything the same, *except* that when acpi-hotplug=off,
ACPI hotplug will *still be enabled* along with native PCIe hotplug;
but because guest OSes prefer native hotplug over ACPI, native PCIe
hotplug will be chosen in that case. (Presumably this change prevents
the "confusion" that is seen with the existing behavior of the
option).
So essentially, the choice of whether to use ACPI is controlled by
enabling/disabling native PCIe hotplug (which *kind of* goes against
the libvirt philosophy of "the XML describes the virtual machine that
is presented to the guest"; instead it becomes "the XML describes how
the guest will behave").
Another possibility would be to completely scrap (well, deprecate and
later remove) the current QEMU commandline switch in favor of one or
more switches that behave differently but result in the desired
behavior.
In either case, I think the best course of action for libvirt is to
revert the current <acpi-bridge-hotplug>, wait until QEMU settles down
with a new workable set of switches, and then re-do libvirt support
based on that.
Laine Stump (10):
Revert "qemu: capabilities: Remove
QEMU_CAPS_PIIX4_ACPI_HOTPLUG_BRIDGE"
Revert "qemuxml2xmltest: Convert all acpi-hotplug control related
tests to DO_TEST_CAPS_LATEST"
Revert "qemuxml2argvtest: Add '-enable' variants for ACPI-hotplug
related cases"
Revert "qemuxml2argvtest: Use real-caps testing for
'acpi-hotplug-bridge-disable'"
Revert "qemuValidateDomainDefPCIFeature: Fix validation logic"
Revert "qemuValidateDomainDefPCIFeature: un-break error messages"
Revert "NEWS: document new acpi pci hotplug config option"
Revert "qemu: command: add support for acpi-bridge-hotplug feature"
Revert "conf: introduce support for acpi-bridge-hotplug feature"
Revert "qemu: capablities: detect
acpi-pci-hotplug-with-bridge-support"
NEWS.rst | 8 --
docs/formatdomain.rst | 29 ------
docs/schemas/domaincommon.rng | 15 ----
src/conf/domain_conf.c | 89 +------------------
src/conf/domain_conf.h | 9 --
src/qemu/qemu_capabilities.c | 4 +-
src/qemu/qemu_capabilities.h | 3 +-
src/qemu/qemu_command.c | 19 ----
src/qemu/qemu_validate.c | 41 ---------
.../caps_6.1.0.x86_64.xml | 1 -
.../caps_6.2.0.x86_64.xml | 1 -
...-hotplug-bridge-disable.aarch64-latest.err | 1 -
.../aarch64-acpi-hotplug-bridge-disable.xml | 13 ---
...-hotplug-bridge-disable.x86_64-latest.args | 34 -------
.../pc-i440fx-acpi-hotplug-bridge-disable.xml | 33 -------
...i-hotplug-bridge-enable.x86_64-latest.args | 34 -------
.../pc-i440fx-acpi-hotplug-bridge-enable.xml | 33 -------
...pi-hotplug-bridge-disable.x86_64-6.0.0.err | 1 -
...-hotplug-bridge-disable.x86_64-latest.args | 37 --------
.../q35-acpi-hotplug-bridge-disable.xml | 47 ----------
...cpi-hotplug-bridge-enable.x86_64-6.0.0.err | 1 -
...i-hotplug-bridge-enable.x86_64-latest.args | 37 --------
.../q35-acpi-hotplug-bridge-enable.xml | 47 ----------
tests/qemuxml2argvtest.c | 10 ---
...i-hotplug-bridge-disable.x86_64-latest.xml | 36 --------
...pi-hotplug-bridge-enable.x86_64-latest.xml | 36 --------
...cpi-root-hotplug-disable.x86_64-latest.xml | 33 -------
.../pc-i440fx-acpi-root-hotplug-disable.xml | 1 +
...acpi-root-hotplug-enable.x86_64-latest.xml | 33 -------
.../pc-i440fx-acpi-root-hotplug-enable.xml | 1 +
...i-hotplug-bridge-disable.x86_64-latest.xml | 53 -----------
...pi-hotplug-bridge-enable.x86_64-latest.xml | 53 -----------
tests/qemuxml2xmltest.c | 10 +--
33 files changed, 9 insertions(+), 794 deletions(-)
delete mode 100644
tests/qemuxml2argvdata/aarch64-acpi-hotplug-bridge-disable.aarch64-latest.err
delete mode 100644 tests/qemuxml2argvdata/aarch64-acpi-hotplug-bridge-disable.xml
delete mode 100644
tests/qemuxml2argvdata/pc-i440fx-acpi-hotplug-bridge-disable.x86_64-latest.args
delete mode 100644 tests/qemuxml2argvdata/pc-i440fx-acpi-hotplug-bridge-disable.xml
delete mode 100644
tests/qemuxml2argvdata/pc-i440fx-acpi-hotplug-bridge-enable.x86_64-latest.args
delete mode 100644 tests/qemuxml2argvdata/pc-i440fx-acpi-hotplug-bridge-enable.xml
delete mode 100644
tests/qemuxml2argvdata/q35-acpi-hotplug-bridge-disable.x86_64-6.0.0.err
delete mode 100644
tests/qemuxml2argvdata/q35-acpi-hotplug-bridge-disable.x86_64-latest.args
delete mode 100644 tests/qemuxml2argvdata/q35-acpi-hotplug-bridge-disable.xml
delete mode 100644
tests/qemuxml2argvdata/q35-acpi-hotplug-bridge-enable.x86_64-6.0.0.err
delete mode 100644
tests/qemuxml2argvdata/q35-acpi-hotplug-bridge-enable.x86_64-latest.args
delete mode 100644 tests/qemuxml2argvdata/q35-acpi-hotplug-bridge-enable.xml
delete mode 100644
tests/qemuxml2xmloutdata/pc-i440fx-acpi-hotplug-bridge-disable.x86_64-latest.xml
delete mode 100644
tests/qemuxml2xmloutdata/pc-i440fx-acpi-hotplug-bridge-enable.x86_64-latest.xml
delete mode 100644
tests/qemuxml2xmloutdata/pc-i440fx-acpi-root-hotplug-disable.x86_64-latest.xml
create mode 120000 tests/qemuxml2xmloutdata/pc-i440fx-acpi-root-hotplug-disable.xml
delete mode 100644
tests/qemuxml2xmloutdata/pc-i440fx-acpi-root-hotplug-enable.x86_64-latest.xml
create mode 120000 tests/qemuxml2xmloutdata/pc-i440fx-acpi-root-hotplug-enable.xml
delete mode 100644
tests/qemuxml2xmloutdata/q35-acpi-hotplug-bridge-disable.x86_64-latest.xml
delete mode 100644
tests/qemuxml2xmloutdata/q35-acpi-hotplug-bridge-enable.x86_64-latest.xml
--
2.31.1