
+Igor +Michael On Thu, 23 Sep 2021, Laine Stump wrote:
On 9/11/21 11:26 PM, Ani Sinha wrote:
Hi all:
This patchset introduces libvirt xml support for the following two pm conf options:
<pm> <acpi-hotplug-bridge enabled='no'/> <acpi-root-hotplug enabled='yes'/> </pm>
(before I get into a more radical discussion about different options - since we aren't exactly duplicating the QEMU option name anyway, what if we made these names more consistent, e.g. "acpi-hotplug-bridge" and "acpi-hotplug-root"?)
yes this is fine. I can swap the two words.
I've thought quite a bit about whether to put these attributes here, or somewhere else, and I'm still undecided.
My initial reaction to this was "PM == Power Management, and power management is all about suspend mode support. Hotplug isn't power management." But then you look at the name of the QEMU option and PM is right there in the name, and I guess it's *kind of related* (effectively suspending/resuming a single device), so maybe I'm thinking too narrowly.
So are there alternate places that might fit the purpose of these new options better, rather than directly mimicking the QEMU option placement (for better or worse)? A couple alternative possibilities:
1) ****
One possibility would be to include these new flags within the existing <acpi> subelement of <features>, which is already used to control whether the guest exposes ACPI to the guest *at all* (via adding "-no-acpi" to the QEMU commandline when <acpi> is missing - NB: this feature flag is currently supported only on x86 and aarch64 QEMU platforms, and ignored for all other hypervisors).
Possibly the new flags could be put in something like this:
<features> <acpi> <hotplug-bridge enabled='no'/> <hotplug-root enabled='yes'/> </acpi> ... </features>
But:
* currently there are no subelements to <acpi>. So this isn't "extending according to an existing pattern".
* even though the <features> element uses presence of a subelement to indicate "enabled" and absence of the subelement to indicate "disabled". But in the case of these new acpi bridge options we would need to explicitly have the "enabled='yes/no'" rather than just using presence of the option to mean "enabled" and absence to mean "disabled" because the default for "root-hotplug" up until now has been *enabled*, and the default for hotplug-bridge is different depending on machinetype. We need to continue working properly (and identically) with old/existing XML, but if we didn't have an "enabled" attribute for these new flags, there would be no way to tell the difference between "not specified" and "disabled", and so no way to disable the feature for a QEMU where the default was "enabled". (Why does this matter? Because I don't like the inconsistency that would arise from some feature flags using absense to mean "disabled" and some using it to mean "use the default".)
* Having something in <features> in the domain XML kind of implies that the associated capability flags should be represented in the <features> section of the domain capabilities. For example, <acpi/> is listed under <features> in the output of virsh capabilities, separately from the flag indicating presence of the -no-acpi option. I'm not sure if we would need to add something there for these options if we moved them into <features> (seems a bit redundant to me to have it in both places, but I'm sure there are $reasons).
2) *****
Alternately, there is an <acpi> subelement of <os>, which is currently used to add a SLIC table (some sort of software license table, which I'd never heard of before) using QEMU's -acpitable commandline option. It is also used somehow by the Xen driver.
<os> <acpi> <table type='slic'>/path/to/slic.dat</table> <hotplug-bridge enabled='no'/> <hotplug-root enabled='yes'/> </acpi> ... </os>
My problem with adding these new PCI controller acpi options to os/acpi is simply that it's in the <os> subelement, which is claimed elsewhere to be intended for OS boot options, and is used for things like specifying the path to a kernel / initrd to boot from.
3) ****
A third option, suggested somewhere by Ani, would be to make a completely new top-level element, called something like <acpiHotplug> that would have separate attributes for the two flags, e.g.:
<acpiHotplug bridge='yes' root='yes'/>
I dislike new toplevel options because they just seem so adhoc, as if the XML namespace is a cluttered, disorganized room. That reminds me too much of my own workspace, which is just... depressing.
****
Since I always seem to spend *way too much time* worrying about naming, only to have it come out wrong in the end anyway, I'm looking for some other opinions. Counting the version that is in Ani's patch currently as option "0", which option do you all think is the best? Or is it completely unimportant?
My preference is obviously option #0 and #3. However, community opinion/perspective is certainly required here.
The above two options are only available for qemu driver and that too for x86 guests only. Both of them are global options.
``acpi-hotplug-bridge`` option enables or disables ACPI hotplug support for cold plugged bridges. Examples of cold plugged bridges include PCI-PCI bridge (pci-bridge controller) for pc machines and pcie-root-port controller for q35 machines. The corresponding commandline options to qemu for x86 guests are:
The "cold plugged bridges" term here throws me for a loop - it implies that hotplugging bridges is something that's supported, and I think it still isn't. Of course this is just the cover letter, so it won't go into git anywhere, but I think it should be enough to say "enables ACPI hotplug into non-root bus PCI bridges/ports".
(pc machines): -global PIIX4_PM.acpi-pci-hotplug-with-bridge-support=<off/on> (q35 machines): -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=<off/on>
So I'm curious - if the QEMU commandline also included "-no-acpi" along with these, what would happen? Would it be silently ignored? Generate an error? Or does -no-acpi only control the suspend support, and acpi hotplug is still available?
-no-acpi disables acpi completely from i386 machines. Please see acpi_setup() where we bail out of x86_machine_is_acpi_enabled() is false. So no support for any acpi based hotplug will be available. Those other options will be ignored.
Being global options, no other bridge specific options for pci-bridge controller or pcie-root-port controllers are required. For pc machine type in x86, this option is available in qemu for a long time, from version 2.1. Please see the changes in qemu.git:
9e047b982452c6 ("piix4: add acpi pci hotplug support")
Interesting. So how was hotplug handled before this? With SHPC? I know there must be *some* kind of hotplug support in older QEMU, because RHEL6 QEMU supported hotplug, and it was based on qemu 0.12 or something ancient like that...
good question. I do not know. may be imammeodo and mst (cc'd) can help here.
133a2da488062e ("pc: acpi: generate AML only for PCI0 devices if PCI bridge hotplug is disabled")
For q35 machine type, this was introduced in qemu 6.1 with the following changes in qemu.git:
(a) c0e427d6eb5fef ("hw/acpi/ich9: Enable ACPI PCI hot-plug") (b) 17858a16950860 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35")
The reasons for enabling ACPI based hotplug for PCIe (q35) based machines (as opposed to native hotplug) for bridges are outlined in (b). It is possible that some users might still want to use native hotplug on PCIe [1]. Therefore, this conf option enables users to choose either ACPI based hotplug or native hotplug for cold plugged bridges (for example for pcie root port controller in q35 machines).
``acpi-root-hotplug`` option enables or disables ACPI based hotplug for PCI root bus (pci-root controller). This option is only available for pc machine type. The corresponding commandline option to qemu for x86 guests is:
-global PIIX4_PM.acpi-root-pci-hotplug=<off/on>
This additional option enables users to disable hotplug for all devices in the system without adding an additional PCI-PCI bridge, putting the devices behind the bridge and using the existing ``acpi-hotplug-bridge`` option to disable hotplug on that bridge. This feature was introduced from qemu version 5.2 with the following change in qemu.git:
3d7e78aa7777f ("Introduce a new flag for i440fx to disable PCI hotplug on the root bus")
The above qemu commit describes some compelling reasons why users might to disable hotplug on PCI root buses [2].
A brief summary of the patches:
[PATCH v3 1/5] qemu: capablities: detect presence of [PATCH v3 2/5] qemu: capablities: detect presence of Patches 1 and 2 implement support for qemu capability checks for the above config options.
[PATCH v3 3/5] conf: introduce acpi-hotplug-bridge and Patch 3 actually adds the config option to the schema and adds related unit tests.
[PATCH v3 4/5] qemu: command: add support for qemu options that Patch 4 adds the backend qemu commandline support for the options. It also adds relevant unit tests for the same.
[PATCH v3 5/5] NEWS: add new acpi pci hotplug options in the release Patch 5 adds the release notes for the next libvirt release.
Changelog: v1: initial implementation. Had some bugs and missed some unit tests. v2: fixed bugs and added additional missing unit tests. v3: reorganized the patches as per Laine's suggestion. Added more details in commit messages. Added conf description in formatdomain.rst. Added changelog for next release.
Notes:
[1] One concrete example of why one might still want to use native hotplug with pcie-root-port controller is the fact that we are still discovering issues with acpi hotplug on PCIE.
Yes, sigh. I recall someone saying something like "if we switch to ACPI hotplug then all these bugs just go away and everything works" or something like that. Reality never matches the ideal picture we put in our brains.
At least ACPI hotplug is only the default on new machinetypes (doesn't help much for management platforms that always just use "q35" every time they start a guest). And it can also cause problems with distro-specific machinetypes in downstream distros when they are rebased: https://bugzilla.redhat.com/2006409
Oh wow, what a tangled web! Yes, during the transition we might see some more issues until things get stable.
One such issue is: https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg02146.html Another reason is that users have been using native hotplug on pcie root ports up until now. They have built and tested their systems based on native hotplug. They may not want to suddenly move to acpi based hotplug just because it is now the default in qemu. Supporting the option to chose one or the other through libvirt makes things simpler for end users.
[2] The use case scenario described by Laine in https://listman.redhat.com/archives/libvir-list/2020-February/msg00110.html intentionally does not discuss i440fx and focusses solely on q35. I do realize that redhat has moved on from i440fx and currently efforts for new features are concentrated on q35 machines only. We have had some hard debates on this on the qemu mailing list before. The fact of the matter is that i440fx is not at 1-1 parity with q35. There are many users who are currenly using i440fx and are simply not ready to move to q35 without sacrificing some existing features they support today. For example https://wiki.qemu.org/images/4/4e/Q35.pdf lists some of q35 limitations.
To be fair, aside from "support for Win2000/WinXP", none of the items on the "limitations" page of that slide deck is something that's impossible to do with a Q35 machinetype; it's just that accomplishing some things may be more complicated. But I understand your point. Mainly I brought it up because I wanted to be sure that we're adding these to fulfill an actual need, rather than just adding bulk for the sake of completeness, or to satisfy curiosity.
Makes sense.
https://www.linux-kvm.org/images/0/06/2012-forum-Q35.pdf provides more information on the differences. Hence we need to solve the issue Laine has described in the above email for i440fx without adding additional bridges.
Further, in Daniel Berrange's words from : https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg03012.html
"From the upstream POV, there's been no decision / agreement to phase out PIIX, this is purely a RHEL downstream decision & plan. If other distros / users have a different POV, and find the feature useful, we should accept the patch if it meets the normal QEMU patch requirements. "
Also to be noted that I have already experimented this qemu commandline option using libvirt passthrough feature as has been documented in http://blog.vmsplice.net/2011/04/how-to-pass-qemu-command-line-options.html This was only meant to be a short term solution until libvirt started supporting this natively. Supporting this option through libvirt would simplify their use case as well as add capability validations and graceful failure scenarios in case qemu did not support the option.
[3] Finally, I implemented support for ``acpi-root-hotplug`` option in Qemu. Since adding the support for this option, I have not run away :-) I am still around, fixing other issues in the same subsystem in qemu and also now I have added myself as a reviewer of patches in this area. I will also be trying to support/maintain this new xml conf option in libvirt to the extent I can in future with the help of other experienced maintainers. Obviously this is all freelance work at this moment and is highly dependent on available free time.
Since I don't follow qemu-devel closely, I didn't have prior knowledge of exactly what the options did, and it was unclear in the earlier versions of your patches that what <acpi-hotplug-bridge enabled='no'/> did was to disable ACPI hotplug for the entire guest (which on Q35 means that native PCIe hotplug will be found/used, and on 440fx means that hotplug won't be possible (unless SHPC hotplugged is enabled)). Your exaplanation and documentation in this spin of the patches makes that all clear though, so I'm beyond the "what does this do and do we need it?" stage to the "are there any problems with the code?" stage, and that's what I'll try to address in my review of the patches.
Sounds good.