[libvirt] [PATCH 0/2] Avoid issues due to qemu dropping osxsave and ospke

Hi, this series tries to address a drop of commandline options by qemu in regard to osxsave [1] and ospke [2]. This was already discussed in [3] late last year but got forgotten afterwards. The Ubuntu bug is at [4] and an older Fedora bug is at [5]. TL;DR: - osxsave/ospke features were never really configurable - KVM never returned the bits on GET_SUPPORTED_CPUID - very rare to be seen in the wild - avoid issues with newer qemu and old/odd XMLs to be sure Details: I checked various use cases from virt-install to openstack and some in between. The only cases I found that would define osxsave/ospke is virt-install pior to version 2.0 and even there only when used with --cpu=host-model or --cpu=host-copy. If you ever really enabled the feature you'd have got: error: the CPU is incompatible with host CPU: Host CPU does not provide required features: ospke The problem lies in domain XMLs that explicitly disable it. That would be <feature policy='disable' name='osxsave'/> But due to almost (or actually none) no host exposing this the following also triggers: <feature policy='optional' name='ospke'/> This will make libvirt add it to the qemu commandline like: -cpu ...,osxsave=off,ospke=off And that will crash when qemu starts with: error: internal error: process exited while connecting to monitor: 2019-04-25T12:12:01.698646Z qemu-system-x86_64: can't apply global core2duo-x86_64-cpu.osxsave=off: Property '.osxsave' not found There are much more long term discussions about demoting and dropping qemu features and I'd like to avoid those discussions being mixed. The reason to drop it more or less without notice was that it never did anything to begin with. Due to that our solution might in a similar fashion be more trivial - just stop defining those two features to qemu commandline. [1]: https://git.qemu.org/?p=qemu.git;a=commit;h=f1a23522b03a569f13aad49294bb4c4b... [2]: https://git.qemu.org/?p=qemu.git;a=commit;h=9ccb9784b57804f5c74434ad6ccb6665... [3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg561877.html [4]: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1825195 [5]: https://bugzilla.redhat.com/show_bug.cgi?id=1644848 Christian Ehrhardt (2): qemu: do not define known no-op features qemuxml2argvtest: add test for remove cpu features src/qemu/qemu_command.c | 23 +++++++++++++++ .../qemuxml2argvdata/cpu-host-model-cmt.args | 2 +- .../cpu-no-removed-features.args | 29 +++++++++++++++++++ .../cpu-no-removed-features.xml | 23 +++++++++++++++ tests/qemuxml2argvdata/cpu-tsc-frequency.args | 4 +-- tests/qemuxml2argvtest.c | 1 + 6 files changed, 79 insertions(+), 3 deletions(-) create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.args create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.xml -- 2.17.1

Qemu dropped cpu features for osxsave and ospke [1][2]. The reason for the instant removal is that those features were never configurable as discussed in [3]. Fortunately the use cases adding those flags in the past are rare, but they exist. One that I identified are e.g. older virt-install when used with --cpu=host-model and there always could be the case of a user adding it to the guest xml. This triggers an issue like: qemu-system-x86_64: can't apply global Broadwell-noTSX-x86_64- cpu.osxsave=on: Property '.osxsave' not found Ensure that this does no more break spawning newer qemu versions by not rendering those features into the qemu command line. Fixes: https://bugs.launchpad.net/fedora/+source/qemu/+bug/1825195 Resolves: https://bugzilla.redhat.com/1644848 [1]: https://git.qemu.org/?p=qemu.git;a=commit;h=f1a2352 [2]: https://git.qemu.org/?p=qemu.git;a=commit;h=9ccb978 [3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg561877.html Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> --- src/qemu/qemu_command.c | 23 +++++++++++++++++++ .../qemuxml2argvdata/cpu-host-model-cmt.args | 2 +- tests/qemuxml2argvdata/cpu-tsc-frequency.args | 4 ++-- 3 files changed, 26 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 1425d97b1e..e0c8ae50a1 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -7076,6 +7076,27 @@ qemuBuildGlobalControllerCommandLine(virCommandPtr cmd, return 0; } +/** + * qemuFeatureNoEffect: + * @feature: CPU Feature + * + * Returns true, if the feature is known to have (never had) an effect on QEMU. + * Those features might be dropped in qemu without a longer deprecation cycle + * and must therefore be known e.g. to no more define them on command line. + */ +static bool +qemuFeatureNoEffect(virCPUFeatureDefPtr feature) +{ + if (!feature->name) + return false; + + if (STREQ(feature->name, "osxsave")) + return true; + if (STREQ(feature->name, "ospke")) + return true; + + return false; +} static int qemuBuildCpuModelArgStr(virQEMUDriverPtr driver, @@ -7144,6 +7165,8 @@ qemuBuildCpuModelArgStr(virQEMUDriverPtr driver, virBufferAsprintf(buf, ",vendor=%s", cpu->vendor_id); for (i = 0; i < cpu->nfeatures; i++) { + if (qemuFeatureNoEffect(&(cpu->features[i]))) + continue; switch ((virCPUFeaturePolicy) cpu->features[i].policy) { case VIR_CPU_FEATURE_FORCE: case VIR_CPU_FEATURE_REQUIRE: diff --git a/tests/qemuxml2argvdata/cpu-host-model-cmt.args b/tests/qemuxml2argvdata/cpu-host-model-cmt.args index 36f706b836..42f969fd62 100644 --- a/tests/qemuxml2argvdata/cpu-host-model-cmt.args +++ b/tests/qemuxml2argvdata/cpu-host-model-cmt.args @@ -12,7 +12,7 @@ QEMU_AUDIO_DRV=none \ -S \ -machine pc,accel=tcg,usb=off,dump-guest-core=off \ -cpu Haswell,+vme,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,\ -+smx,+est,+tm2,+xtpr,+pdcm,+osxsave,+f16c,+rdrand,+pdpe1gb,+abm,+lahf_lm \ ++smx,+est,+tm2,+xtpr,+pdcm,+f16c,+rdrand,+pdpe1gb,+abm,+lahf_lm \ -m 214 \ -realtime mlock=off \ -smp 6,sockets=6,cores=1,threads=1 \ diff --git a/tests/qemuxml2argvdata/cpu-tsc-frequency.args b/tests/qemuxml2argvdata/cpu-tsc-frequency.args index 55bcf89617..55b72b4404 100644 --- a/tests/qemuxml2argvdata/cpu-tsc-frequency.args +++ b/tests/qemuxml2argvdata/cpu-tsc-frequency.args @@ -12,8 +12,8 @@ QEMU_AUDIO_DRV=none \ -S \ -machine pc,accel=kvm,usb=off,dump-guest-core=off \ -cpu Haswell,+vme,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,\ -+smx,+est,+tm2,+xtpr,+pdcm,+osxsave,+f16c,+rdrand,+pdpe1gb,+abm,+lahf_lm,\ -+invtsc,tsc-frequency=3504000000 \ ++smx,+est,+tm2,+xtpr,+pdcm,+f16c,+rdrand,+pdpe1gb,+abm,+lahf_lm,+invtsc,\ +tsc-frequency=3504000000 \ -m 214 \ -realtime mlock=off \ -smp 1,sockets=1,cores=1,threads=1 \ -- 2.17.1

On 4/25/19 9:50 AM, Christian Ehrhardt wrote:
Qemu dropped cpu features for osxsave and ospke [1][2]. The reason for the instant removal is that those features were never configurable as discussed in [3].
Fortunately the use cases adding those flags in the past are rare, but they exist. One that I identified are e.g. older virt-install when used with --cpu=host-model and there always could be the case of a user adding it to the guest xml.
This triggers an issue like: qemu-system-x86_64: can't apply global Broadwell-noTSX-x86_64- cpu.osxsave=on: Property '.osxsave' not found
Ensure that this does no more break spawning newer qemu versions by not rendering those features into the qemu command line.
Fixes: https://bugs.launchpad.net/fedora/+source/qemu/+bug/1825195 Resolves: https://bugzilla.redhat.com/1644848
I am not sure if I understood why this is the solution you opted for. I'll start a discussion in the cover-letter, which seems more suitable. As far as this patch goes: Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
[1]: https://git.qemu.org/?p=qemu.git;a=commit;h=f1a2352 [2]: https://git.qemu.org/?p=qemu.git;a=commit;h=9ccb978 [3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg561877.html
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> --- src/qemu/qemu_command.c | 23 +++++++++++++++++++ .../qemuxml2argvdata/cpu-host-model-cmt.args | 2 +- tests/qemuxml2argvdata/cpu-tsc-frequency.args | 4 ++-- 3 files changed, 26 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 1425d97b1e..e0c8ae50a1 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -7076,6 +7076,27 @@ qemuBuildGlobalControllerCommandLine(virCommandPtr cmd, return 0; }
+/** + * qemuFeatureNoEffect: + * @feature: CPU Feature + * + * Returns true, if the feature is known to have (never had) an effect on QEMU. + * Those features might be dropped in qemu without a longer deprecation cycle + * and must therefore be known e.g. to no more define them on command line. + */ +static bool +qemuFeatureNoEffect(virCPUFeatureDefPtr feature) +{ + if (!feature->name) + return false; + + if (STREQ(feature->name, "osxsave")) + return true; + if (STREQ(feature->name, "ospke")) + return true; + + return false; +}
static int qemuBuildCpuModelArgStr(virQEMUDriverPtr driver, @@ -7144,6 +7165,8 @@ qemuBuildCpuModelArgStr(virQEMUDriverPtr driver, virBufferAsprintf(buf, ",vendor=%s", cpu->vendor_id);
for (i = 0; i < cpu->nfeatures; i++) { + if (qemuFeatureNoEffect(&(cpu->features[i]))) + continue; switch ((virCPUFeaturePolicy) cpu->features[i].policy) { case VIR_CPU_FEATURE_FORCE: case VIR_CPU_FEATURE_REQUIRE: diff --git a/tests/qemuxml2argvdata/cpu-host-model-cmt.args b/tests/qemuxml2argvdata/cpu-host-model-cmt.args index 36f706b836..42f969fd62 100644 --- a/tests/qemuxml2argvdata/cpu-host-model-cmt.args +++ b/tests/qemuxml2argvdata/cpu-host-model-cmt.args @@ -12,7 +12,7 @@ QEMU_AUDIO_DRV=none \ -S \ -machine pc,accel=tcg,usb=off,dump-guest-core=off \ -cpu Haswell,+vme,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,\ -+smx,+est,+tm2,+xtpr,+pdcm,+osxsave,+f16c,+rdrand,+pdpe1gb,+abm,+lahf_lm \ ++smx,+est,+tm2,+xtpr,+pdcm,+f16c,+rdrand,+pdpe1gb,+abm,+lahf_lm \ -m 214 \ -realtime mlock=off \ -smp 6,sockets=6,cores=1,threads=1 \ diff --git a/tests/qemuxml2argvdata/cpu-tsc-frequency.args b/tests/qemuxml2argvdata/cpu-tsc-frequency.args index 55bcf89617..55b72b4404 100644 --- a/tests/qemuxml2argvdata/cpu-tsc-frequency.args +++ b/tests/qemuxml2argvdata/cpu-tsc-frequency.args @@ -12,8 +12,8 @@ QEMU_AUDIO_DRV=none \ -S \ -machine pc,accel=kvm,usb=off,dump-guest-core=off \ -cpu Haswell,+vme,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,\ -+smx,+est,+tm2,+xtpr,+pdcm,+osxsave,+f16c,+rdrand,+pdpe1gb,+abm,+lahf_lm,\ -+invtsc,tsc-frequency=3504000000 \ ++smx,+est,+tm2,+xtpr,+pdcm,+f16c,+rdrand,+pdpe1gb,+abm,+lahf_lm,+invtsc,\ +tsc-frequency=3504000000 \ -m 214 \ -realtime mlock=off \ -smp 1,sockets=1,cores=1,threads=1 \

CPU features that always were a no-op in qemu got removed there. We no more specify them as that would trigger errors and fail to start qemu. This test ensures that those features really are not rendered into qemu command line. Without the related fix this test will trigger and fail like: In 'tests/qemuxml2argvdata/cpu-no-removed-features.args': Offset 371 Expect [ ] Actual [,-osxsave,-ospke ] Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> --- .../cpu-no-removed-features.args | 29 +++++++++++++++++++ .../cpu-no-removed-features.xml | 23 +++++++++++++++ tests/qemuxml2argvtest.c | 1 + 3 files changed, 53 insertions(+) create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.args create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.xml diff --git a/tests/qemuxml2argvdata/cpu-no-removed-features.args b/tests/qemuxml2argvdata/cpu-no-removed-features.args new file mode 100644 index 0000000000..1e4af63dc3 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-no-removed-features.args @@ -0,0 +1,29 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-QEMUGuest1 \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 \ +-name QEMUGuest1 \ +-S \ +-machine pc,accel=kvm,usb=off,dump-guest-core=off \ +-cpu core2duo \ +-m 214 \ +-realtime mlock=off \ +-smp 6,sockets=6,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,\ +server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-usb \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 diff --git a/tests/qemuxml2argvdata/cpu-no-removed-features.xml b/tests/qemuxml2argvdata/cpu-no-removed-features.xml new file mode 100644 index 0000000000..dc5a681901 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-no-removed-features.xml @@ -0,0 +1,23 @@ +<domain type='kvm'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219100</memory> + <currentMemory unit='KiB'>219100</currentMemory> + <vcpu placement='static'>6</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='network'/> + </os> + <cpu match='exact'> + <model>core2duo</model> + <feature name='osxsave' policy='optional'/> + <feature name='ospke' policy='optional'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index d5c6dc4c0c..edc19ace6f 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1649,6 +1649,7 @@ mymain(void) DO_TEST("cpu-fallback", QEMU_CAPS_KVM); DO_TEST_FAILURE("cpu-nofallback", QEMU_CAPS_KVM); DO_TEST("cpu-strict1", QEMU_CAPS_KVM); + DO_TEST("cpu-no-removed-features", QEMU_CAPS_KVM); DO_TEST("cpu-numa1", NONE); DO_TEST("cpu-numa2", NONE); DO_TEST("cpu-numa-no-memory-element", NONE); -- 2.17.1

On 4/25/19 9:50 AM, Christian Ehrhardt wrote:
CPU features that always were a no-op in qemu got removed there. We no more specify them as that would trigger errors and fail to start qemu. This test ensures that those features really are not rendered into qemu command line.
Without the related fix this test will trigger and fail like: In 'tests/qemuxml2argvdata/cpu-no-removed-features.args': Offset 371 Expect [ ] Actual [,-osxsave,-ospke ]
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> ---
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
.../cpu-no-removed-features.args | 29 +++++++++++++++++++ .../cpu-no-removed-features.xml | 23 +++++++++++++++ tests/qemuxml2argvtest.c | 1 + 3 files changed, 53 insertions(+) create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.args create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.xml
diff --git a/tests/qemuxml2argvdata/cpu-no-removed-features.args b/tests/qemuxml2argvdata/cpu-no-removed-features.args new file mode 100644 index 0000000000..1e4af63dc3 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-no-removed-features.args @@ -0,0 +1,29 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-QEMUGuest1 \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 \ +-name QEMUGuest1 \ +-S \ +-machine pc,accel=kvm,usb=off,dump-guest-core=off \ +-cpu core2duo \ +-m 214 \ +-realtime mlock=off \ +-smp 6,sockets=6,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,\ +server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-usb \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 diff --git a/tests/qemuxml2argvdata/cpu-no-removed-features.xml b/tests/qemuxml2argvdata/cpu-no-removed-features.xml new file mode 100644 index 0000000000..dc5a681901 --- /dev/null +++ b/tests/qemuxml2argvdata/cpu-no-removed-features.xml @@ -0,0 +1,23 @@ +<domain type='kvm'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219100</memory> + <currentMemory unit='KiB'>219100</currentMemory> + <vcpu placement='static'>6</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='network'/> + </os> + <cpu match='exact'> + <model>core2duo</model> + <feature name='osxsave' policy='optional'/> + <feature name='ospke' policy='optional'/> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index d5c6dc4c0c..edc19ace6f 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1649,6 +1649,7 @@ mymain(void) DO_TEST("cpu-fallback", QEMU_CAPS_KVM); DO_TEST_FAILURE("cpu-nofallback", QEMU_CAPS_KVM); DO_TEST("cpu-strict1", QEMU_CAPS_KVM); + DO_TEST("cpu-no-removed-features", QEMU_CAPS_KVM); DO_TEST("cpu-numa1", NONE); DO_TEST("cpu-numa2", NONE); DO_TEST("cpu-numa-no-memory-element", NONE);

On Thu, Apr 25, 2019 at 2:50 PM Christian Ehrhardt <christian.ehrhardt@canonical.com> wrote:
Hi, this series tries to address a drop of commandline options by qemu in regard to osxsave [1] and ospke [2]. This was already discussed in [3] late last year but got forgotten afterwards. The Ubuntu bug is at [4] and an older Fedora bug is at [5].
TL;DR: - osxsave/ospke features were never really configurable - KVM never returned the bits on GET_SUPPORTED_CPUID - very rare to be seen in the wild - avoid issues with newer qemu and old/odd XMLs to be sure
Hi, I know we are in a freeze, but we if one has time to review that would be great to be ready when 5.4 opens. But since 8 days passed I thought I could ping here to be seen (again).

On 4/25/19 9:50 AM, Christian Ehrhardt wrote:
Hi, this series tries to address a drop of commandline options by qemu in regard to osxsave [1] and ospke [2]. This was already discussed in [3] late last year but got forgotten afterwards. The Ubuntu bug is at [4] and an older Fedora bug is at [5].
TL;DR: - osxsave/ospke features were never really configurable - KVM never returned the bits on GET_SUPPORTED_CPUID - very rare to be seen in the wild - avoid issues with newer qemu and old/odd XMLs to be sure
Details:
I checked various use cases from virt-install to openstack and some in between. The only cases I found that would define osxsave/ospke is virt-install pior to version 2.0 and even there only when used with --cpu=host-model or --cpu=host-copy. If you ever really enabled the feature you'd have got: error: the CPU is incompatible with host CPU: Host CPU does not provide required features: ospke
The problem lies in domain XMLs that explicitly disable it. That would be <feature policy='disable' name='osxsave'/> But due to almost (or actually none) no host exposing this the following also triggers: <feature policy='optional' name='ospke'/>
So, it happened that this notebook (ThinkPad T480) has the feature you're handling in this patch series: [danielhb@rekt libvirt]$ sudo ./run tools/virsh capabilities [sudo] password for danielhb: <capabilities> <host> <uuid>985fc2cc-23a6-11b2-a85c-a1357d626e56</uuid> <cpu> <arch>x86_64</arch> <model>Skylake-Client-IBRS</model> <vendor>Intel</vendor> <microcode version='150'/> <topology sockets='1' cores='4' threads='2'/> (...) <feature name='pdcm'/> <feature name='osxsave'/> (...) With your approach, what happens is that a feature that is declared to be effective in the capabilities is, in fact, ignored. It is an upgrade of what would happen without it, of course. But I am wondering here, fully aware that you mentioned that you wanted this discussion to be avoided, that we should simply exterminate both osxsave and ospke from Libvirt capabilities. I mean, what's the point of even reporting them if they will end up being ignored? You have both QEMU commits [1] and [2] mentioning that these flags were never configurable in the first place, so it's not like we need to keep them for backward compatibility either. Note that this does not discard this patch series - I think we'll need a solution like this to deal with older VMs that define these features in their XMLs. Thanks, DHB
This will make libvirt add it to the qemu commandline like: -cpu ...,osxsave=off,ospke=off
And that will crash when qemu starts with: error: internal error: process exited while connecting to monitor: 2019-04-25T12:12:01.698646Z qemu-system-x86_64: can't apply global core2duo-x86_64-cpu.osxsave=off: Property '.osxsave' not found
There are much more long term discussions about demoting and dropping qemu features and I'd like to avoid those discussions being mixed. The reason to drop it more or less without notice was that it never did anything to begin with. Due to that our solution might in a similar fashion be more trivial - just stop defining those two features to qemu commandline.
[1]: https://git.qemu.org/?p=qemu.git;a=commit;h=f1a23522b03a569f13aad49294bb4c4b... [2]: https://git.qemu.org/?p=qemu.git;a=commit;h=9ccb9784b57804f5c74434ad6ccb6665... [3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg561877.html [4]: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1825195 [5]: https://bugzilla.redhat.com/show_bug.cgi?id=1644848
Christian Ehrhardt (2): qemu: do not define known no-op features qemuxml2argvtest: add test for remove cpu features
src/qemu/qemu_command.c | 23 +++++++++++++++ .../qemuxml2argvdata/cpu-host-model-cmt.args | 2 +- .../cpu-no-removed-features.args | 29 +++++++++++++++++++ .../cpu-no-removed-features.xml | 23 +++++++++++++++ tests/qemuxml2argvdata/cpu-tsc-frequency.args | 4 +-- tests/qemuxml2argvtest.c | 1 + 6 files changed, 79 insertions(+), 3 deletions(-) create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.args create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.xml

On Tue, May 7, 2019 at 11:03 PM Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
On 4/25/19 9:50 AM, Christian Ehrhardt wrote:
Hi, this series tries to address a drop of commandline options by qemu in regard to osxsave [1] and ospke [2]. This was already discussed in [3] late last year but got forgotten afterwards. The Ubuntu bug is at [4] and an older Fedora bug is at [5].
TL;DR: - osxsave/ospke features were never really configurable - KVM never returned the bits on GET_SUPPORTED_CPUID - very rare to be seen in the wild - avoid issues with newer qemu and old/odd XMLs to be sure
Details:
I checked various use cases from virt-install to openstack and some in between. The only cases I found that would define osxsave/ospke is virt-install pior to version 2.0 and even there only when used with --cpu=host-model or --cpu=host-copy. If you ever really enabled the feature you'd have got: error: the CPU is incompatible with host CPU: Host CPU does not provide required features: ospke
The problem lies in domain XMLs that explicitly disable it. That would be <feature policy='disable' name='osxsave'/> But due to almost (or actually none) no host exposing this the following also triggers: <feature policy='optional' name='ospke'/>
So, it happened that this notebook (ThinkPad T480) has the feature you're handling in this patch series:
Mine as well actually, but obviously only on the host capabilities. [...]
With your approach, what happens is that a feature that is declared to be effective in the capabilities is, in fact, ignored. It is an upgrade of what would happen without it, of course.
Some bikeshedding on "declared to be effective in the capabilities" needed to bring me up to date on this. See below...
But I am wondering here, fully aware that you mentioned that you wanted this discussion to be avoided,
Sorry my comment was misleading then - I'm absolutely fine having that part of the discussion in this context. I only wanted to avoid the discussion about removing features that "actually did something" which naturally is a way more complex topic.
that we should simply exterminate both osxsave and ospke from Libvirt capabilities. I mean, what's the point of even reporting them if they will end up being ignored? You have both QEMU commits [1] and [2] mentioning that these flags were never configurable in the first place, so it's not like we need to keep them for backward compatibility either.
Maybe it is my current lack of understanding how "exterminating both" would be exactly done and I see potential issues where there are none. I wonder would that just be dropping the mappings in src/cpu_map/x86_features.xml? If not then I'd appreciate a hint where exactly you'd suggest we would do the mentioned further extermination of ospke/osxsave. I was afraid of side effects or when no more being detected. And even more so it seems the definition of "capabilities" is not clear enough to me (I never thought too much about it so far): - "The capabilities of the hypervisor" (man page) - "The host CPU architecture and features." (web page) For the former definition yes, we might want to drop such non passable features then. But for the latter definition it feels right to continue reporting that the host has it. It is valid to report that the host has the feature - even thou it can't pass it to a guest. After all it is in the host and not the guest (hypervisor dependent) section of capabilities right? Also we might (later) use the mapping for other things down the road where being an tunable feature (or not) is not important either. Or users compare hosts with same hardware but different libvirt versions and wonder why they lost a cpu feature.
Note that this does not discard this patch series - I think we'll need a solution like this to deal with older VMs that define these features in their XMLs.
Thanks,
DHB
This will make libvirt add it to the qemu commandline like: -cpu ...,osxsave=off,ospke=off
And that will crash when qemu starts with: error: internal error: process exited while connecting to monitor: 2019-04-25T12:12:01.698646Z qemu-system-x86_64: can't apply global core2duo-x86_64-cpu.osxsave=off: Property '.osxsave' not found
There are much more long term discussions about demoting and dropping qemu features and I'd like to avoid those discussions being mixed. The reason to drop it more or less without notice was that it never did anything to begin with. Due to that our solution might in a similar fashion be more trivial - just stop defining those two features to qemu commandline.
[1]: https://git.qemu.org/?p=qemu.git;a=commit;h=f1a23522b03a569f13aad49294bb4c4b... [2]: https://git.qemu.org/?p=qemu.git;a=commit;h=9ccb9784b57804f5c74434ad6ccb6665... [3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg561877.html [4]: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1825195 [5]: https://bugzilla.redhat.com/show_bug.cgi?id=1644848
Christian Ehrhardt (2): qemu: do not define known no-op features qemuxml2argvtest: add test for remove cpu features
src/qemu/qemu_command.c | 23 +++++++++++++++ .../qemuxml2argvdata/cpu-host-model-cmt.args | 2 +- .../cpu-no-removed-features.args | 29 +++++++++++++++++++ .../cpu-no-removed-features.xml | 23 +++++++++++++++ tests/qemuxml2argvdata/cpu-tsc-frequency.args | 4 +-- tests/qemuxml2argvtest.c | 1 + 6 files changed, 79 insertions(+), 3 deletions(-) create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.args create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.xml
-- Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd

On 5/9/19 6:31 AM, Christian Ehrhardt wrote:
On Tue, May 7, 2019 at 11:03 PM Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
On 4/25/19 9:50 AM, Christian Ehrhardt wrote:
Hi, this series tries to address a drop of commandline options by qemu in regard to osxsave [1] and ospke [2]. This was already discussed in [3] late last year but got forgotten afterwards. The Ubuntu bug is at [4] and an older Fedora bug is at [5].
TL;DR: - osxsave/ospke features were never really configurable - KVM never returned the bits on GET_SUPPORTED_CPUID - very rare to be seen in the wild - avoid issues with newer qemu and old/odd XMLs to be sure
Details:
I checked various use cases from virt-install to openstack and some in between. The only cases I found that would define osxsave/ospke is virt-install pior to version 2.0 and even there only when used with --cpu=host-model or --cpu=host-copy. If you ever really enabled the feature you'd have got: error: the CPU is incompatible with host CPU: Host CPU does not provide required features: ospke
The problem lies in domain XMLs that explicitly disable it. That would be <feature policy='disable' name='osxsave'/> But due to almost (or actually none) no host exposing this the following also triggers: <feature policy='optional' name='ospke'/> So, it happened that this notebook (ThinkPad T480) has the feature you're handling in this patch series:
Mine as well actually, but obviously only on the host capabilities.
[...]
And that changes everything I said in my previous email ... not only I messed up checking for <host> instead of <guest> capabilities, but also I could've checked "domcapabilities". In the docs: "While the Driver Capabilities provides the host capabilities (...), the Domain Capabilities provides the hypervisor specific capabilities for Management Applications to query and make decisions regarding what to utilize." Thus, it doesn't matter if osxsave is being reported in the host capabilities. What matters if whether osxsave is declared in the domcapabilities (or the <guest> element in capabilities) - which would mean that guests can utilize it. I apologize for this confusion.
With your approach, what happens is that a feature that is declared to be effective in the capabilities is, in fact, ignored. It is an upgrade of what would happen without it, of course. Some bikeshedding on "declared to be effective in the capabilities" needed to bring me up to date on this. See below...
But I am wondering here, fully aware that you mentioned that you wanted this discussion to be avoided, Sorry my comment was misleading then - I'm absolutely fine having that part of the discussion in this context. I only wanted to avoid the discussion about removing features that "actually did something" which naturally is a way more complex topic.
that we should simply exterminate both osxsave and ospke from Libvirt capabilities. I mean, what's the point of even reporting them if they will end up being ignored? You have both QEMU commits [1] and [2] mentioning that these flags were never configurable in the first place, so it's not like we need to keep them for backward compatibility either. Maybe it is my current lack of understanding how "exterminating both" would be exactly done and I see potential issues where there are none. I wonder would that just be dropping the mappings in src/cpu_map/x86_features.xml? If not then I'd appreciate a hint where exactly you'd suggest we would do the mentioned further extermination of ospke/osxsave.
I was afraid of side effects or when no more being detected. And even more so it seems the definition of "capabilities" is not clear enough to me (I never thought too much about it so far): - "The capabilities of the hypervisor" (man page) - "The host CPU architecture and features." (web page)
For the former definition yes, we might want to drop such non passable features then.
But for the latter definition it feels right to continue reporting that the host has it. It is valid to report that the host has the feature - even thou it can't pass it to a guest. After all it is in the host and not the guest (hypervisor dependent) section of capabilities right?
Yes, I agree.
Also we might (later) use the mapping for other things down the road where being an tunable feature (or not) is not important either. Or users compare hosts with same hardware but different libvirt versions and wonder why they lost a cpu feature.
Note that this does not discard this patch series - I think we'll need a solution like this to deal with older VMs that define these features in their XMLs.
Thanks,
DHB
This will make libvirt add it to the qemu commandline like: -cpu ...,osxsave=off,ospke=off
And that will crash when qemu starts with: error: internal error: process exited while connecting to monitor: 2019-04-25T12:12:01.698646Z qemu-system-x86_64: can't apply global core2duo-x86_64-cpu.osxsave=off: Property '.osxsave' not found
There are much more long term discussions about demoting and dropping qemu features and I'd like to avoid those discussions being mixed. The reason to drop it more or less without notice was that it never did anything to begin with. Due to that our solution might in a similar fashion be more trivial - just stop defining those two features to qemu commandline.
[1]: https://git.qemu.org/?p=qemu.git;a=commit;h=f1a23522b03a569f13aad49294bb4c4b... [2]: https://git.qemu.org/?p=qemu.git;a=commit;h=9ccb9784b57804f5c74434ad6ccb6665... [3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg561877.html [4]: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1825195 [5]: https://bugzilla.redhat.com/show_bug.cgi?id=1644848
Christian Ehrhardt (2): qemu: do not define known no-op features qemuxml2argvtest: add test for remove cpu features
src/qemu/qemu_command.c | 23 +++++++++++++++ .../qemuxml2argvdata/cpu-host-model-cmt.args | 2 +- .../cpu-no-removed-features.args | 29 +++++++++++++++++++ .../cpu-no-removed-features.xml | 23 +++++++++++++++ tests/qemuxml2argvdata/cpu-tsc-frequency.args | 4 +-- tests/qemuxml2argvtest.c | 1 + 6 files changed, 79 insertions(+), 3 deletions(-) create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.args create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.xml

This is a VM with 'osxsave' capability declared, when there is no 'osxsave' guest support, using this series on top of current master: $ sudo ./run tools/virsh start ub1810-osxsave error: Failed to start domain ub1810-osxsave error: internal error: process exited while connecting to monitor: 2019-05-09T17:10:41.077568Z qemu-system-x86_64: can't apply global Skylake-Client-IBRS-x86_64-cpu.osxsave=off: Property '.osxsave' not found Applying this series makes the VM boot up, as intended. I believe there is room for discussion whether the VM should boot up or fail to boot with a "feature not found" error like the case below, in a sense of "you can't disable a feature that does not exist": <feature policy='disable' name='not_a_cap'/> $ sudo ./run tools/virsh start ub1810-unknown-cap error: Failed to start domain ub1810-unknown-cap error: unsupported configuration: unknown CPU feature: not_a_cap But the alternative presented in this series does the trick and it's less annoying for existing VMs. For all the patch series: Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com> On 4/25/19 9:50 AM, Christian Ehrhardt wrote:
Hi, this series tries to address a drop of commandline options by qemu in regard to osxsave [1] and ospke [2]. This was already discussed in [3] late last year but got forgotten afterwards. The Ubuntu bug is at [4] and an older Fedora bug is at [5].
TL;DR: - osxsave/ospke features were never really configurable - KVM never returned the bits on GET_SUPPORTED_CPUID - very rare to be seen in the wild - avoid issues with newer qemu and old/odd XMLs to be sure
Details:
I checked various use cases from virt-install to openstack and some in between. The only cases I found that would define osxsave/ospke is virt-install pior to version 2.0 and even there only when used with --cpu=host-model or --cpu=host-copy. If you ever really enabled the feature you'd have got: error: the CPU is incompatible with host CPU: Host CPU does not provide required features: ospke
The problem lies in domain XMLs that explicitly disable it. That would be <feature policy='disable' name='osxsave'/> But due to almost (or actually none) no host exposing this the following also triggers: <feature policy='optional' name='ospke'/>
This will make libvirt add it to the qemu commandline like: -cpu ...,osxsave=off,ospke=off
And that will crash when qemu starts with: error: internal error: process exited while connecting to monitor: 2019-04-25T12:12:01.698646Z qemu-system-x86_64: can't apply global core2duo-x86_64-cpu.osxsave=off: Property '.osxsave' not found
There are much more long term discussions about demoting and dropping qemu features and I'd like to avoid those discussions being mixed. The reason to drop it more or less without notice was that it never did anything to begin with. Due to that our solution might in a similar fashion be more trivial - just stop defining those two features to qemu commandline.
[1]: https://git.qemu.org/?p=qemu.git;a=commit;h=f1a23522b03a569f13aad49294bb4c4b... [2]: https://git.qemu.org/?p=qemu.git;a=commit;h=9ccb9784b57804f5c74434ad6ccb6665... [3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg561877.html [4]: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1825195 [5]: https://bugzilla.redhat.com/show_bug.cgi?id=1644848
Christian Ehrhardt (2): qemu: do not define known no-op features qemuxml2argvtest: add test for remove cpu features
src/qemu/qemu_command.c | 23 +++++++++++++++ .../qemuxml2argvdata/cpu-host-model-cmt.args | 2 +- .../cpu-no-removed-features.args | 29 +++++++++++++++++++ .../cpu-no-removed-features.xml | 23 +++++++++++++++ tests/qemuxml2argvdata/cpu-tsc-frequency.args | 4 +-- tests/qemuxml2argvtest.c | 1 + 6 files changed, 79 insertions(+), 3 deletions(-) create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.args create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.xml

On Thu, May 9, 2019 at 7:43 PM Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
This is a VM with 'osxsave' capability declared, when there is no 'osxsave' guest support, using this series on top of current master:
$ sudo ./run tools/virsh start ub1810-osxsave error: Failed to start domain ub1810-osxsave error: internal error: process exited while connecting to monitor: 2019-05-09T17:10:41.077568Z qemu-system-x86_64: can't apply global Skylake-Client-IBRS-x86_64-cpu.osxsave=off: Property '.osxsave' not found
Applying this series makes the VM boot up, as intended.
I believe there is room for discussion whether the VM should boot up or fail to boot with a "feature not found" error like the case below, in a sense of "you can't disable a feature that does not exist":
<feature policy='disable' name='not_a_cap'/>
$ sudo ./run tools/virsh start ub1810-unknown-cap error: Failed to start domain ub1810-unknown-cap error: unsupported configuration: unknown CPU feature: not_a_cap
But the alternative presented in this series does the trick and it's less annoying for existing VMs.
For all the patch series:
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Thank you so much Daniel! Anyone else to chime in? Otherwise, having review and a test not done by me, I feel complete and would I'd like to push this later this week.
On 4/25/19 9:50 AM, Christian Ehrhardt wrote:
Hi, this series tries to address a drop of commandline options by qemu in regard to osxsave [1] and ospke [2]. This was already discussed in [3] late last year but got forgotten afterwards. The Ubuntu bug is at [4] and an older Fedora bug is at [5].
TL;DR: - osxsave/ospke features were never really configurable - KVM never returned the bits on GET_SUPPORTED_CPUID - very rare to be seen in the wild - avoid issues with newer qemu and old/odd XMLs to be sure
Details:
I checked various use cases from virt-install to openstack and some in between. The only cases I found that would define osxsave/ospke is virt-install pior to version 2.0 and even there only when used with --cpu=host-model or --cpu=host-copy. If you ever really enabled the feature you'd have got: error: the CPU is incompatible with host CPU: Host CPU does not provide required features: ospke
The problem lies in domain XMLs that explicitly disable it. That would be <feature policy='disable' name='osxsave'/> But due to almost (or actually none) no host exposing this the following also triggers: <feature policy='optional' name='ospke'/>
This will make libvirt add it to the qemu commandline like: -cpu ...,osxsave=off,ospke=off
And that will crash when qemu starts with: error: internal error: process exited while connecting to monitor: 2019-04-25T12:12:01.698646Z qemu-system-x86_64: can't apply global core2duo-x86_64-cpu.osxsave=off: Property '.osxsave' not found
There are much more long term discussions about demoting and dropping qemu features and I'd like to avoid those discussions being mixed. The reason to drop it more or less without notice was that it never did anything to begin with. Due to that our solution might in a similar fashion be more trivial - just stop defining those two features to qemu commandline.
[1]: https://git.qemu.org/?p=qemu.git;a=commit;h=f1a23522b03a569f13aad49294bb4c4b... [2]: https://git.qemu.org/?p=qemu.git;a=commit;h=9ccb9784b57804f5c74434ad6ccb6665... [3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg561877.html [4]: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1825195 [5]: https://bugzilla.redhat.com/show_bug.cgi?id=1644848
Christian Ehrhardt (2): qemu: do not define known no-op features qemuxml2argvtest: add test for remove cpu features
src/qemu/qemu_command.c | 23 +++++++++++++++ .../qemuxml2argvdata/cpu-host-model-cmt.args | 2 +- .../cpu-no-removed-features.args | 29 +++++++++++++++++++ .../cpu-no-removed-features.xml | 23 +++++++++++++++ tests/qemuxml2argvdata/cpu-tsc-frequency.args | 4 +-- tests/qemuxml2argvtest.c | 1 + 6 files changed, 79 insertions(+), 3 deletions(-) create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.args create mode 100644 tests/qemuxml2argvdata/cpu-no-removed-features.xml
-- Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd

On Mon, May 13, 2019 at 12:23 PM Christian Ehrhardt <christian.ehrhardt@canonical.com> wrote:
On Thu, May 9, 2019 at 7:43 PM Daniel Henrique Barboza <danielhb413@gmail.com> wrote:
This is a VM with 'osxsave' capability declared, when there is no 'osxsave' guest support, using this series on top of current master:
$ sudo ./run tools/virsh start ub1810-osxsave error: Failed to start domain ub1810-osxsave error: internal error: process exited while connecting to monitor: 2019-05-09T17:10:41.077568Z qemu-system-x86_64: can't apply global Skylake-Client-IBRS-x86_64-cpu.osxsave=off: Property '.osxsave' not found
Applying this series makes the VM boot up, as intended.
I believe there is room for discussion whether the VM should boot up or fail to boot with a "feature not found" error like the case below, in a sense of "you can't disable a feature that does not exist":
<feature policy='disable' name='not_a_cap'/>
$ sudo ./run tools/virsh start ub1810-unknown-cap error: Failed to start domain ub1810-unknown-cap error: unsupported configuration: unknown CPU feature: not_a_cap
But the alternative presented in this series does the trick and it's less annoying for existing VMs.
For all the patch series:
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Thank you so much Daniel!
Anyone else to chime in?
Nothing else came up, I have rebased, rebuilt and retested to be sure and then pushed to master.
participants (2)
-
Christian Ehrhardt
-
Daniel Henrique Barboza