[PATCH v1 0/4] Prevent ppc64 guest launch with broken features

Hi, This work was intended to fix a bug with APIC-EOI setting only (patch 1). I decided to take a closer look and ended up handling more cases. I consider the first 3 patches to be straightforward: those are conditions that QEMU will complain about and I'm simply refusing to launch while giving a better error message than the one QEMU provides. Patch 4 is more debatable. <hyperv><hyperv/> declaration in the XML is benign, but any of the 14 features being declared will cause QEMU errors. Instead of putting more code (i.e. a switch with 14 features in the middle of the code) to prevent any of those features to be enabled, I made a call to simply refuse to launch with any <hyperv> setting in the XML. This is not because I have a beef with Windows or MS or anything. This is about making an educated guess that it's highly unlikable that anyone is running ppc64 guests with <hyperv> declared in the XML, and it's not worth all the code to support it. But I'm all ears to hear otherwise. After this series, these are the features that aren't supported by ppc64 but can be declared in a ppc64 guest without resulting in a failed launch: <features> <acpi/> <apic/> (without 'eoi' setting) <pae/> <hap state='on|off'/> <viridian/> </features> Daniel Henrique Barboza (4): qemu_domain.c: do not launch ppc64 guests with APIC-EOI setting qemu_domain.c: do not launch ppc64 guests with 'pvspinlock' setting qemu_domain.c: do not launch ppc64 guests with 'pmu' setting qemu_domain.c: do not launch ppc64 guests with Hyperv settings src/qemu/qemu_domain.c | 49 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 45 insertions(+), 4 deletions(-) -- 2.25.1

The "<apic/>" feature, although it's not available for pseries, can be declared in the domain XML of ppc64 guests without errors. But setting its 'eoi' attribute will break QEMU. For "<apic eoi='on'/>": qemu-kvm: Expected key=value format, found +kvm_pv_eoi A similar error happens with eoi='off'. One can argue that it's better to simply forbid launching ppc64 guests with "<apic/>" declared in the ppc64 XML - it is a feature that the architecture doesn't support and this would make it clearer about it. This is sensible, but there are ppc64 guests that are running with "<apic/>" declared in the domain (and A LOT of guests running with "<acpi/>" for that matter, probably reminiscent of x86 templates that were reused for ppc64) that will stop working if we go this route. A more subtle approach is to detect if the 'eoi' element is being set for ppc64 guests, and warn the user about it with a better error message than the one QEMU provides. This is the new error message when any value is set for the 'eoi' element: error: unsupported configuration: The 'eoi' attribute of the 'apic' feature is not supported for architecture 'ppc64' or machine type 'pseries'. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/qemu/qemu_domain.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index edc8ba2ddb..381b62b96a 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -5262,8 +5262,26 @@ qemuDomainDefValidateFeatures(const virDomainDef *def, } break; - case VIR_DOMAIN_FEATURE_ACPI: case VIR_DOMAIN_FEATURE_APIC: + /* the <apic/> declaration is harmless for ppc64, but + * its 'eoi' attribute isn't. To detect if 'eoi' was + * present in the XML we can check the tristate switch + * of def->apic_eoi */ + if (def->features[i] != VIR_TRISTATE_SWITCH_ABSENT && + def->apic_eoi != VIR_TRISTATE_SWITCH_ABSENT && + ARCH_IS_PPC64(def->os.arch)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("The 'eoi' attribute of the '%s' feature " + "is not supported for architecture '%s' or " + "machine type '%s'"), + featureName, + virArchToString(def->os.arch), + def->os.machine); + return -1; + } + break; + + case VIR_DOMAIN_FEATURE_ACPI: case VIR_DOMAIN_FEATURE_PAE: case VIR_DOMAIN_FEATURE_HAP: case VIR_DOMAIN_FEATURE_VIRIDIAN: -- 2.25.1

On Thu, 2020-03-19 at 18:44 -0300, Daniel Henrique Barboza wrote:
qemu_domain.c: do not launch ppc64 guests with APIC-EOI setting
s/qemu_domain.c/qemu/ Same for the other patches in the series.
The "<apic/>" feature, although it's not available for pseries, can be declared in the domain XML of ppc64 guests without errors. But setting its 'eoi' attribute will break QEMU. For "<apic eoi='on'/>":
qemu-kvm: Expected key=value format, found +kvm_pv_eoi
A similar error happens with eoi='off'.
This is https://bugzilla.redhat.com/show_bug.cgi?id=1236440 Please include the Bugzilla URL for other patches in the series as well, if applicable. [...]
case VIR_DOMAIN_FEATURE_APIC: + /* the <apic/> declaration is harmless for ppc64, but + * its 'eoi' attribute isn't. To detect if 'eoi' was + * present in the XML we can check the tristate switch + * of def->apic_eoi */ + if (def->features[i] != VIR_TRISTATE_SWITCH_ABSENT && + def->apic_eoi != VIR_TRISTATE_SWITCH_ABSENT && + ARCH_IS_PPC64(def->os.arch)) {
Since the underlying kvm_pv_eoi feature is x86-only, you should change this last part to !ARCH_IS_X86(def->os.arch) and benefit ARM, RISC-V and s390x users at the same time. When you do so, please reduce the comment to something like /* The kvm_pv_eoi feature is x86-only */ -- Andrea Bolognani / Red Hat / Virtualization

On 3/23/20 2:01 PM, Andrea Bolognani wrote:
On Thu, 2020-03-19 at 18:44 -0300, Daniel Henrique Barboza wrote:
qemu_domain.c: do not launch ppc64 guests with APIC-EOI setting
s/qemu_domain.c/qemu/
Same for the other patches in the series.
The "<apic/>" feature, although it's not available for pseries, can be declared in the domain XML of ppc64 guests without errors. But setting its 'eoi' attribute will break QEMU. For "<apic eoi='on'/>":
qemu-kvm: Expected key=value format, found +kvm_pv_eoi
A similar error happens with eoi='off'.
This is
https://bugzilla.redhat.com/show_bug.cgi?id=1236440
Please include the Bugzilla URL for other patches in the series as well, if applicable.
Sure. I didn't include the link because it can't be opened unless you have a RH bugzilla account and I wasn't sure I could add it here.
[...]
case VIR_DOMAIN_FEATURE_APIC: + /* the <apic/> declaration is harmless for ppc64, but + * its 'eoi' attribute isn't. To detect if 'eoi' was + * present in the XML we can check the tristate switch + * of def->apic_eoi */ + if (def->features[i] != VIR_TRISTATE_SWITCH_ABSENT && + def->apic_eoi != VIR_TRISTATE_SWITCH_ABSENT && + ARCH_IS_PPC64(def->os.arch)) {
Since the underlying kvm_pv_eoi feature is x86-only, you should change this last part to
!ARCH_IS_X86(def->os.arch)
and benefit ARM, RISC-V and s390x users at the same time. When you do so, please reduce the comment to something like
I got a bit confused when doing these patches with what aarch64 supports or not (specially kvm_pv_unhalt, the one from patch 2) then I decided to play it safer. I'll gate this one (and the one from patch 2) to be x86 only. And while we're at it, something that just occurred to me, I'll also gate the ppc64 only capabilities as well in a new patch.
/* The kvm_pv_eoi feature is x86-only */
I'll reduce the comments as well. Thanks, DHB

On Mon, 2020-03-23 at 14:21 -0300, Daniel Henrique Barboza wrote:
On 3/23/20 2:01 PM, Andrea Bolognani wrote:
This is
https://bugzilla.redhat.com/show_bug.cgi?id=1236440
Please include the Bugzilla URL for other patches in the series as well, if applicable.
Sure. I didn't include the link because it can't be opened unless you have a RH bugzilla account and I wasn't sure I could add it here.
Oh, I did not realize that bug was private. That's very silly. I'll ask for it to be made public.
And while we're at it, something that just occurred to me, I'll also gate the ppc64 only capabilities as well in a new patch.
Yeah, that makes sense too. We're probably never going to get to a point where these checks are completely accurate, but any improvement can only be a welcome one :) -- Andrea Bolognani / Red Hat / Virtualization

On 3/23/20 2:45 PM, Andrea Bolognani wrote:
On Mon, 2020-03-23 at 14:21 -0300, Daniel Henrique Barboza wrote:
On 3/23/20 2:01 PM, Andrea Bolognani wrote:
This is
https://bugzilla.redhat.com/show_bug.cgi?id=1236440
Please include the Bugzilla URL for other patches in the series as well, if applicable.
Sure. I didn't include the link because it can't be opened unless you have a RH bugzilla account and I wasn't sure I could add it here.
Oh, I did not realize that bug was private. That's very silly. I'll ask for it to be made public.
And while we're at it, something that just occurred to me, I'll also gate the ppc64 only capabilities as well in a new patch.
Yeah, that makes sense too. We're probably never going to get to a point where these checks are completely accurate, but any improvement can only be a welcome one :)
Just realized that we're already doing that here: case VIR_DOMAIN_FEATURE_HPT: case VIR_DOMAIN_FEATURE_HTM: case VIR_DOMAIN_FEATURE_NESTED_HV: case VIR_DOMAIN_FEATURE_CCF_ASSIST: if (qemuDomainDefValidatePSeriesFeature(def, qemuCaps, i) < 0) return -1; break; qemuDomainDefValidatePseriesFeature() filters if the arch is ppc64: static int qemuDomainDefValidatePSeriesFeature(const virDomainDef *def, virQEMUCapsPtr qemuCaps, int feature) { const char *str; if (def->features[feature] != VIR_TRISTATE_SWITCH_ABSENT && !qemuDomainIsPSeries(def)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("The '%s' feature is not supported for " "architecture '%s' or machine type '%s'"), virDomainFeatureTypeToString(feature), virArchToString(def->os.arch), def->os.machine); return -1; } (...) There is no need for an extra patch to handle it. Thanks, DHB

On Tue, 2020-03-24 at 10:22 -0300, Daniel Henrique Barboza wrote:
On 3/23/20 2:45 PM, Andrea Bolognani wrote:
On Mon, 2020-03-23 at 14:21 -0300, Daniel Henrique Barboza wrote:
And while we're at it, something that just occurred to me, I'll also gate the ppc64 only capabilities as well in a new patch.
Yeah, that makes sense too. We're probably never going to get to a point where these checks are completely accurate, but any improvement can only be a welcome one :)
Just realized that we're already doing that here:
case VIR_DOMAIN_FEATURE_HPT: case VIR_DOMAIN_FEATURE_HTM: case VIR_DOMAIN_FEATURE_NESTED_HV: case VIR_DOMAIN_FEATURE_CCF_ASSIST: if (qemuDomainDefValidatePSeriesFeature(def, qemuCaps, i) < 0) return -1; break;
Right! I had forgotten about writing that validation logic O:-)
There is no need for an extra patch to handle it.
Nope, we're good :) -- Andrea Bolognani / Red Hat / Virtualization

PowerPC does not support the 'pvspinlock' feature. The "<pvspinlock/>" declaration will always have a value 'on' or 'off', and both will break QEMU when launching. This is the error message for "<pvspinlock state='on'/>": qemu-kvm: Expected key=value format, found +kvm_pv_unhalt A similar error message is thrown for "<pvspinlock state='off'/>". This patch prevents the pseries guest from launching with any pvspinlock setting with a more informative error message: error: unsupported configuration: The 'pvspinlock' feature is not supported for architecture 'ppc64' or machine type 'pseries' Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/qemu/qemu_domain.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 381b62b96a..c67abec778 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -5281,13 +5281,25 @@ qemuDomainDefValidateFeatures(const virDomainDef *def, } break; + case VIR_DOMAIN_FEATURE_PVSPINLOCK: + if (def->features[i] != VIR_TRISTATE_SWITCH_ABSENT && + ARCH_IS_PPC64(def->os.arch)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("The '%s' feature is not supported for " + "architecture '%s' or machine type '%s'"), + featureName, + virArchToString(def->os.arch), + def->os.machine); + return -1; + } + break; + case VIR_DOMAIN_FEATURE_ACPI: case VIR_DOMAIN_FEATURE_PAE: case VIR_DOMAIN_FEATURE_HAP: case VIR_DOMAIN_FEATURE_VIRIDIAN: case VIR_DOMAIN_FEATURE_PRIVNET: case VIR_DOMAIN_FEATURE_HYPERV: - case VIR_DOMAIN_FEATURE_PVSPINLOCK: case VIR_DOMAIN_FEATURE_CAPABILITIES: case VIR_DOMAIN_FEATURE_PMU: case VIR_DOMAIN_FEATURE_MSRS: -- 2.25.1

On Thu, 2020-03-19 at 18:44 -0300, Daniel Henrique Barboza wrote: [...]
+ case VIR_DOMAIN_FEATURE_PVSPINLOCK: + if (def->features[i] != VIR_TRISTATE_SWITCH_ABSENT && + ARCH_IS_PPC64(def->os.arch)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("The '%s' feature is not supported for " + "architecture '%s' or machine type '%s'"), + featureName, + virArchToString(def->os.arch), + def->os.machine); + return -1;
Same comment as the previous patch: the feature is x86-only, so there's no point in only preventing its use on ppc64. -- Andrea Bolognani / Red Hat / Virtualization

The Perfomance Monitoring Unit (PMU) feature is not available for the Power architecture. The "<pmu/>" feature will always have a value 'on' or 'off' after saving the domain XML, and both will be rejected by QEMU when launching. This is the error message for "<pmu state='on'/>": qemu-kvm: can't apply global host-powerpc64-cpu.pmu=on: Property '.pmu' not found A similar error message is thrown for "<pmu state='off'/>". This patch prevents the pseries guest from launching with any pmu setting with a more informative error message: error: unsupported configuration: The 'pmu' feature is not supported for architecture 'ppc64' or machine type 'pseries' Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/qemu/qemu_domain.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c67abec778..2e5f987a04 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -5282,6 +5282,7 @@ qemuDomainDefValidateFeatures(const virDomainDef *def, break; case VIR_DOMAIN_FEATURE_PVSPINLOCK: + case VIR_DOMAIN_FEATURE_PMU: if (def->features[i] != VIR_TRISTATE_SWITCH_ABSENT && ARCH_IS_PPC64(def->os.arch)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -5301,7 +5302,6 @@ qemuDomainDefValidateFeatures(const virDomainDef *def, case VIR_DOMAIN_FEATURE_PRIVNET: case VIR_DOMAIN_FEATURE_HYPERV: case VIR_DOMAIN_FEATURE_CAPABILITIES: - case VIR_DOMAIN_FEATURE_PMU: case VIR_DOMAIN_FEATURE_MSRS: case VIR_DOMAIN_FEATURE_LAST: break; -- 2.25.1

On Thu, 2020-03-19 at 18:44 -0300, Daniel Henrique Barboza wrote:
The Perfomance Monitoring Unit (PMU) feature is not available for the Power architecture. The "<pmu/>" feature will always have a value 'on' or 'off' after saving the domain XML, and both will be rejected by QEMU when launching. This is the error message for "<pmu state='on'/>":
qemu-kvm: can't apply global host-powerpc64-cpu.pmu=on: Property '.pmu' not found
A similar error message is thrown for "<pmu state='off'/>".
This patch prevents the pseries guest from launching with any pmu setting with a more informative error message:
error: unsupported configuration: The 'pmu' feature is not supported for architecture 'ppc64' or machine type 'pseries'
I don't think this is right. While you are correct that PMU can't be configured for pSeries guests, I think that's because of the opposite reason: it's always on, and can't be turned off. For comparison's sake: in an x86 guest with <pmu state='on'/> I get $ perf list | grep -E 'Hardware.*event' branch-instructions OR branches [Hardware event] branch-misses [Hardware event] bus-cycles [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event] ref-cycles [Hardware event] L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event] iTLB-loads [Hardware cache event] whereas when I turn off PMU all of those are gone. In a pSeries guest running on POWER8, without any configuration, I get: $ perf list | grep -E 'Hardware.*event' branch-instructions OR branches [Hardware event] branch-misses [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event] stalled-cycles-backend OR idle-cycles-backend [Hardware event] stalled-cycles-frontend OR idle-cycles-frontend [Hardware event] L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-prefetches [Hardware cache event] L1-dcache-store-misses [Hardware cache event] L1-icache-load-misses [Hardware cache event] L1-icache-loads [Hardware cache event] L1-icache-prefetches [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-prefetches [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] iTLB-load-misses [Hardware cache event] So it seems to me that, if anything, the PMU feature should be treated like the <panic/> device, that is, automatically added to pSeries guests if it's not present already. David, what's your opinion on the matter? -- Andrea Bolognani / Red Hat / Virtualization

On 3/23/20 2:28 PM, Andrea Bolognani wrote:
On Thu, 2020-03-19 at 18:44 -0300, Daniel Henrique Barboza wrote:
The Perfomance Monitoring Unit (PMU) feature is not available for the Power architecture. The "<pmu/>" feature will always have a value 'on' or 'off' after saving the domain XML, and both will be rejected by QEMU when launching. This is the error message for "<pmu state='on'/>":
qemu-kvm: can't apply global host-powerpc64-cpu.pmu=on: Property '.pmu' not found
A similar error message is thrown for "<pmu state='off'/>".
This patch prevents the pseries guest from launching with any pmu setting with a more informative error message:
error: unsupported configuration: The 'pmu' feature is not supported for architecture 'ppc64' or machine type 'pseries'
I don't think this is right. While you are correct that PMU can't be configured for pSeries guests, I think that's because of the opposite reason: it's always on, and can't be turned off.
For comparison's sake: in an x86 guest with <pmu state='on'/> I get
$ perf list | grep -E 'Hardware.*event' branch-instructions OR branches [Hardware event] branch-misses [Hardware event] bus-cycles [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event] ref-cycles [Hardware event] L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event] iTLB-loads [Hardware cache event]
whereas when I turn off PMU all of those are gone.
In a pSeries guest running on POWER8, without any configuration, I get:
$ perf list | grep -E 'Hardware.*event' branch-instructions OR branches [Hardware event] branch-misses [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event] stalled-cycles-backend OR idle-cycles-backend [Hardware event] stalled-cycles-frontend OR idle-cycles-frontend [Hardware event] L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-prefetches [Hardware cache event] L1-dcache-store-misses [Hardware cache event] L1-icache-load-misses [Hardware cache event] L1-icache-loads [Hardware cache event] L1-icache-prefetches [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-prefetches [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] iTLB-load-misses [Hardware cache event]
So it seems to me that, if anything, the PMU feature should be treated like the <panic/> device, that is, automatically added to pSeries guests if it's not present already.
I'll check it out how <panic/> is implemented. As long as we don't end up putting "Property '.pmu' not found" in the QEMU command line (which will cause an error) then I think it's ok to have it in the XML.
David, what's your opinion on the matter?

On Mon, Mar 23, 2020 at 06:28:34PM +0100, Andrea Bolognani wrote:
On Thu, 2020-03-19 at 18:44 -0300, Daniel Henrique Barboza wrote:
The Perfomance Monitoring Unit (PMU) feature is not available for the Power architecture. The "<pmu/>" feature will always have a value 'on' or 'off' after saving the domain XML, and both will be rejected by QEMU when launching. This is the error message for "<pmu state='on'/>":
qemu-kvm: can't apply global host-powerpc64-cpu.pmu=on: Property '.pmu' not found
A similar error message is thrown for "<pmu state='off'/>".
This patch prevents the pseries guest from launching with any pmu setting with a more informative error message:
error: unsupported configuration: The 'pmu' feature is not supported for architecture 'ppc64' or machine type 'pseries'
I don't think this is right. While you are correct that PMU can't be configured for pSeries guests, I think that's because of the opposite reason: it's always on, and can't be turned off.
For comparison's sake: in an x86 guest with <pmu state='on'/> I get
$ perf list | grep -E 'Hardware.*event' branch-instructions OR branches [Hardware event] branch-misses [Hardware event] bus-cycles [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event] ref-cycles [Hardware event] L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event] iTLB-loads [Hardware cache event]
whereas when I turn off PMU all of those are gone.
In a pSeries guest running on POWER8, without any configuration, I get:
$ perf list | grep -E 'Hardware.*event' branch-instructions OR branches [Hardware event] branch-misses [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event] stalled-cycles-backend OR idle-cycles-backend [Hardware event] stalled-cycles-frontend OR idle-cycles-frontend [Hardware event] L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-prefetches [Hardware cache event] L1-dcache-store-misses [Hardware cache event] L1-icache-load-misses [Hardware cache event] L1-icache-loads [Hardware cache event] L1-icache-prefetches [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-prefetches [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] iTLB-load-misses [Hardware cache event]
So it seems to me that, if anything, the PMU feature should be treated like the <panic/> device, that is, automatically added to pSeries guests if it's not present already.
David, what's your opinion on the matter?
You're correct. The difference originates at the hardware level. On x86 the host always "owns" the PMU, and it requires explicit work in KVM to make it available to the guest. On POWER, the guest owns the PMU. I'm not sure if it's possible to disable guest access to the PMU at all. Even if it is, it must be some obscure bit in the HFCR or somewhere which I don't believe we've wired up at all. So for now, certainly, pmu is effectively always on. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson

On 3/23/20 9:34 PM, David Gibson wrote:
On Mon, Mar 23, 2020 at 06:28:34PM +0100, Andrea Bolognani wrote:
On Thu, 2020-03-19 at 18:44 -0300, Daniel Henrique Barboza wrote:
The Perfomance Monitoring Unit (PMU) feature is not available for the Power architecture. The "<pmu/>" feature will always have a value 'on' or 'off' after saving the domain XML, and both will be rejected by QEMU when launching. This is the error message for "<pmu state='on'/>":
[...]
So it seems to me that, if anything, the PMU feature should be treated like the <panic/> device, that is, automatically added to pSeries guests if it's not present already.
David, what's your opinion on the matter?
You're correct. The difference originates at the hardware level. On x86 the host always "owns" the PMU, and it requires explicit work in KVM to make it available to the guest. On POWER, the guest owns the PMU. I'm not sure if it's possible to disable guest access to the PMU at all. Even if it is, it must be some obscure bit in the HFCR or somewhere which I don't believe we've wired up at all.
So for now, certainly, pmu is effectively always on.
Thanks for the info. Andrea, I see that the behavior of the 'panic' device in the ppc64 XML is to always add a: <panic model='pseries'/> Even if I remove it Libvirt will automatically added it back again. How do you suggest we handle this PMU? We can copy what 'panic' is doing, adding a <pmu model='peries'/> In the XML just like 'panic' does. Or we can add the 'pmu' feature: <feature> <pmu/> </feature> PMU can't be left without a state ATM, thus this would translate to: <feature> <pmu state='on'/> </feature> This would require to add the whole <feature> element if none is present as well. Both requires some code tweaking here and there. I am not sure which one is better, although the first option would require us to handle the <pmu> XML feature anyway since we can't have both, so there's that. For the sake of completeness, I'll also mention that we can simply allow <pmu/> to be declared in the XML, handling the <pmu state='on'/> inside the QEMU driver to not add the bogus '.pmu' parameter for QEMU ppc64, forbid <pmu state='off'/> to be declared, and nothing else. No auto-generation of XML indicating that the guest will support a PMU. Thanks, DHB

On Tue, 2020-03-24 at 09:27 -0300, Daniel Henrique Barboza wrote:
Andrea, I see that the behavior of the 'panic' device in the ppc64 XML is to always add a:
<panic model='pseries'/>
Even if I remove it Libvirt will automatically added it back again. How do you suggest we handle this PMU?
We can copy what 'panic' is doing, adding a
<pmu model='peries'/>
This is definitely not the right approach - we should use the existing interface rather than try to come up with a new one.
In the XML just like 'panic' does. Or we can add the 'pmu' feature:
<feature> <pmu/> </feature>
PMU can't be left without a state ATM, thus this would translate to:
<feature> <pmu state='on'/> </feature>
This is what we should do.
This would require to add the whole <feature> element if none is present as well. Both requires some code tweaking here and there. I am not sure which one is better, although the first option would require us to handle the <pmu> XML feature anyway since we can't have both, so there's that.
For the sake of completeness, I'll also mention that we can simply allow <pmu/> to be declared in the XML, handling the <pmu state='on'/> inside the QEMU driver to not add the bogus '.pmu' parameter for QEMU ppc64, forbid <pmu state='off'/> to be declared, and nothing else. No auto-generation of XML indicating that the guest will support a PMU.
Looking again at how other architectures, specifically x86 and ARM, handle this, the PMU is generally enabled by default without this fact being reflected in the domain XML; the user can then go ahead and specifically ask for it to be turned on or off, at which point libvirt will add the relevant bits to the QEMU command line. This is basically the second behavior you're describing above, and I think it would be perfectly fine if that's the one we would adopt for ppc64. -- Andrea Bolognani / Red Hat / Virtualization

On 3/24/20 11:00 AM, Andrea Bolognani wrote:
On Tue, 2020-03-24 at 09:27 -0300, Daniel Henrique Barboza wrote:
Andrea, I see that the behavior of the 'panic' device in the ppc64 XML is to always add a:
[...]
For the sake of completeness, I'll also mention that we can simply allow <pmu/> to be declared in the XML, handling the <pmu state='on'/> inside the QEMU driver to not add the bogus '.pmu' parameter for QEMU ppc64, forbid <pmu state='off'/> to be declared, and nothing else. No auto-generation of XML indicating that the guest will support a PMU.
Looking again at how other architectures, specifically x86 and ARM, handle this, the PMU is generally enabled by default without this fact being reflected in the domain XML; the user can then go ahead and specifically ask for it to be turned on or off, at which point libvirt will add the relevant bits to the QEMU command line.
This is basically the second behavior you're describing above, and I think it would be perfectly fine if that's the one we would adopt for ppc64.
I guess I'll roll with this one then. I will allow <pmu state='on'/> to be declared in the XML without breaking QEMU. For <pmu state='off'/> I'll throw an CONFIG_UNSUPPORTED error mentioning that PMU can't be turned off for ppc64. Thanks, DHB

On Tue, 2020-03-24 at 11:05 -0300, Daniel Henrique Barboza wrote:
On 3/24/20 11:00 AM, Andrea Bolognani wrote:
On Tue, 2020-03-24 at 09:27 -0300, Daniel Henrique Barboza wrote:
For the sake of completeness, I'll also mention that we can simply allow <pmu/> to be declared in the XML, handling the <pmu state='on'/> inside the QEMU driver to not add the bogus '.pmu' parameter for QEMU ppc64, forbid <pmu state='off'/> to be declared, and nothing else. No auto-generation of XML indicating that the guest will support a PMU.
Looking again at how other architectures, specifically x86 and ARM, handle this, the PMU is generally enabled by default without this fact being reflected in the domain XML; the user can then go ahead and specifically ask for it to be turned on or off, at which point libvirt will add the relevant bits to the QEMU command line.
This is basically the second behavior you're describing above, and I think it would be perfectly fine if that's the one we would adopt for ppc64.
I guess I'll roll with this one then. I will allow <pmu state='on'/> to be declared in the XML without breaking QEMU. For <pmu state='off'/> I'll throw an CONFIG_UNSUPPORTED error mentioning that PMU can't be turned off for ppc64.
Sounds good. -- Andrea Bolognani / Red Hat / Virtualization

Hyperv features aren't supported in QEMU for ppc64. The <hyperv/> declaration in the XML by itself is benign, but any of its 14 current features will break QEMU with an error like this: qemu-kvm: Expected key=value format, found hv_relaxed This is a more extreme case than the one for apic eoi because we would need an extra 'switch' statement, with all current Hyperv features in the body of qemuDomainDefValidateFeatures(), to check if the user attempted to activate any of them. It's easier to simply fail to launch with any 'hyperv' declaration in the XML for ppc64 guests. A fair disclaimer about Windows and PowerPC: the last Windows version that ran in the architecture is the hall of famer Windows NT 4.0, launched in 1996 and with end of extended support for the Server version in 2004 [1]. I am acknowledging that there might be Windows NT 4.0 users running in PowerPC, but not enough people running it under KVM/QEMU to justify Libvirt allowing 'hyperv' to exist in the domain XML of ppc64 domains. [1] https://en.wikipedia.org/wiki/Windows_NT_4.0 Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- src/qemu/qemu_domain.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 2e5f987a04..ba458bfde9 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -5295,12 +5295,23 @@ qemuDomainDefValidateFeatures(const virDomainDef *def, } break; + case VIR_DOMAIN_FEATURE_HYPERV: + if (def->features[i] != VIR_TRISTATE_SWITCH_ABSENT && + ARCH_IS_PPC64(def->os.arch)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Hyperv features are not supported for " + "architecture '%s' or machine type '%s'"), + virArchToString(def->os.arch), + def->os.machine); + return -1; + } + break; + case VIR_DOMAIN_FEATURE_ACPI: case VIR_DOMAIN_FEATURE_PAE: case VIR_DOMAIN_FEATURE_HAP: case VIR_DOMAIN_FEATURE_VIRIDIAN: case VIR_DOMAIN_FEATURE_PRIVNET: - case VIR_DOMAIN_FEATURE_HYPERV: case VIR_DOMAIN_FEATURE_CAPABILITIES: case VIR_DOMAIN_FEATURE_MSRS: case VIR_DOMAIN_FEATURE_LAST: -- 2.25.1

On Thu, 2020-03-19 at 18:44 -0300, Daniel Henrique Barboza wrote: [...]
+ case VIR_DOMAIN_FEATURE_HYPERV: + if (def->features[i] != VIR_TRISTATE_SWITCH_ABSENT && + ARCH_IS_PPC64(def->os.arch)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Hyperv features are not supported for " + "architecture '%s' or machine type '%s'"), + virArchToString(def->os.arch), + def->os.machine); + return -1;
Same as the previous patches, except Hyper-V is reasonable not only on x86 but also on aarch64, so the architecture check should be tweaked accordingly. -- Andrea Bolognani / Red Hat / Virtualization
participants (3)
-
Andrea Bolognani
-
Daniel Henrique Barboza
-
David Gibson