On 7/27/20 12:29 PM, Daniel P. Berrangé wrote:
On Fri, Jun 19, 2020 at 06:04:33PM -0300, Daniel Henrique Barboza
wrote:
> PPC64 has two KVM modules: kvm_hv and kvm_pr. The official supported
> module was always kvm_hv, while kvm_pr was used for internal testing
> or for very niche cases in Power 8 hosts, always without official
> IBM or distro support.
>
> Problem is, QMP will report KVM supportfor PPC64 if any of these
> modules is loaded in the host, and kvm_pr is broken in everything
> but Power8 (and will remain broken, since kvm_pr is unmaintained).
> This can lead to situations like [1], where the tooling is misled to
> believe that the host has KVM capabilities when in reality it
> doesn't.
>
> The first reaction would be to simply forsake kvm_pr support entirely
> and move on, but there is no reason for now to be disruptive with any
> Power8 guests in the wild that are using kvm_pr (somehow). A more
> subtle approach is to not claim QEMU_CAPS_KVM support in all cases
> that we know it's completely broken, allowing Power8 users to take
> their shot using kvm_pr in their VMs. We can remove kvm_pr support
> completely when the module is removed from the kernel.
I'm not sure libvirt should be forbidding this use of kvm_pr
on non-Power8. IIUC, it is the only impl that can actually be
used when in a Power9 LPARs. This patch is essentially saying
that TCG is better for Power LPARs than kvm_pr which I think
is not right.
We didn't work in kvm_pr support for Power 9. All the effort was put
into implementing nested virtualization with the kvm_hv module. The
fact that kvm_pr module loads in Power 9 is kind of an "accident".
As you can see in the bug linked, it's non-functional.
Things get messy because, well, kvm_hv does not work in P9 LPAR either.
This use case is then left with TCG and the only (sort of) viable option
for this scenario.
The problem is that modern QEMU ppc machine types don't work
on kvm_pr without a bunch of extra CLI args to disable use
of varoous missing features in kvm_pr. If you do pass though
CLI args though, things can work to some extent.
IIUC, TCG suffers from the same problem with these missing
features though, though the BZ report below does not seem to
confirm that belief.
If someone is able to succesfully use kvm_pr with QEMU without
using libvirt, then it makes libvirt look bad if we refuse to
allow the same setup. So if we're going to declare that kvm_pr
is not supported for QEMU on non-Power8, then I feel like QEMU
should be the thing declaring that. We probe QEMU via QMP to
see if it supports KVM. IOW, if QEMU doesn't want to support
kvm_pr, it should report kvm disabled when asked.
The more I think about this the more I'm convinced that my approach
here is not ideal and a QEMU side fix is better. At the very least
we can center these decisions in the QEMU layer, while with these
patches one would need to change both QEMU and Libvirt if something
changes.
On the libvirt side I think we need to focus more on awareness
of the problem. eg virt-host-validate should ERROR is kvm_pr
is loaded on a host which is capable of kvm_hv. It might want
to WARN if kvm_pr is used on any other host, if that's not
too aggressive ?
Possibly the capabilities XML should have some way to report
which KVM impl is present. Not sure what this would look
like though.
> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1843865
That BZ is reported against RHEL downstream, and from a RHEL POV I
think the better answer is to simply not build the kvm_pr kernel
module in the first place, since it was never considered a supported
KVM impl.
This would be ideal. Perhaps we should just rush into removing kvm_pr from
the build and call it a day. People could keep using kvm_pr with QEMU/Libvirt
by building the module themselves, at their own peril.
Thanks,
DHB
Regards,
Daniel