
On Wed, 2024-02-14 at 12:39 +0100, Jiri Denemark wrote:
On Wed, Feb 14, 2024 at 12:07:46 +0100, Michal Prívozník wrote:
On 2/9/24 11:52, Tim Wiederhake wrote:
The mpx feature was removed from the corresponding qemu cpu models. With mpx in the libvirt cpu models, libvirt believes the feature to be implicitly enabled when creating qemu VMs, while in fact it is disabled.
This became an issue when commit 94eacd5a5f introduced new vmx-* features, of which some are dependent on mpx (see "feature_dependencies" table in qemu target/i386/cpu.c), e.g. vmx-exit-clear-bndcfgs and vmx-entry-load-bndcfgs. These features cannot be enabled by qemu without also mpx being enabled, leading to the error message
error: Failed to create domain from testdomain.xml error: operation failed: guest CPU doesn't match specification: missing features: mpx,vmx-exit-clear-bndcfgs, vmx-entry-load-bndcfgs
when trying to create a VM with a "host-model" cpu on a host that does support mpx and the mentioned vmx-* features:
<domain> ... <cpu mode='host-model' check='full' /> ... </domain>
Resolve the issue by removing mpx from libvirt's cpu models as well.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Hold on. I was trying to think whether this is safe for migrations, but have to act fast now. Could you please explain in the commit message that nothing breaks during migration no matter what CPU configuration is used in a domain XML when migrating between all combinations of libvirt with/without this change? In case some configurations would be refused we need to make sure those are impossible to hit in real world.
Jirka
My knowledge about migration is limited, hence I am hesitant to make factual claims. That being said, my understanding is that by requesting e.g. a Skylake-Client cpu 'mpx' was never actually enabled in the VM as qemu's version of the same cpu model did not include that feature. Libvirt would also never explicitly enable it, as it believes it to be already covered by the named cpu model. A migration would therefore happen from a domain with "mpx disabled because of a misunderstanding between libvirt and qemu" to a domain with "mpx disabled because that feature is not part of the cpu model". Can someone with more detailed knowledge of the specifics chime in? Thanks, Tim