On 10/31/23 11:43 AM, Daniel P. Berrangé wrote:
On Sat, Oct 28, 2023 at 09:49:32AM -0500, Jonathon Jongsma wrote:
>
> I'm currently looking at getting libvirt working with AMD's SEV-SNP
> encrypted virtualization technology. I have access to a test machine with an
> AMD EPYC 7713 processor which I can use to launch SNP guests with qemu, but
> only when I specify one of the following versioned -cpu values:
> - EPYC-v4
> - EPYC-Milan-v2
> - EPYC-Rome-v3
>
> From what I understand, the unversioned CPU models in qemu are supposed to
> resolve to a specific versioned CPU model depending on the machine type. But
> I'm not exactly sure how machine type influences it.
There's two aspects - what QEMU is supposed todo and what libvirt was
intended todo, and only the QEMU part really got done.
On the QEMU side, when the user specifies an unversioned CPU model,
QEMU shouuld expand that to a versioned model, based on a definition
tied to the machine type.
This ensures that if the machine type doesn't change, the CPU model
expansion will be guest ABI-stable.
Most non-libvirt users of QEMU use a non-versioned machine type
though so they have no ABI stability guarantee for the machine
type or the CPU model.
Initially all CPUs were mapped to v1, and it was thought that newer
machine types might map to newer CPU versions.
Life it not that simple though, because choice of CPU version affects
runability on any given host :-(
IOW, if QEMU added a new machine type that changed the mapping to
a -v2 CPU model, existing users of the unversioned machine type
'q35' might suddenly find themselves unable to run the guest.
eg consider -v2 adds feature 'foo' which depends on a microcode
update, and the user does not have the microcode present.
Thus, in practice I think it is unlikely QEMU will ever do much
with the machine <-> CPU version mappings.
At the libvirt level though, we can do better. Since we record
our expansions in the XML, we don't have to rely on the machine
type mapping for ABI sability of the CPU models
Libvirt could expand a non-versioned CPU model to any version
it desires, as long as it records that expansion in the XML.
This could mean libvirt can dynamically expand the non-versioned
CPU, taking account of what the host microcode supports.
Libvirt should also allow users to request a versioned CPU
model directly of course.
So, as a first step, I'd like to work on adding the ability to manually
specify a versioned CPU. As far as I understand, this means generating
an xml definition in src/cpu_map/ for each of the versioned CPUs so that
libvirt knows about them and therefore the user can specify them (i.e.
x86_EPYC-v4.xml).
But while doing that, I discovered that the creation of these xml
definitions is largely undocumented. There is a
'src/cpu_map/sync_qemu_models.py' script which was clearly used to
generate them originally. But when I run it against the current qemu
codebase, it modifies quite a few of the CPU xml files. Most of the
modifications are adding features that (I assume) qemu added to the CPU
model after the initial xml files were generated. For instance, when I
regenerate the AMD EPYC CPU, it adds 'npt' and 'nrip-save' features that
were added in qemu commit 9fe8b7be17eaac4cfde4083000cc96747d7cf4f8.
Other CPUs have more features added. But there are also manual
modifications to these files that get overwritten by the script.
So, the question is: are these intended to kept up-to-date with qemu?
The script name "sync_*" implies such, but I don't see much evidence
that it is happening.
Jonathon
> I've got some libvirt patches to launch an SEV-SNP guest working now except
> for the CPU model specification. As far as I can tell, I can currently only
> specify the un-versioned model in libvirt. Is there any way to request a
> particular versioned CPU from qemu? I feel like I'm missing something here.
This is another example of where libvirt could do a better job at
expansion. We ought to "do the right thing" and expand to a version
that is compatible with SNP (somehow). While we should of course have
a way for users to request a specific version, we should not expect
users to care about versions - we must "do the right thing" with SNP
(and TDX in future).
With regards,
Daniel