Versioned CPU types in libvirt

I'm currently looking at getting libvirt working with AMD's SEV-SNP encrypted virtualization technology. I have access to a test machine with an AMD EPYC 7713 processor which I can use to launch SNP guests with qemu, but only when I specify one of the following versioned -cpu values: - EPYC-v4 - EPYC-Milan-v2 - EPYC-Rome-v3 From what I understand, the unversioned CPU models in qemu are supposed to resolve to a specific versioned CPU model depending on the machine type. But I'm not exactly sure how machine type influences it. I've got some libvirt patches to launch an SEV-SNP guest working now except for the CPU model specification. As far as I can tell, I can currently only specify the un-versioned model in libvirt. Is there any way to request a particular versioned CPU from qemu? I feel like I'm missing something here. I should perhaps also mention that I'm running a development version of qemu from Cole's copr repo[1], which could still have some related bugs [1] https://copr.fedorainfracloud.org/coprs/g/virtmaint-sig/sev-snp-coconut/ Thanks, Jonathon

On 10/28/23 10:49 AM, Jonathon Jongsma wrote:
I'm currently looking at getting libvirt working with AMD's SEV-SNP encrypted virtualization technology. I have access to a test machine with an AMD EPYC 7713 processor which I can use to launch SNP guests with qemu, but only when I specify one of the following versioned -cpu values: - EPYC-v4 - EPYC-Milan-v2 - EPYC-Rome-v3
From what I understand, the unversioned CPU models in qemu are supposed to resolve to a specific versioned CPU model depending on the machine type. But I'm not exactly sure how machine type influences it.
At qemu level that's what I thought too, and it sounds like that was the eventual plan, but it's not implemented yet. `qemu -cpu FOO` always maps to `-cpu FOO-v1`. There's some info in qemu docs here, last sentence explicitly makes this clear it may change in future https://www.qemu.org/docs/master/about/deprecated.html#runnability-guarantee... Milan-v2 does add some new CPU features, but the one SNP related bit those models change is `complex_indexing` in l3_cache cpuid, whatever that is. Doesn't look overrideable on qemu command line, or anything libvirt detects from host.
I've got some libvirt patches to launch an SEV-SNP guest working now except for the CPU model specification. As far as I can tell, I can currently only specify the un-versioned model in libvirt. Is there any way to request a particular versioned CPU from qemu? I feel like I'm missing something here.
The reason this fails is that SNP hardware explicitly rejects some guest CPUID values that it deems unsafe. That `complex_indexing` bit is one of them, but `-cpu host` triggers more. TDX seems to have a similar mechanism, but it looks like qemu code is filtering those out for -cpu host. Maybe qemu SNP can do the same, I'll ask AMD devs But yes in the meantime a libvirt workaround would help Thanks, Cole
I should perhaps also mention that I'm running a development version of qemu from Cole's copr repo[1], which could still have some related bugs
[1] https://copr.fedorainfracloud.org/coprs/g/virtmaint-sig/sev-snp-coconut/
Thanks, Jonathon

On Sat, Oct 28, 2023 at 09:49:32AM -0500, Jonathon Jongsma wrote:
I'm currently looking at getting libvirt working with AMD's SEV-SNP encrypted virtualization technology. I have access to a test machine with an AMD EPYC 7713 processor which I can use to launch SNP guests with qemu, but only when I specify one of the following versioned -cpu values: - EPYC-v4 - EPYC-Milan-v2 - EPYC-Rome-v3
From what I understand, the unversioned CPU models in qemu are supposed to resolve to a specific versioned CPU model depending on the machine type. But I'm not exactly sure how machine type influences it.
There's two aspects - what QEMU is supposed todo and what libvirt was intended todo, and only the QEMU part really got done. On the QEMU side, when the user specifies an unversioned CPU model, QEMU shouuld expand that to a versioned model, based on a definition tied to the machine type. This ensures that if the machine type doesn't change, the CPU model expansion will be guest ABI-stable. Most non-libvirt users of QEMU use a non-versioned machine type though so they have no ABI stability guarantee for the machine type or the CPU model. Initially all CPUs were mapped to v1, and it was thought that newer machine types might map to newer CPU versions. Life it not that simple though, because choice of CPU version affects runability on any given host :-( IOW, if QEMU added a new machine type that changed the mapping to a -v2 CPU model, existing users of the unversioned machine type 'q35' might suddenly find themselves unable to run the guest. eg consider -v2 adds feature 'foo' which depends on a microcode update, and the user does not have the microcode present. Thus, in practice I think it is unlikely QEMU will ever do much with the machine <-> CPU version mappings. At the libvirt level though, we can do better. Since we record our expansions in the XML, we don't have to rely on the machine type mapping for ABI sability of the CPU models Libvirt could expand a non-versioned CPU model to any version it desires, as long as it records that expansion in the XML. This could mean libvirt can dynamically expand the non-versioned CPU, taking account of what the host microcode supports. Libvirt should also allow users to request a versioned CPU model directly of course.
I've got some libvirt patches to launch an SEV-SNP guest working now except for the CPU model specification. As far as I can tell, I can currently only specify the un-versioned model in libvirt. Is there any way to request a particular versioned CPU from qemu? I feel like I'm missing something here.
This is another example of where libvirt could do a better job at expansion. We ought to "do the right thing" and expand to a version that is compatible with SNP (somehow). While we should of course have a way for users to request a specific version, we should not expect users to care about versions - we must "do the right thing" with SNP (and TDX in future). With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 10/31/23 11:43 AM, Daniel P. Berrangé wrote:
On Sat, Oct 28, 2023 at 09:49:32AM -0500, Jonathon Jongsma wrote:
I'm currently looking at getting libvirt working with AMD's SEV-SNP encrypted virtualization technology. I have access to a test machine with an AMD EPYC 7713 processor which I can use to launch SNP guests with qemu, but only when I specify one of the following versioned -cpu values: - EPYC-v4 - EPYC-Milan-v2 - EPYC-Rome-v3
From what I understand, the unversioned CPU models in qemu are supposed to resolve to a specific versioned CPU model depending on the machine type. But I'm not exactly sure how machine type influences it.
There's two aspects - what QEMU is supposed todo and what libvirt was intended todo, and only the QEMU part really got done.
On the QEMU side, when the user specifies an unversioned CPU model, QEMU shouuld expand that to a versioned model, based on a definition tied to the machine type.
This ensures that if the machine type doesn't change, the CPU model expansion will be guest ABI-stable.
Most non-libvirt users of QEMU use a non-versioned machine type though so they have no ABI stability guarantee for the machine type or the CPU model.
Initially all CPUs were mapped to v1, and it was thought that newer machine types might map to newer CPU versions.
Life it not that simple though, because choice of CPU version affects runability on any given host :-(
IOW, if QEMU added a new machine type that changed the mapping to a -v2 CPU model, existing users of the unversioned machine type 'q35' might suddenly find themselves unable to run the guest.
eg consider -v2 adds feature 'foo' which depends on a microcode update, and the user does not have the microcode present.
Thus, in practice I think it is unlikely QEMU will ever do much with the machine <-> CPU version mappings.
At the libvirt level though, we can do better. Since we record our expansions in the XML, we don't have to rely on the machine type mapping for ABI sability of the CPU models
Libvirt could expand a non-versioned CPU model to any version it desires, as long as it records that expansion in the XML. This could mean libvirt can dynamically expand the non-versioned CPU, taking account of what the host microcode supports.
Libvirt should also allow users to request a versioned CPU model directly of course.
So, as a first step, I'd like to work on adding the ability to manually specify a versioned CPU. As far as I understand, this means generating an xml definition in src/cpu_map/ for each of the versioned CPUs so that libvirt knows about them and therefore the user can specify them (i.e. x86_EPYC-v4.xml). But while doing that, I discovered that the creation of these xml definitions is largely undocumented. There is a 'src/cpu_map/sync_qemu_models.py' script which was clearly used to generate them originally. But when I run it against the current qemu codebase, it modifies quite a few of the CPU xml files. Most of the modifications are adding features that (I assume) qemu added to the CPU model after the initial xml files were generated. For instance, when I regenerate the AMD EPYC CPU, it adds 'npt' and 'nrip-save' features that were added in qemu commit 9fe8b7be17eaac4cfde4083000cc96747d7cf4f8. Other CPUs have more features added. But there are also manual modifications to these files that get overwritten by the script. So, the question is: are these intended to kept up-to-date with qemu? The script name "sync_*" implies such, but I don't see much evidence that it is happening. Jonathon
I've got some libvirt patches to launch an SEV-SNP guest working now except for the CPU model specification. As far as I can tell, I can currently only specify the un-versioned model in libvirt. Is there any way to request a particular versioned CPU from qemu? I feel like I'm missing something here.
This is another example of where libvirt could do a better job at expansion. We ought to "do the right thing" and expand to a version that is compatible with SNP (somehow). While we should of course have a way for users to request a specific version, we should not expect users to care about versions - we must "do the right thing" with SNP (and TDX in future).
With regards, Daniel

On Tue, Oct 31, 2023 at 18:13:41 -0500, Jonathon Jongsma wrote:
But while doing that, I discovered that the creation of these xml definitions is largely undocumented. There is a 'src/cpu_map/sync_qemu_models.py' script which was clearly used to generate them originally.
Not really, most of the models were manually translated from QEMU source code and the script was created later to help with identifying differences and new features and CPU models.
But when I run it against the current qemu codebase, it modifies quite a few of the CPU xml files. Most of the modifications are adding features that (I assume) qemu added to the CPU model after the initial xml files were generated.
Right, sometimes QEMU added new features to existing CPU models and sometime existing features were removed. I believe they don't do so anymore and introduce a new CPU model version instead.
So, the question is: are these intended to kept up-to-date with qemu? The script name "sync_*" implies such, but I don't see much evidence that it is happening.
No the CPU models in libvirt are not supposed to be synchronized with QEMU. The script does synchronize them, but the result needs careful review and manual processing to avoid submitting changes to existing CPU models to avoid breaking migration between different versions of libvirt. You can use the script to quickly see what changed in QEMU on existing models though. Jirka

Hi Jonathon, I too on occasion poke at SEV-SNP support in libvirt. I've now pushed the dusty, hacky branch to my public fork https://gitlab.com/jfehlig/libvirt/-/tree/sev-snp?ref_type=heads Looking at the git log, it seems I fiddle with it every 2 months or so. It's been that long since I last worked on the task, so I'm due to give it some cycles next week :-). On 10/28/23 08:49, Jonathon Jongsma wrote:
I'm currently looking at getting libvirt working with AMD's SEV-SNP encrypted virtualization technology. I have access to a test machine with an AMD EPYC 7713 processor which I can use to launch SNP guests with qemu, but only when I specify one of the following versioned -cpu values: - EPYC-v4 - EPYC-Milan-v2 - EPYC-Rome-v3
From what I understand, the unversioned CPU models in qemu are supposed to resolve to a specific versioned CPU model depending on the machine type. But I'm not exactly sure how machine type influences it.
I've got some libvirt patches to launch an SEV-SNP guest working now except for the CPU model specification. As far as I can tell, I can currently only specify the un-versioned model in libvirt. Is there any way to request a particular versioned CPU from qemu? I feel like I'm missing something here.
I should perhaps also mention that I'm running a development version of qemu from Cole's copr repo[1], which could still have some related bugs
[1] https://copr.fedorainfracloud.org/coprs/g/virtmaint-sig/sev-snp-coconut/
Do you know the corresponding git branches the host packages are built from? One of the biggest challenges I face is keeping track of the latest repos/branches to use for all the snp and tdx work. I _think_ this is still the oracle for the snp patches https://github.com/AMDESE/AMDSEV/blob/snp-latest/stable-commits The last time I tried starting a SNP guest using libvirt with my hacks, qemu crashed when libvirt probed sev_get_capabilities. Do you have any libvirt patches for early review/testing? I also have access to a SNP-capable test machine. Regards, Jim

On 11/3/23 15:19, Jim Fehlig wrote:
Hi Jonathon,
I too on occasion poke at SEV-SNP support in libvirt. I've now pushed the dusty, hacky branch to my public fork
https://gitlab.com/jfehlig/libvirt/-/tree/sev-snp?ref_type=heads
Looking at the git log, it seems I fiddle with it every 2 months or so. It's been that long since I last worked on the task, so I'm due to give it some cycles next week :-).
On 10/28/23 08:49, Jonathon Jongsma wrote:
I'm currently looking at getting libvirt working with AMD's SEV-SNP encrypted virtualization technology. I have access to a test machine with an AMD EPYC 7713 processor which I can use to launch SNP guests with qemu, but only when I specify one of the following versioned -cpu values: - EPYC-v4 - EPYC-Milan-v2 - EPYC-Rome-v3
From what I understand, the unversioned CPU models in qemu are supposed to resolve to a specific versioned CPU model depending on the machine type. But I'm not exactly sure how machine type influences it.
I've got some libvirt patches to launch an SEV-SNP guest working now except for the CPU model specification. As far as I can tell, I can currently only specify the un-versioned model in libvirt. Is there any way to request a particular versioned CPU from qemu? I feel like I'm missing something here.
I should perhaps also mention that I'm running a development version of qemu from Cole's copr repo[1], which could still have some related bugs
[1] https://copr.fedorainfracloud.org/coprs/g/virtmaint-sig/sev-snp-coconut/
Do you know the corresponding git branches the host packages are built from? One of the biggest challenges I face is keeping track of the latest repos/branches to use for all the snp and tdx work. I _think_ this is still the oracle for the snp patches
https://github.com/AMDESE/AMDSEV/blob/snp-latest/stable-commits
Yes, seems to be the case. I noticed the head commit in the qemu branch is commit bbc1bfb6bfb3cde4c22755cedd5b71e651ca35e8 Author: Cole Robinson <crobinso@redhat.com> Date: Fri Oct 13 13:53:26 2023 -0400 *sev: fix query-sev-capabilities as called by libvirt
The last time I tried starting a SNP guest using libvirt with my hacks, qemu crashed when libvirt probed sev_get_capabilities.
Which fixed this issue :-). With the latest kernel, qemu, and ovmf referenced in stable-commits, qemu fails when attempting to start a SNP guest via libvirt with latest patches in my git branch # virsh start snp-test error: Failed to start domain 'snp-test' error: internal error: QEMU unexpectedly closed the monitor (vm='snp-test'): 2023-11-07T00:01:06.888232Z qemu-system-x86_64: warning: creating ROM device with private memory. 2023-11-07T00:01:06.888516Z qemu-system-x86_64: warning: kvm_create_gmemfd: created memfd: 36, size: 37c000, flags: 0 2023-11-07T00:01:06.891968Z qemu-system-x86_64: warning: kvm_create_gmemfd: created memfd: 38, size: 20000, flags: 0 2023-11-07T00:01:06.892215Z qemu-system-x86_64: warning: creating ROM device with private memory. 2023-11-07T00:01:06.892256Z qemu-system-x86_64: warning: kvm_create_gmemfd: created memfd: 40, size: 84000, flags: 0 2023-11-07T00:01:06.892949Z qemu-system-x86_64: warning: kvm_create_gmemfd: created memfd: 42, size: 20000, flags: 0 2023-11-07T00:01:06.928511Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION2 failed, slot=65536, start=0x0, size=0x80000000, flags=0x0, gmem_fd=0, gmem_offset=0x0: Invalid argument (22) kvm_set_phys_mem: error registering slot: Invalid argument The memory backend is not being configured properly. From the qemu log file -machine pc-q35-7.1,usb=off,dump-guest-core=off,memory-backend=pc.ram,kvm-type=protected,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format,hpet=off,acpi=on \ -object '{"qom-type":"memory-backend-file","id":"pc.ram","mem-path":"/var/lib/libvirt/qemu/ram/1-tw-snp-test/pc.ram","share":false,"x-use-canonical-path-for-ramblock-id":false,"size":8589934592}' \ I'll continue poking on Wednesday. I've dedicated Tuesday to working on hardware in our lab. Regards, Jim

On 11/6/23 6:25 PM, Jim Fehlig wrote:
On 11/3/23 15:19, Jim Fehlig wrote:
Hi Jonathon,
I too on occasion poke at SEV-SNP support in libvirt. I've now pushed the dusty, hacky branch to my public fork
https://gitlab.com/jfehlig/libvirt/-/tree/sev-snp?ref_type=heads
Looking at the git log, it seems I fiddle with it every 2 months or so. It's been that long since I last worked on the task, so I'm due to give it some cycles next week :-).
On 10/28/23 08:49, Jonathon Jongsma wrote:
I'm currently looking at getting libvirt working with AMD's SEV-SNP encrypted virtualization technology. I have access to a test machine with an AMD EPYC 7713 processor which I can use to launch SNP guests with qemu, but only when I specify one of the following versioned -cpu values: - EPYC-v4 - EPYC-Milan-v2 - EPYC-Rome-v3
From what I understand, the unversioned CPU models in qemu are supposed to resolve to a specific versioned CPU model depending on the machine type. But I'm not exactly sure how machine type influences it.
I've got some libvirt patches to launch an SEV-SNP guest working now except for the CPU model specification. As far as I can tell, I can currently only specify the un-versioned model in libvirt. Is there any way to request a particular versioned CPU from qemu? I feel like I'm missing something here.
I should perhaps also mention that I'm running a development version of qemu from Cole's copr repo[1], which could still have some related bugs
[1] https://copr.fedorainfracloud.org/coprs/g/virtmaint-sig/sev-snp-coconut/
Do you know the corresponding git branches the host packages are built from? One of the biggest challenges I face is keeping track of the latest repos/branches to use for all the snp and tdx work. I _think_ this is still the oracle for the snp patches
https://github.com/AMDESE/AMDSEV/blob/snp-latest/stable-commits
Yes, seems to be the case. I noticed the head commit in the qemu branch is
commit bbc1bfb6bfb3cde4c22755cedd5b71e651ca35e8 Author: Cole Robinson <crobinso@redhat.com> Date: Fri Oct 13 13:53:26 2023 -0400
*sev: fix query-sev-capabilities as called by libvirt
Cole would know for sure, but it's possible that the packages from this repository include a couple different cherry-picked patch series.
The last time I tried starting a SNP guest using libvirt with my hacks, qemu crashed when libvirt probed sev_get_capabilities.
Which fixed this issue :-). With the latest kernel, qemu, and ovmf referenced in stable-commits, qemu fails when attempting to start a SNP guest via libvirt with latest patches in my git branch
# virsh start snp-test error: Failed to start domain 'snp-test' error: internal error: QEMU unexpectedly closed the monitor (vm='snp-test'): 2023-11-07T00:01:06.888232Z qemu-system-x86_64: warning: creating ROM device with private memory. 2023-11-07T00:01:06.888516Z qemu-system-x86_64: warning: kvm_create_gmemfd: created memfd: 36, size: 37c000, flags: 0 2023-11-07T00:01:06.891968Z qemu-system-x86_64: warning: kvm_create_gmemfd: created memfd: 38, size: 20000, flags: 0 2023-11-07T00:01:06.892215Z qemu-system-x86_64: warning: creating ROM device with private memory. 2023-11-07T00:01:06.892256Z qemu-system-x86_64: warning: kvm_create_gmemfd: created memfd: 40, size: 84000, flags: 0 2023-11-07T00:01:06.892949Z qemu-system-x86_64: warning: kvm_create_gmemfd: created memfd: 42, size: 20000, flags: 0 2023-11-07T00:01:06.928511Z qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION2 failed, slot=65536, start=0x0, size=0x80000000, flags=0x0, gmem_fd=0, gmem_offset=0x0: Invalid argument (22) kvm_set_phys_mem: error registering slot: Invalid argument
The memory backend is not being configured properly. From the qemu log file
-machine pc-q35-7.1,usb=off,dump-guest-core=off,memory-backend=pc.ram,kvm-type=protected,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format,hpet=off,acpi=on \ -object '{"qom-type":"memory-backend-file","id":"pc.ram","mem-path":"/var/lib/libvirt/qemu/ram/1-tw-snp-test/pc.ram","share":false,"x-use-canonical-path-for-ramblock-id":false,"size":8589934592}' \
I'll continue poking on Wednesday. I've dedicated Tuesday to working on hardware in our lab.
Regards, Jim
In case it helps I've just pushed my rough libvirt branch to my gitlab account as well. Be warned that some of it is still pretty rough and it may be rewritten frequently. In fact, I just rebased it on top of my cpu version patch series before I pushed, so I can't even guarantee that it is currently in a working state. But for what it's worth, feel free to look around and ping me if you have questions: https://gitlab.com/jjongsma/libvirt/-/tree/sev-snp?ref_type=heads Cheers, Jonathon
participants (5)
-
Cole Robinson
-
Daniel P. Berrangé
-
Jim Fehlig
-
Jiri Denemark
-
Jonathon Jongsma