Re: [PATCH 00/32] Add support for versioned CPU models

Wednesday, 20 November 2024

On Tue, Nov 19, 2024 at 07:49:36PM +0100, Jiri Denemark wrote:
...
 When parsing a domain XML which uses a non-versioned CPU model we
want
 to replace it with the appropriate version variant similarly to what we
 do with machine types. Theoretically QEMU supports per machine type
 specification of a version with which a non-versioned CPU model is
 replaced, but this is always 1 for all machine types and the
 query-machines QMP command does not even report the value.

 Luckily after talking to Igor, having a single number per machine type
 does not really allow for setting it to anything but 1 as CPU models
 have different number of versions. Each machine type would need to
 define a specific version for each CPU model, which would be a
 maintenance nightmare. For this reason there's no desire to ever resolve
 non-versioned CPU models to anything but v1 in QEMU and the per machine
 type setting will most likely even be removed completely. Thus it is
 safe for us to always use v1 as the canonical CPU model.

 Some non-versioned CPU models, however, are actually aliases to specific
 versions of a base model rather than being base models themselves. These
 are the old CPU model variants before model versions were introduced,
 e.g., -noTSX, -IBRS, etc. The mapping of these names to versions is
 hardcoded and will never change. We do not translate such CPU models to
 the corresponding versioned names. This allows us to introduce the
 corresponding -v* variants that match the QEMU models rather than the
 existing definitions in our CPU map. The guest CPU will be the same
 either way, but the way libvirt checks the CPU model compatibility with
 the host will be different. The old "partial" check done by libvirt
 using the definition from CPU map will still be used for the old names
 (we can't change this for compatibility reasons), but the corresponding
 versioned variants (as well as all other versions that do not have a
 non-versioned alias) will benefit from the recently introduced new
 "partial" check which uses only the information we get from QEMU to
 check whether a specific CPU definition is usable on the host.

 Other I considered were:
 - replace -noTSX, -IBRS, ... models with their versioned variants
     - we'd need to translate them back for migration (just what we do
       for -v1) for backward compatibility
     - I found the benefit of new partial checking when explicitly using
       the versioned variants quite appealing and dropped the relevant
       changes in progress

 - do not translate anything, i.e., not even base models to -v1
     - the idea behind translating was to make sure QEMU suddenly doesn't
       start translating the base CPU model to a different version (this
       does not happen with -noTSX etc. as they are hardcoded aliases);
       Igor said they will never do that so is this still valid?
     - not translating would bring the same benefit of explicitly using
       -v1 vs non-versioned name

 I guess the current mix does not look very consistent (i.e., it's not
 either all or nothing), but it makes sense to me. The question is
 whether it also makes sense to others :-) 
Yeah, the inconsistency pokes at my brain.

As a slight diversion first, let me point to domcapabilities output

$ virsh domcapabilities --xpath '//model' | grep Skylake-Client
<model usable="no" vendor="Intel">Skylake-Client</model>
<model usable="no"
vendor="Intel">Skylake-Client-IBRS</model>
<model usable="no"
vendor="Intel">Skylake-Client-noTSX-IBRS</model>
<model usable="no"
vendor="Intel">Skylake-Client-v1</model>
<model usable="no"
vendor="Intel">Skylake-Client-v2</model>
<model usable="no"
vendor="Intel">Skylake-Client-v3</model>
<model usable="no"
vendor="Intel">Skylake-Client-v4</model>

I'm not a fan of duplicating the the CPU models here.

By comparison for machine types we avoid the duplication thus, by
explicitly telling the mgmt app what the aliases are:

  <machine canonical="pc-q35-8.2"
maxCpus="1024">q35</machine>
  <machine maxCpus="1024">pc-q35-8.1</machine>
  <machine maxCpus="1024">pc-q35-8.2</machine>
  <machine maxCpus="255">pc-q35-2.4</machine>
  <machine maxCpus="255">pc-q35-2.5</machine>

I think we should be exposing to mgmt apps that some CPU model names are
merely an alias of another model.

This brings up the question of what we call the "canonical" name. Is
"Skylake-Client" canonical, or is "Skylake-Client-v1" canonical ?

ie do we report

$ virsh domcapabilities --xpath '//model' | grep Skylake-Client
<model usable="no" vendor="Intel"
canonical="Skylake-Client-v1">Skylake-Client</model>
<model usable="no" vendor="Intel"
canonical="Skylake-Client-v2">Skylake-Client-IBRS</model>
<model usable="no" vendor="Intel"
canonical="Skylake-Client-v3">Skylake-Client-noTSX-IBRS</model>
<model usable="no"
vendor="Intel">Skylake-Client-v4</model>

or

$ virsh domcapabilities --xpath '//model' | grep Skylake-Client
<model usable="no" vendor="Intel"
canonical="Skylake-Client">Skylake-Client-v1</model>
<model usable="no" vendor="Intel"
canonical="Skylake-Client-IBRS">Skylake-Client-v2</model>
<model usable="no" vendor="Intel"
canonical="Skylake-Client-noTSX-IBRS">Skylake-Client-v3</model>
<model usable="no"
vendor="Intel">Skylake-Client-v4</model>

In the case of machine types, libvirt doesn't decide - we honour
whatever QEMU tells us is the "canonical" name. Does QEMU tell us
this for CPU models ?

Anyway back to your question of translation consistency.

I think domain XML should never contain an aliased name, it should always
get expanded to the canonical name, as described by the domcapabilities
XML.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [PATCH 00/32] Add support for versioned CPU models