Re: [PATCH v3 01/12] cpu_map: update script to handle versioned CPUs

Wednesday, 21 February 2024

On 2/20/24 6:08 PM, Jim Fehlig wrote:
...
 On 12/15/23 15:11, Jonathon Jongsma wrote:
> Previously, the script only generated the parent CPU and any versions
> that had a defined alias. The script now generates all CPU versions. Any
> version that had a defined alias will continue to use that alias, but
> those without aliases will use the generated name $BASECPUNAME-vN.
>
> The reason for this change is two-fold. First, we need to add new models
> that support new features (such as SEV-SNP). To deal with this, the
> script now generates model definitions for all versions.
>
> But we also need to ensure that our CPU definitions are migration-safe.
> To deal with this issue we need to make sure we're always using the
> canonical versioned names for CPUs.

 Related to migration safety, do we need to be concerned with the 
 expansion of 'host-model' CPU? E.g. is it possible 'host-model' expands 
 to EPYC before introducing the new models, and EPYC-v4 afterwards? If 
 so, what are the ramifications of that? 
Indeed, that behavior is possible. In fact, you can see it happening in 
the test results for e.g. patch #5 ("Add versioned EPYC CPUs"). But I 
think that in general we should be fine, because we don't just expand to 
a plain model name, we expand to a model name plus a list of enabled or 
disabled feature flags.

For example, consider one test output file 
(tests/domaincapsdata/qemu_8.1.0-q35.x86_64.xml) that exhibits this 
behavior change. Before my patches the host CPU expanded to a model of 
'EPYC-ROME', with a number of feature flags which included (among others):

       <feature policy='require' name='amd-ssbd'/>
       <feature policy='disable' name='xsaves'/>

After my patches, the host model was expanded to 'EPYC-Rome-v4', with a 
slightly different set of feature flags. The main differences being that 
the 'amd-ssbd' and 'xsaves' features were no longer mentioned because 
they are now included in the -v4 definition. But it also adds a new 
feature flag to disable the 'ibrs' feature since this feature is 
included in the -v4 definition but not in the -v1 definition and is not 
provided by the host CPU:

       <feature policy='disable' name='ibrs'/>

So although the model name is changed, the definition of the host CPU as 
a whole should expand to the same set of CPU features.

I think that this sort of thing can happen every time a new CPU model is 
added (e.g. the -noTSX-IBRS variants), so I think it should work. But if 
anyone knows of any specific migration issues, please let me know.

...
> Qemu documentation states that unversioned names for CPU models
(e.g.
> 'EPYC') are actually aliases to a specific versioned CPU model (e.g.
> 'EPYC-v1'). The documentation further states that the specific version
> targeted by the alias may change based on the machine type of the
> domain. Management software such as libvirt is directed to translate
> these aliases to a concrete version in order to make sure that the CPU
> definition is safe when migrating between different qemu versions that
> may make different choices for which underlying versioned model
> represents the alias.

 Do you have a link to the qemu documentation? Can it be added to the 
 commit message? 
Yes, I could certainly add a link to the commit message. It is mentioned 
here: 
https://www.qemu.org/docs/master/about/deprecated.html#runnability-guaran...

...

> In practice, at the time of writing qemu always maps the unversioned
> aliases to the -v1 model. And libvirt's CPU model definitions also
> assume that this is the case. For example, the 'x86_EPYC.xml' file
> contains the features that are defined for the EPYC-v1 inside of qemu.
> But if qemu ever changes their alias mapping, libvirt's idea of what an
> 'EPYC' CPU means and qemu's idea of what an 'EPYC' CPU means will
no
> longer match. So when choosing a CPU model for a domain, we should
> always pass the canonical versioned name to libvirt rather than the
> unversioned alias. To enable this, the script will generate a new
> 'canonical_name' field to the CPU model xml definition.
>
> Signed-off-by: Jonathon Jongsma <jjongsma(a)redhat.com&gt;
> ---
>   src/cpu_map/sync_qemu_models_i386.py | 42 ++++++++++++++++++++++------
>   1 file changed, 34 insertions(+), 8 deletions(-)

 The logic changes to the script LGTM.

 Reviewed-by: Jim Fehlig <jfehlig(a)suse.com&gt;

 Regards,
 Jim

>
> diff --git a/src/cpu_map/sync_qemu_models_i386.py 
> b/src/cpu_map/sync_qemu_models_i386.py
> index 1c6a2d4d27..7fd62eba4a 100755
> --- a/src/cpu_map/sync_qemu_models_i386.py
> +++ b/src/cpu_map/sync_qemu_models_i386.py
> @@ -322,31 +322,55 @@ def expand_model(model):
>       different fields and may have differing versions into several 
> libvirt-
>       friendly cpu models."""
> -    result = {
> -        "name": model.pop(".name"),
> +    basename = model.pop(".name")
> +    parent = {
> +        "name": basename,
>           "vendor": translate_vendor(model.pop(".vendor")),
>           "features": set(),
>           "extra": dict()}
>       if ".family" in model and ".model" in model:
> -        result["family"] = model.pop(".family")
> -        result["model"] = model.pop(".model")
> +        parent["family"] = model.pop(".family")
> +        parent["model"] = model.pop(".model")
>       for k in [k for k in model if k.startswith(".features")]:
>           v = model.pop(k)
>           for feature in v.split():
>               translated = translate_feature(feature)
>               if translated:
> -                result["features"].add(translated)
> +                parent["features"].add(translated)
>       versions = model.pop(".versions", [])
>       for k, v in model.items():
> -        result["extra"]["model" + k] = v
> -    yield result
> +        parent["extra"]["model" + k] = v
> +
> +    if not versions:
> +        yield parent
> +        return
> +
> +    result = parent
>       for version in versions:
> +        # each version builds on the previous one
>           result = copy.deepcopy(result)
> -        result["name"] = version.pop(".alias",
result["name"])
> +        vnum = int(version.pop(".version"))
> +        vname = "{}-v{}".format(basename, vnum)
> +        result["canonical_name"] = vname
> +        if vnum == 1:
> +            # the first version should always be an alias for the 
> parent and
> +            # should therefore have no extra properties
> +            if version.items():
> +                raise RuntimeError("Unexpected properties in version 1")
> +            yield result
> +            continue
> +
> +        # prefer the 'alias' over the generated the name if it exists 
> since we
> +        # have already been using these aliases
> +        alias = version.pop(".alias", None)
> +        if alias:
> +            result["name"] = alias
> +        else:
> +            result["name"] = vname
>           props = version.pop(".props", dict())
>           for k, v in props:
> @@ -377,6 +401,8 @@ def output_model(f, model):
>       f.write("<cpus>\n")
>       f.write("  <model
name='{}'>\n".format(model["name"]))
> +    if "canonical_name" in model and model["name"] != 
> model["canonical_name"]:
> +        f.write("    
>
<canonical_name>{}</canonical_name>\n".format(model["canonical_name"]))
>       f.write("    <decode host='on'
guest='on'/>\n")
>       f.write("    <signature family='{}'
model='{}'/>\n".format(
>           model["family"], model["model"]))

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [PATCH v3 01/12] cpu_map: update script to handle versioned CPUs