On Fri, Mar 01, 2024 at 10:36:12AM -0600, Jonathon Jongsma wrote:
On 3/1/24 10:13 AM, Daniel P. Berrangé wrote:
> On Tue, Feb 20, 2024 at 05:08:02PM -0700, Jim Fehlig wrote:
> > On 12/15/23 15:11, Jonathon Jongsma wrote:
> > > Previously, the script only generated the parent CPU and any versions
> > > that had a defined alias. The script now generates all CPU versions. Any
> > > version that had a defined alias will continue to use that alias, but
> > > those without aliases will use the generated name $BASECPUNAME-vN.
> > >
> > > The reason for this change is two-fold. First, we need to add new models
> > > that support new features (such as SEV-SNP). To deal with this, the
> > > script now generates model definitions for all versions.
> > >
> > > But we also need to ensure that our CPU definitions are migration-safe.
> > > To deal with this issue we need to make sure we're always using the
> > > canonical versioned names for CPUs.
> >
> > Related to migration safety, do we need to be concerned with the expansion
> > of 'host-model' CPU? E.g. is it possible 'host-model' expands
to EPYC before
> > introducing the new models, and EPYC-v4 afterwards? If so, what are the
> > ramifications of that?
>
> Yes, I see that happening on my laptop in domcapabilities:
>
> Currently libvirt reports:
>
> <mode name='host-model' supported='yes'>
> <model fallback='forbid'>Snowridge</model>
> <vendor>Intel</vendor>
> <maxphysaddr mode='passthrough' limit='46'/>
> <feature policy='require' name='ss'/>
> <feature policy='require' name='vmx'/>
> ...snip...
>
>
> and after this series it reports:
>
> <mode name='host-model' supported='yes'>
> <model fallback='forbid'>Snowridge-v4</model>
> <vendor>Intel</vendor>
> <maxphysaddr mode='passthrough' limit='46'/>
> <feature policy='require' name='ss'/>
> <feature policy='require' name='vmx'/>
> ...snip...
>
>
> That's not wrong per-se, becasue Snowrigde-v4 has a smaller
> delta against my host CPU.
>
> The problem is that libvirt updates the *live* XML for the
> guest with this expansion. IIUC, if we now attempt to
> live migrate to a compatible machine running older libvirt
> the migrate will fail as old libvirt doesn't know the -v4
> CPU.
>
> I'm not sure how to address this ?
But don't we have this issue any time we add a new CPU model to libvirt?
Anytime there's a new model, it has the potential to be a closer match to
the host CPU than an existing model definition was. As I mentioned in my
previous reply, when e.g. the -noTSX CPU variants were added, didn't the
same sort of thing (potentially) happen? Or am I doing something
meaningfully different in this patch set than what happens in those
scenarios?
I think it probably /did/ happen, but that doesn't make it acceptable.
The noTSX stuff was the cause of massive amounts of compatibility pain
for mgmt apps, so the incompatibility in libvirt might have been glossed
over. We're adding alot of new versions here, so the possibly increasing
the visibility/impact of this libvirt change.
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|