On Mon, Mar 04, 2024 at 10:35:25AM -0700, Jim Fehlig wrote:
On 3/1/24 10:13, Daniel P. Berrangé wrote:
> On Fri, Mar 01, 2024 at 10:36:12AM -0600, Jonathon Jongsma wrote:
> > On 3/1/24 10:13 AM, Daniel P. Berrangé wrote:
> > > On Tue, Feb 20, 2024 at 05:08:02PM -0700, Jim Fehlig wrote:
> > > > On 12/15/23 15:11, Jonathon Jongsma wrote:
> > > > > Previously, the script only generated the parent CPU and any
versions
> > > > > that had a defined alias. The script now generates all CPU
versions. Any
> > > > > version that had a defined alias will continue to use that
alias, but
> > > > > those without aliases will use the generated name
$BASECPUNAME-vN.
> > > > >
> > > > > The reason for this change is two-fold. First, we need to add
new models
> > > > > that support new features (such as SEV-SNP). To deal with this,
the
> > > > > script now generates model definitions for all versions.
> > > > >
> > > > > But we also need to ensure that our CPU definitions are
migration-safe.
> > > > > To deal with this issue we need to make sure we're always
using the
> > > > > canonical versioned names for CPUs.
> > > >
> > > > Related to migration safety, do we need to be concerned with the
expansion
> > > > of 'host-model' CPU? E.g. is it possible 'host-model'
expands to EPYC before
> > > > introducing the new models, and EPYC-v4 afterwards? If so, what are
the
> > > > ramifications of that?
> > >
> > > Yes, I see that happening on my laptop in domcapabilities:
> > >
> > > Currently libvirt reports:
> > >
> > > <mode name='host-model' supported='yes'>
> > > <model fallback='forbid'>Snowridge</model>
> > > <vendor>Intel</vendor>
> > > <maxphysaddr mode='passthrough'
limit='46'/>
> > > <feature policy='require' name='ss'/>
> > > <feature policy='require' name='vmx'/>
> > > ...snip...
> > >
> > >
> > > and after this series it reports:
> > >
> > > <mode name='host-model' supported='yes'>
> > > <model fallback='forbid'>Snowridge-v4</model>
> > > <vendor>Intel</vendor>
> > > <maxphysaddr mode='passthrough'
limit='46'/>
> > > <feature policy='require' name='ss'/>
> > > <feature policy='require' name='vmx'/>
> > > ...snip...
> > >
> > >
> > > That's not wrong per-se, becasue Snowrigde-v4 has a smaller
> > > delta against my host CPU.
> > >
> > > The problem is that libvirt updates the *live* XML for the
> > > guest with this expansion. IIUC, if we now attempt to
> > > live migrate to a compatible machine running older libvirt
> > > the migrate will fail as old libvirt doesn't know the -v4
> > > CPU.
Downstream, we (SUSE) don't really support migrating from new -> old. Is
this something we aim to support upstream?
Kind of, sort of, yes and no :)
The VIR_DOMAIN_XML_MIGRATABLE flag is a bit of an attempt to make
it possible to format XML in a way that's (hopefully) mostly acceptable
to older libvirt.
The devil is in the detail though, and there's never really been
any formal testing to prove correctness, so new -> old is one of
those things that may work, please report bugs if we missed
something.
> > > I'm not sure how to address this ?
> >
> > But don't we have this issue any time we add a new CPU model to libvirt?
> > Anytime there's a new model, it has the potential to be a closer match to
> > the host CPU than an existing model definition was. As I mentioned in my
> > previous reply, when e.g. the -noTSX CPU variants were added, didn't the
> > same sort of thing (potentially) happen? Or am I doing something
> > meaningfully different in this patch set than what happens in those
> > scenarios?
>
> I think it probably /did/ happen, but that doesn't make it acceptable.
> The noTSX stuff was the cause of massive amounts of compatibility pain
> for mgmt apps, so the incompatibility in libvirt might have been glossed
> over. We're adding alot of new versions here, so the possibly increasing
> the visibility/impact of this libvirt change.
It can happen when we introduce an entirely new CPU model too. E.g. on a
Genoa machine, prior to commit bfe53e9145c, host model expanded to
Yeah, true, so that's a general problem with 'host-model' when
introducing new CPU generations, if that post-dates a user
deploying on said CPU generation..
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|