On Fri, Jun 29, 2018 at 11:14:17 +0100, Daniel P. Berrangé wrote:
On Thu, Jun 28, 2018 at 04:52:27PM -0300, Eduardo Habkost wrote:
> On Thu, Jun 28, 2018 at 04:45:02PM +0100, Daniel P. Berrangé wrote:
> [...]
> > What if we can borrow the concept of versioning from machine types and apply
> > it to CPU models directly. For example, considering the history of
"Haswell"
> > in QEMU, if we had versioned things, we would by now have:
> >
> > Haswell-1.3.0 - first version (37507094f350b75c62dc059f998e7185de3ab60a)
> > Haswell-2.2.0 - added 'rdrand'
(78a611f1936b3eac8ed78a2be2146a742a85212c_
> > Haswell-2.3.0 - removed 'hle' & 'rtm'
(a356850b80b3d13b2ef737dad2acb05e6da03753)
> > Haswell-2.5.0 - added 'abm'
(becb66673ec30cb604926d247ab9449a60ad8b11
> > Haswell-2.12.0 - added 'spec-ctrl'
(ac96c41354b7e4c70b756342d9b686e31ab87458)
> > Haswell-3.0.0 - added 'ssbd' (never done)
> >
> > If we followed the machine type approach, then a bare "Haswell"
would
> > statically resolve at build time to the most recent Haswell-X.X.X version
> > associated with the QEMU release. This is unhelpful as we have a direct
> > dependancy on the host hardware features. Better would be for a bare
> > "Haswell" to be dynamically resolved at runtime, picking the most
recent
> > version that is capable of launching given the current hardware, KVM/TCG impl
> > and QEMU version.
> >
> > ie -cpu Haswell
> >
> > should use Haswell-2.5.0 if on silicon with the TSX errata applied,
> > but use Haswell-2.12.0 if the Spectre errata is applied in microcode,
> > and use Haswell-3.0.0 once Intel finally releases SSBD microcode errata.
>
> Doing this unconditionally would make
> "-machine pc-q35-3.1 -cpu Haswell" unsafe for live migration, and
> break existing usage. But this behavior could be enabled
> explicitly somehow.
True, for full back compat with existing libvirt we would probably
want to opt-in to it.
eg -cpu Haswell could pick a fixed Haswell--XXX version according
to the machine type. -cpu Haswell,best=on could pick best version
for the host with the caveat about migration between heterogenous
hosts.
I was thinking we could even separate the CPU model version from the
name itself:
-cpu Haswell (the old, compatible way)
-cpu Haswell,version=best
-cpu Haswell,version=2.12.0
It would be slightly more work for the upper management layers, but IMHO
it would make more sense.
In any case, we have to think about keeping guest ABI stable.
I hope the automatic version selection would not cause any problems for
subsequent cold starts (such as Windows activation issues). It should be
very similar to updating CPU microcode which the guest OS is already
supposed to deal with in real hardware. However, in the past QEMU
changed CPU signature (family, model, stepping) for new machine types
and it is likely to happen with separately versioned CPU models too. I
believe CPU microcode updates do not touch these values. On the other
hand, it's similar to host-model and the user can always specify exact
version to avoid this slight change should it be a problem.
Once the domain starts, we need to keep stable ABI across migrations,
save/restores, or snapshots. Libvirt already does so by talking to QEMU
before starting vCPUs and checking for disabled/enabled features. Then
we store this information in the active domain XML to make sure we can
enforce the same CPU later. This concept would need to be enhanced to
include the CPU model version which QEMU would need to be able to
report.
A significantly more fun would result from letting libvirt use the
versioned CPU model stuff by default without an explicit knob in the
XML. But I guess you don't want to go that direction, do you?
Jirka