Re: [libvirt] [Qemu-devel] CPU model versioning separate from machine type versioning ?

29 Jun 2018

      On Fri, Jun 29, 2018 at 09:53:53AM +0100, Dr. David Alan Gilbert wrote:
...
* Eduardo Habkost (ehabkost@redhat.com) wrote:
...
On Thu, Jun 28, 2018 at 04:45:02PM +0100, Daniel P. Berrangé wrote:
[...]
...
What if we can borrow the concept of versioning from machine types and apply
it to CPU models directly. For example, considering the history of "Haswell"
in QEMU, if we had versioned things, we would by now have:
Haswell-1.3.0 - first version (37507094f350b75c62dc059f998e7185de3ab60a)
     Haswell-2.2.0 - added 'rdrand' (78a611f1936b3eac8ed78a2be2146a742a85212c_
     Haswell-2.3.0 - removed 'hle' & 'rtm' (a356850b80b3d13b2ef737dad2acb05e6da03753)
     Haswell-2.5.0 - added 'abm' (becb66673ec30cb604926d247ab9449a60ad8b11
     Haswell-2.12.0 - added 'spec-ctrl' (ac96c41354b7e4c70b756342d9b686e31ab87458)
     Haswell-3.0.0  - added 'ssbd' (never done)
If we followed the machine type approach, then a bare "Haswell" would
statically resolve at build time to the most recent Haswell-X.X.X version
associated with the QEMU release. This is unhelpful as we have a direct
dependancy on the host hardware features. Better would be for a bare
"Haswell" to be dynamically resolved at runtime, picking the most recent
version that is capable of launching given the current hardware, KVM/TCG impl
and QEMU version.
ie -cpu  Haswell
should use Haswell-2.5.0  if on silicon with the TSX errata applied,
but use Haswell-2.12.0 if the Spectre errata is applied in microcode,
and use Haswell-3.0.0 once Intel finally releases SSBD microcode errata.
Doing this unconditionally would make
"-machine pc-q35-3.1 -cpu Haswell" unsafe for live migration, and
break existing usage.  But this behavior could be enabled
explicitly somehow.
...
Versioning of CPU models as opposed to using arbitrary string suffixes
(-noTSX, -IBRS) has a number of usability improvements that we would
gain with versioned machine types, while avoiding exploding the machine
type matrix. With versioned CPU models we can
- Automatically tailor the best model based on hardware support
- Users always get the best model if they use the bare CPU name
- It is obvious to users which is the "best" / "newest" CPU model
- Avoid combinatorial expansion of machines since same CPU model
   version can be added to all releases without adding machine types.
- Users can still force a specific downgraded model by using the
   fully versioned name.
Such versioning of CPU models would largely "just work" with existing
libvirt versions, but to libvirt would really want to expand the bare
CPU name to a versioned CPU name when recording new guest XML, so the
ABI is preserved long term.
An application like virt-manager which wants a simple UI can forever be
happy simply giving users a list of bare CPU model names, and allowing
libvirt / QEMU to automatically expand to the best versioned model for
their host.
An application like oVirt/OpenStack which wants direct control can allow
the admin to choice if a bare name, or explicitly picking a versioned name
if they need to cope with possibility of outdated hosts.
The proposal makes sense, and I think most of it can be already
implemented on top of existing query-cpu-model-* commands.
query-cpu-model-expansion type=static can expand to a versioned
CPU model.
We will probably need to make query-cpu-model-expansion accept a
machine-type name as input, and/or add a new flag meaning "please
give me the best CPU version you have, not the one defined by the
current machine-type".
I'm not sure what would be the best way to encode two types of
information, though:
Both of those are solved with the numbering scheme
...
* Fallback/alternatives info, e.g.: "It makes sense to use
  Haswell-{3.0,2.12,2.5,...} if Haswell-3.1 is not runnable and the
  user asked for Haswell".
Use the highest that works.
...
* Ordering/preference info, e.g.: "Haswell-3.1 is better than
  Haswell-3.0, prefer the latter"
Higher is better.
The only thing that worries me about a numbering scheme is that
it's now more difficult for a user to know whether they've got
the type with a fix for a particular vulnerability.
True, but if more vulns arrive we have the same problem with named
suffixes too. eg if we added  -SSBD variants, users would ask whether
-SSBD includes the -IBRS fix or vica-verca, as a year down the line
they're not going to remember which or SSBD/IBRS came out first.
...
We're going to have to say something like:
  'For the new XYZ vulnerability make sure you're using
  Haswell-3.2 or later, SkyLake-2.6 or later, Westmere-4.8 or later
  .....'
which all gets a bit confusing.
The kernel has a /sys/devices/system/cpu/vulnerabilities dir
that lists status of various flaws.

I have been thinking about whether libvirt should create a
'virt-guest-validate' command that looks at guest XML and
reports whether any of the config settings are vulnerable
or otherwise diverging from best practice in some way.

QEMU itself would perhaps have a 'query-vulnerabilities'
monitor command to report whether the current config is
satisfactory or not.

Ultimately though, getting a fixed guest involves host
kernel, microcode, qemu, and guest kernel. So to get a
true picture of your safety people should really look
straight to the guest kernels' /sys/devices/system/cpu/vulnerabilities
directory. They only need to look at host/microcode/qemu if the
guest is reporting something is wrong.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|