
On Fri, Jun 29, 2018 at 09:53:53AM +0100, Dr. David Alan Gilbert wrote:
* Eduardo Habkost (ehabkost@redhat.com) wrote:
On Thu, Jun 28, 2018 at 04:45:02PM +0100, Daniel P. Berrangé wrote: [...]
What if we can borrow the concept of versioning from machine types and apply it to CPU models directly. For example, considering the history of "Haswell" in QEMU, if we had versioned things, we would by now have:
Haswell-1.3.0 - first version (37507094f350b75c62dc059f998e7185de3ab60a) Haswell-2.2.0 - added 'rdrand' (78a611f1936b3eac8ed78a2be2146a742a85212c_ Haswell-2.3.0 - removed 'hle' & 'rtm' (a356850b80b3d13b2ef737dad2acb05e6da03753) Haswell-2.5.0 - added 'abm' (becb66673ec30cb604926d247ab9449a60ad8b11 Haswell-2.12.0 - added 'spec-ctrl' (ac96c41354b7e4c70b756342d9b686e31ab87458) Haswell-3.0.0 - added 'ssbd' (never done)
If we followed the machine type approach, then a bare "Haswell" would statically resolve at build time to the most recent Haswell-X.X.X version associated with the QEMU release. This is unhelpful as we have a direct dependancy on the host hardware features. Better would be for a bare "Haswell" to be dynamically resolved at runtime, picking the most recent version that is capable of launching given the current hardware, KVM/TCG impl and QEMU version.
ie -cpu Haswell
should use Haswell-2.5.0 if on silicon with the TSX errata applied, but use Haswell-2.12.0 if the Spectre errata is applied in microcode, and use Haswell-3.0.0 once Intel finally releases SSBD microcode errata.
Doing this unconditionally would make "-machine pc-q35-3.1 -cpu Haswell" unsafe for live migration, and break existing usage. But this behavior could be enabled explicitly somehow.
Versioning of CPU models as opposed to using arbitrary string suffixes (-noTSX, -IBRS) has a number of usability improvements that we would gain with versioned machine types, while avoiding exploding the machine type matrix. With versioned CPU models we can
- Automatically tailor the best model based on hardware support
- Users always get the best model if they use the bare CPU name
- It is obvious to users which is the "best" / "newest" CPU model
- Avoid combinatorial expansion of machines since same CPU model version can be added to all releases without adding machine types.
- Users can still force a specific downgraded model by using the fully versioned name.
Such versioning of CPU models would largely "just work" with existing libvirt versions, but to libvirt would really want to expand the bare CPU name to a versioned CPU name when recording new guest XML, so the ABI is preserved long term.
An application like virt-manager which wants a simple UI can forever be happy simply giving users a list of bare CPU model names, and allowing libvirt / QEMU to automatically expand to the best versioned model for their host.
An application like oVirt/OpenStack which wants direct control can allow the admin to choice if a bare name, or explicitly picking a versioned name if they need to cope with possibility of outdated hosts.
The proposal makes sense, and I think most of it can be already implemented on top of existing query-cpu-model-* commands. query-cpu-model-expansion type=static can expand to a versioned CPU model.
We will probably need to make query-cpu-model-expansion accept a machine-type name as input, and/or add a new flag meaning "please give me the best CPU version you have, not the one defined by the current machine-type".
I'm not sure what would be the best way to encode two types of information, though:
Both of those are solved with the numbering scheme
* Fallback/alternatives info, e.g.: "It makes sense to use Haswell-{3.0,2.12,2.5,...} if Haswell-3.1 is not runnable and the user asked for Haswell".
Use the highest that works.
* Ordering/preference info, e.g.: "Haswell-3.1 is better than Haswell-3.0, prefer the latter"
Higher is better.
The only thing that worries me about a numbering scheme is that it's now more difficult for a user to know whether they've got the type with a fix for a particular vulnerability.
True, but if more vulns arrive we have the same problem with named suffixes too. eg if we added -SSBD variants, users would ask whether -SSBD includes the -IBRS fix or vica-verca, as a year down the line they're not going to remember which or SSBD/IBRS came out first.
We're going to have to say something like: 'For the new XYZ vulnerability make sure you're using Haswell-3.2 or later, SkyLake-2.6 or later, Westmere-4.8 or later .....'
which all gets a bit confusing.
The kernel has a /sys/devices/system/cpu/vulnerabilities dir that lists status of various flaws. I have been thinking about whether libvirt should create a 'virt-guest-validate' command that looks at guest XML and reports whether any of the config settings are vulnerable or otherwise diverging from best practice in some way. QEMU itself would perhaps have a 'query-vulnerabilities' monitor command to report whether the current config is satisfactory or not. Ultimately though, getting a fixed guest involves host kernel, microcode, qemu, and guest kernel. So to get a true picture of your safety people should really look straight to the guest kernels' /sys/devices/system/cpu/vulnerabilities directory. They only need to look at host/microcode/qemu if the guest is reporting something is wrong. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|