On Tue, May 07, 2024 at 18:24:20 -0400, Collin Walling wrote:
QEMU will soon support reporting an optional array of deprecated
features for an expanded CPU model via the query-cpu-model-expansion
command. The intended use of this data is to make it easier for a
user to define a CPU model with features flagged as deprecated set to
disabled, thus rendering the guest migratable to future hardware that
will out-right drop support for said features.
So is the list of deprecated features a static list common to all CPU
models or does it depend on the CPU model? If it's static getting the
list via another QMP command (in addition to query-cpu-model-expansion)
would make sense. I would expect the query-cpu-model-expansion to limit
deprecated features to only those actually used by the expanded CPU
model, which would make constructing a complete list quite hard.
On the other hand, if the deprecated features depend on a CPU model,
we'd need a way to list all CPU models and their deprecated features to
be able to report such info in domain capabilities XML. Ideally without
having to call query-cpu-model-expansion on every single CPU model.
Notes
=====
- In my example below, I am running on a z14.2 machine.
- The features that are flagged as deprecated for this model are: bpb, csske, cte, te.
OK, so the deprecated features seem to depend on a specific CPU model.
Does query-cpu-model-expansion list them even if they are not specified
in the CPU model we want to expand? If not, we need another interface to
be able to report this info.
Ideas
=====
New Host CPU Model
------------------
Create a new CPU model that is a mirror of the host CPU model with deprecated features
turned off. Let's call this model "host-recommended". A user may define
this model in the guest XML as they would any other CPU model:
<cpu mode='host-recommended' check='partial'/>
Just as how host-model works, anything defined nested in the <cpu> tag will be
ignored.
This model could potentially be listed in the domcapabilities output after the
host-model:
<cpu>
<mode name='host-passthrough' supported='yes'>
...
</mode>
...
<mode name='host-model' supported='yes'>
...
</mode>
<mode name='host-recommended' supported='yes'>
...
<feature policy='disable' name='cte'/>
<feature policy='require' name='ais'/>
<feature policy='disable' name='bpb'/>
<feature policy='require' name='ctop'/>
<feature policy='require' name='gs'/>
<feature policy='require' name='ppa15'/>
<feature policy='require' name='zpci'/>
<feature policy='require' name='sea_esop2'/>
<feature policy='disable' name='te'/>
<feature policy='require' name='cmm'/>
<feature policy='disable' name='csske'/>
</cpu>
New Nested Element Under <cpu>
------------------------------
Create a new optional XML element nested under the <cpu> tag that may be used to
disable deprecated features. This approach is more explicit compared to creating a new
CPU model, and allows the user to disable these features when defining a specific model
other than host-model. Here is an example of what the guest's defined XML for the CPU
could look like:
<cpu mode='host-model' check='partial'>
<deprecated_features>off</deprecated_features>
</cpu>
However, a conflict arises with this approach: parameter priority. It would need to be
discussed what the expected behavior should be if a user defines a guest with both a mode
to disable deprecated features and any deprecated features listed with the
'require' policy, e.g.:
<cpu mode='custom' match='exact' check='partial'>
<model fallback='allow'>z13.2-base</model>
<!-- which one takes priority? -->
<deprecated_features>off</deprecated_features>
<feature policy='require' name='csske'/>
</cpu>
Another conflict is setting this option to "on" would have no effect on the CPU
model (I can't think of a reason why someone would want to explicitly enable these
features). This may not communicate well to the user.
I think have a separate configuration is better as it does not limit the
used to just a single CPU model. But a nested element with a text node
looks strange. An optional attribute for the <cpu> element would be
better.
For host-model the expected behavior would be to either keep or
drop deprecated features from the CPU definition. When combined with
custom CPU mode where such feature may be already present I can imagine
three different options. Either keep deprecated features (that is do
nothing, just like we do now), drop such features silently, or fail to
start a domain in case the definition uses a deprecated feature.
We could even create the attribute, but limit it only to host-model,
which would be mostly equivalent to your "host-recomended" mode, but
extensible to other modes in the future.
To report these features, a <deprecatedProperties> tag could be
added to the domcapabilities output using the same format I use in my proposed patch for
the qemu capabilities file:
<cpu>
<mode name='host-passthrough' supported='yes'>
...
</mode>
...
<mode name='host-model' supported='yes'>
...
</mode>
<deprecatedProperties>
<property name='bpb'/>
<property name='te'/>
<property name='cte'/>
<property name='csske'/>
</deprecatedProperties>
</cpu>
We should stick with "features" rather than calling them "properties"
here to avoid confusion. Also this schema would mean the list of
deprecated features is indeed the same for all CPU models, which does
not seem to be the case here.
And one more interesting point: what to do with the baseline CPU
computation which expects to see host-model definitions from all hosts.
We'd need a way to provide the info about deprecated features to the
input data for baseline computation. In other words, to the host-model
definition itself. We could, for example, add a deprecated='yes|no'
attribute to each feature in the host-model definition. But this won't
really work when the list of deprecated CPU features depends on a CPU
model as the baseline may choose a different CPU model than what is used
by a host-model. So perhaps we may just rely on the host computing the
baseline CPU to handle deprecated features on the result and document
this computation should be done on the most up-to-date host so to avoid
missing features that were deprecated in a later update.
I feel like there's more questions than answers in my reply. Sorry about
that and the delay.
Jirka