
On Tue, Jun 04, 2024 at 11:42:25 -0400, Collin Walling wrote:
The QEMU portion is designed for s390x such that there is a static list of hardcoded feature bits that are flagged for deprecation. This list can be updated in follow-up patches as more features need to be flagged.
Good, a single static list should definitely be easier to handle.
The query-cpu-model-expansion *response* has been modified to now include an /optional/ "deprecated props" string array (no additional parameter is required in the query JSON). This array will be present if and only if there is data to be reported. If an architecture does not support reporting deprecated features, or if there are no deprecated features to report, then array is simply omitted from the response.
The deprecated features are filtered against the full-expansion features for the particular CPU model. In the example for a z14, we would expect all four currently deprecated features reported: bpb, csske, cte, and te, since those features are part of the z14 full expansion. On an older model, say a z900, you'd only see bpb because the others are not present in the full-expansion.
In that case I think it would be nice if QEMU provided a way of getting the list itself without any filtering. Mainly to make sure we're providing a completely list for users which may decide to avoid using such features in their non-host-model CPU definitions. And I can imagine management layers might want to get a complete list too.
Another conflict is setting this option to "on" would have no effect on the CPU model (I can't think of a reason why someone would want to explicitly enable these features). This may not communicate well to the user.
For host-model the expected behavior would be to either keep or drop deprecated features from the CPU definition. When combined with custom CPU mode where such feature may be already present I can imagine
I think have a separate configuration is better as it does not limit the used to just a single CPU model. But a nested element with a text node looks strange. An optional attribute for the <cpu> element would be better. three different options. Either keep deprecated features (that is do nothing, just like we do now), drop such features silently, or fail to start a domain in case the definition uses a deprecated feature.
I fear the last option may break some things?
The idea is the behavior would be controllable via a special value in that attribute, which a used would have to explicitly specify to get such behavior. In other words, a user would provide a custom CPU definition and ask for deprecated features to be removed no matter what they are currently. But I don't think we need to care about it now because...
We could even create the attribute, but limit it only to host-model, which would be mostly equivalent to your "host-recomended" mode, but extensible to other modes in the future.
I think this is the better approach.
Agreed. Just do the minimal required thing while making it extensible.
In tandem to your suggestion to add a new attribute to the cpu element, a quick mock-up of it could look like this:
<cpu mode='host-model' deprecated_features='off'>
Right. Although you may have fun coming with an attribute name, because we are not very consistent with naming multi-word attributes. Some people don't like underscores while others don't like camelCase :-) Anyway, I think you should at least be consistent and use a single name for the attribute in domain XML and the element in domcapabilities XML.
To report these features, a <deprecatedProperties> tag could be added to the domcapabilities output using the same format I use in my proposed patch for the qemu capabilities file:
<cpu> <mode name='host-passthrough' supported='yes'> ... </mode> ... <mode name='host-model' supported='yes'> ... </mode> <deprecatedProperties> <property name='bpb'/> <property name='te'/> <property name='cte'/> <property name='csske'/> </deprecatedProperties> </cpu>
We should stick with "features" rather than calling them "properties" here to avoid confusion. Also this schema would mean the list of deprecated features is indeed the same for all CPU models, which does not seem to be the case here.
Noted to use "features" instead of "properties".
And you are correct that this makes the assumption that these features are available for *all* CPU models -- I did not think of this perspective. This is populated from a query-cpu-model-expansion based on "host-model".
What would be a better way to report these features in the domcapabilities response? Perhaps they should only be nested under the host-model?
Well, the list is static and common to all CPU models so keeping it as another element under <cpu> as you suggested looks fine.
...another thought would be to implement a new feature policy "deprecated" that would allow these XML features to be applied to any CPU model. This would flag the feature to either be disabled if it's available for that model, or ignore it otherwise. The validity of the feature would be checked against the list of deprecated features (to avoid typos or having a user define an imaginary feature). Additionally, the feature policy would evaluate to "disabled" for a live guest XML to alleviate migration issues on machines that do not know about this new policy.
Consider the following for domcapabilities output:
<cpu> <mode name='host-passthrough' supported='yes'> ... </mode> ... <mode name='host-model' supported='yes'> ... </mode> <deprecatedFeatures> <feature policy='deprecated' name='bpb'/> <feature policy='deprecated' name='te'/> <feature policy='deprecated' name='cte'/> <feature policy='deprecated' name='csske'/> </deprecatedProperties> </cpu>
I think this is a little bit too complicated.
The benefit here is that this would let a user easily copy+paste these features into their guest XML and not have to consider whether or not these features are available for the CPU model they wish to use.
They should always be able to mark all deprecated features as disabled if they want. I mean a feature not used by a specific CPU model is disabled and disabling it explicitly in the XML does not change anything. Or is s390x different and using a specific CPU model would also limit what extra features you can list even if you just disable them?
Alternatively, the <mode name='host-model' ...> could include the aforementioned deprecated_features='off' attribute and the nested features would include the analogous features with policy='disabled'. This provides the benefit of having a near ready-to-use CPU model XML with deprecated features that can be provided to the baseline input.
Introducing a new value for an attribute would cause parse failures when the XML is passed to an older libvirt. I think we should just take host-model CPUs from all hosts, compute the baseline and drop deprecated features from the result. This would of course need a new flag for the baseline API. This way we could also take host-model from a single host and let deprecated features be removed from it. Internally either the baseline QMP command would also provide a list of deprecated features or we could just take the result and let QEMU expand it to get the list.
The way I imagined this working was to either use the "host-recommended" CPU model by default (but I believe we are nix'ing that approach), OR just use the host-model as-is today and rely on the input XML containing the appropriate list of deprecated features (with policy=disabled). The input XML containing this CPU model could be extracted from a live guest that has deprecated features disabled.
Or we could perhaps add a flag for domcapabilities API to show the host-model with deprecated features already disabled. This way we could avoid introducing a new policy while letting the originating host limit the disabled features itself. Or we could do both... limit the input and also removed the deprecated features from the result :-)
To echo part of the discussion from Daniel's feedback, he seems in favor of providing an explicit indication on the cpu definition that deprecated features are on|off. I think this works in tandem with a deprecated_features attribute in the <cpu> element?
Yeah, I think that's exactly what the attribute can do. Jirka