On 6/13/24 7:22 AM, Jiri Denemark wrote:
On Tue, Jun 04, 2024 at 11:42:25 -0400, Collin Walling wrote:
> The QEMU portion is designed for s390x such that there is a static list
> of hardcoded feature bits that are flagged for deprecation. This list
> can be updated in follow-up patches as more features need to be flagged.
Good, a single static list should definitely be easier to handle.
> The query-cpu-model-expansion *response* has been modified to now
> include an /optional/ "deprecated props" string array (no additional
> parameter is required in the query JSON). This array will be present if
> and only if there is data to be reported. If an architecture does not
> support reporting deprecated features, or if there are no deprecated
> features to report, then array is simply omitted from the response.
>
> The deprecated features are filtered against the full-expansion features
> for the particular CPU model. In the example for a z14, we would expect
> all four currently deprecated features reported: bpb, csske, cte, and
> te, since those features are part of the z14 full expansion. On an
> older model, say a z900, you'd only see bpb because the others are not
> present in the full-expansion.
In that case I think it would be nice if QEMU provided a way of getting
the list itself without any filtering. Mainly to make sure we're
providing a completely list for users which may decide to avoid using
such features in their non-host-model CPU definitions. And I can imagine
management layers might want to get a complete list too.
Okay. I will need to bring this up on the QEMU devel list to figure out
the best approach to report this list. I do not believe this hinders
development for designing a way for a user to specify host-model with
deprecated features disabled, but I understand that this would be
valuable to users who want to define a custom model with deprecated
features off and management applications may want this list.
I'll mention the action items for this request at the bottom of this email.
>>> Another conflict is setting this option to "on"
would have no
>>> effect on the CPU model (I can't think of a reason why someone
>>> would want to explicitly enable these features). This may not
>>> communicate well to the user.
>>
>> I think have a separate configuration is better as it does not limit the
>> used to just a single CPU model. But a nested element with a text node
>> looks strange. An optional attribute for the <cpu> element would be
>> better.
>>> For host-model the expected behavior would be to either keep or
>> drop deprecated features from the CPU definition. When combined with
>> custom CPU mode where such feature may be already present I can imagine
>> three different options. Either keep deprecated features (that is do
>> nothing, just like we do now), drop such features silently, or fail to
>> start a domain in case the definition uses a deprecated feature.
>
> I fear the last option may break some things?
The idea is the behavior would be controllable via a special value in
that attribute, which a used would have to explicitly specify to get
such behavior. In other words, a user would provide a custom CPU
definition and ask for deprecated features to be removed no matter what
they are currently. But I don't think we need to care about it now
because...
>> We could even create the attribute, but limit it only to host-model,
>> which would be mostly equivalent to your "host-recomended" mode, but
>> extensible to other modes in the future.
>
> I think this is the better approach.
Agreed. Just do the minimal required thing while making it extensible.
Sounds good!
> In tandem to your suggestion to add a new attribute to the cpu
element,
> a quick mock-up of it could look like this:
>
> <cpu mode='host-model' deprecated_features='off'>
Right. Although you may have fun coming with an attribute name, because
we are not very consistent with naming multi-word attributes. Some
people don't like underscores while others don't like camelCase :-)
Anyway, I think you should at least be consistent and use a single name
for the attribute in domain XML and the element in domcapabilities XML.
Sounds fun! I'll try to come up with a reasonable name on the
first run and discuss afterwards.
>>> To report these features, a <deprecatedProperties>
tag could be
>>> added to the domcapabilities output using the same format I use in
>>> my proposed patch for the qemu capabilities file:
>>>
>>> <cpu>
>>> <mode name='host-passthrough' supported='yes'>
>>> ...
>>> </mode>
>>> ...
>>> <mode name='host-model' supported='yes'>
>>> ...
>>> </mode>
>>> <deprecatedProperties>
>>> <property name='bpb'/>
>>> <property name='te'/>
>>> <property name='cte'/>
>>> <property name='csske'/>
>>> </deprecatedProperties>
>>> </cpu>
>>
>> We should stick with "features" rather than calling them
"properties"
>> here to avoid confusion. Also this schema would mean the list of
>> deprecated features is indeed the same for all CPU models, which does
>> not seem to be the case here.
>
> Noted to use "features" instead of "properties".
>
> And you are correct that this makes the assumption that these features
> are available for *all* CPU models -- I did not think of this
> perspective. This is populated from a query-cpu-model-expansion based
> on "host-model".
>
> What would be a better way to report these features in the
> domcapabilities response? Perhaps they should only be nested under the
> host-model?
Well, the list is static and common to all CPU models so keeping it as
another element under <cpu> as you suggested looks fine.
Okay. I will retain this output (with a better element name). If this
no longer makes sense as development progresses, I can easily drop it :)
> ...another thought would be to implement a new feature policy
> "deprecated" that would allow these XML features to be applied to any
> CPU model. This would flag the feature to either be disabled if it's
> available for that model, or ignore it otherwise. The validity of the
> feature would be checked against the list of deprecated features (to
> avoid typos or having a user define an imaginary feature).
> Additionally, the feature policy would evaluate to "disabled" for a live
> guest XML to alleviate migration issues on machines that do not know
> about this new policy.
>
> Consider the following for domcapabilities output:
>
> <cpu>
> <mode name='host-passthrough' supported='yes'>
> ...
> </mode>
> ...
> <mode name='host-model' supported='yes'>
> ...
> </mode>
> <deprecatedFeatures>
> <feature policy='deprecated' name='bpb'/>
> <feature policy='deprecated' name='te'/>
> <feature policy='deprecated' name='cte'/>
> <feature policy='deprecated' name='csske'/>
> </deprecatedProperties>
> </cpu>
I think this is a little bit too complicated.
Noted.
> The benefit here is that this would let a user easily copy+paste
these
> features into their guest XML and not have to consider whether or not
> these features are available for the CPU model they wish to use.
They should always be able to mark all deprecated features as disabled
if they want. I mean a feature not used by a specific CPU model is
disabled and disabling it explicitly in the XML does not change
anything. Or is s390x different and using a specific CPU model would
also limit what extra features you can list even if you just disable
them?
s390x is not any different. If you specify, say, a z14 and
provide+disable a CPU feature that was introduced in a future model,
(e.g. msa9, a gen15a feature), the guest will operate without any issues.
To be concise, here is the output of z14 with msa9=on|off:
# qemu-system-s390x ... -cpu z14,msa9=on
qemu-system-s390x: can't apply global z14-s390x-cpu.msa9=on: Group
'msa9' is not available for CPU model 'z14', it was introduced with
later models.
# qemu-system-s390x ... -cpu z14,msa9=off
/guest starts up successfully/
A bit of insight: future CPU models that outright drop the deprecated
features should have similar behavior: feature=on should say something
like "no longer supported" (outside the scope of what we're aiming to do
here), feature=off should just start the guest successfully --
definitely within scope :)
> Alternatively, the <mode name='host-model' ...>
could include the
> aforementioned deprecated_features='off' attribute and the nested
> features would include the analogous features with policy='disabled'.
> This provides the benefit of having a near ready-to-use CPU model XML
> with deprecated features that can be provided to the baseline input.
Introducing a new value for an attribute would cause parse failures when
the XML is passed to an older libvirt. I think we should just take
host-model CPUs from all hosts, compute the baseline and drop deprecated
features from the result. This would of course need a new flag for the
baseline API. This way we could also take host-model from a single host
and let deprecated features be removed from it. Internally either the
baseline QMP command would also provide a list of deprecated features or
we could just take the result and let QEMU expand it to get the list.
I like this idea. If it's acceptable to add a new flag to this command,
that'd be really nice. s390x utilizes the hypervisor-cpu-baseline
command, but would this flag need to be added to the analogous
cpu-baseline command as well?
> The way I imagined this working was to either use the
"host-recommended"
> CPU model by default (but I believe we are nix'ing that approach), OR
> just use the host-model as-is today and rely on the input XML containing
> the appropriate list of deprecated features (with policy=disabled). The
> input XML containing this CPU model could be extracted from a live guest
> that has deprecated features disabled.
Or we could perhaps add a flag for domcapabilities API to show the
host-model with deprecated features already disabled. This way we could
avoid introducing a new policy while letting the originating host limit
the disabled features itself. Or we could do both... limit the input and
also removed the deprecated features from the result :-)
So if I understand this correctly, you're saying there could be two ways
of defining a guest with deprecated features turned off:
1. define the guest with host-model and the aforementioned
"deprecated_features=off" attribute
--- OR ---
2. define the guest by copy+pasting the resulting host-model from a
virsh domcapabiliites --deprecated-features-off (pardon the messy
prototype flag name :P )
> To echo part of the discussion from Daniel's feedback, he
seems in favor
> of providing an explicit indication on the cpu definition that
> deprecated features are on|off. I think this works in tandem with a
> deprecated_features attribute in the <cpu> element?
Yeah, I think that's exactly what the attribute can do.
Jirka
Awesome! I think my only lingering question is the one above to make
sure I'm aligning my understanding with your feedback.
This information should be sufficient for me to continue development and
propose a reasonable v1 for this. Here's the plan:
- implement the "deprecated_features" attribute for host-model *only*
(with a better name)
- add a new flag for baseline that will ensure the result turns off
deprecated features for the resulting model (should be easy since I
can piggyback off the features flag that already invokes a model
expansion)
- add a new flag for domcapabilities that will present the host-model
with deprecated features paired with the disabled policy
. . .
Another item to implement is a way to get the full list of deprecated
features independent of the CPU model. I can start this discussion on
the QEMU devel list while working on the items above. Here is the plan:
- discuss over on the qemu-devel list what would be the best way to
reliably report all deprecated features, independent of the CPU model
- implement QEMU changes and implement updates to the appropriate ABI
in libvirt
Thank you for your time and feedback on this!
--
Regards,
Collin