On Fri, Mar 18, 2022 at 01:23:03PM -0400, Collin Walling wrote:
On 3/15/22 15:08, David Hildenbrand wrote:
> On 15.03.22 18:40, Boris Fiuczynski wrote:
>> On 3/15/22 4:58 PM, David Hildenbrand wrote:
>>> On 11.03.22 13:44, Christian Borntraeger wrote:
>>>>
>>>>
>>>> Am 11.03.22 um 10:30 schrieb David Hildenbrand:
>>>>> On 11.03.22 05:17, Collin Walling wrote:
>>>>>> The s390x architecture has a growing list of features that will
no longer
>>>>>> be supported on future hardware releases. This introduces an
issue with
>>>>>> migration such that guests, running on models with these
features enabled,
>>>>>> will be rejected outright by machines that do not support these
features.
>>>>>>
>>>>>> A current example is the CSSKE feature that has been deprecated
for some time.
>>>>>> It has been publicly announced that gen15 will be the last
release to
>>>>>> support this feature, however we have postponed this to gen16a.
A possible
>>>>>> solution to remedy this would be to create a new QEMU QMP
Response that allows
>>>>>> users to query for deprecated/unsupported features.
>>>>>>
>>>>>> This presents two parts of the puzzle: how to report deprecated
features to
>>>>>> a user (libvirt) and how should libvirt handle this
information.
>>>>>>
>>>>>> First, let's discuss the latter. The patch presented
alongside this cover letter
>>>>>> attempts to solve the migration issue by hard-coding the CSSKE
feature to be
>>>>>> disabled for all s390x CPU models. This is done by simply
appending the CSSKE
>>>>>> feature with the disabled policy to the host-model.
>>>>>>
>>>>>> libvirt pseudo:
>>>>>>
>>>>>> if arch is s390x
>>>>>> set CSSKE to disabled for host-model
>>>>>
>>>>> That violates host-model semantics and possibly the user intend.
There
>>>>> would have to be some toggle to manually specify this, for example,
a
>>>>> new model type or a some magical flag.
>>>>
>>>> What we actually want to do is to disable csske completely from QEMU
and
>>>> thus from the host-model. Then it would not violate the spec.
>>>> But this has all kind of issues (you cannot migrate from older versions
>>>> of software and machines) although the hardware still can provide the
feature.
>>>>
>>>> The hardware guys promised me to deprecate things two generations
earlier
>>>> and we usually deprecate things that are not used or where software has
a
>>>> runtime switch.
>>>>
>>>> From what I hear from you is that you do not want to modify the
host-model
>>>> semantics to something more useful but rather define a new thing (e.g.
"host-sane") ?
>>>
>>> My take would be, to keep the host model consistent, meaning, the
>>> semantics in QEMU exactly match the semantics in Libvirt. It defines the
>>> maximum CPU model that's runnable under KVM. If a feature is not
>>> included (e.g., csske) that feature cannot be enabled in any way.
>>>
>>> The "host model" has the semantics of resembling the actual host
CPU.
>>> This is only partially true, because we support some features the host
>>> might not support (e.g., zPCI IIRC) and obviously don't support all
host
>>> features in QEMU.
>>>
>>> So instead of playing games on the libvirt side with the host model, I
>>> see the following alternatives:
>>>
>>> 1. Remove the problematic features from the host model in QEMU, like
"we
>>> just don't support this feature". Consequently, any migration of a
VM
>>> with csske=on to a new QEMU version will fail, similar to having an
>>> older QEMU version without support for a certain feature.
>>>
>>> "host-passthrough" would change between QEMU versions ... which I
see as
>>> problematic.
>>>
>>> 2. Introduce a new CPU model that has these new semantics: "host
model"
>>> - deprecated features. Migration of older VMs with csske=on to a new
>>> QEMU version will work. Make libvirt use/expand that new CPU model
>>>
>>> It doesn't necessarily have to be an actual new cpu model. We can use a
>>> feature group, like "-cpu host,deprectated-features=false".
What's
>>> inside "deprecated-features" will actually change between QEMU
versions,
>>> but we don't really care, as the expanded CPU model won't change.
>>>
>>> "host-passthrough" won't change between QEMU versions ...
>>>
>>> 3. As Daniel suggested, don't use the host model, but a CPU model
>>> indicated as "suggested".
>>>
>>> The real issue is that in reality, we don't simply always use a model
>>> like "gen15a", but usually want optional features, if they are
around.
>>> Prime examples are "sie" and friends.
>>>
>>>
>>>
>>> I tend to prefer 2. With 3. I see issues with optional features like
>>> "sie" and friends. Often, you really want "give me all you
got, but
>>> disable deprecated features that might cause problems in the future".
>>>
>>
>> David,
>> if I understand you proposal 2 correctly it sounds a lot like Christians
>> idea of leaving the CPU mode "host-model" as is and introduce a new
CPU
>> mode "host-recommended" for the new semantics in which
>> query-cpu-model-expansion would be called with the additional
>> "deprectated-features" property.
>> That way libvirt would not have to fiddle around with the deprecation
>> itself and users would have the option which semantic they want to use.
>> Is that correct?
>
> Yes, exactly.
>
>
From what I understand:
QEMU
- add a "deprecated-features" feature group (more-or-less David's code)
libvirt
- recognize a new model name "host-recommended"
- query QEMU for host-model + deprecated-features and cache it in caps
file (something like <hostRecCpu>)
- when guest is defined with "host-recommended", pull <hostRecCPU> from
caps when guest is started (similar to how host-model works today)
If this is sufficient, then I can then get to work on this.
My question is what would be the best way to include the deprecated
features when calculating a baseline or comparison. Both work with the
host-model and may no longer present an accurate result. Say, for
example, we baseline a z15 with a gen17 (which will outright not support
CSSKE). With today's implementation, this might result in a ridiculously
old CPU model which also does not support CSSKE. The ideal response
would be a z15 - deprecated features (i.e. host-recommended on a z15),
but we'd need a way to flag to QEMU that we want to exclude the
deprecated features. Or am I totally wrong about this?
QEMU has a concept of versioned QEMU models, so you could define a
z15-v2 version without CSSKE
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|