On 2024/08/18 16:03, Michael S. Tsirkin wrote:
On Sun, Aug 18, 2024 at 02:04:29PM +0900, Akihiko Odaki wrote:
> On 2024/08/09 21:50, Fabiano Rosas wrote:
>> Peter Xu <peterx(a)redhat.com> writes:
>>
>>> On Thu, Aug 08, 2024 at 10:47:28AM -0400, Michael S. Tsirkin wrote:
>>>> On Thu, Aug 08, 2024 at 10:15:36AM -0400, Peter Xu wrote:
>>>>> On Thu, Aug 08, 2024 at 07:12:14AM -0400, Michael S. Tsirkin wrote:
>>>>>> This is too big of a hammer. People already use what you call
"cross
>>>>>> migrate" and have for years. We are not going to stop
developing
>>>>>> features just because someone suddenly became aware of some such
bit.
>>>>>> If you care, you will have to work to solve the problem properly
-
>>>>>> nacking half baked hacks is the only tool maintainers have to
make
>>>>>> people work on hard problems.
>>>>>
>>>>> IMHO this is totally different thing. It's not about proposing a
new
>>>>> feature yet so far, it's about how we should fix a breakage
first.
>>>>>
>>>>> And that's why I think we should fix it even in the simple way
first, then
>>>>> we consider anything more benefitial from perf side without breaking
>>>>> anything, which should be on top of that.
>>>>>
>>>>> Thanks,
>>>>
>>>> As I said, once the quick hack is merged people stop caring.
>>>
>>> IMHO it's not a hack. It's a proper fix to me to disable it by
default for
>>> now.
>>>
>>> OTOH, having it ON always even knowing it can break migration is a hack to
>>> me, when we don't have anything else to guard the migration.
>>>
>>>> Mixing different kernel versions in migration is esoteric enough for
>>>> this not to matter to most people. There's no rush I think, address
>>>> it properly.
>>>
>>> Exactly mixing kernel versions will be tricky to users to identify, but
>>> that's, AFAICT, exactly happening everywhere. We can't urge user to
always
>>> use the exact same kernels when we're talking about a VM cluster.
That's
>>> why I think allowing migration to work across those kernels matter.
>>
>> I also worry a bit about the scenario where the cluster changes slightly
>> and now all VMs are already restricted by some option that requires the
>> exact same kernel. Specifically, kernel changes in a cloud environment
>> also happen due to factors completely unrelated to migration. I'm not
>> sure the people managing the infra (who care about migration) will be
>> gating kernel changes just because QEMU has been configured in a
>> specific manner.
>
> I have wrote a bit about the expectation on the platform earlier[1], but let
> me summarize it here.
>
> 1. I expect the user will not downgrade the platform of hosts after setting
> up a VM. This is essential to enable any platform feature.
>
> 2. The user is allowed to upgrade the platform of hosts gradually. This
> results in a situation with mixed platforms. The oldest platform is still
> not older than the platform the VM is set up for. This enables the gradual
> deployment strategy.
>
> 3. the user is allowed to downgrade the platform of hosts to the version
> used when setting up the VM. This enables rollbacks in case of regression.
>
> With these expectations, we can ensure migratability by a) enabling platform
> features available on all hosts when setting up the VM and b) saving the
> enabled features. This is covered with my
> -dump-platform/-merge-platform/-use-platform proposal[2].
I really like [2]. Do you plan to work on it? Does anyone else?
No, but I want to move "[PATCH v3 0/5] virtio-net: Convert feature
properties to OnOffAuto" forward:
https://patchew.org/QEMU/20240714-auto-v3-0-e27401aabab3@daynix.com/
This will clarify the existence of the "auto" semantics, which is to
enable a platform feature based on availability. [2] will be regarded as
a feature to improve the handling of the "auto" semantics once this
change lands.
Regards,
Akihiko Odaki
>
>> Regards,
>> Akihiko Odaki
>>
>> [1]
>>
https://lore.kernel.org/r/2b62780c-a6cb-4262-beb5-81d54c14f545@daynix.com
>> [2]
>>
https://lore.kernel.org/all/2da4ebcd-2058-49c3-a4ec-8e60536e5cbb@daynix.com/
>