On 2024/08/09 21:50, Fabiano Rosas wrote:
Peter Xu <peterx(a)redhat.com> writes:
> On Thu, Aug 08, 2024 at 10:47:28AM -0400, Michael S. Tsirkin wrote:
>> On Thu, Aug 08, 2024 at 10:15:36AM -0400, Peter Xu wrote:
>>> On Thu, Aug 08, 2024 at 07:12:14AM -0400, Michael S. Tsirkin wrote:
>>>> This is too big of a hammer. People already use what you call
"cross
>>>> migrate" and have for years. We are not going to stop developing
>>>> features just because someone suddenly became aware of some such bit.
>>>> If you care, you will have to work to solve the problem properly -
>>>> nacking half baked hacks is the only tool maintainers have to make
>>>> people work on hard problems.
>>>
>>> IMHO this is totally different thing. It's not about proposing a new
>>> feature yet so far, it's about how we should fix a breakage first.
>>>
>>> And that's why I think we should fix it even in the simple way first,
then
>>> we consider anything more benefitial from perf side without breaking
>>> anything, which should be on top of that.
>>>
>>> Thanks,
>>
>> As I said, once the quick hack is merged people stop caring.
>
> IMHO it's not a hack. It's a proper fix to me to disable it by default for
> now.
>
> OTOH, having it ON always even knowing it can break migration is a hack to
> me, when we don't have anything else to guard the migration.
>
>> Mixing different kernel versions in migration is esoteric enough for
>> this not to matter to most people. There's no rush I think, address
>> it properly.
>
> Exactly mixing kernel versions will be tricky to users to identify, but
> that's, AFAICT, exactly happening everywhere. We can't urge user to always
> use the exact same kernels when we're talking about a VM cluster. That's
> why I think allowing migration to work across those kernels matter.
I also worry a bit about the scenario where the cluster changes slightly
and now all VMs are already restricted by some option that requires the
exact same kernel. Specifically, kernel changes in a cloud environment
also happen due to factors completely unrelated to migration. I'm not
sure the people managing the infra (who care about migration) will be
gating kernel changes just because QEMU has been configured in a
specific manner.
I have wrote a bit about the expectation on the platform earlier[1], but
let me summarize it here.
1. I expect the user will not downgrade the platform of hosts after
setting up a VM. This is essential to enable any platform feature.
2. The user is allowed to upgrade the platform of hosts gradually. This
results in a situation with mixed platforms. The oldest platform is
still not older than the platform the VM is set up for. This enables the
gradual deployment strategy.
3. the user is allowed to downgrade the platform of hosts to the version
used when setting up the VM. This enables rollbacks in case of regression.
With these expectations, we can ensure migratability by a) enabling
platform features available on all hosts when setting up the VM and b)
saving the enabled features. This is covered with my
-dump-platform/-merge-platform/-use-platform proposal[2].
Regards,
Akihiko Odaki
[1]
https://lore.kernel.org/r/2b62780c-a6cb-4262-beb5-81d54c14f545@daynix.com
[2]
https://lore.kernel.org/all/2da4ebcd-2058-49c3-a4ec-8e60536e5cbb@daynix.com/