Peter Xu <peterx(a)redhat.com> writes:
On Wed, Jul 10, 2024 at 06:38:26PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx(a)redhat.com> writes:
>
> > On Wed, Jul 10, 2024 at 04:48:23PM -0300, Fabiano Rosas wrote:
> >> Peter Xu <peterx(a)redhat.com> writes:
> >>
> >> > On Wed, Jul 10, 2024 at 01:21:51PM -0300, Fabiano Rosas wrote:
> >> >> It's not about trust, we simply don't support migrations
other than
> >> >> n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not
included.
> >> >
> >> > Where does it come from? I thought we suppport that..
> >>
> >> I'm taking that from:
> >>
> >> docs/devel/migration/main.rst:
> >> "In general QEMU tries to maintain forward migration compatibility
> >> (i.e. migrating from QEMU n->n+1) and there are users who benefit from
> >> backward compatibility as well."
> >>
> >> But of course it doesn't say whether that comes with a transitive rule
> >> allowing n->n+2 migrations.
> >
> > I'd say that "i.e." implies n->n+1 is not the only forward
migration we
> > would support.
> >
> > I _think_ we should support all forward migration as long as the machine
> > type matches.
> >
> >>
> >> >
> >> > The same question would be: are we requesting an OpenStack cluster to
> >> > always upgrade QEMU with +1 versions, otherwise migration will fail?
> >>
> >> Will an OpenStack cluster be using upstream QEMU? If not, then that's a
> >
> > It's an example to show what I meant! :) Nothing else. Definitely not
> > saying that everyone should use an upstream released QEMU (but in reality,
> > it's not a problem, I think, and I do feel like people use them, perhaps
> > more with the stable releases).
> >
> >> question for the distro. In a very practical sense, we're not
requesting
> >> anything. We barely test n->n+1/n->n-1, even if we had a strong
support
> >> statement I wouldn't be confident saying migration from QEMU 2.7 ->
QEMU
> >> 9.1 should succeed.
> >
> > No matter what we test in CI, I don't think we should break that for >1
> > versions.. I hope 2.7->9.1 keeps working, otherwise I think it's legal
to
> > file a bug by anyone.
> >
> > For example, I randomly fetched a bug report:
> >
> >
https://gitlab.com/qemu-project/qemu/-/issues/1937
> >
> > QEMU version: 6.2 and 7.2.5
> >
> > And I believe that's the common case even for upstream. If we don't do
> > that right for upstream, it can be impossible tasks for downstream and for
> > all of us to maintain.
>
> But do we do that right currently? I have no idea. Have we ever done
> it? And we're here discussing a hypothetical 2.7->9.1 ...
>
> So we cannot reuse the UNUSED field because QEMU from 2016 might send
> their data and QEMU from 2024 would interpret it wrong.
>
> How do we proceed? Add a subsection. And make the code survive when
> receiving 0.
>
> @Peter is that it? What about backwards-compat? We'll need a property as
> well it seems.
Compat property is definitely one way to go, but I think it's you who more
or less persuaded me that reusing it seems possible! At least I can't yet
think of anything bad if it's ancient unused buffers.
Since we're allowing any old QEMU version to migrate to the most recent
one, we need to think of the data that was there before the introduction
of the UNUSED field. If that QEMU version is used, then it's not going
to be zeroes on the stream, but whatever data was there before. The new
QEMU will be expecting the vendor_data introduced in this patch.
And that's why I was asking about a sane way to describe the
"magic
year".. And I was very serious when I said "6 years" to follow the
deprecation of machine types, because it'll be something we can follow
to say when an unused buffer can be obsolete and it could make some
sense: if we will start to deprecate machine types, then it may not
make sense to keep any UNUSED compatible forever, after all.
Is there an easy way to look at a field and tell in which machine type's
timeframe it was introduced? If the machine type of that era has been
removed, then the field is free to go as well. I'd prefer if we had a
hard link instead of just counting years. Maybe we should to that
mapping at the machine deprecation time? As in, "look at the unused
fields introduced in that timeframe and mark them free".
> And we need that ruler to be as accurate as "always 6 years to follow
> machine type deprecation procedure", in case someone else tomorrow asks us
> something that was only UNUSED since 2018. We need a rule of thumb if we
> want to reuse it, and if all of you agree we can start with this one,
> perhaps with a comment above the field (before we think all through and
> make it a rule on deprecating UNUSED)?