Hi!
So it looks like we face a problem with cross-version
migration when using vhost. It's not new but became more
acute with the advent of vhost user.
For users to be able to migrate between different versions
of the hypervisor the interface exposed to guests
by hypervisor must stay unchanged.
The problem is that a qemu device is connected
to a backend in another process, so the interface
exposed to guests depends on the capabilities of that
process.
Specifically, for vhost user interface based on virtio, this includes
the "host features" bitmap that defines the interface, as well as more
host values such as the max ring size. Adding new features/changing
values to this interface is required to make progress, but on the other
hand we need ability to get the old host features to be compatible.
To solve this problem within qemu, qemu has a versioning system based on
a machine type concept which fundamentally is a version string, by
specifying that string one can get hardware compatible with a previous
qemu version. QEMU also reports the latest version and list of versions
supported so libvirt records the version at VM creation and then is
careful to use this machine version whenever it migrates a VM.
One might wonder how is this solved with a kernel vhost backend. The
answer is that it mostly isn't - instead an assumption is made, that
qemu versions are deployed together with the kernel - this is generally
true for downstreams. Thus whenever qemu gains a new feature, it is
already supported by the kernel as well. However, if one attempts
migration with a new qemu from a system with a new to old kernel, one
would get a failure.
In the world where we have multiple userspace backends, with some of
these supplied by ISVs, this seems non-realistic.
IMO we need to support vhost backend versioning, ideally
in a way that will also work for vhost kernel backends.
So I'd like to get some input from both backend and management
developers on what a good solution would look like.
If we want to emulate the qemu solution, this involves adding the
concept of interface versions to dpdk. For example, dpdk could supply a
file (or utility printing?) with list of versions: latest and versions
supported. libvirt could read that and
- store latest version at vm creation
- pass it around with the vm
- pass it to qemu
From here, qemu could pass this over the vhost-user channel,
thus making sure it's initialized with the correct
compatible interface.
As version here is an opaque string for libvirt and qemu,
anything can be used - but I suggest either a list
of values defining the interface, e.g.
any_layout=on,max_ring=256
or a version including the name and vendor of the backend,
e.g. "org.dpdk.v4.5.6".
Note that typically the list of supported versions can only be
extended, not shrunk. Also, if the host/guest interface
does not change, don't change the current version as
this just creates work for everyone.
Thoughts? Would this work well for management? dpdk? vpp?
Thanks!
--
MST