
On Fri, Apr 17, 2020 at 07:24:57PM +0800, Cornelia Huck wrote:
On Fri, 17 Apr 2020 05:52:02 -0400 Yan Zhao <yan.y.zhao@intel.com> wrote:
On Fri, Apr 17, 2020 at 04:44:50PM +0800, Cornelia Huck wrote:
On Mon, 13 Apr 2020 01:52:01 -0400 Yan Zhao <yan.y.zhao@intel.com> wrote:
This patchset introduces a migration_version attribute under sysfs of VFIO Mediated devices.
This migration_version attribute is used to check migration compatibility between two mdev devices.
Currently, it has two locations: (1) under mdev_type node, which can be used even before device creation, but only for mdev devices of the same mdev type. (2) under mdev device node, which can only be used after the mdev devices are created, but the src and target mdev devices are not necessarily be of the same mdev type (The second location is newly added in v5, in order to keep consistent with the migration_version node for migratable pass-though devices)
What is the relationship between those two attributes?
(1) is for mdev devices specifically, and (2) is provided to keep the same sysfs interface as with non-mdev cases. so (2) is for both mdev devices and non-mdev devices.
in future, if we enable vfio-pci vendor ops, (i.e. a non-mdev device is binding to vfio-pci, but is able to register migration region and do migration transactions from a vendor provided affiliate driver), the vendor driver would export (2) directly, under device node. It is not able to provide (1) as there're no mdev devices involved.
Ok, creating an alternate attribute for non-mdev devices makes sense. However, wouldn't that rather be a case (3)? The change here only refers to mdev devices.
as you pointed below, (3) and (2) serve the same purpose. and I think a possible usage is to migrate between a non-mdev device and an mdev device. so I think it's better for them both to use (2) rather than creating (3).
Is existence (and compatibility) of (1) a pre-req for possible existence (and compatibility) of (2)?
no. (2) does not reply on (1).
Hm. Non-existence of (1) seems to imply "this type does not support migration". If an mdev created for such a type suddenly does support migration, it feels a bit odd.
yes. but I think if the condition happens, it should be reported a bug to vendor driver. should I add a line in the doc like "vendor driver should ensure that the migration compatibility from migration_version under mdev_type should be consistent with that from migration_version under device node" ?
(It obviously cannot be a prereq for what I called (3) above.)
Does userspace need to check (1) or can it completely rely on (2), if it so chooses?
I think it can completely reply on (2) if compatibility check before mdev creation is not required.
If devices with a different mdev type are indeed compatible, it seems userspace can only find out after the devices have actually been created, as (1) does not apply? yes, I think so.
How useful would it be for userspace to even look at (1) in that case? It only knows if things have a chance of working if it actually goes ahead and creates devices.
hmm, is it useful for userspace to test the migration_version under mdev type before it knows what mdev device to generate ? like when the userspace wants to migrate an mdev device in src vm, but it has not created target vm and the target mdev device.
One of my worries is that the existence of an attribute with the same name in two similar locations might lead to confusion. But maybe it isn't a problem.
Yes, I have the same feeling. but as (2) is for sysfs interface consistency, to make it transparent to userspace tools like libvirt, I guess the same name is necessary?
What do we actually need here, I wonder? (1) and (2) seem to serve slightly different purposes, while (2) and what I called (3) have the same purpose. Is it important to userspace that (1) and (2) have the same name? so change (1) to migration_type_version and (2) to migration_instance_version? But as they are under different locations, could that location imply enough information?
Thanks Yan