On Fri, 20 Jul 2018 10:19:24 +0800
Zhenyu Wang <zhenyuw(a)linux.intel.com> wrote:
Current mdev device create interface depends on fixed mdev type,
which get uuid
from user to create instance of mdev device. If user wants to use customized
number of resource for mdev device, then only can create new mdev type for that
which may not be flexible. This requirement comes not only from to be able to
allocate flexible resources for KVMGT, but also from Intel scalable IO
virtualization which would use vfio/mdev to be able to allocate arbitrary
resources on mdev instance. More info on [1] [2] [3].
To allow to create user defined resources for mdev, it trys to extend mdev
create interface by adding new "instances=xxx" parameter following uuid, for
target mdev type if aggregation is supported, it can create new mdev device
which contains resources combined by number of instances, e.g
echo "<uuid>,instances=10" > create
VM manager e.g libvirt can check mdev type with "aggregation" attribute which
can support this setting. If no "aggregation" attribute found for mdev type,
previous behavior is still kept for one instance allocation. And new sysfs
attribute "instances" is created for each mdev device to show allocated
number.
This trys to create new KVMGT type with minimal vGPU resources which can be
combined with "instances=x" setting to allocate for user wanted resources.
"instances" makes me think this is arg helps to create multiple mdev
instances rather than consuming multiple instances for a single mdev.
You're already exposing the "aggregation" attribute, so doesn't
"aggregate" perhaps make more sense as the create option? We're asking
the driver to aggregate $NUM instances into a single mdev. The mdev
attribute should then perhaps also be "aggregated_instances".
The next user question for the interface might be what aspect of the
device gets multiplied by this aggregation? In i915 I see you're
multiplying the memory sizes by the instance, but clearly the
resolution doesn't change. I assume this is sort of like mdev types
themselves, ie. some degree of what a type means is buried in the
implementation and some degree of what some number of those types
aggregated together means is impossible to describe generically.
We're also going to need to add aggregation to the checklist for device
compatibility for migration, for example 1) is it the same mdev_type,
1a) are the aggregation counts the same (new), 2) is the host driver
compatible (TBD).
The count handling in create_store(), particularly MDEV_CREATE_OPT_LEN
looks a little suspicious. I think we should just be validating that
the string before the ',' or the entire string if there is no comma is
UUID length. Pass only that to uuid_le_to_bin(). We can then strncmp
as you have for "instances=" (or "aggregate=") but then let's be
sure
to end the string we pass to kstrtouint(), ie. assume there could be
further args.
Documentation/ABI/testing/sysfs-bus-vfio-mdev also needs to be updated
with this series.
I'm curious what libvirt folks and Kirti think of this, it looks like
it has a nice degree of backwards compatibility, both in the sysfs
interface and the vendor driver interface. Thanks,
Alex