On 9/2/2016 10:18 AM, Michal Privoznik wrote:
On 01.09.2016 18:59, Alex Williamson wrote:
> On Thu, 1 Sep 2016 18:47:06 +0200
> Michal Privoznik <mprivozn(a)redhat.com> wrote:
>
>> On 31.08.2016 08:12, Tian, Kevin wrote:
>>>> From: Alex Williamson [mailto:alex.williamson@redhat.com]
>>>> Sent: Wednesday, August 31, 2016 12:17 AM
>>>>
>>>> Hi folks,
>>>>
>>>> At KVM Forum we had a BoF session primarily around the mediated device
>>>> sysfs interface. I'd like to share what I think we agreed on and
the
>>>> "problem areas" that still need some work so we can get the
thoughts
>>>> and ideas from those who weren't able to attend.
>>>>
>>>> DanPB expressed some concern about the mdev_supported_types sysfs
>>>> interface, which exposes a flat csv file with fields like
"type",
>>>> "number of instance", "vendor string", and then a
bunch of type
>>>> specific fields like "framebuffer size",
"resolution", "frame rate
>>>> limit", etc. This is not entirely machine parsing friendly and sort
of
>>>> abuses the sysfs concept of one value per file. Example output taken
>>>> from Neo's libvirt RFC:
>>>>
>>>> cat /sys/bus/pci/devices/0000:86:00.0/mdev_supported_types
>>>> # vgpu_type_id, vgpu_type, max_instance, num_heads, frl_config,
framebuffer,
>>>> max_resolution
>>>> 11 ,"GRID M60-0B", 16, 2, 45, 512M,
2560x1600
>>>> 12 ,"GRID M60-0Q", 16, 2, 60, 512M,
2560x1600
>>>> 13 ,"GRID M60-1B", 8, 2, 45, 1024M,
2560x1600
>>>> 14 ,"GRID M60-1Q", 8, 2, 60, 1024M,
2560x1600
>>>> 15 ,"GRID M60-2B", 4, 2, 45, 2048M,
2560x1600
>>>> 16 ,"GRID M60-2Q", 4, 4, 60, 2048M,
2560x1600
>>>> 17 ,"GRID M60-4Q", 2, 4, 60, 4096M,
3840x2160
>>>> 18 ,"GRID M60-8Q", 1, 4, 60, 8192M,
3840x2160
>>>>
>>>> The create/destroy then looks like this:
>>>>
>>>> echo "$mdev_UUID:vendor_specific_argument_list" >
>>>> /sys/bus/pci/devices/.../mdev_create
>>>>
>>>> echo "$mdev_UUID:vendor_specific_argument_list" >
>>>> /sys/bus/pci/devices/.../mdev_destroy
>>>>
>>>> "vendor_specific_argument_list" is nebulous.
>>>>
>>>> So the idea to fix this is to explode this into a directory structure,
>>>> something like:
>>>>
>>>> ├── mdev_destroy
>>>> └── mdev_supported_types
>>>> ├── 11
>>>> │ ├── create
>>>> │ ├── description
>>>> │ └── max_instances
>>>> ├── 12
>>>> │ ├── create
>>>> │ ├── description
>>>> │ └── max_instances
>>>> └── 13
>>>> ├── create
>>>> ├── description
>>>> └── max_instances
>>>>
>>>> Note that I'm only exposing the minimal attributes here for
simplicity,
>>>> the other attributes would be included in separate files and we would
>>>> require vendors to create standard attributes for common device classes.
>>>
>>> I like this idea. All standard attributes are reflected into this hierarchy.
>>> In the meantime, can we still allow optional vendor string in create
>>> interface? libvirt doesn't need to know the meaning, but allows upper
>>> layer to do some vendor specific tweak if necessary.
>>
>> This is not the best idea IMO. Libvirt is there to shadow differences
>> between hypervisors. While doing that, we often hide differences between
>> various types of HW too. Therefore in order to provide good abstraction
>> we should make vendor specific string as small as possible (ideally an
>> empty string). I mean I see it as bad idea to expose "vgpu_type_id"
from
>> example above in domain XML. What I think the better idea is if we let
>> users chose resolution and frame buffer size, e.g.: <video
>> resolution="1024x768" framebuffer="16"/> (just the first
idea that came
>> to my mind while writing this e-mail). The point is, XML part is
>> completely free of any vendor-specific knobs.
>
> That's not really what you want though, a user actually cares whether
> they get an Intel of NVIDIA vGPU, we can't specify it as just a
> resolution and framebuffer size. The user also doesn't want the model
> changing each time the VM is started, so not only do you *need* to know
> the vendor, you need to know the vendor model. This is the only way to
> provide a consistent VM. So as we discussed at the BoF, the libvirt
> xml will likely reference the vendor string, which will be a unique
> identifier that encompasses all the additional attributes we expose.
> Really the goal of the attributes is simply so you don't need a per
> vendor magic decoder ring to figure out the basic features of a given
> vendor string. Thanks,
Okay, maybe I'm misunderstanding something. I just thought that users
will consult libvirt's nodedev driver (e.g. virsh nodedev-list && virsh
nodedev-dumpxml $id) to fetch vGPU capabilities and then use that info
to construct domain XML.
I'm not familiar with libvirt code, curious how libvirt's nodedev driver
enumerates devices in the system?
Also, I guess libvirt will need some sort of understanding of vGPUs
in
sense that if there are two vGPUs in the system
I think you meant two physical GPUs in the system, right?
(say both INTEL and
NVIDIA) libvirt must create mdev on the right one. I guess we can't rely
solely on vgpu_type_id uniqueness here, can we.
When two GPUs are present in the system, both INTEL and NVIDIA, these
devices have unique domain:bus:device:function. 'mdev_create' sysfs file
for mdev would be present for each device in their device directory (as
per v7 version patch below is the path of 'mdev_create')
/sys/bus/pci/devices/<domain:bus:device:function>/mdev_create
So libvirt need to know on which physical device mdev device need to be
created.
Thanks,
Kirti
Michal