On 9/3/2016 6:37 PM, Paolo Bonzini wrote:
On 03/09/2016 13:56, John Ferlan wrote:
> On 09/02/2016 05:48 PM, Paolo Bonzini wrote:
>> On 02/09/2016 20:33, Kirti Wankhede wrote:
>>> <Alex> We could even do:
>>>>>
>>>>> echo $UUID1:$GROUPA > create
>>>>>
>>>>> where $GROUPA is the group ID of a previously created mdev device
into
>>>>> which $UUID1 is to be created and added to the same group.
>>> </Alex>
>>
>> >From the point of view of libvirt, I think I prefer Alex's idea.
>> <group> could be an additional element in the nodedev-create XML:
>>
>> <device>
>> <name>my-vgpu</name>
>> <parent>pci_0000_86_00_0</parent>
>> <capability type='mdev'>
>> <type id='11'/>
>> <uuid>0695d332-7831-493f-9e71-1c85c8911a08</uuid>
>> <group>group1</group>
>> </capability>
>> </device>
>>
>> (should group also be a UUID?)
>
I replied to earlier mail too, group number doesn't need to be UUID. It
should be a unique number. I think in the discussion in bof someone
mentioned about using domain's unique number that libvirt generates.
That should also work.
> As long as create_group handles all the work and all libvirt does
is
> call it, get the return status/error, and handle deleting the vGPU on
> error, then I guess it's doable.
>
Yes that is the idea. Libvirt doesn't have to care about the groups.
With Alex's proposal, as you mentioned above, libvirt have to provide
group number to mdev_create, check return status and handle error case.
echo $UUID1:$GROUP1 > mdev_create
echo $UUID2:$GROUP1 > mdev_create
would create two mdev devices assigned to same domain.
> Alternatively having multiple <type id='#'> in the
XML and performing a
> single *mdev/create_group is an option.
I don't really like the idea of a single nodedev-create creating
multiple devices, but that would work too.
> That is, what is the "output" from create_group that gets added to the
> domain XML? How is that found?
A new sysfs path is created, whose name depends on the UUID. The UUID
is used in a <hostdev> element in the domain XML and the sysfs path
appears in the QEMU command line. Kirti and Neo had examples in their
presentation at KVM Forum.
If you create multiple devices in the same group, they are added to the
same IOMMU group so they must be used by the same VM. However they
don't have to be available from the beginning; they could be
hotplugged/hot-unplugged later, since from the point of view of the VM
those are just another PCI device.
> Also, once the domain is running can a
> vGPU be added to the group? Removed? What allows/prevents?
Kirti?... :)
Yes, vGPU could be hot-plugged or hot-unplugged. This also depends on
does vendor driver want to support that. For example, domain is running
with two vGPUs $UUID1 and $UUID2 and user tried to hot-unplug vGPU
$UUID2, vendor driver knows that domain is running and vGPU is being
used in guest, so vendor driver can fail offline/close() call if they
don't support hot-unplug. Similarly for hot-plug vendor driver can fail
create call to not to support hot-plug.
In principle I don't think anything should block vGPUs from different
groups being added to the same VM, but I have to defer to Alex and Kirti
again on this.
No, there should be one group per VM.
>> Since John brought up the topic of minimal XML, in this case
it will be
>> like this:
>>
>> <device>
>> <name>my-vgpu</name>
>> <parent>pci_0000_86_00_0</parent>
>> <capability type='mdev'>
>> <type id='11'/>
>> </capability>
>> </device>
>>
>> The uuid will be autogenerated by libvirt and if there's no <group>
(as
>> is common for VMs with only 1 vGPU) it will be a single-device group.
>
> The <name> could be ignored as it seems existing libvirt code wants to
> generate a name via udevGenerateDeviceName for other devices. I haven't
> studied it long enough, but I believe that's how those pci_####* names
> created.
Yeah that makes sense. So we get down to a minimal XML that has just
parent, and capability with type in it; additional elements could be
name (ignored anyway), and within capability uuid and group.
Yes, this seems good.
I would like to have one more capability here. Pulling here some
suggestion from my previous mail:
In the directory structure, a 'params' can take optional parameters.
Libvirt then can set 'params' and then create mdev device. For example,
param say 'disable_console_vnc=1' is set for type 11, then devices
created of type 11 will have that param set unless it is cleared.
└── mdev_supported_types
├── 11
│ ├── create
│ ├── description
│ └── max_instances
│ └── params
├── 12
│ ├── create
│ ├── description
│ └── max_instances
│ └── params
└── 13
├── create
├── description
└── max_instances
└── params
So with that XML format would be:
<device>
<name>my-vgpu</name>
<parent>pci_0000_86_00_0</parent>
<capability type='mdev'>
<type id='11'/>
<group>group1</group>
<params>disable_console_vnc=1</params>
</capability>
</device>
and 'params' field should be just a string to libvirt and its optional
also. If user want to provide extra parameter while creating vGPU device
they should provide it in XML file as above to nodedev-create.
Very initial proposal was to have this extra paramter list as a string
to mdev_create itself as:
echo $UUID1:$PARAMS > mdev_create
I would like to know others opinions on whether it should be part of
mdev_create input or a separate write to 'params' file in sysfs as in
above directory structure.
Kirti.
Thanks,
Paolo