On Wed, Sep 07, 2016 at 10:44:56AM -0600, Alex Williamson wrote:
On Wed, 7 Sep 2016 21:45:31 +0530
Kirti Wankhede <kwankhede(a)nvidia.com> wrote:
> To hot-plug mdev device to a domain in which there is already a mdev
> device assigned, mdev device should be created with same group number as
> the existing devices are and then hot-plug it. If there is no mdev
> device in that domain, then group number should be a unique number.
>
> This simplifies the mdev grouping and also provide flexibility for
> vendor driver implementation.
The 'start' operation for NVIDIA mdev devices allocate peer-to-peer
resources between mdev devices. Does this not represent some degree of
an isolation hole between those devices? Will peer-to-peer DMA between
devices honor the guest IOVA when mdev devices are placed into separate
address spaces, such as possible with vIOMMU?
Hi Alex,
In reality, the p2p operation will only work under same translation domain.
As we are discussing the multiple mdev per VM use cases, I think we probably
should not just limit it for p2p operation.
So, in general, the NVIDIA vGPU device model's requirement is to know/register
all mdevs per VM before opening any those mdev devices.
I don't particularly like the iommu group solution either, which is why
in my latest proposal I've given the vendor driver a way to indicate
this grouping is required so more flexible mdev devices aren't
restricted by this. But the limited knowledge I have of the hardware
configuration which imposes this restriction on NVIDIA devices seems to
suggest that iommu grouping of these sets is appropriate. The vfio-core
infrastructure is almost entirely built for managing vfio group, which
are just a direct mapping of iommu groups. So the complexity of iommu
groups is already handled. Adding a new layer of grouping into mdev
seems like it's increasing the complexity further, not decreasing it.
I really appreciate your thoughts on this issue, and consideration of how NVIDIA
vGPU device model works, but so far I still feel we are borrowing a very
meaningful concept "iommu group" to solve an device model issues, which I
actually
hope can be workarounded by a more independent piece of logic, and that is why Kirti is
proposing the "mdev group".
Let's see if we can address your concerns / questions in Kirti's reply.
Thanks,
Neo
Thanks,
Alex