On Tue, 18 Apr 2017 09:52:38 +0100
"Daniel P. Berrange" <berrange(a)redhat.com> wrote:
On Tue, Apr 18, 2017 at 10:49:40AM +0200, Martin Polednik wrote:
> On 12/04/17 16:19 -0600, Alex Williamson wrote:
> > On Wed, 5 Apr 2017 09:21:17 +0100
> > "Daniel P. Berrange" <berrange(a)redhat.com> wrote:
> >
> > > On Wed, Apr 05, 2017 at 08:12:31AM +0200, Erik Skultety wrote:
> > > > On Tue, Apr 04, 2017 at 04:23:18PM +0100, Daniel P. Berrange wrote:
> > > > > On Wed, Mar 29, 2017 at 02:51:10PM +0200, Erik Skultety wrote:
> > > > > > This series enables the node device driver to report
information about the
> > > > > > existing mediated devices on the host. There is no device
creation involved
> > > > > > yet. The information reported by the node device driver is
split into two
> > > > > > parts, one that is reported within the physical
parent's capabilities
> > > > > > (the generic stuff that comes from the mdev types'
sysfs attributes, note the
> > > > > > 'description' attribute which is verbatim -
raw,unstructured string) and the
> > > > > > other that is reported within the mdev child device and
merely contains the
> > > > > > mdev type id, which the device was instantiated from, and
the iommu group
> > > > > > number.
> > > > > >
> > > > > > Basically, the format of the XML I went for is as follows:
> > > > > >
> > > > > > PCI parent:
> > > > > > <device>
> > > > > > <name>pci_0000_06_00_0</name>
> > > > > > <path>/sys/devices/.../0000:06:00.0</path>
> > > > > > <parent>pci_0000_05_08_0</parent>
> > > > > > ...
> > > > > > <capability type='pci'>
> > > > > > ...
> > > > > > <capability type='mdev'>
> > > > > > <type id='nvidia-11'>
> > > > > > <name>GRID M60-0B</name>
> > > > > > <description>num_heads=2, frl_config=45,
framebuffer=512M, max_resolution=2560x1600, max_instance=16</description>
> > > > >
> > > > > This 'description' field is pretty horrific.
> > > > >
> > > > > We were quite clear that we were not going to expose arbitrary
attributes
> > > > > in the XML without modelling them explicitly as XML elements.
Using the
> > > > > description field in this way is just doing arbitrary attribute
passthrough
> > > > > via the backdoor - it is clear that applications are doing to
end up parsing
> > > > > this 'description' data and will them complain if we
later change it.
> > > > >
> > > >
> > > > I remember us stating that, but this is the case with NVIDIA (who at
least
> > > > reports some useful information in it) and Intel - what if other
vendor comes
> > > > and creates a valid, human readable description of a device without
specifying
> > > > any attributes like in the case above? Just because some vendors
"abused" the
> > > > attribute doesn't mean we should stop reporting it completely.
IMHO the name
> > > > "description" should mean that it's something raw,
possibly unstructured, and
> > > > thus the applications should treat it that way. Now, this is my bad
and I need
> > > > to revisit the last patch with documentation and add a paragraph for
the
> > > > <description> element as an optional element with raw data.
> > > >
> > > > Until all the attributes are properly unified/standardized under
sysfs ABI and
> > > > respected by vendors, I think we should report everything we're
able to gather
> > > > about a device, description included. If properly documented, nobody
can
> > > > complain about us breaking anything, since how do you guarantee
format-less raw
> > > > string anyway? After all, applications will end up parsing it anyway,
be it from
> > > > our XML or not.
> > >
> > > I understand your point, but I'm still not in favour of exposing this
info
> > > because it is a clear cut attempt to do arbitrary attribute passthrough
to
> > > libvirt.
> > >
> > > All the attributes show there can be determined by a simple lookup based
on
> > > the name field "GRID M60-0B". So if apps want to know the number
of heads,
> > > framebuffer size, etc, etc I think they should just maintain a lookup map
> > > based on name in their own code, until we explicitly model this stuff in
> > > the XML.
> > >
> > > Once we model the attributes in the XML, this description adds no useful
> > > info that we wouldn't already be reported, and before we model it in
the
> > > XML, this is just clearly an abuse of our design statement that we are
not
> > > doing generic attribute passthrough.
> >
> > I told Erik I'd jump in here, but I think I might agree with Dan. On
> > the kernel side, the description attribute was an attempt to pull
> > ourselves out of a rat hole of trying to figure out how to classify
> > devices and then figure out what attributes for each class might be
> > common across vendors. Clearly it'd be great to somehow know that a
> > device is a GPU and has a resolution and graphics memory attribute, but
> > getting there is hard. A free-form description field is sort of a
> > catch-all where the vendor can provide a stop-gap and it's useful for
> > human consumption. It would be inadvisable to parse it with machine
> > code though. Including it in the XML sort of invites that possibility.
> >
> > A given mdev type is supposed to be stable. It should always point to
> > a software equivalent device, including capabilities and resources. It
> > should therefore be valid for upper level software to use the type to
> > lookup from a human generated table the properties for that device. Of
>
> From an upper level software's perspective, this is super inefficient
> unless database like pci IDs exist. Each management software
> maintaining their possibly outdated lookup table may lead to
> inconsistence in the user experience and data.
>
> If such database existed, there is no reason for libvirt not to use it
> and report the data in some sane format.
IMHO it would be a waste of time for libvirt to do this. Effort would be
better spent implementing the solution we requested right at the start,
which is for the kernel to export the information in a stable format via
dedicated sysfs attributes. Any userspace DB in applications is just a
short term lack to workaround this missing kernel feature.
...and a pony too! Seriously, "requesting" this feature isn't
necessarily going to make it happen. How do we classify devices? How
do we define attributes per class that multiple vendors can agree to
and are extensible to future vendors? This seems exactly like the sort
of thing that needs to be flushed out in userspace before committing to
poorly defined ABIs in the kernel. Thanks,
Alex