On Wed, 5 Apr 2017 09:21:17 +0100
"Daniel P. Berrange" <berrange(a)redhat.com> wrote:
On Wed, Apr 05, 2017 at 08:12:31AM +0200, Erik Skultety wrote:
> On Tue, Apr 04, 2017 at 04:23:18PM +0100, Daniel P. Berrange wrote:
> > On Wed, Mar 29, 2017 at 02:51:10PM +0200, Erik Skultety wrote:
> > > This series enables the node device driver to report information about
the
> > > existing mediated devices on the host. There is no device creation
involved
> > > yet. The information reported by the node device driver is split into two
> > > parts, one that is reported within the physical parent's capabilities
> > > (the generic stuff that comes from the mdev types' sysfs attributes,
note the
> > > 'description' attribute which is verbatim - raw,unstructured
string) and the
> > > other that is reported within the mdev child device and merely contains
the
> > > mdev type id, which the device was instantiated from, and the iommu group
> > > number.
> > >
> > > Basically, the format of the XML I went for is as follows:
> > >
> > > PCI parent:
> > > <device>
> > > <name>pci_0000_06_00_0</name>
> > > <path>/sys/devices/.../0000:06:00.0</path>
> > > <parent>pci_0000_05_08_0</parent>
> > > ...
> > > <capability type='pci'>
> > > ...
> > > <capability type='mdev'>
> > > <type id='nvidia-11'>
> > > <name>GRID M60-0B</name>
> > > <description>num_heads=2, frl_config=45, framebuffer=512M,
max_resolution=2560x1600, max_instance=16</description>
> >
> > This 'description' field is pretty horrific.
> >
> > We were quite clear that we were not going to expose arbitrary attributes
> > in the XML without modelling them explicitly as XML elements. Using the
> > description field in this way is just doing arbitrary attribute passthrough
> > via the backdoor - it is clear that applications are doing to end up parsing
> > this 'description' data and will them complain if we later change it.
> >
>
> I remember us stating that, but this is the case with NVIDIA (who at least
> reports some useful information in it) and Intel - what if other vendor comes
> and creates a valid, human readable description of a device without specifying
> any attributes like in the case above? Just because some vendors "abused"
the
> attribute doesn't mean we should stop reporting it completely. IMHO the name
> "description" should mean that it's something raw, possibly
unstructured, and
> thus the applications should treat it that way. Now, this is my bad and I need
> to revisit the last patch with documentation and add a paragraph for the
> <description> element as an optional element with raw data.
>
> Until all the attributes are properly unified/standardized under sysfs ABI and
> respected by vendors, I think we should report everything we're able to gather
> about a device, description included. If properly documented, nobody can
> complain about us breaking anything, since how do you guarantee format-less raw
> string anyway? After all, applications will end up parsing it anyway, be it from
> our XML or not.
I understand your point, but I'm still not in favour of exposing this info
because it is a clear cut attempt to do arbitrary attribute passthrough to
libvirt.
All the attributes show there can be determined by a simple lookup based on
the name field "GRID M60-0B". So if apps want to know the number of heads,
framebuffer size, etc, etc I think they should just maintain a lookup map
based on name in their own code, until we explicitly model this stuff in
the XML.
Once we model the attributes in the XML, this description adds no useful
info that we wouldn't already be reported, and before we model it in the
XML, this is just clearly an abuse of our design statement that we are not
doing generic attribute passthrough.
I told Erik I'd jump in here, but I think I might agree with Dan. On
the kernel side, the description attribute was an attempt to pull
ourselves out of a rat hole of trying to figure out how to classify
devices and then figure out what attributes for each class might be
common across vendors. Clearly it'd be great to somehow know that a
device is a GPU and has a resolution and graphics memory attribute, but
getting there is hard. A free-form description field is sort of a
catch-all where the vendor can provide a stop-gap and it's useful for
human consumption. It would be inadvisable to parse it with machine
code though. Including it in the XML sort of invites that possibility.
A given mdev type is supposed to be stable. It should always point to
a software equivalent device, including capabilities and resources. It
should therefore be valid for upper level software to use the type to
lookup from a human generated table the properties for that device. Of
course there's about zero chance that a type will be perfectly stable,
but hopefully it's a rare event and they stabilize pretty quickly. It'd
be nice to one day revisit that classification and per class attributes
problem in the kernel, but maybe we can let userspace figure out a
common, useful set of attribute per class first. Thanks,
Alex