Hi Daniel,

Thanks a lot for the quick reply - it's much appreciated.

> IIUC, the high level scenario is as follows

Yes, the high-level description matches the use-case.

The SmartNIC DPU cores may even rely on something else other than PCIe in a
general case: i.e. they could use a platform device or a different I/O
specification to access the network controller while the hypervisor host would
rely on PCIe. The end result is the same though - hypervisor host PCI addresses
cannot be relied upon to identify the so called "port representors"
(https://lwn.net/Articles/692942/) at the SmartNIC DPU operating system side.

Moreover, there can be multiple SmartNIC DPUs per hypervisor in a general case
each with its own set of PFs and VFs. In order to determine which DPU is going
to handle representor port programming at the control plane level, there needs
to be a way to identify a DPU based on a VF selected by the hypervisor (at
least in Nova, VF selection is driven from the hypervisor side). A board serial
number can be determined both from the hypervisor and the DPU independently
and so the hypervisor services can provide the board serial to the network
control plane for the discovery of a relevant DPU. That's where Libvirt comes
in for helping with serial number retrieval.

> This seems like a reasonable feature request to me, since there is
a piece of info that apps using libvirt need, and libvirt does not
expose this. Requiring the mgmt app like Nova to dig into the host
PCI config space indicates a clear gap in libvirt functionality.

Ack, thanks for confirming.

> Is scenario (2) going to be at all common ? What would be a reason why the
info is not exposed via the standardized VPD - is it just a legacy hardware
issue ?

VPD is an optional capability in the PCI and PCIe specs. While there is hope
that every SmartNIC DPU vendor will implement it seeing the need for it,
there might be some fragmentation because the specs do not mandate its
presence. Scenario (2) is an attempt to have an alternative source for the
same piece of information: if a serial is available via the driver (which may
query NIC firmware instead of reading VPD) it can still be used with the same
end result. The devlink-info API does not mandate that a board serial is
exposed either so there is no guarantee this will be available via devlink.

It will surely be simpler to just implement scenario (1) and add (2)
later if there is a significant need for it. The generally available hardware
I have seen has VPD exposed so I can just focus on (1) while we can decide
on whether to do (2) or not.

Best Regards,
Dmitrii Shcherbakov
LP: ~dmitriis

On Tue, Jun 1, 2021 at 2:32 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
On Fri, May 28, 2021 at 10:58:05PM +0300, Dmitrii Shcherbakov wrote:
> Hello Libvirt Developers,
>
> I am looking for some feedback on a planned enhancement to Libvirt: the aim
> is
> to store a portion of PCI(e) Vital Product Data (VPD) for each device along
> with other PCI/PCIe device information already collected. Specifically, the
> SN
> (Serial Number) read-only field of a VPD data structure of a device is of
> interest which is described in PCI/PCIe specs (PCI local bus 2.1+ and PCIe
> 4.0+).
>
> The context for this is the cross-project work in OpenStack (Nova, Neutron),
> OVS and OVN to support for off-path SmartNIC DPUs ([1], [2], [3], [4]). The
> Nova specification [1] provides an overview of the relevant hardware and the
> use-case for board serial numbers, however, VPD is the standard capability
> in
> the PCI/PCIe specifications not tied to the use-case in particular so the
> suggestion from the Nova core team was to aim at introducing means of
> collecting this information via Libvirt. It can then be retrieved by the
> respective virt driver in Nova via Libvirt without having to introduce this
> code into Nova itself.

I've talked with Sean Mooney at little about the use case this morning.

IIUC, the high level scenario is as follows

 - The main host machine has a PCI controller topology to which the
   NICs are attached. This is how the host OS and by extension libvirt,
   nova, etc, see the PCI devices.

 - There is a second PCI controller topology to which the NICs are
   attached. This is only visible to the arm cores for the offload
   engine

 - Nova/Neutron can identify the NICs based on the PCI topology
   seen by the host OS, but need to tell the NIC mgmt software
   which NIC to use in a way that can be undersood by the offload
   cores.

IOW, the PCI address is not usable as a unique identifier because
there are two completely independant PCI topologies with no mapping
between them.

The VPD data provides a replacement way to identify a NIC based on
a unique serial number that is indendant of PCI topology. Nova needs
this serial number in order to configure the device offload featues.

This seems like a reasonable feature request to me, since there is
a piece of info that apps using libvirt need, and libvirt does not
expose this. Requiring the mgmt app like Nova to dig into the host
PCI config space indicates a clear gap in libvirt functionality.

> I would like to suggest the following to be done in Libvirt:
>
> 1) adding the code for extracting a serial number from VPD for PCI/PCIe
> devices
>    in general and storing it for exposure via the Libvirt API;
> More specifically, I propose adding a nested capability called "vpd" under
> VIR_NODE_DEV_CAP_PCI_DEV:
>   <capability type='pci'>
>     <capability type='vpd'>
>     <serial>UNIQUESERIAL</serial>
> <!-- ... other VPD attributes if present -->
> </capability>
> <!-- ... -->
> </capability>

This looks like a reasonable proposal

> 2) (optional) implementing functionality to obtain a board serial number via
>    devlink-info for PCIe devices if they do not expose a VPD capability
> but the device driver can retrieve it via firmware. The board serial number
> can be stored in the same element as suggested above.

Is scenario (2) going to be at all common ? What would be a reason why the
info is not exposed via the standardized VPD - is it just a legacy hardware
issue ?

> Not all devices expose the devlink API and even fewer do expose board serial
> via devlink-info:
>
> * devlink was added in 4.10 [11];
> * devlink-info was introduced in 5.1 [12];
> * querying for board.serial_number was added in kernel 5.9 [13] and iproute2
>   5.9.0 [14];
> * Besides the generic devlink infrastructure support above, device drivers
>   also need to support exposing this field.
>
> Therefore, implementing two approaches (sysfs VPD, devlink) is preferable
> for better compatibility.
>
> I would appreciate any feedback on whether this potential addition makes
> sense.
> If so, I can look into implementing this.

It makes sense to me.

Regards,
Daniel
--
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|