Storing PCI(e) VPD Board Serial Numbers

Hello Libvirt Developers, I am looking for some feedback on a planned enhancement to Libvirt: the aim is to store a portion of PCI(e) Vital Product Data (VPD) for each device along with other PCI/PCIe device information already collected. Specifically, the SN (Serial Number) read-only field of a VPD data structure of a device is of interest which is described in PCI/PCIe specs (PCI local bus 2.1+ and PCIe 4.0+). The context for this is the cross-project work in OpenStack (Nova, Neutron), OVS and OVN to support for off-path SmartNIC DPUs ([1], [2], [3], [4]). The Nova specification [1] provides an overview of the relevant hardware and the use-case for board serial numbers, however, VPD is the standard capability in the PCI/PCIe specifications not tied to the use-case in particular so the suggestion from the Nova core team was to aim at introducing means of collecting this information via Libvirt. It can then be retrieved by the respective virt driver in Nova via Libvirt without having to introduce this code into Nova itself. Quoting the PCI(e) specs: * "Vital Product Data (VPD) is the information that uniquely defines items such as the hardware, software, and microcode elements of a system."; * "Vital Product Data is made up of Small and Large Resource Data Types."; * "Large resource type VPD-R Tag: This tag contains the read only VPD keywords for an add-in card." * SN read-only field: "The characters are alphanumeric and represent the unique add-in card Serial Number." The VPD capability is optional per the specification so it may or may not appear for PCI(e) endpoints. The devices of interest (SmartNIC DPUs), however, generally have it exposed. The PCI/PCIe specs define a binary format for VPD and a sysfs entry exposing a binary blob in that format has been available since kernel v2.6.26 [5]. The relevant sections of specs are: * "6.4. Vital Product Data" in the PCI Local Bus specification; * "6.28 Vital Product Data (VPD)" in the PCIe 4.0 Base Specification. Note that the serial number stored in VPD is not identical to the information stored in the Device Serial Number (DSN) capability also present in the specs as it may identify a component on a board which presents a multi-function device but the board itself may have multiple components ([9] also makes a distinction between a board serial and a device serial). As a reference, there is some code to parse and print the VPD in lspci [6] and there is a prototype along those lines in Python [7], a polished version of which I plan to use in Nova until the relevant functionality appears in Libvirt. Likewise, the devlink kernel infrastructure, which is already used in Libvirt to query additional device capabilities [8] (e.g. the presence of an eswitch and its switchdev mode) has a devlink-info API [9] that exposes a way to query a board serial number if a device driver exposes it (in turn, by querying controller firmware or via PCIe VPD). This allows doing that in a bus-independent manner (e.g. it would work for PCIe, platform devices or other I/O interconnects) but in the context of devices that implement devlink API only (which are not necessarily network devices [10] but most of them currently are). I would like to suggest the following to be done in Libvirt: 1) adding the code for extracting a serial number from VPD for PCI/PCIe devices in general and storing it for exposure via the Libvirt API; More specifically, I propose adding a nested capability called "vpd" under VIR_NODE_DEV_CAP_PCI_DEV: <capability type='pci'> <capability type='vpd'> <serial>UNIQUESERIAL</serial> <!-- ... other VPD attributes if present --> </capability> <!-- ... --> </capability> 2) (optional) implementing functionality to obtain a board serial number via devlink-info for PCIe devices if they do not expose a VPD capability but the device driver can retrieve it via firmware. The board serial number can be stored in the same element as suggested above. Not all devices expose the devlink API and even fewer do expose board serial via devlink-info: * devlink was added in 4.10 [11]; * devlink-info was introduced in 5.1 [12]; * querying for board.serial_number was added in kernel 5.9 [13] and iproute2 5.9.0 [14]; * Besides the generic devlink infrastructure support above, device drivers also need to support exposing this field. Therefore, implementing two approaches (sysfs VPD, devlink) is preferable for better compatibility. I would appreciate any feedback on whether this potential addition makes sense. If so, I can look into implementing this. [1] https://review.opendev.org/c/openstack/nova-specs/+/787458 [2] https://review.opendev.org/c/openstack/neutron-specs/+/788821 [3] https://patchwork.ozlabs.org/project/openvswitch/patch/20210323145032.453120... [4] https://patchwork.ozlabs.org/project/ovn/patch/20210509140305.1910796-1-frod... [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... [6] https://github.com/pciutils/pciutils/blob/v3.7.0/ls-vpd.c#L95-L216 [7] https://gist.github.com/dshcherb/40e982989599a757e5b1e25999501019 [8] https://github.com/libvirt/libvirt/blob/v7.3.0/src/util/virnetdev.c#L3167-L3... [9] https://www.kernel.org/doc/html/latest/networking/devlink/devlink-info.html [10] https://www.kernel.org/doc/html/latest/networking/devlink/index.html [11] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... [12] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... [13] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... [14] https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=7332... Best Regards, Dmitrii Shcherbakov LP: ~dmitriis

On Fri, May 28, 2021 at 10:58:05PM +0300, Dmitrii Shcherbakov wrote:
Hello Libvirt Developers,
I am looking for some feedback on a planned enhancement to Libvirt: the aim is to store a portion of PCI(e) Vital Product Data (VPD) for each device along with other PCI/PCIe device information already collected. Specifically, the SN (Serial Number) read-only field of a VPD data structure of a device is of interest which is described in PCI/PCIe specs (PCI local bus 2.1+ and PCIe 4.0+).
The context for this is the cross-project work in OpenStack (Nova, Neutron), OVS and OVN to support for off-path SmartNIC DPUs ([1], [2], [3], [4]). The Nova specification [1] provides an overview of the relevant hardware and the use-case for board serial numbers, however, VPD is the standard capability in the PCI/PCIe specifications not tied to the use-case in particular so the suggestion from the Nova core team was to aim at introducing means of collecting this information via Libvirt. It can then be retrieved by the respective virt driver in Nova via Libvirt without having to introduce this code into Nova itself.
I've talked with Sean Mooney at little about the use case this morning. IIUC, the high level scenario is as follows - The main host machine has a PCI controller topology to which the NICs are attached. This is how the host OS and by extension libvirt, nova, etc, see the PCI devices. - There is a second PCI controller topology to which the NICs are attached. This is only visible to the arm cores for the offload engine - Nova/Neutron can identify the NICs based on the PCI topology seen by the host OS, but need to tell the NIC mgmt software which NIC to use in a way that can be undersood by the offload cores. IOW, the PCI address is not usable as a unique identifier because there are two completely independant PCI topologies with no mapping between them. The VPD data provides a replacement way to identify a NIC based on a unique serial number that is indendant of PCI topology. Nova needs this serial number in order to configure the device offload featues. This seems like a reasonable feature request to me, since there is a piece of info that apps using libvirt need, and libvirt does not expose this. Requiring the mgmt app like Nova to dig into the host PCI config space indicates a clear gap in libvirt functionality.
I would like to suggest the following to be done in Libvirt:
1) adding the code for extracting a serial number from VPD for PCI/PCIe devices in general and storing it for exposure via the Libvirt API; More specifically, I propose adding a nested capability called "vpd" under VIR_NODE_DEV_CAP_PCI_DEV: <capability type='pci'> <capability type='vpd'> <serial>UNIQUESERIAL</serial> <!-- ... other VPD attributes if present --> </capability> <!-- ... --> </capability>
This looks like a reasonable proposal
2) (optional) implementing functionality to obtain a board serial number via devlink-info for PCIe devices if they do not expose a VPD capability but the device driver can retrieve it via firmware. The board serial number can be stored in the same element as suggested above.
Is scenario (2) going to be at all common ? What would be a reason why the info is not exposed via the standardized VPD - is it just a legacy hardware issue ?
Not all devices expose the devlink API and even fewer do expose board serial via devlink-info:
* devlink was added in 4.10 [11]; * devlink-info was introduced in 5.1 [12]; * querying for board.serial_number was added in kernel 5.9 [13] and iproute2 5.9.0 [14]; * Besides the generic devlink infrastructure support above, device drivers also need to support exposing this field.
Therefore, implementing two approaches (sysfs VPD, devlink) is preferable for better compatibility.
I would appreciate any feedback on whether this potential addition makes sense. If so, I can look into implementing this.
It makes sense to me. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Hi Daniel, Thanks a lot for the quick reply - it's much appreciated.
IIUC, the high level scenario is as follows
Yes, the high-level description matches the use-case. The SmartNIC DPU cores may even rely on something else other than PCIe in a general case: i.e. they could use a platform device or a different I/O specification to access the network controller while the hypervisor host would rely on PCIe. The end result is the same though - hypervisor host PCI addresses cannot be relied upon to identify the so called "port representors" (https://lwn.net/Articles/692942/) at the SmartNIC DPU operating system side. Moreover, there can be multiple SmartNIC DPUs per hypervisor in a general case each with its own set of PFs and VFs. In order to determine which DPU is going to handle representor port programming at the control plane level, there needs to be a way to identify a DPU based on a VF selected by the hypervisor (at least in Nova, VF selection is driven from the hypervisor side). A board serial number can be determined both from the hypervisor and the DPU independently and so the hypervisor services can provide the board serial to the network control plane for the discovery of a relevant DPU. That's where Libvirt comes in for helping with serial number retrieval.
This seems like a reasonable feature request to me, since there is a piece of info that apps using libvirt need, and libvirt does not expose this. Requiring the mgmt app like Nova to dig into the host PCI config space indicates a clear gap in libvirt functionality.
Ack, thanks for confirming.
Is scenario (2) going to be at all common ? What would be a reason why the info is not exposed via the standardized VPD - is it just a legacy hardware issue ?
VPD is an optional capability in the PCI and PCIe specs. While there is hope that every SmartNIC DPU vendor will implement it seeing the need for it, there might be some fragmentation because the specs do not mandate its presence. Scenario (2) is an attempt to have an alternative source for the same piece of information: if a serial is available via the driver (which may query NIC firmware instead of reading VPD) it can still be used with the same end result. The devlink-info API does not mandate that a board serial is exposed either so there is no guarantee this will be available via devlink. It will surely be simpler to just implement scenario (1) and add (2) later if there is a significant need for it. The generally available hardware I have seen has VPD exposed so I can just focus on (1) while we can decide on whether to do (2) or not. Best Regards, Dmitrii Shcherbakov LP: ~dmitriis On Tue, Jun 1, 2021 at 2:32 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
Hello Libvirt Developers,
I am looking for some feedback on a planned enhancement to Libvirt: the aim is to store a portion of PCI(e) Vital Product Data (VPD) for each device along with other PCI/PCIe device information already collected. Specifically,
SN (Serial Number) read-only field of a VPD data structure of a device is of interest which is described in PCI/PCIe specs (PCI local bus 2.1+ and PCIe 4.0+).
The context for this is the cross-project work in OpenStack (Nova, Neutron), OVS and OVN to support for off-path SmartNIC DPUs ([1], [2], [3], [4]). The Nova specification [1] provides an overview of the relevant hardware and
use-case for board serial numbers, however, VPD is the standard capability in the PCI/PCIe specifications not tied to the use-case in particular so the suggestion from the Nova core team was to aim at introducing means of collecting this information via Libvirt. It can then be retrieved by the respective virt driver in Nova via Libvirt without having to introduce
On Fri, May 28, 2021 at 10:58:05PM +0300, Dmitrii Shcherbakov wrote: the the this
code into Nova itself.
I've talked with Sean Mooney at little about the use case this morning.
IIUC, the high level scenario is as follows
- The main host machine has a PCI controller topology to which the NICs are attached. This is how the host OS and by extension libvirt, nova, etc, see the PCI devices.
- There is a second PCI controller topology to which the NICs are attached. This is only visible to the arm cores for the offload engine
- Nova/Neutron can identify the NICs based on the PCI topology seen by the host OS, but need to tell the NIC mgmt software which NIC to use in a way that can be undersood by the offload cores.
IOW, the PCI address is not usable as a unique identifier because there are two completely independant PCI topologies with no mapping between them.
The VPD data provides a replacement way to identify a NIC based on a unique serial number that is indendant of PCI topology. Nova needs this serial number in order to configure the device offload featues.
This seems like a reasonable feature request to me, since there is a piece of info that apps using libvirt need, and libvirt does not expose this. Requiring the mgmt app like Nova to dig into the host PCI config space indicates a clear gap in libvirt functionality.
I would like to suggest the following to be done in Libvirt:
1) adding the code for extracting a serial number from VPD for PCI/PCIe devices in general and storing it for exposure via the Libvirt API; More specifically, I propose adding a nested capability called "vpd" under VIR_NODE_DEV_CAP_PCI_DEV: <capability type='pci'> <capability type='vpd'> <serial>UNIQUESERIAL</serial> <!-- ... other VPD attributes if present --> </capability> <!-- ... --> </capability>
This looks like a reasonable proposal
2) (optional) implementing functionality to obtain a board serial number via devlink-info for PCIe devices if they do not expose a VPD capability but the device driver can retrieve it via firmware. The board serial number can be stored in the same element as suggested above.
Is scenario (2) going to be at all common ? What would be a reason why the info is not exposed via the standardized VPD - is it just a legacy hardware issue ?
Not all devices expose the devlink API and even fewer do expose board serial via devlink-info:
* devlink was added in 4.10 [11]; * devlink-info was introduced in 5.1 [12]; * querying for board.serial_number was added in kernel 5.9 [13] and iproute2 5.9.0 [14]; * Besides the generic devlink infrastructure support above, device drivers also need to support exposing this field.
Therefore, implementing two approaches (sysfs VPD, devlink) is preferable for better compatibility.
I would appreciate any feedback on whether this potential addition makes sense. If so, I can look into implementing this.
It makes sense to me.
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (2)
-
Daniel P. Berrangé
-
Dmitrii Shcherbakov