The mlx4 (Mellanox) netdev driver implements the sysfs phys_port_id
file for both VFs and PFs, so you can find the VF netdev plugged into
the same physical port as any given PF netdev by comparing the
contents of phys_port_id of the respective netdevs. That's what
libvirt does when attempting to find the PF netdev for a given VF
netdev (or vice versa).
Most other netdevs drivers don't implement phys_port_id, so the file
shows up in sysfs, but attempts to read it result in ENOTSUPP. In
these cases, libvirt is unable to read phys_port_id, so it just
returns the first entry in the PF/VF's list of netdevs.
But we've found that the i40e driver is in between these two
situations - it implements phys_port_id for PF netdevs, but doesn't
implement it for VF netdevs. So libvirt would successfully read the
phys_port_id of the PF netdev, then try to find a VF netdev with
matching phys_port_id, but would fail because it is NULL for all
VFs. This would result in a message like the following:
Could not find network device with phys_port_id '3cfdfe9edc39'
under PCI device at /sys/class/net/ens4f1/device/virtfn0
To solve this problem in a way that won't break functionality for
anyone else, this patch saves the first netdev name we find for the
device, and returns that if we fail to find a netdev with the desired
phys_port_id.
---
src/util/virpci.c | 30 ++++++++++++++++++++++++++----
1 file changed, 26 insertions(+), 4 deletions(-)
diff --git a/src/util/virpci.c b/src/util/virpci.c
index 5ded77087..402602678 100644
--- a/src/util/virpci.c
+++ b/src/util/virpci.c
@@ -2876,6 +2876,7 @@ virPCIGetNetName(const char *device_link_sysfs_path,
int ret = -1;
DIR *dir = NULL;
struct dirent *entry = NULL;
+ char *firstEntryName = NULL;
char *thisPhysPortID = NULL;
size_t i = 0;
@@ -2902,6 +2903,15 @@ virPCIGetNetName(const char *device_link_sysfs_path,
/* if this one doesn't match, keep looking */
if (STRNEQ_NULLABLE(physPortID, thisPhysPortID)) {
VIR_FREE(thisPhysPortID);
+ /* save the first entry we find to use as a failsafe
+ * in case we don't match the phys_port_id. This is
+ * needed because some NIC drivers (e.g. i40e)
+ * implement phys_port_id for PFs, but not for VFs
+ */
+ if (!firstEntryName &&
+ VIR_STRDUP(firstEntryName, entry->d_name) < 0)
+ goto cleanup;
+
continue;
}
} else {
@@ -2918,10 +2928,21 @@ virPCIGetNetName(const char *device_link_sysfs_path,
if (ret < 0) {
if (physPortID) {
- virReportError(VIR_ERR_INTERNAL_ERROR,
- _("Could not find network device with "
- "phys_port_id '%s' under PCI device at
%s"),
- physPortID, device_link_sysfs_path);
+ if (firstEntryName) {
+ /* we didn't match the provided phys_port_id, but this
+ * is probably because phys_port_id isn't implemented
+ * for this NIC driver, so just return the first
+ * (probably only) netname we found.
+ */
+ *netname = firstEntryName;
+ firstEntryName = NULL;
+ ret = 0;
+ } else {
+ virReportError(VIR_ERR_INTERNAL_ERROR,
+ _("Could not find network device with "
+ "phys_port_id '%s' under PCI device at
%s"),
+ physPortID, device_link_sysfs_path);
+ }
} else {
ret = 0; /* no netdev at the given index is *not* an error */
}
@@ -2930,6 +2951,7 @@ virPCIGetNetName(const char *device_link_sysfs_path,
VIR_DIR_CLOSE(dir);
VIR_FREE(pcidev_sysfs_net_path);
VIR_FREE(thisPhysPortID);
+ VIR_FREE(firstEntryName);
return ret;
}
--
2.13.5
Show replies by date
On Fri, 2017-09-15 at 11:49 -0400, Laine Stump wrote:
@@ -2902,6 +2903,15 @@ virPCIGetNetName(const char
*device_link_sysfs_path,
/* if this one doesn't match, keep looking */
if (STRNEQ_NULLABLE(physPortID, thisPhysPortID)) {
VIR_FREE(thisPhysPortID);
+ /* save the first entry we find to use as a failsafe
+ * in case we don't match the phys_port_id. This is
+ * needed because some NIC drivers (e.g. i40e)
+ * implement phys_port_id for PFs, but not for VFs
+ */
+ if (!firstEntryName &&
+ VIR_STRDUP(firstEntryName, entry->d_name) < 0)
+ goto cleanup;
You need curly braces around the body here.
Looks sane otherwise. ACK
--
Andrea Bolognani / Red Hat / Virtualization