On 8/28/20 6:53 AM, Dmytro Linkin wrote:
Current virPCIGetNetName() logic is to get net device name by
checking
it's phys_port_id, if caller provide it, or by it's index (eg, by it's
position at sysfs net directory). This approach worked fine up until
linux kernel version 5.8, where NVIDIA Mellanox driver implemented
linking of VFs' representors to PCI device in switchdev mode. This mean
that device's sysfs net directory will hold multiple net devices. Ex.:
$ ls '/sys/bus/pci/devices/0000:82:00.0/net'
ens1f0 eth0 eth1
Most switch devices support phys_port_name instead of phys_port_id, so
virPCIGetNetName() will try to get PF name by it's index - 0. The
problem here is that the PF nedev entry may not be the first.
To fix that, for switch devices, we introduce a new logic to select the
PF uplink netdev according to the content of phys_port_name. Extend
virPCIGetNetName() with physPortNameRegex variable to get proper device
by it's phys_port_name scheme, for ex., "p[0-9]+$" to get PF,
"pf[0-9]+vf[0-9]+$" to get VF or "p1$" to get exact net device. So
now
virPCIGetNetName() logic work in following sequence:
- filter by phys_port_id, if it's provided,
or
- filter by phys_port_name, if it's regex provided,
or
- get net device by it's index (position) in sysfs net directory.
Also, make getting content of iface sysfs files more generic.
Signed-off-by: Dmytro Linkin <dlinkin(a)nvidia.com>
Reviewed-by: Adrian Chiris <adrianc(a)nvidia.com>
[...]
+/* Represents format of PF's phys_port_name in switchdev mode:
+ * 'p%u' or 'p%us%u'. New line checked since value is readed from sysfs
file.
+ */
+# define VIR_PF_PHYS_PORT_NAME_REGEX ((char *)"(p[0-9]+$)|(p[0-9]+s[0-9]+$)")
+
I've come back to look at this patch several times since it was posted
(sorry for the extreme delay in responding), but just can't figure out
what it's doing with this regex and why the regex is necessary. Not
having access to the hardware that it works with is a bit of a problem,
but perhaps I could get a better idea if you gave a full example of
sysfs contents? My concern with using a regex is that it might work just
fine when using one method for net device naming, but break if that was
changed. Also, it seems counterintuitive that it would be necessary to
look for a device with a name matching a specific pattern; why isn't
there simply a single symbolic link somewhere in the sysfs tree for the
net device that just directly points at its physical port? That would be
so much simpler and more reliable (or at least it would give the
*perception* of being more reliable).