[libvirt] [PATCH 0/3 RFC] macvtap: Implement getPhysfn to support sending a port profile message for a VF to its PF

This patch tries to fix getPhysFn in macvtap.c to get the physical function(PF) of the direct attach interface, if the interface is a SR-IOV VF. It derives the PF/VF relationship from sysfs. There is some code to do this in node device driver. But it is local to the node device driver . I did not see a clean way to use some of the node device stuff here. The node device driver looks at PCI capabilities to get the same information. This implementation tries to not get into PCI capability details and just looks at sysfs paths to derive the PF-VF relationship This patch series implements the following 01/3 - Add function to get the network interface name of a pci device 02/3 - Add functions to get sriov PF/VF relationship of a network interface 03/3 - macvtap: Fix getPhysfn to get the PF of a direct attach network interface Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com>

From: Roopa Prabhu <roprabhu@cisco.com> This function looks at sysfs "net" directory to get the network interface name associated with the pci device Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> --- src/util/pci.c | 42 ++++++++++++++++++++++++++++++++++++++++++ src/util/pci.h | 2 ++ 2 files changed, 44 insertions(+), 0 deletions(-) diff --git a/src/util/pci.c b/src/util/pci.c index a79c164..d2deeef 100644 --- a/src/util/pci.c +++ b/src/util/pci.c @@ -1679,3 +1679,45 @@ int pciDeviceIsAssignable(pciDevice *dev, return 1; } + +/* + * This function returns the network device name + * of a pci device + */ +int +pciDeviceGetNetName(char *device_link_sysfs_path, char **netname) +{ + char *pcidev_sysfs_net_path = NULL; + int ret = -1; + DIR *dir = NULL; + struct dirent *entry = NULL; + + if (virBuildPath(&pcidev_sysfs_net_path, device_link_sysfs_path, + "net") == -1) { + virReportOOMError(); + return -1; + } + + dir = opendir(pcidev_sysfs_net_path); + if (dir == NULL) + goto out; + + while ((entry = readdir(dir))) { + if (entry->d_name[0] == '.' || + !strcmp(entry->d_name, "..")) + continue; + + /* Assume a single directory entry */ + *netname = strdup(entry->d_name); + ret = 0; + break; + } + + if (dir) + closedir(dir); + +out: + VIR_FREE(pcidev_sysfs_net_path); + + return ret; +} diff --git a/src/util/pci.h b/src/util/pci.h index a351baf..fa25472 100644 --- a/src/util/pci.h +++ b/src/util/pci.h @@ -74,4 +74,6 @@ int pciDeviceIsAssignable(pciDevice *dev, int strict_acs_check); int pciWaitForDeviceCleanup(pciDevice *dev, const char *matcher); +int pciDeviceGetNetName(char *device_link_sysfs_path, char **netname); + #endif /* __VIR_PCI_H__ */

On 08/01/2011 07:57 PM, Roopa Prabhu wrote:
From: Roopa Prabhu<roprabhu@cisco.com>
This function looks at sysfs "net" directory to get the network interface name associated with the pci device
Signed-off-by: Roopa Prabhu<roprabhu@cisco.com> Signed-off-by: Christian Benvenuti<benve@cisco.com> Signed-off-by: David Wang<dwang2@cisco.com> --- src/util/pci.c | 42 ++++++++++++++++++++++++++++++++++++++++++ src/util/pci.h | 2 ++ 2 files changed, 44 insertions(+), 0 deletions(-)
diff --git a/src/util/pci.c b/src/util/pci.c index a79c164..d2deeef 100644 --- a/src/util/pci.c +++ b/src/util/pci.c @@ -1679,3 +1679,45 @@ int pciDeviceIsAssignable(pciDevice *dev,
return 1; } + +/* + * This function returns the network device name + * of a pci device + */ +int +pciDeviceGetNetName(char *device_link_sysfs_path, char **netname) +{ + char *pcidev_sysfs_net_path = NULL; + int ret = -1; + DIR *dir = NULL; + struct dirent *entry = NULL; + + if (virBuildPath(&pcidev_sysfs_net_path, device_link_sysfs_path, + "net") == -1) { + virReportOOMError(); + return -1; + } + + dir = opendir(pcidev_sysfs_net_path); + if (dir == NULL) + goto out; + + while ((entry = readdir(dir))) { + if (entry->d_name[0] == '.' || The above check makes the following one obsolete. If all entries with first letter '.' can be skipped, then you could just keep the first one, otherwise I'd also use !strcmp(entry->d_name, "."). + !strcmp(entry->d_name, "..")) + continue; + + /* Assume a single directory entry */ + *netname = strdup(entry->d_name); + ret = 0; + break; + } + + if (dir) + closedir(dir); Check for 'dir != NULL) is not necessary due to goto above. + +out: + VIR_FREE(pcidev_sysfs_net_path); + + return ret; +} diff --git a/src/util/pci.h b/src/util/pci.h index a351baf..fa25472 100644 --- a/src/util/pci.h +++ b/src/util/pci.h @@ -74,4 +74,6 @@ int pciDeviceIsAssignable(pciDevice *dev, int strict_acs_check); int pciWaitForDeviceCleanup(pciDevice *dev, const char *matcher);
+int pciDeviceGetNetName(char *device_link_sysfs_path, char **netname); + #endif /* __VIR_PCI_H__ */
Stefan

On 8/1/11 6:00 PM, "Stefan Berger" <stefanb@linux.vnet.ibm.com> wrote:
On 08/01/2011 07:57 PM, Roopa Prabhu wrote:
From: Roopa Prabhu<roprabhu@cisco.com>
This function looks at sysfs "net" directory to get the network interface name associated with the pci device
Signed-off-by: Roopa Prabhu<roprabhu@cisco.com> Signed-off-by: Christian Benvenuti<benve@cisco.com> Signed-off-by: David Wang<dwang2@cisco.com> --- src/util/pci.c | 42 ++++++++++++++++++++++++++++++++++++++++++ src/util/pci.h | 2 ++ 2 files changed, 44 insertions(+), 0 deletions(-)
diff --git a/src/util/pci.c b/src/util/pci.c index a79c164..d2deeef 100644 --- a/src/util/pci.c +++ b/src/util/pci.c @@ -1679,3 +1679,45 @@ int pciDeviceIsAssignable(pciDevice *dev,
return 1; } + +/* + * This function returns the network device name + * of a pci device + */ +int +pciDeviceGetNetName(char *device_link_sysfs_path, char **netname) +{ + char *pcidev_sysfs_net_path = NULL; + int ret = -1; + DIR *dir = NULL; + struct dirent *entry = NULL; + + if (virBuildPath(&pcidev_sysfs_net_path, device_link_sysfs_path, + "net") == -1) { + virReportOOMError(); + return -1; + } + + dir = opendir(pcidev_sysfs_net_path); + if (dir == NULL) + goto out; + + while ((entry = readdir(dir))) { + if (entry->d_name[0] == '.' || The above check makes the following one obsolete. If all entries with first letter '.' can be skipped, then you could just keep the first one, otherwise I'd also use !strcmp(entry->d_name, ".").
Agree, yes I should use strcmp. Did not think of all cases here because the net dir normally contains one entry. I will also revisit this check for cases when there might be more net devices associated with a pci device.
+ !strcmp(entry->d_name, "..")) + continue; + + /* Assume a single directory entry */ + *netname = strdup(entry->d_name); + ret = 0; + break; + } + + if (dir) + closedir(dir); Check for 'dir != NULL) is not necessary due to goto above.
Yes will fix. Thanks stefan.
+ +out: + VIR_FREE(pcidev_sysfs_net_path); + + return ret; +} diff --git a/src/util/pci.h b/src/util/pci.h index a351baf..fa25472 100644 --- a/src/util/pci.h +++ b/src/util/pci.h @@ -74,4 +74,6 @@ int pciDeviceIsAssignable(pciDevice *dev, int strict_acs_check); int pciWaitForDeviceCleanup(pciDevice *dev, const char *matcher);
+int pciDeviceGetNetName(char *device_link_sysfs_path, char **netname); + #endif /* __VIR_PCI_H__ */
Stefan

From: Roopa Prabhu <roprabhu@cisco.com> This patch adds helper functions to derive the PF/VF relationship of an sriov network device Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> --- src/util/interface.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++ src/util/interface.h | 8 +++ 2 files changed, 130 insertions(+), 0 deletions(-) diff --git a/src/util/interface.c b/src/util/interface.c index f5eecfb..5ee5703 100644 --- a/src/util/interface.c +++ b/src/util/interface.c @@ -30,6 +30,7 @@ #include <sys/ioctl.h> #include <fcntl.h> #include <netinet/in.h> +#include <dirent.h> #ifdef __linux__ # include <linux/if.h> @@ -45,6 +46,8 @@ #include "virfile.h" #include "memory.h" #include "netlink.h" +#include "pci.h" +#include "logging.h" #define VIR_FROM_THIS VIR_FROM_NET @@ -1197,3 +1200,122 @@ ifaceRestoreMacAddress(const char *linkdev, return rc; } + +int ifaceIsVF(const char *ifname) +{ + char *if_sysfs_device_link = NULL; + int ret; + + if (virAsprintf(&if_sysfs_device_link, NET_SYSFS "%s/device/physfn", + ifname) < 0) { + virReportOOMError(); + return -1; + } + + ret = virFileExists(if_sysfs_device_link); + + VIR_FREE(if_sysfs_device_link); + + return ret; +} + +int ifaceGetVFIndex(const char *pfname, const char *vfname, + int *vf_index) +{ + int ret = -1; + DIR *dir = NULL; + struct dirent *entry = NULL; + char *pf_sysfs_device_link = NULL, *vf_sysfs_device_link = NULL; + char *vf_sysfs_device = NULL; + char errbuf[64]; + + if (virAsprintf(&pf_sysfs_device_link, NET_SYSFS "%s/device", + pfname) < 0) { + virReportOOMError(); + return -1; + } + + if (virAsprintf(&vf_sysfs_device_link, NET_SYSFS "%s/device", + vfname) < 0) { + VIR_FREE(pf_sysfs_device_link); + virReportOOMError(); + return -1; + } + + vf_sysfs_device = canonicalize_file_name(vf_sysfs_device_link); + if (vf_sysfs_device == NULL) { + memset(errbuf, '\0', sizeof(errbuf)); + VIR_ERROR(_("Failed to resolve device link '%s': '%s'"), + vf_sysfs_device_link, + virStrerror(errno, errbuf, sizeof(errbuf))); + VIR_FREE(pf_sysfs_device_link); + VIR_FREE(vf_sysfs_device_link); + return -1; + } + + dir = opendir(pf_sysfs_device_link); + if (dir == NULL) + goto out; + + while ((entry = readdir(dir))) { + if (STRPREFIX(entry->d_name, "virtfn")) { + char *device_link, *device_path; + + if (virBuildPath(&device_link, pf_sysfs_device_link, + entry->d_name) == -1) { + virReportOOMError(); + goto out; + } + + device_path = canonicalize_file_name(device_link); + if (device_path == NULL) { + memset(errbuf, '\0', sizeof(errbuf)); + VIR_ERROR(_("Failed to resolve device link '%s': '%s'"), + device_link, virStrerror(errno, errbuf, + sizeof(errbuf))); + VIR_FREE(device_link); + goto out; + } + + if (!strcmp(vf_sysfs_device, device_path)) { + *vf_index = atoi(entry->d_name + strlen("virtfn")); + ret = 0; + } + + VIR_FREE(device_link); + VIR_FREE(device_path); + + if ( ret == 0 ) + break; + } + } + + if (dir) + closedir(dir); + +out: + + VIR_FREE(pf_sysfs_device_link); + VIR_FREE(vf_sysfs_device_link); + VIR_FREE(vf_sysfs_device); + + return ret; +} + +int ifaceGetPFName(const char *ifname, char **pfname) +{ + char *physfn_sysfs_path = NULL; + int ret = -1; + + if (virAsprintf(&physfn_sysfs_path, NET_SYSFS "%s/device/physfn", + ifname) < 0) { + virReportOOMError(); + return -1; + } + + ret = pciDeviceGetNetName(physfn_sysfs_path, pfname); + + VIR_FREE(physfn_sysfs_path); + + return ret; +} diff --git a/src/util/interface.h b/src/util/interface.h index 9647653..f2c84f8 100644 --- a/src/util/interface.h +++ b/src/util/interface.h @@ -26,6 +26,8 @@ struct nlattr; # include "datatypes.h" # include "network.h" +#define NET_SYSFS "/sys/class/net/" + int ifaceGetFlags(const char *name, short *flags); int ifaceIsUp(const char *name, bool *up); @@ -76,4 +78,10 @@ int ifaceReplaceMacAddress(const unsigned char *macaddress, int ifaceRestoreMacAddress(const char *linkdev, const char *stateDir); +int ifaceIsVF(const char *ifname); + +int ifaceGetVFIndex(const char *pfname, const char *vfname, int *vf_index); + +int ifaceGetPFName(const char *ifname, char **pfname); + #endif /* __VIR_INTERFACE_H__ */

On 08/01/2011 07:57 PM, Roopa Prabhu wrote:
From: Roopa Prabhu<roprabhu@cisco.com>
This patch adds helper functions to derive the PF/VF relationship of an sriov network device
Signed-off-by: Roopa Prabhu<roprabhu@cisco.com> Signed-off-by: Christian Benvenuti<benve@cisco.com> Signed-off-by: David Wang<dwang2@cisco.com> --- src/util/interface.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++ src/util/interface.h | 8 +++ 2 files changed, 130 insertions(+), 0 deletions(-)
diff --git a/src/util/interface.c b/src/util/interface.c index f5eecfb..5ee5703 100644 --- a/src/util/interface.c +++ b/src/util/interface.c @@ -30,6 +30,7 @@ #include<sys/ioctl.h> #include<fcntl.h> #include<netinet/in.h> +#include<dirent.h>
#ifdef __linux__ # include<linux/if.h> @@ -45,6 +46,8 @@ #include "virfile.h" #include "memory.h" #include "netlink.h" +#include "pci.h" +#include "logging.h"
#define VIR_FROM_THIS VIR_FROM_NET
@@ -1197,3 +1200,122 @@ ifaceRestoreMacAddress(const char *linkdev,
return rc; } + +int ifaceIsVF(const char *ifname) +{ + char *if_sysfs_device_link = NULL; + int ret; + + if (virAsprintf(&if_sysfs_device_link, NET_SYSFS "%s/device/physfn", + ifname)< 0) { + virReportOOMError(); + return -1; + } + + ret = virFileExists(if_sysfs_device_link); + + VIR_FREE(if_sysfs_device_link); + + return ret; +} + +int ifaceGetVFIndex(const char *pfname, const char *vfname, + int *vf_index) +{ + int ret = -1; + DIR *dir = NULL; + struct dirent *entry = NULL; + char *pf_sysfs_device_link = NULL, *vf_sysfs_device_link = NULL; + char *vf_sysfs_device = NULL; + char errbuf[64]; + + if (virAsprintf(&pf_sysfs_device_link, NET_SYSFS "%s/device", + pfname)< 0) { + virReportOOMError(); + return -1; + } + + if (virAsprintf(&vf_sysfs_device_link, NET_SYSFS "%s/device", + vfname)< 0) { + VIR_FREE(pf_sysfs_device_link); + virReportOOMError(); + return -1; + } + + vf_sysfs_device = canonicalize_file_name(vf_sysfs_device_link); + if (vf_sysfs_device == NULL) { + memset(errbuf, '\0', sizeof(errbuf)); + VIR_ERROR(_("Failed to resolve device link '%s': '%s'"), + vf_sysfs_device_link, + virStrerror(errno, errbuf, sizeof(errbuf))); + VIR_FREE(pf_sysfs_device_link); + VIR_FREE(vf_sysfs_device_link); + return -1; + } + + dir = opendir(pf_sysfs_device_link); + if (dir == NULL) + goto out; + + while ((entry = readdir(dir))) { + if (STRPREFIX(entry->d_name, "virtfn")) { + char *device_link, *device_path; + + if (virBuildPath(&device_link, pf_sysfs_device_link, + entry->d_name) == -1) { + virReportOOMError(); + goto out; + } + + device_path = canonicalize_file_name(device_link); + if (device_path == NULL) { + memset(errbuf, '\0', sizeof(errbuf)); + VIR_ERROR(_("Failed to resolve device link '%s': '%s'"), + device_link, virStrerror(errno, errbuf, + sizeof(errbuf))); + VIR_FREE(device_link); + goto out; + } + + if (!strcmp(vf_sysfs_device, device_path)) { + *vf_index = atoi(entry->d_name + strlen("virtfn")); This looks odd. Can you explain? + ret = 0; + } + + VIR_FREE(device_link); + VIR_FREE(device_path); + + if ( ret == 0 ) + break; + } + } + + if (dir) + closedir(dir);
In this case the above test should be after the 'out:' label not to cause a memory leak with the goto's above.
+ +out: + + VIR_FREE(pf_sysfs_device_link); + VIR_FREE(vf_sysfs_device_link); + VIR_FREE(vf_sysfs_device); + + return ret; +} + +int ifaceGetPFName(const char *ifname, char **pfname) +{ + char *physfn_sysfs_path = NULL; + int ret = -1; + + if (virAsprintf(&physfn_sysfs_path, NET_SYSFS "%s/device/physfn", + ifname)< 0) { + virReportOOMError(); + return -1; + } + + ret = pciDeviceGetNetName(physfn_sysfs_path, pfname); + + VIR_FREE(physfn_sysfs_path); + + return ret; +} diff --git a/src/util/interface.h b/src/util/interface.h index 9647653..f2c84f8 100644 --- a/src/util/interface.h +++ b/src/util/interface.h @@ -26,6 +26,8 @@ struct nlattr; # include "datatypes.h" # include "network.h"
+#define NET_SYSFS "/sys/class/net/" + int ifaceGetFlags(const char *name, short *flags); int ifaceIsUp(const char *name, bool *up);
@@ -76,4 +78,10 @@ int ifaceReplaceMacAddress(const unsigned char *macaddress, int ifaceRestoreMacAddress(const char *linkdev, const char *stateDir);
+int ifaceIsVF(const char *ifname); + +int ifaceGetVFIndex(const char *pfname, const char *vfname, int *vf_index); + +int ifaceGetPFName(const char *ifname, char **pfname); + #endif /* __VIR_INTERFACE_H__ */
Stefan

On 8/1/11 6:10 PM, "Stefan Berger" <stefanb@linux.vnet.ibm.com> wrote:
On 08/01/2011 07:57 PM, Roopa Prabhu wrote:
From: Roopa Prabhu<roprabhu@cisco.com>
This patch adds helper functions to derive the PF/VF relationship of an sriov network device
Signed-off-by: Roopa Prabhu<roprabhu@cisco.com> Signed-off-by: Christian Benvenuti<benve@cisco.com> Signed-off-by: David Wang<dwang2@cisco.com> --- src/util/interface.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++ src/util/interface.h | 8 +++ 2 files changed, 130 insertions(+), 0 deletions(-)
diff --git a/src/util/interface.c b/src/util/interface.c index f5eecfb..5ee5703 100644 --- a/src/util/interface.c +++ b/src/util/interface.c @@ -30,6 +30,7 @@ #include<sys/ioctl.h> #include<fcntl.h> #include<netinet/in.h> +#include<dirent.h>
#ifdef __linux__ # include<linux/if.h> @@ -45,6 +46,8 @@ #include "virfile.h" #include "memory.h" #include "netlink.h" +#include "pci.h" +#include "logging.h"
#define VIR_FROM_THIS VIR_FROM_NET
@@ -1197,3 +1200,122 @@ ifaceRestoreMacAddress(const char *linkdev,
return rc; } + +int ifaceIsVF(const char *ifname) +{ + char *if_sysfs_device_link = NULL; + int ret; + + if (virAsprintf(&if_sysfs_device_link, NET_SYSFS "%s/device/physfn", + ifname)< 0) { + virReportOOMError(); + return -1; + } + + ret = virFileExists(if_sysfs_device_link); + + VIR_FREE(if_sysfs_device_link); + + return ret; +} + +int ifaceGetVFIndex(const char *pfname, const char *vfname, + int *vf_index) +{ + int ret = -1; + DIR *dir = NULL; + struct dirent *entry = NULL; + char *pf_sysfs_device_link = NULL, *vf_sysfs_device_link = NULL; + char *vf_sysfs_device = NULL; + char errbuf[64]; + + if (virAsprintf(&pf_sysfs_device_link, NET_SYSFS "%s/device", + pfname)< 0) { + virReportOOMError(); + return -1; + } + + if (virAsprintf(&vf_sysfs_device_link, NET_SYSFS "%s/device", + vfname)< 0) { + VIR_FREE(pf_sysfs_device_link); + virReportOOMError(); + return -1; + } + + vf_sysfs_device = canonicalize_file_name(vf_sysfs_device_link); + if (vf_sysfs_device == NULL) { + memset(errbuf, '\0', sizeof(errbuf)); + VIR_ERROR(_("Failed to resolve device link '%s': '%s'"), + vf_sysfs_device_link, + virStrerror(errno, errbuf, sizeof(errbuf))); + VIR_FREE(pf_sysfs_device_link); + VIR_FREE(vf_sysfs_device_link); + return -1; + } + + dir = opendir(pf_sysfs_device_link); + if (dir == NULL) + goto out; + + while ((entry = readdir(dir))) { + if (STRPREFIX(entry->d_name, "virtfn")) { + char *device_link, *device_path; + + if (virBuildPath(&device_link, pf_sysfs_device_link, + entry->d_name) == -1) { + virReportOOMError(); + goto out; + } + + device_path = canonicalize_file_name(device_link); + if (device_path == NULL) { + memset(errbuf, '\0', sizeof(errbuf)); + VIR_ERROR(_("Failed to resolve device link '%s': '%s'"), + device_link, virStrerror(errno, errbuf, + sizeof(errbuf))); + VIR_FREE(device_link); + goto out; + } + + if (!strcmp(vf_sysfs_device, device_path)) { + *vf_index = atoi(entry->d_name + strlen("virtfn")); This looks odd. Can you explain?
The PF device sysfs directory has links with the name "virtfn<vf_index>" pointing to the VF's pci device. Example: # ls -l virtfn* lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn0 -> ../0000:1e:00.1 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn1 -> ../0000:1e:00.2 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn10 -> ../0000:1e:01.3 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn11 -> ../0000:1e:01.4 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn12 -> ../0000:1e:01.5 I see if the virt function link is pointing to the pci device of the VF I am looking for. If it is, then I try to get the vf index from the virtual function link name (I just need the number in "virtfn<vf_index>". This is one way to get the vf index. The other way is to get it by reading the pci config space of the PF, which the node device driver uses.
+ ret = 0; + } + + VIR_FREE(device_link); + VIR_FREE(device_path); + + if ( ret == 0 ) + break; + } + } + + if (dir) + closedir(dir);
In this case the above test should be after the 'out:' label not to cause a memory leak with the goto's above.
Thanks for catching it. Exactly similar lines of code exists in the previous patch. There was a bug in the out label location in previous patch. When I fixed the previous patch I accidently made the change in this patch too introducing a leak here. :) Will fix it. Thanks stefan.
+ +out: + + VIR_FREE(pf_sysfs_device_link); + VIR_FREE(vf_sysfs_device_link); + VIR_FREE(vf_sysfs_device); + + return ret; +} + +int ifaceGetPFName(const char *ifname, char **pfname) +{ + char *physfn_sysfs_path = NULL; + int ret = -1; + + if (virAsprintf(&physfn_sysfs_path, NET_SYSFS "%s/device/physfn", + ifname)< 0) { + virReportOOMError(); + return -1; + } + + ret = pciDeviceGetNetName(physfn_sysfs_path, pfname); + + VIR_FREE(physfn_sysfs_path); + + return ret; +} diff --git a/src/util/interface.h b/src/util/interface.h index 9647653..f2c84f8 100644 --- a/src/util/interface.h +++ b/src/util/interface.h @@ -26,6 +26,8 @@ struct nlattr; # include "datatypes.h" # include "network.h"
+#define NET_SYSFS "/sys/class/net/" + int ifaceGetFlags(const char *name, short *flags); int ifaceIsUp(const char *name, bool *up);
@@ -76,4 +78,10 @@ int ifaceReplaceMacAddress(const unsigned char *macaddress, int ifaceRestoreMacAddress(const char *linkdev, const char *stateDir);
+int ifaceIsVF(const char *ifname); + +int ifaceGetVFIndex(const char *pfname, const char *vfname, int *vf_index); + +int ifaceGetPFName(const char *ifname, char **pfname); + #endif /* __VIR_INTERFACE_H__ */
Stefan

On 08/02/2011 08:46 AM, Roopa Prabhu wrote:
On 8/1/11 6:10 PM, "Stefan Berger"<stefanb@linux.vnet.ibm.com> wrote:
From: Roopa Prabhu<roprabhu@cisco.com>
This patch adds helper functions to derive the PF/VF relationship of an sriov network device
Signed-off-by: Roopa Prabhu<roprabhu@cisco.com> Signed-off-by: Christian Benvenuti<benve@cisco.com> Signed-off-by: David Wang<dwang2@cisco.com> --- src/util/interface.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++ src/util/interface.h | 8 +++ 2 files changed, 130 insertions(+), 0 deletions(-)
diff --git a/src/util/interface.c b/src/util/interface.c index f5eecfb..5ee5703 100644 --- a/src/util/interface.c +++ b/src/util/interface.c @@ -30,6 +30,7 @@ #include<sys/ioctl.h> #include<fcntl.h> #include<netinet/in.h> +#include<dirent.h>
#ifdef __linux__ # include<linux/if.h> @@ -45,6 +46,8 @@ #include "virfile.h" #include "memory.h" #include "netlink.h" +#include "pci.h" +#include "logging.h"
#define VIR_FROM_THIS VIR_FROM_NET
@@ -1197,3 +1200,122 @@ ifaceRestoreMacAddress(const char *linkdev,
return rc; } + +int ifaceIsVF(const char *ifname) +{ + char *if_sysfs_device_link = NULL; + int ret; + + if (virAsprintf(&if_sysfs_device_link, NET_SYSFS "%s/device/physfn", + ifname)< 0) { + virReportOOMError(); + return -1; + } + + ret = virFileExists(if_sysfs_device_link); + + VIR_FREE(if_sysfs_device_link); + + return ret; +} + +int ifaceGetVFIndex(const char *pfname, const char *vfname, + int *vf_index) +{ + int ret = -1; + DIR *dir = NULL; + struct dirent *entry = NULL; + char *pf_sysfs_device_link = NULL, *vf_sysfs_device_link = NULL; + char *vf_sysfs_device = NULL; + char errbuf[64]; + + if (virAsprintf(&pf_sysfs_device_link, NET_SYSFS "%s/device", + pfname)< 0) { + virReportOOMError(); + return -1; + } + + if (virAsprintf(&vf_sysfs_device_link, NET_SYSFS "%s/device", + vfname)< 0) { + VIR_FREE(pf_sysfs_device_link); + virReportOOMError(); + return -1; + } + + vf_sysfs_device = canonicalize_file_name(vf_sysfs_device_link); + if (vf_sysfs_device == NULL) { + memset(errbuf, '\0', sizeof(errbuf)); + VIR_ERROR(_("Failed to resolve device link '%s': '%s'"), + vf_sysfs_device_link, + virStrerror(errno, errbuf, sizeof(errbuf))); + VIR_FREE(pf_sysfs_device_link); + VIR_FREE(vf_sysfs_device_link); + return -1; + } + + dir = opendir(pf_sysfs_device_link); + if (dir == NULL) + goto out; + + while ((entry = readdir(dir))) { + if (STRPREFIX(entry->d_name, "virtfn")) { + char *device_link, *device_path; + + if (virBuildPath(&device_link, pf_sysfs_device_link, + entry->d_name) == -1) { + virReportOOMError(); + goto out; + } + + device_path = canonicalize_file_name(device_link); + if (device_path == NULL) { + memset(errbuf, '\0', sizeof(errbuf)); + VIR_ERROR(_("Failed to resolve device link '%s': '%s'"), + device_link, virStrerror(errno, errbuf, + sizeof(errbuf))); + VIR_FREE(device_link); + goto out; + } + + if (!strcmp(vf_sysfs_device, device_path)) { + *vf_index = atoi(entry->d_name + strlen("virtfn")); This looks odd. Can you explain? The PF device sysfs directory has links with the name "virtfn<vf_index>"
On 08/01/2011 07:57 PM, Roopa Prabhu wrote: pointing to the VF's pci device.
Example: # ls -l virtfn* lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn0 -> ../0000:1e:00.1 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn1 -> ../0000:1e:00.2 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn10 -> ../0000:1e:01.3 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn11 -> ../0000:1e:01.4 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn12 -> ../0000:1e:01.5
I see if the virt function link is pointing to the pci device of the VF I am looking for. If it is, then I try to get the vf index from the virtual function link name (I just need the number in "virtfn<vf_index>". This is one way to get the vf index. The other way is to get it by reading the pci config space of the PF, which the node device driver uses.
In that case I'd suggest to use 'if (sscanf(entry->d_name, "virtfn%u", &vf_index) != 1) { error }' Stefan
+ ret = 0; + } + + VIR_FREE(device_link); + VIR_FREE(device_path); + + if ( ret == 0 ) + break; + } + } + + if (dir) + closedir(dir); In this case the above test should be after the 'out:' label not to cause a memory leak with the goto's above.
Thanks for catching it. Exactly similar lines of code exists in the previous patch. There was a bug in the out label location in previous patch. When I fixed the previous patch I accidently made the change in this patch too introducing a leak here. :) Will fix it. Thanks stefan.
+ +out: + + VIR_FREE(pf_sysfs_device_link); + VIR_FREE(vf_sysfs_device_link); + VIR_FREE(vf_sysfs_device); + + return ret; +} + +int ifaceGetPFName(const char *ifname, char **pfname) +{ + char *physfn_sysfs_path = NULL; + int ret = -1; + + if (virAsprintf(&physfn_sysfs_path, NET_SYSFS "%s/device/physfn", + ifname)< 0) { + virReportOOMError(); + return -1; + } + + ret = pciDeviceGetNetName(physfn_sysfs_path, pfname); + + VIR_FREE(physfn_sysfs_path); + + return ret; +} diff --git a/src/util/interface.h b/src/util/interface.h index 9647653..f2c84f8 100644 --- a/src/util/interface.h +++ b/src/util/interface.h @@ -26,6 +26,8 @@ struct nlattr; # include "datatypes.h" # include "network.h"
+#define NET_SYSFS "/sys/class/net/" + int ifaceGetFlags(const char *name, short *flags); int ifaceIsUp(const char *name, bool *up);
@@ -76,4 +78,10 @@ int ifaceReplaceMacAddress(const unsigned char *macaddress, int ifaceRestoreMacAddress(const char *linkdev, const char *stateDir);
+int ifaceIsVF(const char *ifname); + +int ifaceGetVFIndex(const char *pfname, const char *vfname, int *vf_index); + +int ifaceGetPFName(const char *ifname, char **pfname); + #endif /* __VIR_INTERFACE_H__ */
Stefan

On 8/2/11 6:46 AM, "Stefan Berger" <stefanb@linux.vnet.ibm.com> wrote:
On 08/02/2011 08:46 AM, Roopa Prabhu wrote:
On 8/1/11 6:10 PM, "Stefan Berger"<stefanb@linux.vnet.ibm.com> wrote:
From: Roopa Prabhu<roprabhu@cisco.com>
This patch adds helper functions to derive the PF/VF relationship of an sriov network device
Signed-off-by: Roopa Prabhu<roprabhu@cisco.com> Signed-off-by: Christian Benvenuti<benve@cisco.com> Signed-off-by: David Wang<dwang2@cisco.com> --- src/util/interface.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++ src/util/interface.h | 8 +++ 2 files changed, 130 insertions(+), 0 deletions(-)
diff --git a/src/util/interface.c b/src/util/interface.c index f5eecfb..5ee5703 100644 --- a/src/util/interface.c +++ b/src/util/interface.c @@ -30,6 +30,7 @@ #include<sys/ioctl.h> #include<fcntl.h> #include<netinet/in.h> +#include<dirent.h>
#ifdef __linux__ # include<linux/if.h> @@ -45,6 +46,8 @@ #include "virfile.h" #include "memory.h" #include "netlink.h" +#include "pci.h" +#include "logging.h"
#define VIR_FROM_THIS VIR_FROM_NET
@@ -1197,3 +1200,122 @@ ifaceRestoreMacAddress(const char *linkdev,
return rc; } + +int ifaceIsVF(const char *ifname) +{ + char *if_sysfs_device_link = NULL; + int ret; + + if (virAsprintf(&if_sysfs_device_link, NET_SYSFS "%s/device/physfn", + ifname)< 0) { + virReportOOMError(); + return -1; + } + + ret = virFileExists(if_sysfs_device_link); + + VIR_FREE(if_sysfs_device_link); + + return ret; +} + +int ifaceGetVFIndex(const char *pfname, const char *vfname, + int *vf_index) +{ + int ret = -1; + DIR *dir = NULL; + struct dirent *entry = NULL; + char *pf_sysfs_device_link = NULL, *vf_sysfs_device_link = NULL; + char *vf_sysfs_device = NULL; + char errbuf[64]; + + if (virAsprintf(&pf_sysfs_device_link, NET_SYSFS "%s/device", + pfname)< 0) { + virReportOOMError(); + return -1; + } + + if (virAsprintf(&vf_sysfs_device_link, NET_SYSFS "%s/device", + vfname)< 0) { + VIR_FREE(pf_sysfs_device_link); + virReportOOMError(); + return -1; + } + + vf_sysfs_device = canonicalize_file_name(vf_sysfs_device_link); + if (vf_sysfs_device == NULL) { + memset(errbuf, '\0', sizeof(errbuf)); + VIR_ERROR(_("Failed to resolve device link '%s': '%s'"), + vf_sysfs_device_link, + virStrerror(errno, errbuf, sizeof(errbuf))); + VIR_FREE(pf_sysfs_device_link); + VIR_FREE(vf_sysfs_device_link); + return -1; + } + + dir = opendir(pf_sysfs_device_link); + if (dir == NULL) + goto out; + + while ((entry = readdir(dir))) { + if (STRPREFIX(entry->d_name, "virtfn")) { + char *device_link, *device_path; + + if (virBuildPath(&device_link, pf_sysfs_device_link, + entry->d_name) == -1) { + virReportOOMError(); + goto out; + } + + device_path = canonicalize_file_name(device_link); + if (device_path == NULL) { + memset(errbuf, '\0', sizeof(errbuf)); + VIR_ERROR(_("Failed to resolve device link '%s': '%s'"), + device_link, virStrerror(errno, errbuf, + sizeof(errbuf))); + VIR_FREE(device_link); + goto out; + } + + if (!strcmp(vf_sysfs_device, device_path)) { + *vf_index = atoi(entry->d_name + strlen("virtfn")); This looks odd. Can you explain? The PF device sysfs directory has links with the name "virtfn<vf_index>"
On 08/01/2011 07:57 PM, Roopa Prabhu wrote: pointing to the VF's pci device.
Example: # ls -l virtfn* lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn0 -> ../0000:1e:00.1 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn1 -> ../0000:1e:00.2 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn10 -> ../0000:1e:01.3 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn11 -> ../0000:1e:01.4 lrwxrwxrwx. 1 root root 0 Jul 27 11:09 virtfn12 -> ../0000:1e:01.5
I see if the virt function link is pointing to the pci device of the VF I am looking for. If it is, then I try to get the vf index from the virtual function link name (I just need the number in "virtfn<vf_index>". This is one way to get the vf index. The other way is to get it by reading the pci config space of the PF, which the node device driver uses.
In that case I'd suggest to use 'if (sscanf(entry->d_name, "virtfn%u", &vf_index) != 1) { error }'
Ok sounds good. Thanks.

From: Roopa Prabhu <roprabhu@cisco.com> This patch adds code to the existing getPhysfn to get the PF device name and VF index if the linkdev is a sriov VF. The idea is to send the port profile message to a PF if the direct attach interface is a VF. Eventually interface.c is probably the right place for getPhysfn Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> --- src/util/macvtap.c | 15 ++++++--------- 1 files changed, 6 insertions(+), 9 deletions(-) diff --git a/src/util/macvtap.c b/src/util/macvtap.c index 0b00776..7fcea0d 100644 --- a/src/util/macvtap.c +++ b/src/util/macvtap.c @@ -948,18 +948,15 @@ getPhysfn(const char *linkdev, char **physfndev) { int rc = 0; - bool virtfn = false; - if (virtfn) { + if (ifaceIsVF(linkdev)) { - /* XXX: if linkdev is SR-IOV VF, then set vf = VF index */ - /* XXX: and set linkdev = PF device */ - /* XXX: need to use get_physical_function_linux() or */ - /* XXX: something like that to get PF */ - /* XXX: device and figure out VF index */ - - rc = 1; + /* if linkdev is SR-IOV VF, then set vf = VF index */ + /* and set physfndev = PF device */ + rc = ifaceGetPFName(linkdev, physfndev); + if (!rc) + rc = ifaceGetVFIndex(*physfndev, linkdev, vf); } else { /* Not SR-IOV VF: physfndev is linkdev and VF index

On 08/01/2011 07:57 PM, Roopa Prabhu wrote:
From: Roopa Prabhu<roprabhu@cisco.com>
This patch adds code to the existing getPhysfn to get the PF device name and VF index if the linkdev is a sriov VF. The idea is to send the port profile message to a PF if the direct attach interface is a VF. Eventually interface.c is probably the right place for getPhysfn
Signed-off-by: Roopa Prabhu<roprabhu@cisco.com> Signed-off-by: Christian Benvenuti<benve@cisco.com> Signed-off-by: David Wang<dwang2@cisco.com> --- src/util/macvtap.c | 15 ++++++--------- 1 files changed, 6 insertions(+), 9 deletions(-)
diff --git a/src/util/macvtap.c b/src/util/macvtap.c index 0b00776..7fcea0d 100644 --- a/src/util/macvtap.c +++ b/src/util/macvtap.c @@ -948,18 +948,15 @@ getPhysfn(const char *linkdev, char **physfndev) { int rc = 0; - bool virtfn = false;
- if (virtfn) { + if (ifaceIsVF(linkdev)) {
- /* XXX: if linkdev is SR-IOV VF, then set vf = VF index */ - /* XXX: and set linkdev = PF device */ - /* XXX: need to use get_physical_function_linux() or */ - /* XXX: something like that to get PF */ - /* XXX: device and figure out VF index */ - - rc = 1; + /* if linkdev is SR-IOV VF, then set vf = VF index */ + /* and set physfndev = PF device */
+ rc = ifaceGetPFName(linkdev, physfndev); + if (!rc) + rc = ifaceGetVFIndex(*physfndev, linkdev, vf); } else {
/* Not SR-IOV VF: physfndev is linkdev and VF index
Looks good to me. ACK Stefan

On Mon, Aug 01, 2011 at 04:56:58PM -0700, Roopa Prabhu wrote:
This patch tries to fix getPhysFn in macvtap.c to get the physical function(PF) of the direct attach interface, if the interface is a SR-IOV VF. It derives the PF/VF relationship from sysfs.
There is some code to do this in node device driver. But it is local to the node device driver . I did not see a clean way to use some of the node device stuff here. The node device driver looks at PCI capabilities to get the same information.
We should not have two implementations of this functionality in the code. Either the node device code should be made to use this version or vice versa.
This implementation tries to not get into PCI capability details and just looks at sysfs paths to derive the PF-VF relationship
This patch series implements the following 01/3 - Add function to get the network interface name of a pci device 02/3 - Add functions to get sriov PF/VF relationship of a network interface 03/3 - macvtap: Fix getPhysfn to get the PF of a direct attach network interface
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com>
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Mon, Aug 01, 2011 at 09:16:09PM -0400, Dave Allan wrote:
On Mon, Aug 01, 2011 at 04:56:58PM -0700, Roopa Prabhu wrote:
This patch tries to fix getPhysFn in macvtap.c to get the physical function(PF) of the direct attach interface, if the interface is a SR-IOV VF. It derives the PF/VF relationship from sysfs.
There is some code to do this in node device driver. But it is local to the node device driver . I did not see a clean way to use some of the node device stuff here. The node device driver looks at PCI capabilities to get the same information.
We should not have two implementations of this functionality in the code. Either the node device code should be made to use this version or vice versa.
We already have a file src/util/pci.c that contains a bunch of helper APIs for dealing with PCI devices. If we need some shared code between macvtap and the node device driver, then we ought to put it in the pci.c file Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 8/2/11 4:15 AM, "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Mon, Aug 01, 2011 at 09:16:09PM -0400, Dave Allan wrote:
On Mon, Aug 01, 2011 at 04:56:58PM -0700, Roopa Prabhu wrote:
This patch tries to fix getPhysFn in macvtap.c to get the physical function(PF) of the direct attach interface, if the interface is a SR-IOV VF. It derives the PF/VF relationship from sysfs.
There is some code to do this in node device driver. But it is local to the node device driver . I did not see a clean way to use some of the node device stuff here. The node device driver looks at PCI capabilities to get the same information.
We should not have two implementations of this functionality in the code. Either the node device code should be made to use this version or vice versa.
We already have a file src/util/pci.c that contains a bunch of helper APIs for dealing with PCI devices. If we need some shared code between macvtap and the node device driver, then we ought to put it in the pci.c file
From the patches that I posted, I still need the stuff that tries to derive
Yes, I looked at it and saw that I had to move quite a few node_device data structures and code in src/util/pci.c. Wasn't sure if there was a reason for it to not already exist in src/util/pci.c. So decided to post this patch as RFC for specific directions. the net device from the pci device. I will re-evaluate moving the node device stuff to get PF-VF relationship information into src/util/pci.c and repost the patches. Thanks for the comments.
participants (4)
-
Daniel P. Berrange
-
Dave Allan
-
Roopa Prabhu
-
Stefan Berger