[libvirt] [PATCH] pci: properly handle out-of-order SRIOV virtual functions

This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1025397 When libvirt creates the list of an SRIOV Physical Function's (PF) Virtual Functions (VF), it assumes that the order of "virtfn*" links returned by readdir() from the PF's sysfs directory is already in the correct order. Experience has shown that this is not always the case. This results in 1) incorrect assumptions made by consumers of the output of the virt_functions list of virsh nodedev-dumpxml, and 2) setting MAC address and vlan tag on the wrong VF (since libvirt uses netlink to set mac address and vlan tag, netlink requires the VF#, and the function virPCIGetVirtualFunctionIndex() returns the wrong index due to the improperly ordered VF list). See the bugzilla report for an example of this improper behavior. The solution provided by this patch is for virPCIGetVirtualFunctions to first gather all the "virtfn*" names from the PFs sysfs directory, then allocate a virtual_functions array large enough to hold all entries, and finally to create a device entry for each virtfn* name and place it into the virtual_functions array at the proper index (based on interpreting the value following "virtfn" in the name). Checks are introduced to ensure that 1) the virtfn list given in sysfs is not sparse (ie, there are no indexes larger than the length of the list), and that no two virtfn* entries decode to the same index. --- I briefly debated solving this problem by changing the virtual_functions array from being just a list of PCI device addresses into being an array of new objects that have both a PCI address and a virtual function index. I decided against it because it would have modified the XML format and required more substantial changes to other parts of the code (and anyway is unnecessary, since the virtual function array is never "sparse", so it can be appropriately represented by implying the virtual function index from a device's position in the list). src/util/virpci.c | 91 ++++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 66 insertions(+), 25 deletions(-) diff --git a/src/util/virpci.c b/src/util/virpci.c index 148631f..e70fa3d 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -2379,6 +2379,8 @@ virPCIGetPhysicalFunction(const char *vf_sysfs_path, return ret; } +static const char *virtfnPrefix = "virtfn"; + /* * Returns virtual functions of a physical function */ @@ -2393,6 +2395,8 @@ virPCIGetVirtualFunctions(const char *sysfs_path, struct dirent *entry = NULL; char *device_link = NULL; char errbuf[64]; + char **virtfnNames = NULL; + size_t nVirtfnNames = 0; VIR_DEBUG("Attempting to get SR IOV virtual functions for device" "with sysfs path '%s'", sysfs_path); @@ -2411,51 +2415,88 @@ virPCIGetVirtualFunctions(const char *sysfs_path, while ((entry = readdir(dir))) { if (STRPREFIX(entry->d_name, "virtfn")) { - virPCIDeviceAddress *config_addr = NULL; + char *tempName; - if (virBuildPath(&device_link, sysfs_path, entry->d_name) == -1) { - virReportOOMError(); + /* save these to sort into virtual_functions array */ + if (VIR_STRDUP(tempName, entry->d_name) < 0) + goto error; + if (VIR_APPEND_ELEMENT(virtfnNames, nVirtfnNames, tempName) < 0) { + VIR_FREE(tempName); goto error; } + } + } - VIR_DEBUG("Number of virtual functions: %d", - *num_virtual_functions); + /* pre-allocate because we will be filling in out of order */ + if (nVirtfnNames && VIR_ALLOC_N(*virtual_functions, nVirtfnNames) < 0) + goto error; + *num_virtual_functions = nVirtfnNames; - if (virPCIGetDeviceAddressFromSysfsLink(device_link, - &config_addr) != - SRIOV_FOUND) { - VIR_WARN("Failed to get SRIOV function from device " - "link '%s'", device_link); - VIR_FREE(device_link); - continue; - } + for (i = 0; i < nVirtfnNames; i++) { + virPCIDeviceAddress *config_addr = NULL; + unsigned int virtfnIndex; - if (VIR_REALLOC_N(*virtual_functions, - *num_virtual_functions + 1) < 0) { - VIR_FREE(config_addr); - goto error; - } + if (virBuildPath(&device_link, sysfs_path, virtfnNames[i]) < 0) { + virReportOOMError(); + goto error; + } + + if (virStrToLong_ui(virtfnNames[i] + strlen(virtfnPrefix), + 0, 10, &virtfnIndex) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid virtual function link name '%s' " + "in physical function directory '%s'"), + virtfnNames[i], sysfs_path); + goto error; + } + + if (virtfnIndex >= nVirtfnNames) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Virtual function link name '%s' larger than " + "total count of virtual functions %zu " + "in physical function directory '%s'"), + virtfnNames[i], nVirtfnNames, sysfs_path); + goto error; + } - (*virtual_functions)[*num_virtual_functions] = config_addr; - (*num_virtual_functions)++; + if ((*virtual_functions)[virtfnIndex]) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Virtual function link name '%s' " + "generates duplicate index %zu " + "in physical function directory '%s'"), + virtfnNames[i], nVirtfnNames, sysfs_path); + goto error; + } + + if (virPCIGetDeviceAddressFromSysfsLink(device_link, &config_addr) != + SRIOV_FOUND) { + VIR_WARN("Failed to get SRIOV function from device link '%s'", + device_link); VIR_FREE(device_link); + continue; } + + VIR_DEBUG("Found virtual function %d", virtfnIndex); + (*virtual_functions)[virtfnIndex] = config_addr; + VIR_FREE(device_link); } + VIR_DEBUG("Number of virtual functions: %d", *num_virtual_functions); ret = 0; cleanup: + for (i = 0; i < nVirtfnNames; i++) + VIR_FREE(virtfnNames[i]); + VIR_FREE(virtfnNames); VIR_FREE(device_link); if (dir) closedir(dir); return ret; error: - if (*virtual_functions) { - for (i = 0; i < *num_virtual_functions; i++) - VIR_FREE((*virtual_functions)[i]); - VIR_FREE(*virtual_functions); - } + for (i = 0; i < *num_virtual_functions; i++) + VIR_FREE((*virtual_functions)[i]); + VIR_FREE(*virtual_functions); goto cleanup; } -- 1.8.3.1

On 11/05/2013 11:43 AM, Laine Stump wrote:
+ goto error; + }
- (*virtual_functions)[*num_virtual_functions] = config_addr; - (*num_virtual_functions)++; + if ((*virtual_functions)[virtfnIndex]) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Virtual function link name '%s' " + "generates duplicate index %zu " + "in physical function directory '%s'"), + virtfnNames[i], nVirtfnNames, sysfs_path); + goto error; + } + + if (virPCIGetDeviceAddressFromSysfsLink(device_link, &config_addr) != + SRIOV_FOUND) { + VIR_WARN("Failed to get SRIOV function from device link '%s'", + device_link); VIR_FREE(device_link); + continue;
(*virtual_functions)[virtfnIndex] will be left NULL here. virNetDevGetVirtualFunctions looks like it would segfault on such array.
} + + VIR_DEBUG("Found virtual function %d", virtfnIndex); + (*virtual_functions)[virtfnIndex] = config_addr; + VIR_FREE(device_link); }
Jan

On 11/06/2013 02:33 PM, Ján Tomko wrote:
On 11/05/2013 11:43 AM, Laine Stump wrote:
+ goto error; + }
- (*virtual_functions)[*num_virtual_functions] = config_addr; - (*num_virtual_functions)++; + if ((*virtual_functions)[virtfnIndex]) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Virtual function link name '%s' " + "generates duplicate index %zu " + "in physical function directory '%s'"), + virtfnNames[i], nVirtfnNames, sysfs_path); + goto error; + } + + if (virPCIGetDeviceAddressFromSysfsLink(device_link, &config_addr) != + SRIOV_FOUND) { + VIR_WARN("Failed to get SRIOV function from device link '%s'", + device_link); VIR_FREE(device_link); + continue; (*virtual_functions)[virtfnIndex] will be left NULL here. virNetDevGetVirtualFunctions looks like it would segfault on such array.
Ugh. Yes, previously this failure would lead to no entry added to the array, but now the array is pre-allocated, and the position of each element within the array is important so we can't do that. However, the only reason that a problem would be encountered getting a device address from a virtfn* link should not be considered an error is the case where some driver has decided to have a filename in the device directory that starts with "virtfn", but is something other than a link to a virtual function. If that were to happen, it would still be the case that all the virtual functions would be at the beginning of the array, and there should be no holes in the middle. For example, if we had: virtfn0 virtfn2 virtfn-poor-name-choice virtfn1 then the result would be an array of 4 items, with 0-2 filled in, and 3 left NULL. So, I believe a solution to this problem is to scan the array when were finished and do two things: 1) truncate any NULLs at the end by decreasing num_virtual_functions, and 2) fail and log an error if there are any NULLs earlier in the array than the highest non-NULL. I'll revise the patch to do that and resubmit.
participants (2)
-
Ján Tomko
-
Laine Stump