On 11/07/2013 02:08 PM, Martin Kletzander wrote:
On Thu, Nov 07, 2013 at 01:10:39PM +0200, Laine Stump wrote:
> This resolves:
>
>
https://bugzilla.redhat.com/show_bug.cgi?id=1025397
>
> When libvirt creates the list of an SRIOV Physical Function's (PF)
> Virtual Functions (VF), it assumes that the order of "virtfn*" links
> returned by readdir() from the PF's sysfs directory is already in the
> correct order. Experience has shown that this is not always the case.
>
> This results in 1) incorrect assumptions made by consumers of the
> output of the virt_functions list of virsh nodedev-dumpxml, and 2)
> setting MAC address and vlan tag on the wrong VF (since libvirt uses
> netlink to set mac address and vlan tag, netlink requires the VF#, and
> the function virPCIGetVirtualFunctionIndex() returns the wrong index
> due to the improperly ordered VF list). See the bugzilla report for an
> example of this improper behavior.
>
> The solution provided by this patch is for virPCIGetVirtualFunctions
> to first gather all the "virtfn*" names from the PFs sysfs directory,
> then allocate a virtual_functions array large enough to hold all
> entries, and finally to create a device entry for each virtfn* name
> and place it into the virtual_functions array at the proper index
> (based on interpreting the value following "virtfn" in the name).
>
> Checks are introduced to ensure that 1) the virtfn list given in sysfs
> is not sparse (ie, there are no indexes larger than the length of the
> list), and that no two virtfn* entries decode to the same index.
> ---
>
> Difference from V1: Based on jtomko's review, truncate trailing NULLs
> in virtual_functions array, and actually check for NULLs in the middle
> (a sparse array) and generate an error in that case, as we already
> claimed to do in the commit message.
>
> (Sorry for the lack of inter-diff, but the only changes from V1 are
> the addition of the lines starting with "truncate any trailing nulls"
> and continuing until the next VIR_DEBUG.)
>
> src/util/virpci.c | 115 ++++++++++++++++++++++++++++++++++++++++++------------
> 1 file changed, 90 insertions(+), 25 deletions(-)
>
Disclaimer: I don't want this to sound rude (just that it does to me
every time I read it), this is just a question that popped in my mind.
Not at all. Alternatives are always welcome.
Why don't we do it differently? If we have a while loop with a number
starting from 1 and access(, F_OK)'ing '"virtfn%d", number' in
that
loop, we'll have all those VFs sorted. And if that's not nice enough,
we should be able to read the number of VFs from the device's
"/config" (in offset 0x10 [1]) and do a nice, clean for loop over
already known filenames.
Or am I missing something? I just Took that from "drivers/pci/iov.c",
so this may be "Just Wrong".
No, that would work too. I was just looking at it from the inside rather
than the outside, so was only thinking of how to fix the current code
rather than how to rewrite it, and never considered that.
My one concern would be if someone actually pushed a patch into the
kernel similar to the one suggested by the original reporter of this
problem (which put leading 0's on all the numbers in the virtfn
filenames), but that seems unlikely, as it isn't scalable (what happens
when you get > 100 VFs?).
I like your idea enough to redo the patch using it. It may be tomorrow
before I get it posted though.