[libvirt] PATCH: pci-subsystem: ixgbe: SR-IOV: kernel way to order driver's virtfn -entries is odd causing libvirt failures.

From fac886b86099dca1937730026da24c34857c4077 Mon Sep 17 00:00:00 2001 From: Niilo Minkkinen <niilo.minkkinen@tieto.com> Date: Wed, 23 Oct 2013 14:50:07 +0300 Subject: [PATCH 1/1] Fixes ordering of virtfn -entries on drivers with more
Hi. As Bjorn Helgaas recommend, this might be the item to discuss in the wider area. ----------------------------------- There is a behavior effecting virtfn -entries in sysfs, when amount of them increases over 10. Run VM's through LIBVIRT -> QEMU/KVM, this causes : - MAC address setting by LIBVIRT disordered ie. setting targeted to wrong VF. - VLAN setting by LIBVIRT overall failed Basics of this are in "/libvirt-x.x.x/src/util/virpci.c" ; in function below, which don't order virtfn entries correctly. /* * Returns virtual functions of a physical function */ int virPCIGetVirtualFunctions(const char *sysfs_path, virPCIDeviceAddressPtr **virtual_functions, unsigned int *num_virtual_functions) { But I let you to decide which is best way to fix this, as if every application reads "virtfn" entries from PF's directory, they all need to sort entries in alphabet. order to avoid this influence. So personally I did get over this by adding pre-zeroes to names to have them in sorted order in PF's directory. ---------------------------- Tested : - kernel 3.8.0-21 - libvirt 1.0.2 & libvirt 1.1.3 - ixgbe 3.11.33-k, ixgbe 3.17-3 & ixgbe 3.18.7 After applying this MAC/VLAN seen to work and VM's get contacted using MAC's & VLAN's, which been checked using tcpdump. P.S. Tested with 3.8.0-21, but patch constructed against "https://github.com/torvalds/linux.git", so there is possibilty this works in latest kernel, but haven't checked it. At least ordering scheme seen in "iov.c" in same form as before. ----------------------------- than 10 entries ( virtfn0...virtfn9 ). This effects on drivers like Intel 82599 "ixgbe", which have max 31 VF's (Virtual Function) on one PF (Physical Function). Currently, ordering is "diffused", which in case causes MAC/VLAN -settings on libvirt based to PCI -addressing failed. Signed-off-by: Niilo Minkkinen <niilo.minkkinen@tieto.com> Signed-off-by: Tommy Varre <tommy.varre@tieto.com> --- drivers/pci/iov.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index 21a7182..fbd1209 100644 --- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -106,7 +106,7 @@ static int virtfn_add(struct pci_dev *dev, int id, int reset) mutex_unlock(&iov->dev->sriov->lock); rc = pci_bus_add_device(virtfn); - sprintf(buf, "virtfn%u", id); + sprintf(buf, "virtfn%02u", id); rc = sysfs_create_link(&dev->dev.kobj, &virtfn->dev.kobj, buf); if (rc) goto failed1; @@ -149,7 +149,7 @@ static void virtfn_remove(struct pci_dev *dev, int id, int reset) __pci_reset_function(virtfn); } - sprintf(buf, "virtfn%u", id); + sprintf(buf, "virtfn%02u", id); sysfs_remove_link(&dev->dev.kobj, buf); /* * pci_stop_dev() could have been called for this virtfn already, -- 1.8.1.2 ----------------------------- What you think ? Niilo Minkkinen niilo.minkkinen@tieto.com

On Thu, Oct 24, 2013 at 09:52:31AM +0300, Niilona wrote:
Hi.
As Bjorn Helgaas recommend, this might be the item to discuss in the wider area. -----------------------------------
There is a behavior effecting virtfn -entries in sysfs, when amount of them increases over 10. Run VM's through LIBVIRT -> QEMU/KVM, this causes : - MAC address setting by LIBVIRT disordered ie. setting targeted to wrong VF. - VLAN setting by LIBVIRT overall failed
Basics of this are in "/libvirt-x.x.x/src/util/virpci.c" ; in function below, which don't order virtfn entries correctly.
/* * Returns virtual functions of a physical function */ int virPCIGetVirtualFunctions(const char *sysfs_path, virPCIDeviceAddressPtr **virtual_functions, unsigned int *num_virtual_functions) {
But I let you to decide which is best way to fix this, as if every application reads "virtfn" entries from PF's directory, they all need to sort entries in alphabet. order to avoid this influence. So personally I did get over this by adding pre-zeroes to names to have them in sorted order in PF's directory.
Libvirt has to work correctly with all existing released kernels, so we have to fix libvirt to deal with ordering correctly. Thus changing the kernel naming here doesn't really help. We need to fix libvirt to deal with this. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 10/24/2013 11:12 AM, Daniel P. Berrange wrote:
On Thu, Oct 24, 2013 at 09:52:31AM +0300, Niilona wrote:
Hi.
As Bjorn Helgaas recommend, this might be the item to discuss in the wider area. -----------------------------------
There is a behavior effecting virtfn -entries in sysfs, when amount of them increases over 10. Run VM's through LIBVIRT -> QEMU/KVM, this causes : - MAC address setting by LIBVIRT disordered ie. setting targeted to wrong VF. - VLAN setting by LIBVIRT overall failed
Basics of this are in "/libvirt-x.x.x/src/util/virpci.c" ; in function below, which don't order virtfn entries correctly.
/* * Returns virtual functions of a physical function */ int virPCIGetVirtualFunctions(const char *sysfs_path, virPCIDeviceAddressPtr **virtual_functions, unsigned int *num_virtual_functions) {
But I let you to decide which is best way to fix this, as if every application reads "virtfn" entries from PF's directory, they all need to sort entries in alphabet. order to avoid this influence. So personally I did get over this by adding pre-zeroes to names to have them in sorted order in PF's directory. Libvirt has to work correctly with all existing released kernels, so we have to fix libvirt to deal with ordering correctly. Thus changing the kernel naming here doesn't really help. We need to fix libvirt to deal with this.
Right. Now that there is enough information to understand the problem wrt. libvirt, I can see that the main place this improper ordering of the device links causes an issue is in virPCIGetVirtualFunctionIndex() (and callers) - it assumes that the VF indexes will exactly match the index of each PCI device in the list returned by virPCIGetVirtualFunction(). (We've never made any guarantees about the ordering of virtual functions in the output of nodedev-dumpxml, but the return value from virPCIGetVirtualFunctionIndex is used by callers to fill in the netlink command to get/set a VF's mac address and vlan tag.) To fix this we need to modify virPCIGetVirtualFunctions to gather all virtfn%d entries in the PF's directory and sort them numerically (right now we just grab them in whatever order readdir() provides and put them in the list in that order). This will automatically fix virPCIGetVirtualFunctionIndex() and coincidentally make the ordering in nodedev-dumpxml "proper". Of course even this will only work as long as lists of virtual functions are guaranteed to never be "sparse" (e.g. there can't be a virtfn22 unless there is also a virtfn21). My guess is that this is always true, but we should check for that in the new function and log an appropriate error in the case that we encounter a sparse list.

On 10/29/2013 06:01 PM, Laine Stump wrote:
On Thu, Oct 24, 2013 at 09:52:31AM +0300, Niilona wrote:
There is a behavior effecting virtfn -entries in sysfs, when amount of them increases over 10. Run VM's through LIBVIRT -> QEMU/KVM, this causes : - MAC address setting by LIBVIRT disordered ie. setting targeted to wrong VF. - VLAN setting by LIBVIRT overall failed To fix this we need to modify virPCIGetVirtualFunctions to gather all virtfn%d entries in the PF's directory and sort them numerically (right now we just grab them in whatever order readdir() provides and put them in the list in that order). This will automatically fix virPCIGetVirtualFunctionIndex() and coincidentally make the ordering in nodedev-dumpxml "proper".
Of course even this will only work as long as lists of virtual functions are guaranteed to never be "sparse" (e.g. there can't be a virtfn22 unless there is also a virtfn21). My guess is that this is always true, but we should check for that in the new function and log an appropriate error in the case that we encounter a sparse list.
I've implemented such a patch, and have posted it to libvir-list: https://www.redhat.com/archives/libvir-list/2013-November/msg00125.html Please try it out and post feedback to that thread. Thanks for persevering enough to make me see the problem! :-)
participants (3)
-
Daniel P. Berrange
-
Laine Stump
-
Niilona