
18 Jun
2014
18 Jun
'14
11:58 a.m.
On 2014年06月18日 20:24, Laine Stump wrote: > On 06/17/2014 06:16 AM, vaughan wrote: >> Hi experts, >> >> Release: OL7 >> Kernel: 3.10.0-121.el7.x86_64 >> >> Noticed below error on OL7 server, while loading Intel 10gigabit nic >> driver module , ixgbe in syslog >> -------------------------------------------------------------------- >> journal: nl_recv returned with error: No buffer space available >> ------------------------------------------------------------------- >> >> Complete syslog content for ixgbe module load : >> >> un 16 20:46:10 ca-ostest432 kernel: ixgbe: Intel(R) 10 Gigabit PCI >> Express Network Driver - version 3.15.1-k >> Jun 16 20:46:10 ca-ostest432 kernel: ixgbe: Copyright (c) 1999-2013 >> Intel Corporation. >> Jun 16 20:46:10 ca-ostest432 kvm: 1 guest now active >> Jun 16 20:46:10 ca-ostest432 kvm: 0 guests now active >> Jun 16 20:46:10 ca-ostest432 kernel: ixgbe 0000:13:00.0: Multiqueue >> Enabled: Rx Queue count = 16, Tx Queue count = 16 >> Jun 16 20:46:10 ca-ostest432 kernel: ixgbe 0000:13:00.0: (PCI >> Express:5.0GT/s:Width x8) 00:1b:21:c8:24:74 >> Jun 16 20:46:10 ca-ostest432 kernel: ixgbe 0000:13:00.0: MAC: 2, PHY: 9, >> SFP+: 3, PBA No: E70856-007 >> Jun 16 20:46:10 ca-ostest432 kernel: ixgbe 0000:13:00.0: PCI Express >> bandwidth of 32GT/s available >> Jun 16 20:46:10 ca-ostest432 kernel: ixgbe 0000:13:00.0: (Speed:5.0GT/s, >> Width: x8, Encoding Loss:20%) >> Jun 16 20:46:10 ca-ostest432 journal: nl_recv returned with error: No >> buffer space available > A very similar problem (probably the same, but you're showing the kernel > error message rather than the error logged by libvirt) was reported and Can we distinguish the two problem just from libvirtd.log? I turned it on, and it just output the following messages, like common errors. If you would like to see the whole log, I can attach it. 2014-06-18 14:16:40.901+0000: 31891: info : virNetlinkEventServiceStart:520 : starting netlink event service with protocol 15 2014-06-18 14:16:40.901+0000: 31891: debug : virEventPollInterruptLocked:710 : Skip interrupt, 0 0 2014-06-18 14:16:40.901+0000: 31891: debug : virEventPollAddHandle:136 : EVENT_POLL_ADD_HANDLE: watch=6 fd=14 events=1 cb=0x7fdff9397680 opaque=0x7fdffb8ff960 ff=(nil) 2014-06-18 14:16:40.901+0000: 31891: debug : virNetlinkEventServiceStart:576 : netlink event listener on fd: 14 running 2014-06-18 14:16:40.901+0000: 31891: debug : virNetServerRun:1093 : srv=0x7fdffb8f0d70 quit=0 ......... 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=0 w=1, f=6 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=1 w=2, f=8 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=2 w=3, f=11 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=3 w=4, f=12 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=4 w=5, f=13 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=5 w=6, f=14 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=6 w=7, f=15 e=0 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=7 w=8, f=15 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=8 w=9, f=17 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollCalculateTimeout:332 : Calculate expiry of 0 timers 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollCalculateTimeout:361 : Timeout at 0 due in -1 ms 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollRunOnce:627 : EVENT_POLL_RUN: nhandles=8 timeout=-1 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollRunOnce:638 : Poll got 1 event(s) 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchTimeouts:423 : Dispatch 0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:468 : Dispatch 8 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=0 w=1 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=1 w=2 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=2 w=3 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=3 w=4 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=4 w=5 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=5 w=6 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:496 : EVENT_POLL_DISPATCH_HANDLE: watch=6 events=1 2014-06-18 14:16:48.993+0000: 31891: error : virNetlinkEventCallback:344 : nl_recv return ok length: 186: Success 2014-06-18 14:16:48.993+0000: 31891: debug : virNetlinkEventCallback:349 : dispatching to max 0 clients, called from event watch 6 2014-06-18 14:16:48.993+0000: 31891: debug : virNetlinkEventCallback:362 : event not handled. 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=7 w=8 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=8 w=9 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollCleanupTimeouts:514 : Cleanup 0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollCleanupTimeouts:550 : Found 0 out of 0 timeout slots used, releasing 0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollCleanupHandles:562 : Cleanup 9 2014-06-18 14:16:48.993+0000: 31891: debug : virEventRunDefaultImpl:270 : running default event implementation 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollCleanupTimeouts:514 : Cleanup 0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollCleanupTimeouts:550 : Found 0 out of 0 timeout slots used, releasing 0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollCleanupHandles:562 : Cleanup 9 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=0 w=1, f=6 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=1 w=2, f=8 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=2 w=3, f=11 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=3 w=4, f=12 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=4 w=5, f=13 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=5 w=6, f=14 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=6 w=7, f=15 e=0 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=7 w=8, f=15 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollMakePollFDs:391 : Prepare n=8 w=9, f=17 e=1 d=0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollCalculateTimeout:332 : Calculate expiry of 0 timers 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollCalculateTimeout:361 : Timeout at 0 due in -1 ms 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollRunOnce:627 : EVENT_POLL_RUN: nhandles=8 timeout=-1 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollRunOnce:638 : Poll got 1 event(s) 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchTimeouts:423 : Dispatch 0 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:468 : Dispatch 8 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=0 w=1 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=1 w=2 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=2 w=3 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=3 w=4 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=4 w=5 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:482 : i=5 w=6 2014-06-18 14:16:48.993+0000: 31891: debug : virEventPollDispatchHandles:496 : EVENT_POLL_DISPATCH_HANDLE: watch=6 events=5 2014-06-18 14:16:48.993+0000: 31891: error : virNetlinkEventCallback:340 : nl_recv returned with error, -5: No buffer space available > addressed in libnl3 quite awhile back. libnl3 originally set the default > buffer size to 4096, which wasn't enough for SRIOV cards with lots of > VFs. So they increased it to 4 * 4096, which should be plenty for > anybody. That libnl3 patch is present in RHEL7.0 (currently at 3.2.21.6). > > Can you verify the version of libnl3 you are running, and that it > contains this code > > if (page_size == 0) > page_size = getpagesize() * 4; I'm using libnl3-3.2.21-6.0.1.el7.x86_64 which has the patch you mentioned, * <E4><B8><89> 2<E6><9C><88> 26 2014 Thomas Graf <tgraf@redhat.com> - 3.2.21-5 - nl-Increase-receive-buffer-size-to-4-pages.patch (rh #1040626) > in the function lib/nl.c:nl_recv() (previously it was just "page_size = > getpagesize();"). If you don't have that patch in your libnl3 package, > please backport the upstream commit that makes that change. If you do > have that patch in your libnl3, perhaps you have gotten a different > ixgbe driver from somewhere (we did test against ixgbe with the maximum > number of VFs, so there would have to be something different in your > driver). It would be good to figure out the source of the problem before > applying any fix anywhere - much better to understand the cause, and > right now I don't think we do; what is creating the need for such a > large buffer in your case, but not for others who use the same driver > with the same number of VFs?). Thank you to point this out. I admit that my patch is derived from the symptom, not the cause. I'll dig in this direction. filename: /lib/modules/3.10.0-121.el7.x86_64/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko version: 3.15.1-k license: GPL description: Intel(R) 10 Gigabit PCI Express Network Driver author: Intel Corporation, <linux.nics@intel.com> srcversion: FC629607FA8BAB4CDE3BCF0 ..... depends: mdio,ptp,dca intree: Y vermagic: 3.10.0-121.el7.x86_64 SMP mod_unload modversions signer: Red Hat Enterprise Linux kernel signing key sig_key: E3:7B:3F:4E:75:78:02:B9:12:00:CF:F3:C7:81:1A:51:2E:94:83:EF sig_hashalgo: sha256 parm: max_vfs:Maximum number of virtual functions to allocate per physical function - default is zero and maximum value is 63 (uint) parm: allow_unsupported_sfp:Allow unsupported and untested SFP+ modules on 82599-based adapters (uint) parm: debug:Debug level (0=none,...,16=all) (int) > > > Up to now our position has been that this problem should be fixed in > libnl, so we have preferred to not patch libvirt for it, but instead get > libnl fixed. If we do decide to patch libvirt, I think it would be > better to turn on message peeking for nl_recv > (nl_socket_enable_msg_peek()), as that would solve the problem totally > and permanently (the upstream maintainer of libnl is reluctant to turn > that on by default due to potential performance problems in other users > of libnl) > Affirmative.