On 06/17/2014 06:16 AM, vaughan wrote:
Hi experts,
Release: OL7
Kernel: 3.10.0-121.el7.x86_64
Noticed below error on OL7 server, while loading Intel 10gigabit nic
driver module , ixgbe in syslog
--------------------------------------------------------------------
journal: nl_recv returned with error: No buffer space available
-------------------------------------------------------------------
Complete syslog content for ixgbe module load :
un 16 20:46:10 ca-ostest432 kernel: ixgbe: Intel(R) 10 Gigabit PCI
Express Network Driver - version 3.15.1-k
Jun 16 20:46:10 ca-ostest432 kernel: ixgbe: Copyright (c) 1999-2013
Intel Corporation.
Jun 16 20:46:10 ca-ostest432 kvm: 1 guest now active
Jun 16 20:46:10 ca-ostest432 kvm: 0 guests now active
Jun 16 20:46:10 ca-ostest432 kernel: ixgbe 0000:13:00.0: Multiqueue
Enabled: Rx Queue count = 16, Tx Queue count = 16
Jun 16 20:46:10 ca-ostest432 kernel: ixgbe 0000:13:00.0: (PCI
Express:5.0GT/s:Width x8) 00:1b:21:c8:24:74
Jun 16 20:46:10 ca-ostest432 kernel: ixgbe 0000:13:00.0: MAC: 2, PHY: 9,
SFP+: 3, PBA No: E70856-007
Jun 16 20:46:10 ca-ostest432 kernel: ixgbe 0000:13:00.0: PCI Express
bandwidth of 32GT/s available
Jun 16 20:46:10 ca-ostest432 kernel: ixgbe 0000:13:00.0: (Speed:5.0GT/s,
Width: x8, Encoding Loss:20%)
Jun 16 20:46:10 ca-ostest432 journal: nl_recv returned with error: No
buffer space available
A very similar problem (probably the same, but you're showing the kernel
error message rather than the error logged by libvirt) was reported and
addressed in libnl3 quite awhile back. libnl3 originally set the default
buffer size to 4096, which wasn't enough for SRIOV cards with lots of
VFs. So they increased it to 4 * 4096, which should be plenty for
anybody. That libnl3 patch is present in RHEL7.0 (currently at 3.2.21.6).
Can you verify the version of libnl3 you are running, and that it
contains this code
if (page_size == 0)
page_size = getpagesize() * 4;
in the function lib/nl.c:nl_recv() (previously it was just "page_size =
getpagesize();"). If you don't have that patch in your libnl3 package,
please backport the upstream commit that makes that change. If you do
have that patch in your libnl3, perhaps you have gotten a different
ixgbe driver from somewhere (we did test against ixgbe with the maximum
number of VFs, so there would have to be something different in your
driver). It would be good to figure out the source of the problem before
applying any fix anywhere - much better to understand the cause, and
right now I don't think we do; what is creating the need for such a
large buffer in your case, but not for others who use the same driver
with the same number of VFs?).
Up to now our position has been that this problem should be fixed in
libnl, so we have preferred to not patch libvirt for it, but instead get
libnl fixed. If we do decide to patch libvirt, I think it would be
better to turn on message peeking for nl_recv
(nl_socket_enable_msg_peek()), as that would solve the problem totally
and permanently (the upstream maintainer of libnl is reluctant to turn
that on by default due to potential performance problems in other users
of libnl)