[libvirt] Handling large amount of VF's with intelligent passthrough

Hi, I've run into a rather interesting problem recently where a weird interaction between libnl3 and libvirt caused some difficult-to-debug issues. From libvirt's side, the issue was that a netlink response was much larger than the pagesize and truncated by libnl3. When virNetDevLinkDump() calls virNetlinkCommand(), nl_recv() is supposed to return a rather large structure with information for all the Virtual Functions. When called in a system where the number of PCIe Virtual Functions are more than 30 for a given Physical Function, the netlink response is larger than 4k, meaning that a message is truncated. Unfortunately libnl3 truncates this silently, meaning that a cryptic error pops up much later in virNetDevParseVfConfig(), "missing IFLA_VF_INFO in netlink response". Aside from the error propagation (which might be fixable in libnl3), there still remains the need to enable libvirt to function in cases like this. This can be done in two ways, both in virNetlinkCommand(). 1. Message peeking can be enabled. In theory this slows down any netlink messages by doing a two stage query: query the buffer size, then allocate the receive buffer and receive the message. This is a reliability/performance tradeoff, I guess. This is as simple as adding: nl_socket_enable_msg_peek(nlhandle); 2. The receive buffer size can also be made larger: nl_socket_set_msg_buf_size(nlhandle, ARBITRARY_BUFFER_SIZE); This does not incur a performance penalty, but until libnl3 can propagate the truncation error, this merely postpones the error for future generations... Jan

On Tue, Dec 08, 2015 at 05:58:15PM +0200, Jan Gutter wrote:
Hi,
I've run into a rather interesting problem recently where a weird interaction between libnl3 and libvirt caused some difficult-to-debug issues. From libvirt's side, the issue was that a netlink response was much larger than the pagesize and truncated by libnl3. When virNetDevLinkDump() calls virNetlinkCommand(), nl_recv() is supposed to return a rather large structure with information for all the Virtual Functions. When called in a system where the number of PCIe Virtual Functions are more than 30 for a given Physical Function, the netlink response is larger than 4k, meaning that a message is truncated. Unfortunately libnl3 truncates this silently, meaning that a cryptic error pops up much later in virNetDevParseVfConfig(), "missing IFLA_VF_INFO in netlink response".
Aside from the error propagation (which might be fixable in libnl3), there still remains the need to enable libvirt to function in cases like this. This can be done in two ways, both in virNetlinkCommand().
I wishes we never used libnl and just did everything directly to netlink sockets - libnl has been a never ending source of bugs and breakage
1. Message peeking can be enabled. In theory this slows down any netlink messages by doing a two stage query: query the buffer size, then allocate the receive buffer and receive the message. This is a reliability/performance tradeoff, I guess.
Do you have a guage on what kind of performance penalty we're talking about ? If this is not in a hot-path for libvirt I'd be inclined to just accept the hit.
This is as simple as adding:
nl_socket_enable_msg_peek(nlhandle);
2. The receive buffer size can also be made larger:
nl_socket_set_msg_buf_size(nlhandle, ARBITRARY_BUFFER_SIZE);
This does not incur a performance penalty, but until libnl3 can propagate the truncation error, this merely postpones the error for future generations...
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 12/08/2015 10:58 AM, Jan Gutter wrote:
Hi,
I've run into a rather interesting problem recently where a weird interaction between libnl3 and libvirt caused some difficult-to-debug issues. From libvirt's side, the issue was that a netlink response was much larger than the pagesize and truncated by libnl3. When virNetDevLinkDump() calls virNetlinkCommand(), nl_recv() is supposed to return a rather large structure with information for all the Virtual Functions. When called in a system where the number of PCIe Virtual Functions are more than 30 for a given Physical Function, the netlink response is larger than 4k, meaning that a message is truncated. Unfortunately libnl3 truncates this silently, meaning that a cryptic error pops up much later in virNetDevParseVfConfig(), "missing IFLA_VF_INFO in netlink response".
Aside from the error propagation (which might be fixable in libnl3), there still remains the need to enable libvirt to function in cases like this. This can be done in two ways, both in virNetlinkCommand().
What version of libvirt and libnl are you using? I thought that we had solved this problem, either in libvirt or in libnl at least a couple years ago. Did something not get pushed somewhere? (now I need to go spelunking in bugzilla again :-/)
1. Message peeking can be enabled. In theory this slows down any netlink messages by doing a two stage query: query the buffer size, then allocate the receive buffer and receive the message. This is a reliability/performance tradeoff, I guess.
This is as simple as adding:
nl_socket_enable_msg_peek(nlhandle);
2. The receive buffer size can also be made larger:
nl_socket_set_msg_buf_size(nlhandle, ARBITRARY_BUFFER_SIZE);
This does not incur a performance penalty, but until libnl3 can propagate the truncation error, this merely postpones the error for future generations...
Jan
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Tue, Dec 8, 2015 at 9:52 PM, Laine Stump <laine@laine.org> wrote:
What version of libvirt and libnl are you using? I thought that we had solved this problem, either in libvirt or in libnl at least a couple years ago. Did something not get pushed somewhere? (now I need to go spelunking in bugzilla again :-/)
I believe I checked on tip on both... My literature study hit http://marc.info/?l=linux-netdev&m=117322322423467&w=2 and http://lists.infradead.org/pipermail/libnl/2012-May/000601.html I also just found: https://centoros.wordpress.com/2014/09/12/ifla_vf_info/ I am not sure if this is the same bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1496942 Jan

On Wed, Dec 9, 2015 at 11:05 AM, Jan Gutter <jan.gutter@netronome.com> wrote:
On Tue, Dec 8, 2015 at 9:52 PM, Laine Stump <laine@laine.org> wrote:
What version of libvirt and libnl are you using? I thought that we had solved this problem, either in libvirt or in libnl at least a couple years ago. Did something not get pushed somewhere? (now I need to go spelunking in bugzilla again :-/)
I believe I checked on tip on both...
My literature study hit http://marc.info/?l=linux-netdev&m=117322322423467&w=2 and http://lists.infradead.org/pipermail/libnl/2012-May/000601.html
I also just found: https://centoros.wordpress.com/2014/09/12/ifla_vf_info/
I am not sure if this is the same bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1496942
Ah, I believe I found the bug you were looking for: https://bugzilla.redhat.com/show_bug.cgi?id=889319 Jan

On Wed, Dec 9, 2015 at 11:09 AM, Jan Gutter <jan.gutter@netronome.com> wrote:
On Wed, Dec 9, 2015 at 11:05 AM, Jan Gutter <jan.gutter@netronome.com> wrote:
On Tue, Dec 8, 2015 at 9:52 PM, Laine Stump <laine@laine.org> wrote:
What version of libvirt and libnl are you using? I thought that we had solved this problem, either in libvirt or in libnl at least a couple years ago. Did something not get pushed somewhere? (now I need to go spelunking in bugzilla again :-/)
Looks like the libnl3 side has been fixed on master, at least: https://github.com/thom311/libnl/commit/bbdcaea9a779885fedc04817dcc1195 I was inspecting code from the previous release... Jan
participants (3)
-
Daniel P. Berrange
-
Jan Gutter
-
Laine Stump