When an nwfilter rule sets the parameter CTRL_IP_LEARNING to
"dhcp",
this turns on the "dhcpsnoop" thread, which uses libpcap to monitor
traffic on the domain's tap device and extract the IP address from the
DHCP response.
If libpcap on the host is built with HAVE_TPACKET3 defined (to enable
support for TPACKET_V3), the dhcpsnoop code's initialization of the
libpcap socket would fail with the following error:
virNWFilterSnoopDHCPOpen:1134 : internal error: pcap_setfilter: can't remove kernel
filter: Bad file descriptor
It turns out that this was because TPACKET_V3 requires a larger buffer
size than libvirt was setting (we were setting it to 128k). Changing
the buffer size to 256k eliminates the error, and the dhcpsnoop thread
once again works properly.
A fuller explanation of why TPACKET_V3 requires such a large buffer,
for future git spelunkers:
libpcap calls setsockopt(... SOL_PACKET, PACKET_RX_RING...) to setup a
ring buffer for receiving packets; two of the attributes sent to this
API are called tp_frame_size, and tp_frame_nr. If libpcap was built
with HAVE_TPACKET3 defined, tp_trame_size is set to MAXIMUM_SNAPLEN
(defined in libpcap sources as 262144) and tp_frame_nr is set to:
[the buffer size we set, i.e. PCAP_BUFFERSIZE i.e. 262144] / tp_frame_size.
So if PCAP_BUFFERSIZE < MAXIMUM_SNAPLEN, then tp_frame_nr (the number
of frames in the ring buffer) is 0, which is nonsensical. This same
value is later used as a multiplier to determine the size for a call
to malloc() (which would also fail).
(NB: if HAVE_TPACKET3 is *not* defined, then tp_frame_size is set to
the snaplen set by the user (in our case 576) plus a small amount to
account for ethernet headers, so 256k is far more than adequate)
Since the TPACKET_V3 code in libpcap actually reads multiple packets
into each frame, it's not a problem to have only a single frame
(especially when we are monitoring such infrequent traffic), so it's
okay to set this relatively small buffer size (in comparison to the
default, which is 2MB), which is important since every guest using
dhcp snooping in a nwfilter rule will hold 2 of these buffers for the
entire life of the guest.
Thanks to Christian Ehrhardt <paelzer(a)gmail.com> for discovering that
buffer size was the problem (this was not at all obvious from the
error that was logged!)
Resolves:
https://bugzilla.redhat.com/1547237
Fixes:
https://bugs.launchpad.net/libvirt/+bug/1758037
Signed-off-by: Laine Stump <laine(a)laine.org>
---
change from V1: you asked for "big comment", you got it!
If you think this is overkill, or underkill, let me know. I decided
that the details about proper buffersize were better sitting right
next to the definition of PCAP_BUFFERSIZE, but that there should also
be a comment down in the code where any potential future error would
occur, as that would more easily catch the eye of the hapless keyboard
jockey who gets to rediscover this problem the next time libpcap
changes.
Also, I'm planning to request that the libpcap developers check for a
viable buffersize during the pcap_set_buffer_size() call so that in
the future this error is reported immediately rather than just having
some random-seeming failure at a later time when pcap_setfilter() is
called.
src/nwfilter/nwfilter_dhcpsnoop.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
Reviewed-by: John Ferlan <jferlan(a)redhat.com>
Safe for freeze (and it's a bugfix),
John
glad we didn't have to describe this with punch cards or paper tape ;-)
some day maybe we can make this "variable" based on some config value
and not have to worry about changing sizes resulting in strange and
unusual circumstances.