Re: [PATCH v2] network: add rule to nftables backend that zeroes checksum of DHCP responses

Tuesday, 29 October 2024

On 10/29/24 8:46 AM, Daniel P. Berrangé wrote:
...
 On Tue, Oct 29, 2024 at 12:22:42PM +0000, Daniel P. Berrangé wrote:
> On Tue, Oct 29, 2024 at 06:03:26AM -0500, Andrea Bolognani wrote:
>> On Mon, Oct 28, 2024 at 06:07:14PM +0000, Daniel P. Berrangé wrote:
>>> On Mon, Oct 28, 2024 at 10:32:55AM -0700, Andrea Bolognani wrote:
>>>> I did some testing of my own and I can confirm that FreeBSD and
>>>> OpenBSD are fine with this change, as are various Linux flavors
>>>> (Alpine, CirrOS, Debian, Fedora, openSUSE, Ubuntu).
>>>>
>>>> However, a few other operating systems aren't: namely GNU/Hurd,
Haiku
>>>> and NetBSD break with this change. Interestingly, these were all fine
>>>> with the nftables backend before it.
>>>
>>> Well that's odd. I've checked NetBSD source code and found no less
>>> than 3 DHCP client impls, and all of them cope with checksum == 0.
>>>
>>> https://github.com/NetBSD/src/blob/trunk/usr.bin/rump_dhcpclient/net.c#L497
>>>
>>>
https://github.com/NetBSD/src/blob/trunk/external/bsd/dhcpcd/dist/src/dhc...
>>>
>>>
https://github.com/NetBSD/src/blob/trunk/external/mpl/dhcp/dist/common/pa...
>>>
>>> the middle impl also directly copes with partial checksums
>>
>> The boot log contains
>>
>>    Starting dhcpcd.
>>    wm0: checksum failure from 192.168.124.1
>>
>> so I guess the second implementation is the relevant one.
>
> I've just tested netBSD 10.0 and get exactly the same failure
> as you.
>
> I've tried "tcpdump -vv -i vnetXXX port 68" on the host and
> on the guest and that is reporting that the checksum is bad.
> It is *not* getting set to zero.
>
> Meanwhile, if I run the same tcpdump with OpenBSD guest, I
> see tcpdump reporting a zero checksum as expected.
>
> WTF ?
>
> Somehow our nftables rule is not having an effect, or worse,
> it is have a non-deterministic effect where it works for
> packets on some guests, but not others.
>
> I checked the rule counters and packets are hitting the rule,
> but not getting their checksum zerod.

 Further research shows tcpdump on packets leaving 'virbr0' have
 the checksum correctly zerod. Our nftables rule is working.

 A concurrent tcpdump on packets leaving 'vnetNNN' shows the
 checksum is mangled.

 With our old iptables rules, we set a valid checksum when leaving
 virbr0, and I presume this causes all subsequent code to not touch
 the checksum field.

 With our new nftables rules, we set a zero checksum when leaving
 virbr0, and "zero checksum" conceptually  means "not present (yet)".

 I think there must be logic somewhere in the kernel/QEMU which
 sees "not present" and decides it needs to do <something> with
 the checksum field. 
Yikes!

...

 A key difference that is probably relevant is that netbsd is
 using an e1000 NIC in QEMU, while openbsd is using a virtio-net
 NIC. At least when created by virt-manager.

 AFAIR, QEMU's magic checksum offload only happens for virtio-net,
 so presumably our rules are incompatible with non-virtio-net NICs
 in someway. 
Double and triple yikes!

So something in the packet path for non-virtio-net NICs is noticing that 
the packet checksum is 0, and then "fixing" it with the *wrong* checksum?

But in the past when it already had the correct checksum, that same bit 
of code said "Huh. The checksum is already correct" and left it alone.

So when the extra rules are removed, then those same guests begin 
working? (You can easily remove the checksum rules with:

   nft delete chain ip libvirt_network postroute_mangle

BTW, I just now tried an e1000e NIC on Fedora guest and it continues to 
work with the 0-checksum rules removed. In this case tcpdump on virbr0 
shows "bad cksum", but when I look at tcpdump on the guest, it shows 
"udp cksum ok" though, so something else somewhere is setting the 
checksum to the correct value.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [PATCH v2] network: add rule to nftables backend that zeroes checksum of DHCP responses