On 10/24/24 2:12 PM, Laine Stump wrote:
On 10/24/24 12:36 PM, Daniel P. Berrangé wrote:
>> [...]
>
> AFAIR, it isn't actually a bug with virtio-net usage as this last
> bit suggests. Rather it is a result of feature negotiation with QEMU
> on the host, whereby the guest & QEMU mutually agree to turn off
> checksums because they are redundant when the "link" is just local
> memory not a physical cable.
>
> IOW, packets don't arrive in the guest with a bad checksum. They
> arrive in the guest with no checksum *as requested* by the guest.
>
> The DHCP client decides this is a bad checksum, as it wasn't
> aware of the checksum offload usage.
>
> [...]
>
> ....and in Fedora/RHEL context it was fixed 18 years ago, as we
> first hit this when working on Xen integration in 2006 :-)
I think you would have to say "in Fedora/RHEL+Xen context it was fixed
18 years ago", since the specific test case that I recall working with
was a RHEL5 guest that couldn't get an address from DHCP. So RHEL still
had the problem, it just took switching to QEMU + virtio + vhost packet
processing to make it visible :-)
>
>> A few quick tests proved that it was the same old "bad checksum"
>> problem from 2010 come back to haunt us.
>
> 2006 :-)
Interesting - so my origin story is at least partially a false memory :-/
But if you had this problem with Xen in 2006, how was it fixed then (and
why was it a surprised when it came up again in 2010 after vhost
processing was turned on? Or maybe it wasn't a surprise, and I just
thought so because I wasn't around in 2006 :-)
(The commit adding --checksum-fill rules to libvirt was fd5b15ff1, from
July 2010)