On 04.03.2016 19:27, Daniel P. Berrange wrote:
On Fri, Mar 04, 2016 at 07:17:57PM +0100, Dennis Jacobfeuerborn
wrote:
> Hi,
> with recent guest installs (both centos 5 and 7) on centos 7 hosts I
> seem to have to disable checksum offloading using "ethtool -K eth0 tx
> off" in order to allow traffic to flow a specific route.
>
> Basically the guest is installed with IP 192.168.21.10 and a default
> gateway of 192.168.21.254. Up until that point I can ssh into the system
> normally.
> There exists an OpenVPN System with the IP 192.168.21.1 that choses
> client IP's for the vpn connections from the pool 192.168.20.0/24.
> In order to pass the reponses back to the OpenVPN system I installed the
> route "192.168.20.0/24 via 192.168.21.1 dev eth0".
>
> When I now ping the IP 192.168.21.10 through the VPN connection this
> works fine but when I try to ssh into that system the connection just
> hangs. Looking at a tcpdump I noticed that the checksum for the packets
> weren't quite right so I issued a "ethtool -K eth0 tx off" and
suddenly
> everything worked as expected.
>
> What is strange here is that I'm seeing this with both CentOS 5 and
> CentOS 7 guests and only when dealing with th routed traffic and not the
> regular one.
>
> Does anyone have an idea what is going on here? Could this be an issue
> with the virtio driver?
Things are optimized with virtio-net so that no checksum is ever
written until the packet reaches a physical NIC. So with commnuication
between 2 guest on the same physical host no checksums will ever be done,
& it is expected that tcpdump would show corrupt checksums in that case.
Normally this is just fine as almost no applications in the guest OS will
operate directly on the ethernet packets, so will never even realize that
no checksum is done. There has been one notable problem in the past though
where dhcp clients would get upset by the missing checksum.
When using libvirt virtual networks, we actually create a firewall rule
on the host OS that explicitly adds valid checksums for packets on the
DHCP port to avoid this problem.
I guess it is conceivable that some other applications may get upset by
the missing checksums if they operate at the ethernet layer instead of
the IP layer, which might explain what you see.
So apparently the issue seems to be that the SYN,ACK from the ssh
connection gets sent out by the 192.168.21.10 system with a wrong
checksum then arrives on eth0 at 192.168.21.1 and then gets dripped when
it is sent out the tun0 device create by OpenVPN.
WHat is the proper fix to deal with this? This behavior apparently has
changed only recently and it seems rather cumbersome to now have to
disable tcp checksum offloading for every guest I install in the future.
Regards,
Dennis