[libvirt] macvtap - no incoming ipv6 traffic processed on kvm host unless i start tcpdump on interface

Dear folks, I'm using for the first time macvtap interface for my virtual machines in bridged mode. VM -> HOST -> Router -> INTERNET This works fine for ipv4 connectivity. For ipv6 my virtual machines receive appropriate v6 address from radvd but are not able to receive answer packages from outside (ping -t -6 google.de was started inside VM). I see the ping request/response on my router: 14:10:52.147834 IP6 2a01:198:200:8350:dc8b:cd82:144e:14eb > 2a00:1450:4001:806::1018: ICMP6, echo request, seq 108, length 40 14:10:52.182073 IP6 2a00:1450:4001:806::1018 > 2a01:198:200:8350:dc8b:cd82:144e:14eb: ICMP6, echo reply, seq 108, length 40 14:10:55.179874 IP6 2a01:198:200:350::2 > 2a00:1450:4001:806::1018: ICMP6, destination unreachable, unreachable address 2a01:198:200:8350:dc8b:cd82:144e:14eb, length 88 But i do not receive the reply on the VM. However on the KVM host - when i start a tcpdump on the macvtap interface with root@s1:~# tcpdump -ni macvtap0 ip6 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on macvtap0, link-type EN10MB (Ethernet), capture size 262144 bytes 14:12:37.134516 IP6 2a01:198:200:8350:dc8b:cd82:144e:14eb > 2a00:1450:4001:806::1018: ICMP6, echo request, seq 129, length 40 14:12:37.188529 IP6 fe80::12fe:edff:fee6:cfa > ff02::1:ff4e:14eb: ICMP6, neighbor solicitation, who has 2a01:198:200:8350:dc8b:cd82:144e:14eb, length 32 14:12:37.189040 IP6 2a01:198:200:8350:dc8b:cd82:144e:14eb > fe80::12fe:edff:fee6:cfa: ICMP6, neighbor advertisement, tgt is 2a01:198:200:8350:dc8b:cd82:144e:14eb, length 32 14:12:37.189202 IP6 2a00:1450:4001:806::1018 > 2a01:198:200:8350:dc8b:cd82:144e:14eb: ICMP6, echo reply, seq 129, length 40 packages starting to get processed and VM receives replies. Any idea what is happening here? Cheers, Stefan

On Wed, Apr 08, 2015 at 02:13:49PM +0200, Stefan Bauer wrote:
Dear folks,
I'm using for the first time macvtap interface for my virtual machines in bridged mode.
VM -> HOST -> Router -> INTERNET
This works fine for ipv4 connectivity.
For ipv6 my virtual machines receive appropriate v6 address from radvd but are not able to receive answer packages from outside (ping -t -6 google.de was started inside VM).
I see the ping request/response on my router:
14:10:52.147834 IP6 2a01:198:200:8350:dc8b:cd82:144e:14eb > 2a00:1450:4001:806::1018: ICMP6, echo request, seq 108, length 40 14:10:52.182073 IP6 2a00:1450:4001:806::1018 > 2a01:198:200:8350:dc8b:cd82:144e:14eb: ICMP6, echo reply, seq 108, length 40 14:10:55.179874 IP6 2a01:198:200:350::2 > 2a00:1450:4001:806::1018: ICMP6, destination unreachable, unreachable address 2a01:198:200:8350:dc8b:cd82:144e:14eb, length 88
But i do not receive the reply on the VM.
However on the KVM host - when i start a tcpdump on the macvtap interface with
root@s1:~# tcpdump -ni macvtap0 ip6 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on macvtap0, link-type EN10MB (Ethernet), capture size 262144 bytes 14:12:37.134516 IP6 2a01:198:200:8350:dc8b:cd82:144e:14eb > 2a00:1450:4001:806::1018: ICMP6, echo request, seq 129, length 40 14:12:37.188529 IP6 fe80::12fe:edff:fee6:cfa > ff02::1:ff4e:14eb: ICMP6, neighbor solicitation, who has 2a01:198:200:8350:dc8b:cd82:144e:14eb, length 32 14:12:37.189040 IP6 2a01:198:200:8350:dc8b:cd82:144e:14eb > fe80::12fe:edff:fee6:cfa: ICMP6, neighbor advertisement, tgt is 2a01:198:200:8350:dc8b:cd82:144e:14eb, length 32 14:12:37.189202 IP6 2a00:1450:4001:806::1018 > 2a01:198:200:8350:dc8b:cd82:144e:14eb: ICMP6, echo reply, seq 129, length 40
packages starting to get processed and VM receives replies. Any idea what is happening here?
I'm guessing the promiscuous modes plays its part in this field. You can try setting the interface to promisc mode manually using 'ip l set $dev promisc on' and see whether that helps without starting tcpdump. Also check sysctl -a | grep 'ipv6.*forward'. Disclaimer: all of that ^^ is just a guess :)
Cheers,
Stefan
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Wed, 2015-04-08 at 14:32 +0200, Martin Kletzander wrote:
I'm guessing the promiscuous modes plays its part in this field. You can try setting the interface to promisc mode manually using 'ip l set $dev promisc on' and see whether that helps without starting tcpdump. Also check sysctl -a | grep 'ipv6.*forward'.
Thank you for your answer. IIRC forwarding is only required for routing. I use bridge mode. You are right - it's working with promiscious mode enabled. Without: 2a01:198:200:8350:98ec:1708:c82e:a4fb dev br-lan INCOMPLETE With: 2a01:198:200:8350:98ec:1708:c82e:a4fb dev br-lan lladdr 52:54:00:8e:b9:eb REACHABLE Is it really required to be enabled all the time to make it work? Cheers, Stefan

On 08.04.2015 14:13, Stefan Bauer wrote:
Dear folks,
I'm using for the first time macvtap interface for my virtual machines in bridged mode.
VM -> HOST -> Router -> INTERNET
This works fine for ipv4 connectivity.
For ipv6 my virtual machines receive appropriate v6 address from radvd but are not able to receive answer packages from outside (ping -t -6 google.de was started inside VM).
I see the ping request/response on my router:
14:10:52.147834 IP6 2a01:198:200:8350:dc8b:cd82:144e:14eb > 2a00:1450:4001:806::1018: ICMP6, echo request, seq 108, length 40 14:10:52.182073 IP6 2a00:1450:4001:806::1018 > 2a01:198:200:8350:dc8b:cd82:144e:14eb: ICMP6, echo reply, seq 108, length 40 14:10:55.179874 IP6 2a01:198:200:350::2 > 2a00:1450:4001:806::1018: ICMP6, destination unreachable, unreachable address 2a01:198:200:8350:dc8b:cd82:144e:14eb, length 88
But i do not receive the reply on the VM.
However on the KVM host - when i start a tcpdump on the macvtap interface with
root@s1:~# tcpdump -ni macvtap0 ip6 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on macvtap0, link-type EN10MB (Ethernet), capture size 262144 bytes 14:12:37.134516 IP6 2a01:198:200:8350:dc8b:cd82:144e:14eb > 2a00:1450:4001:806::1018: ICMP6, echo request, seq 129, length 40 14:12:37.188529 IP6 fe80::12fe:edff:fee6:cfa > ff02::1:ff4e:14eb: ICMP6, neighbor solicitation, who has 2a01:198:200:8350:dc8b:cd82:144e:14eb, length 32 14:12:37.189040 IP6 2a01:198:200:8350:dc8b:cd82:144e:14eb > fe80::12fe:edff:fee6:cfa: ICMP6, neighbor advertisement, tgt is 2a01:198:200:8350:dc8b:cd82:144e:14eb, length 32 14:12:37.189202 IP6 2a00:1450:4001:806::1018 > 2a01:198:200:8350:dc8b:cd82:144e:14eb: ICMP6, echo reply, seq 129, length 40
packages starting to get processed and VM receives replies. Any idea what is happening here?
I mildly recalls seeing a bug like this. The problem was in intel's kernel driver. A NIC by defaul checks incoming packets whether they match NIC's MAC. So if a TAP device was created over a NIC, it had to be put into promisc mode (automatically done by the driver) to allow different MACs and the check was done in kernel then. But since this generates too much interrupts, NICs HW was extended and it can be programmed with multiple MACs to let through. However, there was a bug which I recall of, that intel driver was not always putting the TAP MAC into the NIC HW correctly. Obviously, the bug was not visible if the NIC was put into promisc mode. And this may be what you are seeing. Let me see if I can find the bug. Michal

On Thu, 2015-04-09 at 10:12 +0200, Michal Privoznik wrote:
I mildly recalls seeing a bug like this. The problem was in intel's kernel driver. A NIC by defaul checks incoming packets whether they match NIC's MAC. So if a TAP device was created over a NIC, it had to be put into promisc mode (automatically done by the driver) to allow different MACs and the check was done in kernel then. But since this generates too much interrupts, NICs HW was extended and it can be programmed with multiple MACs to let through. However, there was a bug which I recall of, that intel driver was not always putting the TAP MAC into the NIC HW correctly. Obviously, the bug was not visible if the NIC was put into promisc mode. And this may be what you are seeing. Let me see if I can find the bug.
That sounds like my problem even though i do not have an Intel nic. 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 10) Cheers, Stefan

On Thu, 2015-04-09 at 10:12 +0200, Michal Privoznik wrote:
I mildly recalls seeing a bug like this. The problem was in intel's kernel driver. A NIC by defaul checks incoming packets whether they match NIC's MAC. So if a TAP device was created over a NIC, it had to be put into promisc mode (automatically done by the driver) to allow different MACs and the check was done in kernel then. But since this generates too much interrupts, NICs HW was extended and it can be programmed with multiple MACs to let through. However, there was a bug which I recall of, that intel driver was not always putting the TAP MAC into the NIC HW correctly. Obviously, the bug was not visible if the NIC was put into promisc mode. And this may be what you are seeing. Let me see if I can find the bug.
FYI - enabling promiscious mode on eth0 does not help. I have to enable it on the macvtap0 which is on top of eth0. Stefan

On 09.04.2015 10:39, Stefan Bauer wrote:
On Thu, 2015-04-09 at 10:12 +0200, Michal Privoznik wrote:
I mildly recalls seeing a bug like this. The problem was in intel's kernel driver. A NIC by defaul checks incoming packets whether they match NIC's MAC. So if a TAP device was created over a NIC, it had to be put into promisc mode (automatically done by the driver) to allow different MACs and the check was done in kernel then. But since this generates too much interrupts, NICs HW was extended and it can be programmed with multiple MACs to let through. However, there was a bug which I recall of, that intel driver was not always putting the TAP MAC into the NIC HW correctly. Obviously, the bug was not visible if the NIC was put into promisc mode. And this may be what you are seeing. Let me see if I can find the bug.
FYI - enabling promiscious mode on eth0 does not help. I have to enable it on the macvtap0 which is on top of eth0.
Ah, so that's not the bug then. Have you perhaps changed the vNIC MAC inside the guest after it was started? Even though libvirt will nowadays adapt to such change, older version of libvirt haven't this implemented causing guest connectivity to break. Michal

On Thu, 2015-04-09 at 10:47 +0200, Michal Privoznik wrote:
Ah, so that's not the bug then. Have you perhaps changed the vNIC MAC inside the guest after it was started? Even though libvirt will nowadays adapt to such change, older version of libvirt haven't this implemented causing guest connectivity to break.
Nope. I only changed the nic mode from bridged to macvtap bridged mode but always followed by a shutdown / start of the VM. Anyway it's only an ipv6 problem - i do not have this problem with ipv4! Stefan

On 09.04.2015 10:50, Stefan Bauer wrote:
On Thu, 2015-04-09 at 10:47 +0200, Michal Privoznik wrote:
Ah, so that's not the bug then. Have you perhaps changed the vNIC MAC inside the guest after it was started? Even though libvirt will nowadays adapt to such change, older version of libvirt haven't this implemented causing guest connectivity to break.
Nope. I only changed the nic mode from bridged to macvtap bridged mode but always followed by a shutdown / start of the VM. Anyway it's only an ipv6 problem - i do not have this problem with ipv4!
Maybe a firewall problem? Anything interesting in iptables/ebtables output? Michal

v4 is using arp to broadcast address (ff:ff:ff:ff:ff:ff) - that works with macvtap without having card in promiscious mode (ip li is not showing PROMISC flag). v6 is using multicast to solicited node-address - that is NOT working without enabling manually promiscious mode. Stefan

Further investigation shows that it is working as expected on linux guests. Only my win10 guests "trigger" that problem. Will dig deeper... Stefan

I could narrow down the problem. Win7Prof and Windows 10 is having this issue. Win Client is member of the multicast group and is listening on ff02::1:ffdf:fe88 Traffic from router -> client - captured on router: 07:41:23.104567 IP6 fe80::12fe:edff:fee6:cfa > ff02::1:ffdf:fe88: ICMP6, neighbor solicitation, who has 2a01:198:200:8350:851e:21a7:28df:fe88, length 32 07:41:24.099764 IP6 fe80::12fe:edff:fee6:cfa > ff02::1:ffdf:fe88: ICMP6, neighbor solicitation, who has 2a01:198:200:8350:851e:21a7:28df:fe88, length 32 07:41:25.099772 IP6 fe80::12fe:edff:fee6:cfa > ff02::1:ffdf:fe88: ICMP6, neighbor solicitation, who has 2a01:198:200:8350:851e:21a7:28df:fe88, length 32 Traffic from router to client - captured on KVM HOST 09:41:23.106444 IP6 fe80::12fe:edff:fee6:cfa > ff02::1:ffdf:fe88: ICMP6, neighbor solicitation, who has 2a01:198:200:8350:851e:21a7:28df:fe88, length 32 09:41:24.101647 IP6 fe80::12fe:edff:fee6:cfa > ff02::1:ffdf:fe88: ICMP6, neighbor solicitation, who has 2a01:198:200:8350:851e:21a7:28df:fe88, length 32 09:41:25.101718 IP6 fe80::12fe:edff:fee6:cfa > ff02::1:ffdf:fe88: ICMP6, neighbor solicitation, who has 2a01:198:200:8350:851e:21a7:28df:fe88, length 32 No traffic seen on KVM GUEST (win7 or win10) with wireshark Any ideas to debug this further? Cheers Stefan

It was the privacy extension enabled on the windows hosts. So it was sending out from addresses the KVM HOST is not member of the multicast group. disabled it with netsh interface ipv6 set privacy state=disabled Stefan
participants (3)
-
Martin Kletzander
-
Michal Privoznik
-
Stefan Bauer