Hi,
Sorry for the delay.
On Tue, 11 Aug 2020 23:52:46 -0400
Laine Stump <laine(a)redhat.com> wrote:
On 8/10/20 11:23 PM, Ian Wienand wrote:
> Hello,
>
> Firstly THANK YOU for the IPv6 NAT support merged in 6.5. It has been
> almost impossible to get IPv6 into a VM on a laptop that switches
> between wifi and wired (dock) connections, because you can not add a
> wifi interface to a bridge. I know NAT is against the IPv6 end-to-end
> xen but it makes this "just work" for the vast majority of people like
> me who need to ssh/curl/talk to ipv6 only hosts!
>
> So I installed 6.6.0 from the virt-preview repos on Fedora 32 to
> eagerly test it out.
>
> My network config looks like
>
> <network>
> <name>network</name>
> <uuid> ... </uuid>
> <forward mode='nat'>
> <nat ipv6='yes'/>
> </forward>
> <bridge name='virbr0' stp='on' delay='0'/>
> <mac address=' ... '/>
> <domain name='network'/>
> <ip address='192.168.100.1' netmask='255.255.255.0'>
> <dhcp>
> <range start='192.168.100.128' end='192.168.100.254'/>
> </dhcp>
> </ip>
> <ip family='ipv6' address='fc00:dead:beef:55::'
prefix='64'>
> </ip>
> </network>
>
> The first problem I hit was trying to start that network:
>
> error: internal error: Check the host setup: enabling IPv6 forwarding
> with RA routes without accept_ra set to 2 is likely to cause routes
> loss. Interfaces to look at: wlp4s0
>
> wlp4s0 is my wifi card that is configured by NetworkManager in a
> completely unremarkable fashion. By default it gets an ipv6 via SLAAC
> from my router. This feels a bit like the unresolved bug [1] which
> says that systemd-networkd is handling the RA's in userspace for
> ... reasons [2]. It's unclear to me if NetworkManager is doing
> similar.
Yes, and yes. The only reason I haven't done something about this is
that I'm undecided *what* to do. On one hand it seems many (most)
systems are handling RAs with a userspace process, so it doesn't matter
that it's disabled in the kernel. On the other hand, the person who
added this check must have had a valid reason for going to the trouble
of adding it (rather than just documenting that you needed to set
accept_ra to 2 for some set of interfaces (I forget right now exactly
which ones, and I'm trying to wind my brain down for the end of the day,
so don't want to go look it up :-)
The check comes from commit 00d28a78b5d1 ("network: check accept_ra
before enabling ipv6 forwarding"), and it's there because the accept_ra
flag works like this (from Documentation/networking/ip-sysctl.txt):
0 Do not accept Router Advertisements.
1 Accept Router Advertisements if forwarding is disabled.
2 Overrule forwarding behaviour. Accept Router Advertisements
even if forwarding is enabled.
Now, as libvirt enables IPv6 forwarding via
/proc/sys/net/ipv6/conf/all/forwarding (in my opinion, this could be
limited to the interfaces involved), router advertisements would start
being discarded on all interfaces if this is '1'.
Another half-baked idea I was thinking about is: if there's at least one
address on a given interface with the 'noprefixroute' flag, that means
they are added by userspace. In that case,
virNetDevIPCheckIPv6ForwardingCallback() could set data->hasRARoutes to
false, and if userspace is explicitly handling RAs, don't worry at all
about accept_ra -- 0 is fine if it was set e.g. by NetworkManager.
Otherwise, just go ahead and set it to 2, we're not conflicting with
anything that would set addresses from RAs (other than the kernel).
I can see 3 possibilities:
1) completely remove the check, with the idea that while it was a good
thing at the time, it's now obsolete.
2) have a config item (in /etc/libvirt/network.conf (which doesn't
currently exist) maybe?) to let people manually disable the check.
3) try to make libvirt's code intelligent, and look for clues that RAs
are handled elsewhere (someone would need to figure out what those
"clues" are).
Yes, addresses with 'noprefixroute' should be a safe choice: userspace
agents need to create routes separately anyway, but it won't be set if
the kernel is setting up those addresses.
> I feel like this must be a red-herring. My wired interface has
the
> same setting of 0
>
> $ cat /proc/sys/net/ipv6/conf/enp0s31f6/accept_ra
> 0
>
> and is similarly just a very standard auto-configured NetworkManager
> interface. When I "net-start" the network whilst on wifi libvirt
> doesn't seem to care about that interface (I presume it only looks at
> the active one?). When I dock and turn off wifi, ipv6 connectivity
> continues to work through enp0s31f6, so I don't think the accept_ra
> really matters in this case.
Because you're using NetworkManager. I've confirmed with [some NM
person, I forget who or in what venue] that NM handles RAs itself, so
accept_ra should be turned off in the kernel (it's not harmful if it's
on as far as I know, it just does nothing useful)
>
> I feel like this message is incorrect, and being as I've done nothing
> special to my underlying interfaces probably going to be wrong for a
> lot of people trying this? Does anyone know the details of this
> message and see why it would be required in this situation?
It isn't. We just need to decide which of the ways listed above to fix it.
>
> The other thing that I'd like to expand the documentation on, if I can
> get some clarity, is the choice of network. It seems like it has to
> be a /64, and it seems like the best choice is within fc00::/7, or at
> least that is what has been assigned for private networks like this
> [3]?
"locally assigned" addresses in IPv6 are... different. I've been trying
to figure this out myself (in order to *automatically* assign a network
address to a libvirt virtual network, as Dan suggested in the cover
letter for the IPv6 NAT patches), and I *think* you need to at least set
the lowest bit of the first byte of the address (that's the "locally
assigned" bit). So that would mean that all networks should be somewhere
within FD00::/8 (but please correct me if I'm wrong!)
>
> The only problem with this is that I think glibc filters this range so
> nothing prefers IPv6.
What?? Exactly what isn't preferring IPv6? Do you mean outbound
connections that would be to an IPv6 address will be nixed in favor of
an IPv4 address if the source IP of the connection was going to be in
FC00::/7? Or something else? Do you have a reference for this?
> Is this the range expected to be used for ipv6
> NAT? If so, would a patch to drop some documentation breadcrumbs
> about setting gai.conf or something be useful?
The man page for gai.conf *implies* that glibc is following the
preference rules suggested in RFC3484, which was written prior to
RFC4193, so it seems strange that it would give any special treatment to
addresses in that range. Does it behave in the same way if you use
FD00::... instead of FC00::...? (probably, but worth checking)
> Or are there better choices for the network?
I've Cc'ed Stefano Brivio, who has worked on IPv6 in the kernel, and (at
least based on the conversations I've had with him) has a much better
knowledge of IPv6. Maybe he can offer some advice.
(BTW, he was playing around with defining an IPv6 libvirt network that
used the same network as the host's physical interface, then turning on
ndp-proxy, and finally adding a host route for each guest IP; this
permits the guests to all be on the same IPv6 network as the host; if we
can get all of those steps automated in a libvirt virtual network, it
will be even better than IPv6 NAT!)
Yes, that would be ideal. I don't think NAT with IPv6 is a wise thing
to do, but my ISP just delegates a /64 prefix to me. So I need NDP
proxying because my guests need to appear on the same network. I do it
manually with something like:
echo 1 > /proc/sys/net/ipv6/conf/<upstream interface>/proxy_ndp
ip -6 neigh add proxy <guest address> dev <upstream interface>
and passing my network prefix to libvirt:
<ip family='ipv6' address='<my prefix>::1'
prefix='64'>
</ip>
works flawlessly, dnsmasq gets configured properly on the host and the
guest can use SLAAC, also DNS configuration (RFC 8106) worked.
Other than NDP proxying, another slightly problematic item was that I
tried, as a hack, to pass a different prefix there (should never be
done, indeed). dnsmasq fails, but silently, and libvirt accepts it, also
silently -- it should probably warn instead, even just because it won't
work with dnsmasq.
--
Stefano