Re: [libvirt PATCH 0/3] network: support NAT with IPv6

Tuesday, 9 June 2020

On Mon, Jun 08, 2020 at 11:05:00PM -0400, Laine Stump wrote:
...
 On 6/8/20 10:51 AM, Daniel P. Berrangé wrote:
 > The virtual network has never supported NAT with IPv6 since this feature
 > didn't exist at the time. NAT has been available since RHEL-7 vintage
 > though, and it is desirable to be able to use it.
 > 
 > This series enables it with
 > 
 >    <forward mode=3D"nat">
 >      <nat ipv6=3D"yes"/>
 >    </forward>

 I've had this lurking on my "this is something I should do" list for a
long
 time, but couldn't decide on the best name in XML (and also figured that the
 problem with accept_ra needed to be fixed first), so it never got to the
 top. So I'm glad to see you've done it, disappointed in myself that I never
 did it :-/

 I like your XML knob naming better than what I'd considered. I had thought
 of having <forward mode='supernat'> (or some other more reasonable extra
 mode), but your proposal is more orthogonal and matches with the existing
 ipv6='yes' at the toplevel of <network> (which is used to enable ipv6
 traffic between guests on the bridge even when there are no IPv6 addresses
 configured for the network.) 
I considered  mode="nat6" as an alternative, but it would have meant
updating many switch() statements, and is a somewhat misleading as a
name. 

...
 >    </network>
 > 
 > Conceptually this means
 > 
 >   - Try to gimme a subnet with IPv4 and DHCP
 >   - Try to gimme a subnet with IPv6 and RAs
 > 
 > Now when we start the virtual network
 > 
 >   - If IPv4 is not enabled on host, don't assign addr

 What will we use to check for this? Not just "no IP addresses configured", I
 guess, since it may be the case that libvirt has just happened to come up
 before NM or whoever has started any networks. (or maybe someone wants to
 use IPv6 on a libvirt virtual network, but have no IPv6 connectivity beyond
 the host). 
IIUC, we can simply check whether it is possible to create a socket
with AF_INET or AF_INET6.  If the kernel supports it, then this
should suceed, even if network manager isn't running yet.

...
 >   - Else
 >     - Iterate N=3D1..254 to find a free range for IPv4
 >     - Use 192.168.N.0/24 for subnet
 >     - Use 192.168.N.1 for host IP
 >     - Use 192.168.N.2 -> 192.168.N.254 for guest DHCP
 > 
 >   - If IPv6 is not enabled on host, don't assign addr
 >   - Else
 >     - Generate NNNN:NNNN as 4 random bytes
 >     - Use fd00:add:f00d:NNNN:NNNN::0/64 for IPv6 subnet
 >     - Use fd00:add:f00d:NNNN:NNNN::1 for host IP
 >     - Use route advertizement for IPv6 zero-conf
 > 
 > With NNNN:NNNN, even with 1000 guests running, we have just a 0.02%
 > chance of clashing with a guest for IPv6.
 > 
 > The "live" XML would always reflect the currently assigned addresses
 > 
 > Proactively monitor the address allocations of the host. If we see
 > a conflicting address appear, take down the dnsmasq intance, generate
 > a new subnet, bring dnsmasq back online.

 Hmm. How would you see this monitoring happening? We couldn't do it with an
 external script like I had done for simple "shut down on conflict" without
 adding extra functionality to libvirt's network driver. We *could* go back
 to the idea of monitoring netlink change messages ourselves within libvirtd
 and doing it all internally ourselves. Or maybe the NM script I proposed
 could go beyond simply destroying conflicting networks, and also restart any
 network that had autoaddr='yes'; to make this fully functional we would need
 to finally put in the proper stuff so that tap devices (and the underlying
 emulated NICs) would be set offline when their connected network was
 destroyed, and then reconnected/set online when the network was re-started.
 Getting the networks to behave this way would be useful in general anyway,
 even without thinking about the conflicting-networks problem. The one
 downside of externally controlling renumbering-on-conflict using an external
 script is that it would only work with NetworkManager... 
Yeah, I'm trying to remember now why we went the NM hook route, rather
than listening for netlink events. I guess NM is much simpler to hook
into.  I'd honestly not thought about this too much though - just having
an automatically numbered network will already be a huge step forward
compared to current day.

In particular if we insituted a rule that if we are NOT on a hypervisor,
we count from N=254 -> 0, when picking 192.168.N.0, and count from
N=0 -> 254 when we are on a hypervisor, then we'll trivially avoid the
host/guest clash in simple case, even if network is not yet online.

Don't anyone dare mention nested virt with 3 levels of libvirt... 

Seriously though, even without automatic teardown & restart, we'd
be way better off by simply not hardcoding 192.168.N.0 at RPM
install time when the network env is not the same as the run time
network env. eg cloud images

...
 > Ideally we would have to bring the guest network links offline
and
 > then online again to force DHCP re-assignment immediately.

 Yeah, I think it really makes sense that when a libvirt network is
 destroyed, all the tap devices are set offline, and the emulated NICs are
 set offline as well; then when a libvirt network is started, we would go
 through all devices that are supposed to be connected to that network,
 reconnect the taps, set them online, and set the emulated NIC online. We
 currently do the reconnection part when libvirtd is restarted but can't do
 it immediately when a *network* is restarted because the network driver has
 no access to the list of active guests and their interfaces....

 Hmm, we do now maintain the list of ports for each network though, and it
 would be possible to expand that to keep the name of the tap device
 associated with the port in addition to the other info (e.g. whether or not
 the NIC has been set offline via an API call), *but* when a network is
 destroyed, all ports registered with that network are also destroyed, so
 just expanding the attributes for the ports isn't going to get us where we
 need. So, do we want to 1) change it to maintain active ports for a network
 when it is destroyed so that they can be easily reactivated when the network
 is restarted? Or do we want to 2) change the network driver to make calls to
 all registered hypervisor drivers during a net-start to look for all guest
 interfaces that think they are connected to the network? The former sounds
 much more efficient, but I don't know how "dirty" it seems to maintain
state
 for something that has been "destroyed"...

 Or maybe we instead need to also add a new API for networks
 virNetworkReconnect(), which will use newly expanded info in the network
 ports list to reconnect all guest interfaces. 
Responsibility for enslaving a TAP device into a bridge still lives with
the virt drivers, not the network driver.

The virt drivers could listen for lifecycle events from the network driver
and auto-reconnect.

Alternatively the virt driver could listen for netlink events and see the
virbr0 being deleted, and created by the kernel.

...
 On a different sub-topic - it would be nice to provide some stability
to the
 subnet used for an autoaddr='yes' network (think of the case where every
 time a host is booted, libvirt starts its default network when
 192.168.122.0/24 is available, but then a short time later a host interface
 is always started on the same subnet - that would mean every time the host
 booted the exact same destabilizing dance would take place even though it
 would be pretty easy to predict the eventually-used subnet based on past
 experience).

 Although we historically have avoided automatic changes to libvirt config
 files by libvirtd itself as much as possible (the only cases I can think of
 are when we're modifying the config to take care of some compatibility
 problem after an upgrade), what do you think about having the autoaddr='yes'
 networks automatically update the config with the current subnet info?
 (maybe this would need to only be done if not starting from a live image or
 something, or maybe it should just always be done). This would then be used
 as the first guess the next time the network was started. That way we would
 avoid the need to delay starting libvirt networks until after host
 networking was fully up; the subnet might bounce around a bit that first
 time, but once a stable address was found during that first run, it would
 then be used from the get-go during all subsequent boots (until/unless
 something changed and it had to be changed yet again). 
We could stash the previously chosen  subnet in /var/cache/libvirt/network
or /var/lib/libvirt/network, no need to modify the inactive XML config.
This is like how dnsmasq "remembers" DHCP leases previously given for guests.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt PATCH 0/3] network: support NAT with IPv6