[libvirt] UDP broadcasts vs. nat Masquerading issue

Hi all, I'm observing an issue that as soon as libvirt starts, UPD broadcasts flowing through physical network (and intended for other services, unrelated to any virtualization) get broken. Specifically, windows neighbourhood browsing through samba's nmbd starts suffering badly (Samba is running on this same box). At the moment I'm running a quite outdated version 1.2.9 of libvirt, but because other than this issue it does its job pretty well I'd first consider some patching/backporting rather than totally replacing it with a new one. Anyway, I first need to better understand what is going on and what is wrong with it. This could also be related somewhat to https://www.redhat.com/archives/libvir-list/2013-September/msg01311.html but I suppose it is not exactly that thing, and besides, my version does already include a fix for the broadcasts issue mentioned in the msg01311. I've already figured the source of trouble is anyway related to these rules added: -A POSTROUTING -o br0 -j MASQUERADE -A POSTROUTING -o enp0s25 -j MASQUERADE -A POSTROUTING -o virbr2_nic -j MASQUERADE -A POSTROUTING -o vnet0 -j MASQUERADE Here, virbr2_nic and vnet0 are used by libvirt for arranging NAT-mode network configurations for VMs, unrelated to normal network stuff, so it looks ok. However, br0 (with enp0s25 in it) is a main interface of this host with primary ip address. And enp0s25 is a physical nic of this host, and it is used for all sorts of regular (unrelated to virtualization) communications as well. Also, br0 is used for attaching some bridge-mode (as opposed to NAT-mode) VMs managed by libvirt, but bridge mode is not supposed to employ address translation anyway. So, clearly, libvirt somehow chooses to set up masquerading for literally all existing network interfaces here (except lo), but I can't see a reason for the first two rules in the list above. Furthermore, they corrupt UDP broadcats coming from outside and reaching this host (through enp0s25/br0) such that source address gets replaced by this hosts primary address (as per masquerading). I've verified this by arranging a hand-crafted UDP listener and printing the respective source addresses as seen by normal userspace. Obviously Samba server can not work correctly under such conditions. Now I've discovered that I can "eliminate" the problem by e.g.: 1. Removing "-A POSTROUTING -o br0 -j MASQUERADE" (manual brute force) 2. Inserting "-A POSTROUTING -s 192.168.0.0/24 -d 192.168.0.255/32 -j ACCEPT" (Of course correcting rules by hand is not a solution, just a test) So question is, how the correct rules should ideally look like? And, is this issue known/fixed in most current libvirt? Thank you, Regards, Nikolai

On Wed, Jul 03, 2019 at 09:07:10PM +0300, Nikolai Zhubr wrote:
At the moment I'm running a quite outdated version 1.2.9 of libvirt, but because other than this issue it does its job pretty well I'd first consider some patching/backporting rather than totally replacing it with a new one. Anyway, I first need to better understand what is going on and what is wrong with it.
Not too bad an issue - the iptables rules libvirt creates have been almost entirely unchanged since they were first introduced, until I recently did some changes. My recent changes merely moved them down into libvirt chains instead of the root chains, which shouldn't be a functional change.
I've already figured the source of trouble is anyway related to these rules added:
-A POSTROUTING -o br0 -j MASQUERADE -A POSTROUTING -o enp0s25 -j MASQUERADE -A POSTROUTING -o virbr2_nic -j MASQUERADE -A POSTROUTING -o vnet0 -j MASQUERADE
Is that really the full specification of the rules ? AFAIK, libvirt would not create such rules - at very least lbivirt MASQUERADE should always have a source + destination IP mask If I start the libvirt 'default' virtual network the MASQUARADE rules created by libvirt look like: -A LIBVIRT_PRT -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p tcp -j MASQUERADE --to-ports 1024-65535 -A LIBVIRT_PRT -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p udp -j MASQUERADE --to-ports 1024-65535 -A LIBVIRT_PRT -s 192.168.122.0/24 ! -d 192.168.122.0/24 -j MASQUERADE In older libvirt it would be POSTROUTING instead of LIBVIRT_PRT but otherwise the same
Here, virbr2_nic and vnet0 are used by libvirt for arranging NAT-mode network configurations for VMs, unrelated to normal network stuff, so it looks ok. However, br0 (with enp0s25 in it) is a main interface of this host with primary ip address. And enp0s25 is a physical nic of this host, and it is used for all sorts of regular (unrelated to virtualization) communications as well. Also, br0 is used for attaching some bridge-mode (as opposed to NAT-mode) VMs managed by libvirt, but bridge mode is not supposed to employ address translation anyway.
So, clearly, libvirt somehow chooses to set up masquerading for literally all existing network interfaces here (except lo), but I can't see a reason for the first two rules in the list above. Furthermore, they corrupt UDP broadcats coming from outside and reaching this host (through enp0s25/br0) such that source address gets replaced by this hosts primary address (as per masquerading). I've verified this by arranging a hand-crafted UDP listener and printing the respective source addresses as seen by normal userspace. Obviously Samba server can not work correctly under such conditions.
Now I've discovered that I can "eliminate" the problem by e.g.:
1. Removing "-A POSTROUTING -o br0 -j MASQUERADE" (manual brute force) 2. Inserting "-A POSTROUTING -s 192.168.0.0/24 -d 192.168.0.255/32 -j ACCEPT" (Of course correcting rules by hand is not a solution, just a test)
So question is, how the correct rules should ideally look like? And, is this issue known/fixed in most current libvirt?
I'm not convinced that libvirt is actually creating those rules. Is there any other software on your host that could be responsible ? Can you test - Disable all libvirt virtual networks virsh net-destroy NETNAME virsh net-autostart --disable NETNAME - Reboot the host OS Now look at iptables rules. Are the MASQUERADE rules present ? If they are not present then start a network with 'virsh net-start NETNAME' Are the MASQUERADE rules now present ? If so can you show the XML of the network (virsh net-dumpxml NETNAME). Also do you have any hook scripts in /etc/libvirt/hooks that might be doing anything ? Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (2)
-
Daniel P. Berrangé
-
Nikolai Zhubr