[Libvir] Virtual network iptables rules

Warning, this is a long & complicated email with lots of horrible details :-) I've long been a little confused with the way iptables & bridging interacts, so set out to do some experiments. I added a -j LOG rule to every single chain in both the filter & nat tables, and then tried various traffic patterns, to see which chains were traversed & in which order. There are 2 types of config I considered - virtual networking, and shared physical device. For both these I tried with net.bridge.bridge-nf-call-iptables on & off. This gave 4 scenarios to test with. For the test I simply did 'ping -c 1 <ip addr>', which gives a simple roundtrip with a single packet in each direction. The results were as follows.... Scenario 1: Virtual network =========================== net.bridge.bridge-nf-call-iptables = 0 Host: eth0 -> Internet virbr0 -> MASQUERADE to eth0 Guest: vif1.0 -> virbr0 Traffic: Guest -> Google ------------------------ Out: NAT-PREROUTING IN=virbr0 OUT= SRC=192.168.122.47 DST=64.233.167.99 FORWARD IN=virbr0 OUT=eth0 SRC=192.168.122.47 DST=64.233.167.99 NAT-POSTROUTING IN= OUT=eth0 SRC=192.168.122.47 DST=64.233.167.99 Back: FORWARD IN=eth0 OUT=virbr0 SRC=64.233.167.99 DST=192.168.122.47 Traffic: Guest -> Host ---------------------- Out: NAT-PREROUTING IN=virbr0 OUT= SRC=192.168.122.47 DST=192.168.122.1 INPUT IN=virbr0 OUT= SRC=192.168.122.47 DST=192.168.122.1 Back: OUTPUT IN= OUT=virbr0 SRC=192.168.122.1 DST=192.168.122.47 Traffic: Host -> Guest ---------------------- Out: NAT-OUTPUT IN= OUT=virbr0 SRC=192.168.122.1 DST=192.168.122.47 OUTPUT IN= OUT=virbr0 SRC=192.168.122.1 DST=192.168.122.47 NAT-POSTROUTING IN= OUT=virbr0 SRC=192.168.122.1 DST=192.168.122.47 Back: INPUT IN=virbr0 OUT= SRC=192.168.122.47 DST=192.168.122.1 Scenario 2: Virtual network =========================== net.bridge.bridge-nf-call-iptables = 1 Host: eth0 -> Internet virbr0 -> MASQUERADE to eth0 Guest: vif1.0 -> virbr0 Traffic: Guest -> Google ------------------------ Out: NAT-PREROUTING IN=virbr0 OUT= PHYSIN=vif1.0 SRC=192.168.122.47 DST=64.233.167.99 FORWARD IN=virbr0 OUT=eth0 PHYSIN=vif1.0 SRC=192.168.122.47 DST=64.233.167.99 NAT-POSTROUTING IN= OUT=eth0 PHYSIN=vif1.0 SRC=192.168.122.47 DST=64.233.167.99 Back: FORWARD IN=eth0 OUT=virbr0 SRC=64.233.167.99 DST=192.168.122.47 Traffic: Guest -> Host ---------------------- Out: NAT-PREROUTING IN=virbr0 OUT= PHYSIN=vif1.0 SRC=192.168.122.47 DST=192.168.122.1 INPUT IN=virbr0 OUT= PHYSIN=vif1.0 SRC=192.168.122.47 DST=192.168.122.1 Back: OUTPUT IN= OUT=virbr0 SRC=192.168.122.1 DST=192.168.122.47 Traffic: Host -> Guest ---------------------- Out: NAT-OUTPUT IN= OUT=virbr0 SRC=192.168.122.1 DST=192.168.122.47 OUTPUT IN= OUT=virbr0 SRC=192.168.122.1 DST=192.168.122.47 NAT-POSTROUTING IN= OUT=virbr0 SRC=192.168.122.1 DST=192.168.122.47 Back: INPUT IN=virbr0 OUT= PHYSIN=vif1.0 SRC=192.168.122.47 DST=192.168.122.1 Scenario 3: Shared physical device ================================== net.bridge.bridge-nf-call-iptables = 0 Host: peth1 -> Internet xenbr0 -> peth1 Guest: vif2.0 -> xenbr0 Traffic: Guest -> Google ------------------------ Nada Traffic: Guest -> Host ---------------------- Out: NAT-PREROUTING IN=eth1 OUT= SRC=192.168.254.120 DST=192.168.254.132 INPUT IN=eth1 OUT= SRC=192.168.254.120 DST=192.168.254.132 Back: OUTPUT IN= OUT=eth1 SRC=192.168.254.132 DST=192.168.254.120 Traffic: Host -> Guest ---------------------- Out: NAT-OUTPUT IN= OUT=eth1 SRC=192.168.254.132 DST=192.168.254.120 OUTPUT IN= OUT=eth1 SRC=192.168.254.132 DST=192.168.254.120 NAT-POSTROUTING IN= OUT=eth1 SRC=192.168.254.132 DST=192.168.254.120 Back: INPUT IN=eth1 OUT= SRC=192.168.254.120 DST=192.168.254.132 Scenario 4: Shared physical device ================================== net.bridge.bridge-nf-call-iptables = 1 Host: peth1 -> Internet xenbr0 -> peth1 Guest: vif2.0 -> xenbr0 Traffic: Guest -> Google ------------------------ Out: NAT-PREROUTING IN=xenbr1 OUT= PHYSIN=vif2.0 SRC=192.168.254.120 DST=64.233.167.99 FORWARD IN=xenbr1 OUT=xenbr1 PHYSIN=vif2.0 PHYSOUT=peth1 SRC=192.168.254.120 DST=64.233.167.99 NAT-POSTROUTING IN= OUT=xenbr1 PHYSIN=vif2.0 PHYSOUT=peth1 SRC=192.168.254.120 DST=64.233.167.99 Back: FORWARD IN=xenbr1 OUT=xenbr1 PHYSIN=peth1 PHYSOUT=vif2.0 SRC=64.233.167.99 DST=192.168.254.120 Traffic: Guest -> Host ---------------------- Out: NAT-PREROUTING IN=xenbr1 OUT= PHYSIN=vif2.0 SRC=192.168.254.120 DST=192.168.254.132 FORWARD IN=xenbr1 OUT=xenbr1 PHYSIN=vif2.0 PHYSOUT=vif0.1 SRC=192.168.254.120 DST=192.168.254.132 NAT-POSTROUTING IN= OUT=xenbr1 PHYSIN=vif2.0 PHYSOUT=vif0.1 SRC=192.168.254.120 DST=192.168.254.132 INPUT IN=eth1 OUT= SRC=192.168.254.120 DST=192.168.254.132 Back: OUTPUT IN= OUT=eth1 SRC=192.168.254.132 DST=192.168.254.120 FORWARD IN=xenbr1 OUT=xenbr1 PHYSIN=vif0.1 PHYSOUT=vif2.0 SRC=192.168.254.132 DST=192.168.254.120 Traffic: Host -> Guest ---------------------- Out: NAT-OUTPUT IN= OUT=eth1 SRC=192.168.254.132 DST=192.168.254.120 OUTPUT IN= OUT=eth1 SRC=192.168.254.132 DST=192.168.254.120 NAT-POSTROUTING IN= OUT=eth1 SRC=192.168.254.132 DST=192.168.254.120 FORWARD IN=xenbr1 OUT=xenbr1 PHYSIN=vif0.1 PHYSOUT=vif2.0 SRC=192.168.254.132 DST=192.168.254.120 Back: FORWARD IN=xenbr1 OUT=xenbr1 PHYSIN=vif2.0 PHYSOUT=vif0.1 SRC=192.168.254.120 DST=192.168.254.132 INPUT IN=eth1 OUT= SRC=192.168.254.120 DST=192.168.254.132 Now in this email I'm really only concerned with the first 2 virtual network scernaios. The shared physical device stuff can be ignored henceforth, basically because it 'just works(tm)'. For virtual networks there are basically 3 types of networking config we need to represent in terms of iptables rules, and these need to work for scenrios 1 & 2 - ie regardless of the magic sysctl knob. Here is what we currently implement...... Type 1: Isolated virtual network -------------------------------- - We don't add anything here Type 2: Forwarding to a specific NIC only ----------------------------------------- Chain POSTROUTING (policy ACCEPT 345 packets, 32627 bytes) pkts bytes target prot opt in out source destination 0 0 MASQUERADE all -- * eth1 0.0.0.0/0 0.0.0.0/0 PHYSDEV match ! --physdev-is-bridged Chain FORWARD (policy ACCEPT 29 packets, 2244 bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT all -- eth1 vnet0 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED 0 0 ACCEPT all -- vnet0 eth1 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- * eth1 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in vnet0 Chain INPUT (policy ACCEPT 80483 packets, 382M bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT udp -- vnet0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 0 0 ACCEPT tcp -- vnet0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 0 0 ACCEPT udp -- vnet0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 0 0 ACCEPT tcp -- vnet0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67 Type 3: Forwarding to any active NIC ------------------------------------ Chain POSTROUTING (policy ACCEPT 360 packets, 33843 bytes) pkts bytes target prot opt in out source destination 2 476 MASQUERADE all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match ! --physdev-is-bridged Chain FORWARD (policy ACCEPT 29 packets, 2244 bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT all -- * virbr0 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED 0 0 ACCEPT all -- virbr0 * 0.0.0.0/0 0.0.0.0/0 0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-in virbr0 Chain INPUT (policy ACCEPT 80884 packets, 382M bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 0 0 ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 0 0 ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 0 0 ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67 So how do these shape up, given the traversal scenarios & the overall desire to be as restrictive as possible with traffic. Problem: The INPUT rules are missing altogether for the isolated virtual network so potentially DHCP/DNS will be blocked Solution: Add them - simple bug. Problem: The POSTROUTING rule is too generic so it matches pretty much any kind of traffic, from any virtual network, or even from VPN devices setup by VPNC. Solution: Only masquerade traffic whose source address is within the netmask associated with the virtual network in question Problem: The FORWARD rule is too generic, forwarding traffic to/from the virtual network regardless of whether the dest/src IP address is within the netmask associated with the virtual network. Assuming the first problem is setup to only masquerade valid IP addresses from the virtual network, this rule would then allow guests to spoof their IP and have it forwarded off-host. Solution: Only forward packets whose IP address is within the netmask associated with the virtual network Problem: The policy of the FORWARD rule is ACCEPT, and/or later user defined rules may inadvertently match on traffic from the virtual network, again allowing through spoof traffic, or traffic from what should be an isolated virtual network Solution: There needs to be a catch-all REJECT rule associated with every bridge device, in both directions Problem: There is an extra physdev match per bridge device, and per guest device. This is basically unneccessary since the previous rule sets will already have allowed through the traffic. The physdev matches also only work if net.bridge.bridge-nf-call-iptables = 1 Solution: Simply remove the per-device matches Problem: The POSTROUTING rule has a physdev match applied, which only works if net.bridge.bridge-nf-call-iptables = 1. Solution: Remove physdev match & masquerade based on network address associated with the virtual network If we apply all solution outlined here, we'll end up with a set of rules which look like this.......... Type 1: Isolated virtual network -------------------------------- Chain POSTROUTING (policy ACCEPT 273 packets, 26341 bytes) pkts bytes target prot opt in out source destination Chain FORWARD (policy ACCEPT 29 packets, 2244 bytes) pkts bytes target prot opt in out source destination 0 0 REJECT all -- * vnet2 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable 0 0 REJECT all -- vnet2 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable Chain INPUT (policy ACCEPT 76724 packets, 366M bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT udp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 0 0 ACCEPT tcp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 0 0 ACCEPT udp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 0 0 ACCEPT tcp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67 Type 2: Forwarding to a specific NIC only ----------------------------------------- Chain POSTROUTING (policy ACCEPT 273 packets, 26341 bytes) pkts bytes target prot opt in out source destination 0 0 MASQUERADE all -- * eth1 192.168.200.0/24 0.0.0.0/0 Chain FORWARD (policy ACCEPT 29 packets, 2244 bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT all -- eth1 vnet3 0.0.0.0/0 192.168.200.0/24 state RELATED,ESTABLISHED 0 0 ACCEPT all -- vnet3 eth1 192.168.200.0/24 0.0.0.0/0 0 0 REJECT all -- * vnet3 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable 0 0 REJECT all -- vnet3 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable Chain INPUT (policy ACCEPT 76724 packets, 366M bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT udp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 0 0 ACCEPT tcp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 0 0 ACCEPT udp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 0 0 ACCEPT tcp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67 Type 3: Forwarding to any active NIC ------------------------------------ Chain POSTROUTING (policy ACCEPT 273 packets, 26341 bytes) pkts bytes target prot opt in out source destination 16 1292 MASQUERADE all -- * * 192.168.122.0/24 0.0.0.0/0 Chain FORWARD (policy ACCEPT 29 packets, 2244 bytes) pkts bytes target prot opt in out source destination 44 20200 ACCEPT all -- * virbr0 0.0.0.0/0 192.168.122.0/24 state RELATED,ESTABLISHED 56 3676 ACCEPT all -- virbr0 * 192.168.122.0/24 0.0.0.0/0 0 0 REJECT all -- * virbr0 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable 0 0 REJECT all -- virbr0 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable Chain INPUT (policy ACCEPT 76724 packets, 366M bytes) pkts bytes target prot opt in out source destination 28 1728 ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 0 0 ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 5 1640 ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 0 0 ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67 So in summary: - Every single network type has the 4 INPUT rules for DHCP/DNS - Every single network type has catch all REJECT rules in FORWARD chain for both directions of traffic - A network forwarding to any device, has ACCEPT rules which allow through traffic associated with the virtual network IP range to/from any device - A network forwarding to a specific device, has ACCEPT rules which allow through traffic associated with the virtual network IP range to/from that specific device. - A network forwarding to any device, has MASQUERADE rule to translate source address which matches the virtual network & destined for any dev - A network forwarding to a specific device, has MASQUERADE rule to translate source address which matches the virutal network & destinaed for that specific device - There are no physdev matches needed. Hopefully at least one person has read this far through the email and still understands what is going on.... I'm attaching a patch which implements all this. BTW, there is also a bug in the vif-bridge script for Xen which adds a per-guest VIF physdev match rule. This needs to be removed too. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Hi Dan, On Thu, 2007-04-05 at 02:44 +0100, Daniel P. Berrange wrote:
Warning, this is a long & complicated email with lots of horrible details :-)
I've long been a little confused with the way iptables & bridging interacts, so set out to do some experiments. I added a -j LOG rule to every single chain in both the filter & nat tables, and then tried various traffic patterns, to see which chains were traversed & in which order.
Nice work ...
Scenario 2: Virtual network ===========================
net.bridge.bridge-nf-call-iptables = 1
Host: eth0 -> Internet virbr0 -> MASQUERADE to eth0
Guest: vif1.0 -> virbr0
Traffic: Guest -> Google ------------------------
Out:
NAT-PREROUTING IN=virbr0 OUT= PHYSIN=vif1.0 SRC=192.168.122.47 DST=64.233.167.99 FORWARD IN=virbr0 OUT=eth0 PHYSIN=vif1.0 SRC=192.168.122.47 DST=64.233.167.99 NAT-POSTROUTING IN= OUT=eth0 PHYSIN=vif1.0 SRC=192.168.122.47 DST=64.233.167.99
This really suprises me - I would have expected another one like: FORWARD IN=virbr0 OUT=virbr0 PHYSIN=vif1.0 PHYSOUT=virb0 SRC=192.168.122.47 DST=64.233.167.99 Is it because the packets are coming in on bridge interface we don't see any physdev matching? So, we would see it with Guest->Guest?
For virtual networks there are basically 3 types of networking config we need to represent in terms of iptables rules, and these need to work for scenrios 1 & 2 - ie regardless of the magic sysctl knob.
Well, IMHO, we should never be switching off the sysctl knob ourselves - i.e. we shouldn't have it in xen/scripts/network-bridge - but I take the point that a user might switch it off.
Problem: The INPUT rules are missing altogether for the isolated virtual network so potentially DHCP/DNS will be blocked Solution: Add them - simple bug.
I fixed this in CVS, didn't I?
Problem: The POSTROUTING rule is too generic so it matches pretty much any kind of traffic, from any virtual network, or even from VPN devices setup by VPNC. Solution: Only masquerade traffic whose source address is within the netmask associated with the virtual network in question
Problem: The FORWARD rule is too generic, forwarding traffic to/from the virtual network regardless of whether the dest/src IP address is within the netmask associated with the virtual network. Assuming the first problem is setup to only masquerade valid IP addresses from the virtual network, this rule would then allow guests to spoof their IP and have it forwarded off-host. Solution: Only forward packets whose IP address is within the netmask associated with the virtual network
Problem: The policy of the FORWARD rule is ACCEPT, and/or later user defined rules may inadvertently match on traffic from the virtual network, again allowing through spoof traffic, or traffic from what should be an isolated virtual network Solution: There needs to be a catch-all REJECT rule associated with every bridge device, in both directions
Problem: There is an extra physdev match per bridge device, and per guest device. This is basically unneccessary since the previous rule sets will already have allowed through the traffic. The physdev matches also only work if net.bridge.bridge-nf-call-iptables = 1 Solution: Simply remove the per-device matches
Problem: The POSTROUTING rule has a physdev match applied, which only works if net.bridge.bridge-nf-call-iptables = 1. Solution: Remove physdev match & masquerade based on network address associated with the virtual network
I guess the two main differences are 1) avoid physdev based rules because they don't work with net.bridge.bridge-nf-call-iptables = 1 and 2) use network address based rules which I avoided because of pure superstition and the feeling that IP based matching on a bridge was just ugly. I haven't spent long thinking about these changes, but on the face of it they all look well thought out and sensible. Definitely worth giving it a shot. Cheers, Mark.

On Thu, Apr 05, 2007 at 08:28:57AM +0100, Mark McLoughlin wrote:
Hi Dan,
On Thu, 2007-04-05 at 02:44 +0100, Daniel P. Berrange wrote:
Warning, this is a long & complicated email with lots of horrible details :-)
I've long been a little confused with the way iptables & bridging interacts, so set out to do some experiments. I added a -j LOG rule to every single chain in both the filter & nat tables, and then tried various traffic patterns, to see which chains were traversed & in which order.
Nice work ...
Scenario 2: Virtual network ===========================
net.bridge.bridge-nf-call-iptables = 1
Host: eth0 -> Internet virbr0 -> MASQUERADE to eth0
Guest: vif1.0 -> virbr0
Traffic: Guest -> Google ------------------------
Out:
NAT-PREROUTING IN=virbr0 OUT= PHYSIN=vif1.0 SRC=192.168.122.47 DST=64.233.167.99 FORWARD IN=virbr0 OUT=eth0 PHYSIN=vif1.0 SRC=192.168.122.47 DST=64.233.167.99 NAT-POSTROUTING IN= OUT=eth0 PHYSIN=vif1.0 SRC=192.168.122.47 DST=64.233.167.99
This really suprises me - I would have expected another one like:
FORWARD IN=virbr0 OUT=virbr0 PHYSIN=vif1.0 PHYSOUT=virb0 SRC=192.168.122.47 DST=64.233.167.99
Is it because the packets are coming in on bridge interface we don't see any physdev matching? So, we would see it with Guest->Guest?
I'll check up on the DomU<->DomU case - that may well exhibit a FORWARD traversal with both a PHYSIN & PHYSOUT match.
For virtual networks there are basically 3 types of networking config we need to represent in terms of iptables rules, and these need to work for scenrios 1 & 2 - ie regardless of the magic sysctl knob.
Well, IMHO, we should never be switching off the sysctl knob ourselves - i.e. we shouldn't have it in xen/scripts/network-bridge - but I take the point that a user might switch it off.
Yeah, I don't much like it either, but the Fedora Xen bridge scripts turn the setting off - principally so that traffic for bridged guests doesn't get hit by the Dom0 iptables rules.
Problem: The INPUT rules are missing altogether for the isolated virtual network so potentially DHCP/DNS will be blocked Solution: Add them - simple bug.
I fixed this in CVS, didn't I?
Yeah - I was comparing against the official 0.2.1 release which I happen to have an RPM installed of. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, Apr 05, 2007 at 08:28:57AM +0100, Mark McLoughlin wrote:
On Thu, 2007-04-05 at 02:44 +0100, Daniel P. Berrange wrote:
I guess the two main differences are 1) avoid physdev based rules because they don't work with net.bridge.bridge-nf-call-iptables = 1 and 2) use network address based rules which I avoided because of pure superstition and the feeling that IP based matching on a bridge was just ugly.
Considering point #2 - I think it is not entirely unreasonable. We let VMs on the bridge to use any IP addresses they like within the context of the virtual network for VM <-> VM communication. Although we'll hand out adddress via DHCP from the official range, they can also be manually configured with arbitrary addresses. For routing purposes we need to provide an IP address for the 'gateway router' (ie the Dom0 bridge device), and thus it is good practice to only route traffic associated with the network/mask of the router. If we were filtering traffic within the bridge based on IP, that would be ugly, but the forwarding / postrouting rules are concerned with traffic which is leaving the bridge & thus being routed, so IP based matching is good here. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, Apr 05, 2007 at 02:44:46AM +0100, Daniel P. Berrange wrote:
Warning, this is a long & complicated email with lots of horrible details :-)
That reminds me that we really ought to have a page in the documentation providing more high level explanations of the virtual network capabilities (http://libvirt.org/format.html#Net1 is a bit on the low level details side). Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Daniel P. Berrange wrote: [...]
Scenario 2: Virtual network ===========================
net.bridge.bridge-nf-call-iptables = 1
As far as I could tell, this case is exactly the same as scenario 1, except PHYSIN is available.
Type 1: Isolated virtual network --------------------------------
Chain POSTROUTING (policy ACCEPT 273 packets, 26341 bytes) pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 29 packets, 2244 bytes) pkts bytes target prot opt in out source destination 0 0 REJECT all -- * vnet2 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable 0 0 REJECT all -- vnet2 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
So the thinking here is that FORWARD will only apply to packets from DomU to the internet. Since this is an isolated network, all packets trying to go out should be rejected. I'm a bit confused as to what "vnet2" is here. It seems that any traffic to/from virbr0 should be rejected. The rules above seem like they might match the DomU <-> DomU case (wouldn't these go through the FORWARD chain also?) If DomUs should be allowed to talk to each other (and that in itself is a policy decision) then perhaps adding a rule to allow when in = virbr0 & out = virbr0?
Chain INPUT (policy ACCEPT 76724 packets, 366M bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT udp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 0 0 ACCEPT tcp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 0 0 ACCEPT udp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 0 0 ACCEPT tcp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67
So we have ACCEPT rules on a chain whose default policy is ACCEPT? Is there a later catch-all REJECT rule which I'm not seeing?
Type 2: Forwarding to a specific NIC only -----------------------------------------
Chain POSTROUTING (policy ACCEPT 273 packets, 26341 bytes) pkts bytes target prot opt in out source destination 0 0 MASQUERADE all -- * eth1 192.168.200.0/24 0.0.0.0/0
Seems OK.
Chain FORWARD (policy ACCEPT 29 packets, 2244 bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT all -- eth1 vnet3 0.0.0.0/0 192.168.200.0/24 state RELATED,ESTABLISHED 0 0 ACCEPT all -- vnet3 eth1 192.168.200.0/24 0.0.0.0/0 0 0 REJECT all -- * vnet3 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable 0 0 REJECT all -- vnet3 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
Seems OK, except for the DomU <-> DomU case as above.
Chain INPUT (policy ACCEPT 76724 packets, 366M bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT udp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 0 0 ACCEPT tcp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 0 0 ACCEPT udp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 0 0 ACCEPT tcp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67
Again I don't understand ACCEPT rules on a chain with default policy ACCEPT.
Type 3: Forwarding to any active NIC ------------------------------------
Same comments as for the type 2 case above.
Hopefully at least one person has read this far through the email and still understands what is going on....
To some extent ... Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (USA), Charlie Peters (USA) and David Owens (Ireland)

BTW, while researching 'net.bridge.bridge-nf-call-iptables', I came across this scary diagram: http://l7-filter.sourceforge.net/PacketFlow.png Be sure to resize your browser window to the maximum it will go :-) Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (USA), Charlie Peters (USA) and David Owens (Ireland)

On Thu, Apr 05, 2007 at 11:43:56AM +0100, Richard W.M. Jones wrote:
BTW, while researching 'net.bridge.bridge-nf-call-iptables', I came across this scary diagram:
http://l7-filter.sourceforge.net/PacketFlow.png
Be sure to resize your browser window to the maximum it will go :-)
Haha & I thought my email would be hard to understand ! Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, 2007-04-05 at 11:43 +0100, Richard W.M. Jones wrote:
BTW, while researching 'net.bridge.bridge-nf-call-iptables', I came across this scary diagram:
http://l7-filter.sourceforge.net/PacketFlow.png
Be sure to resize your browser window to the maximum it will go :-)
Yeah, Dan pointed this out before - it does actually make some sense if you stare at it long enough :-) Cheers, Mark.

On Thu, Apr 05, 2007 at 11:38:42AM +0100, Richard W.M. Jones wrote:
Daniel P. Berrange wrote: [...]
Scenario 2: Virtual network ===========================
net.bridge.bridge-nf-call-iptables = 1
As far as I could tell, this case is exactly the same as scenario 1, except PHYSIN is available.
Yep, that is correct. The net.bridge.bridge-nf-call-iptables has a much more significant impact on scenario 4 with shared physical NICs, because with bridging to the physical NIC you'd ordinarily not hit iptables at all in many cases.
Type 1: Isolated virtual network --------------------------------
Chain POSTROUTING (policy ACCEPT 273 packets, 26341 bytes) pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 29 packets, 2244 bytes) pkts bytes target prot opt in out source destination 0 0 REJECT all -- * vnet2 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable 0 0 REJECT all -- vnet2 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
So the thinking here is that FORWARD will only apply to packets from DomU to the internet. Since this is an isolated network, all packets trying to go out should be rejected. I'm a bit confused as to what "vnet2" is here. It seems that any traffic to/from virbr0 should be rejected.
I should have explained that vnet2, vnet3 & virbr0 are all just the bridge devices associated with each virtual network. I actually had all 3 virtual networks running at once, which is wy each example uses a different NIC.
The rules above seem like they might match the DomU <-> DomU case (wouldn't these go through the FORWARD chain also?) If DomUs should be allowed to talk to each other (and that in itself is a policy decision) then perhaps adding a rule to allow when in = virbr0 & out = virbr0?
Hmm, that's a good question. I didn't test the DomU<->DomU case. I'll check up on that shortly & let you know about that. I suspect you are correct that this would accidentally block DomU<->DomU comms.
Chain INPUT (policy ACCEPT 76724 packets, 366M bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT udp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 0 0 ACCEPT tcp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 0 0 ACCEPT udp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 0 0 ACCEPT tcp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67
So we have ACCEPT rules on a chain whose default policy is ACCEPT? Is there a later catch-all REJECT rule which I'm not seeing?
Basically assume the policy of the chain could be anything. I just happened to have it as ACCEPT, but the user may well have other rules added by the OS tools (eg system-config-securitylevel) which would otherwise block our traffic. So in coming up with the rules I tried to be as explicit as possible about what to ACCEPT/REJECT.
Chain FORWARD (policy ACCEPT 29 packets, 2244 bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT all -- eth1 vnet3 0.0.0.0/0 192.168.200.0/24 state RELATED,ESTABLISHED 0 0 ACCEPT all -- vnet3 eth1 192.168.200.0/24 0.0.0.0/0 0 0 REJECT all -- * vnet3 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable 0 0 REJECT all -- vnet3 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
Seems OK, except for the DomU <-> DomU case as above.
Will investigate this too.
Chain INPUT (policy ACCEPT 76724 packets, 366M bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT udp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 0 0 ACCEPT tcp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 0 0 ACCEPT udp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 0 0 ACCEPT tcp -- vnet3 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67
Again I don't understand ACCEPT rules on a chain with default policy ACCEPT.
As above - its a 'just in case'. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

Daniel P. Berrange wrote:
Chain INPUT (policy ACCEPT 76724 packets, 366M bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT udp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53 0 0 ACCEPT tcp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53 0 0 ACCEPT udp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67 0 0 ACCEPT tcp -- vnet2 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67 So we have ACCEPT rules on a chain whose default policy is ACCEPT? Is there a later catch-all REJECT rule which I'm not seeing?
Basically assume the policy of the chain could be anything. I just happened to have it as ACCEPT, but the user may well have other rules added by the OS tools (eg system-config-securitylevel) which would otherwise block our traffic. So in coming up with the rules I tried to be as explicit as possible about what to ACCEPT/REJECT.
Understood. The rules seem fine in that case. Rich. -- Emerging Technologies, Red Hat http://et.redhat.com/~rjones/ 64 Baker Street, London, W1U 7DF Mobile: +44 7866 314 421 Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (USA), Charlie Peters (USA) and David Owens (Ireland)

On Thu, 2007-04-05 at 11:55 +0100, Daniel P. Berrange wrote:
On Thu, Apr 05, 2007 at 11:38:42AM +0100, Richard W.M. Jones wrote:
Daniel P. Berrange wrote: [...]
Scenario 2: Virtual network ===========================
net.bridge.bridge-nf-call-iptables = 1
As far as I could tell, this case is exactly the same as scenario 1, except PHYSIN is available.
Yep, that is correct. The net.bridge.bridge-nf-call-iptables has a much more significant impact on scenario 4 with shared physical NICs, because with bridging to the physical NIC you'd ordinarily not hit iptables at all in many cases.
What's happening is that even though we're bridging here, we don't see iptables being invoked as packets traversed the bridge here because it's not actually traversing the bridge. i.e. in that packet flow diagram, we go into the link layer, hit NAT PREROUTING, but then the bridging decision sends us up to the routing decision at the network layer before we can hit the FORWARD filter at the link layer. i.e. if instead of assigning an IP address to the bridge, we connected a loopback device to the bridge and assigned the IP address to that, then we would hit the link layer FORWARD filter even for the Guest->Host case. Cheers, Mark.

On Thu, Apr 05, 2007 at 11:55:30AM +0100, Daniel P. Berrange wrote:
On Thu, Apr 05, 2007 at 11:38:42AM +0100, Richard W.M. Jones wrote:
Type 1: Isolated virtual network --------------------------------
Chain POSTROUTING (policy ACCEPT 273 packets, 26341 bytes) pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 29 packets, 2244 bytes) pkts bytes target prot opt in out source destination 0 0 REJECT all -- * vnet2 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable 0 0 REJECT all -- vnet2 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
So the thinking here is that FORWARD will only apply to packets from DomU to the internet. Since this is an isolated network, all packets trying to go out should be rejected. I'm a bit confused as to what "vnet2" is here. It seems that any traffic to/from virbr0 should be rejected.
I should have explained that vnet2, vnet3 & virbr0 are all just the bridge devices associated with each virtual network. I actually had all 3 virtual networks running at once, which is wy each example uses a different NIC.
The rules above seem like they might match the DomU <-> DomU case (wouldn't these go through the FORWARD chain also?) If DomUs should be allowed to talk to each other (and that in itself is a policy decision) then perhaps adding a rule to allow when in = virbr0 & out = virbr0?
Hmm, that's a good question. I didn't test the DomU<->DomU case. I'll check up on that shortly & let you know about that. I suspect you are correct that this would accidentally block DomU<->DomU comms.
In scenario 1 we have net.bridge.bridge-nf-call-iptables = 0, so the DomU <-> DomU traffic is handled completely at the network briding layer, so iptables never gets involved at all. So we don't need any extra rules in this case to allow DomU <-> DomU.
Chain FORWARD (policy ACCEPT 29 packets, 2244 bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT all -- eth1 vnet3 0.0.0.0/0 192.168.200.0/24 state RELATED,ESTABLISHED 0 0 ACCEPT all -- vnet3 eth1 192.168.200.0/24 0.0.0.0/0 0 0 REJECT all -- * vnet3 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable 0 0 REJECT all -- vnet3 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
Seems OK, except for the DomU <-> DomU case as above.
Will investigate this too.
In this scenario 2, we net.bridge.bridge-nf-call-iptables = 1, so even though the DomU <-> DomU traffic is being bridged, it still enters the iptables PREROUTING, FORWARD & POSTROUTING chains. So, yes, we do need extra rule to allow DomU <-> DomU traffic here, match in=vnet3 & out=vnet3 The trace looks like: Out: NAT-PREROUTING IN=vnet3 OUT= PHYSIN=vif7.0 SRC=192.168.200.204 DST=192.168.200.242 FORWARD IN=vnet3 OUT=vnet3 PHYSIN=vif7.0 PHYSOUT=tap0 SRC=192.168.200.204 DST=192.168.200.242 NAT-POSTROUTING IN= OUT=vnet3 PHYSIN=vif7.0 PHYSOUT=tap0 SRC=192.168.200.204 DST=192.168.200.242 Back: FORWARD IN=vnet3 OUT=vnet3 PHYSIN=tap0 PHYSOUT=vif7.0 SRC=192.168.200.242 DST=192.168.200.204 I'm addressing this by adding an extra rule: Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 56 10899 ACCEPT all -- vnet3 vnet3 0.0.0.0/0 0.0.0.0/0 So with all this I've now tested: - Isolated network - Allowed SSH host -> guest - Allowed SSH guest -> host - Allowed SSH guest -> guest - Denied SSH guest -> world - Forwarding to specific NIC - Allowed SSH host -> guest - Allowed SSH guest -> host - Allowed SSH guest -> guest - Allowed SSH guest -> host on specific NIC - Denied SSH guest -> host on other NIC - Forwarding to world - Allowed SSH host -> guest - Allowed SSH guest -> host - Allowed SSH guest -> guest - Allowed SSH guest -> world regardless of active NIC Attaching the latest patch with this Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
participants (4)
-
Daniel P. Berrange
-
Daniel Veillard
-
Mark McLoughlin
-
Richard W.M. Jones