---
Daniel, how's this as a first go?
It's just your content converted to HTML in the same style as the other
pages, with the occasional typo fix. If this is workable, then I'll create
another patch after this to hook it into the overall website menu structure.
docs/firewall.html.in | 477 +++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 477 insertions(+), 0 deletions(-)
create mode 100644 docs/firewall.html.in
diff --git a/docs/firewall.html.in b/docs/firewall.html.in
new file mode 100644
index 0000000..a6dbec2
--- /dev/null
+++ b/docs/firewall.html.in
@@ -0,0 +1,477 @@
+<?xml version="1.0"?>
+<html>
+ <body>
+ <h1 >Firewall / network filtering in libvirt</h1>
+ <p>There are three pieces of libvirt functionality which do network
+ filtering of some type.
+ <br /><br />
+ At a high level they are:
+ </p>
+ <ul>
+ <li>The virtual network driver
+ <br /><br />
+ This provides a isolated bridge device (ie no physical NICs
+ enslaved). Guest TAP devices are attached to this bridge.
+ Guests can talk to each other and the host, and optionally the
+ wider world.
+ <br /><br />
+ </li>
+ <li>The QEMU driver MAC filtering
+ <br /><br />
+ This provides a generic filtering of MAC addresses to prevent
+ the guest spoofing its MAC address. This is mostly obsoleted by
+ the next item, so won't be discussed further.
+ <br /><br />
+ </li>
+ <li>The network filter driver
+ <br /><br />
+ This provides fully configurable, arbitrary network filtering
+ of traffic on guest NICs. Generic rulesets are defined at the
+ host level to control traffic in some manner. Rules sets are
+ then associated with individual NICs of a guest. While not as
+ expressive as directly using iptables/ebtables, this can still
+ do nearly everything you would want to on a guest NIC filter.
+ </li>
+ </ul>
+
+ <h3><a name="name-fw-virtual-network-driver"
+ id="id-fw-virtual-network-driver">The virtual network
driver</a>
+ </h3>
+ <p>The typical configuration for guests is to use bridging of the
+ physical NIC on the host to connect the guest directly to the LAN.
+ In RHEL6 there is also the possibility of using macvtap/sr-iov
+ and VEPA connectivity. None of this stuff plays nicely with wireless
+ NICs, since they will typically silently drop any traffic with a
+ MAC address that doesn't match that of the physical NIC.
+ </p>
+ <p>Thus the virtual network driver in libvirt was invented. This takes
+ the form of an isolated bridge device (ie one with no physical NICs
+ enslaved). The TAP devices associated with the guest NICs are attached
+ to the bridge device. This immediately allows guests on a single host
+ to talk to each other and to the host OS (modulo host IPtables rules).
+ </p>
+ <p>libvirt then uses iptables to control what further connectivity is
+ available. There are three configurations possible for a virtual
+ network at time of writing:
+ </p>
+ <ul>
+ <li>isolated: all off-node traffic is completely blocked</li>
+ <li>nat: outbound traffic to the LAN is allowed, but MASQUERADED</li>
+ <li>forward: outbound traffic to the LAN is allowed</li>
+ </ul>
+ <p>The latter 'forward' case requires the virtual network be on a
+ separate sub-net from the main LAN, and that the LAN admin has
+ configured routing for this subnet. In the future we intend to
+ add support for IP subnetting and/or proxy-arp. This allows for
+ the virtual network to use the same subnet as the main LAN and
+ should avoid need for the LAN admin to configure special routing.
+ </p>
+ <p>Libvirt will optionally also provide DHCP services to the virtual
+ network using DNSMASQ. In all cases, we need to allow DNS/DHCP
+ queries to the host OS. Since we can't predict whether the host
+ firewall setup is already allowing this, we insert 4 rules into
+ the head of the INPUT chain
+ </p>
+ <pre>
+target prot opt in out source destination
+ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53
+ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53
+ACCEPT udp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67
+ACCEPT tcp -- virbr0 * 0.0.0.0/0 0.0.0.0/0 tcp
dpt:67</pre>
+ <p>Note we have restricted our rules to just the bridge associated
+ with the virtual network, to avoid opening undesirable holes in
+ the host firewall wrt the LAN/WAN.
+ </p>
+ <p>The next rules depend on the type of connectivity allowed, and go
+ in the main FORWARD chain:
+ </p>
+ <ul>
+ <li>type=isolated
+ <br /><br />
+Allow traffic between guests. Deny inbound. Deny outbound.
+ <pre>
+target prot opt in out source destination
+ACCEPT all -- virbr1 virbr1 0.0.0.0/0 0.0.0.0/0
+REJECT all -- * virbr1 0.0.0.0/0 0.0.0.0/0 reject-with
icmp-port-unreachable
+REJECT all -- virbr1 * 0.0.0.0/0 0.0.0.0/0 reject-with
icmp-port-unreachable</pre>
+ </li>
+ <li>type=nat
+ <br /><br />
+Allow inbound related to an established connection. Allow
+outbound, but only from our expected subnet. Allow traffic
+between guests. Deny all other inbound. Deny all other outbound.
+ <pre>
+target prot opt in out source destination
+ACCEPT all -- * virbr0 0.0.0.0/0 192.168.122.0/24 state
RELATED,ESTABLISHED
+ACCEPT all -- virbr0 * 192.168.122.0/24 0.0.0.0/0
+ACCEPT all -- virbr0 virbr0 0.0.0.0/0 0.0.0.0/0
+REJECT all -- * virbr0 0.0.0.0/0 0.0.0.0/0 reject-with
icmp-port-unreachable
+REJECT all -- virbr0 * 0.0.0.0/0 0.0.0.0/0 reject-with
icmp-port-unreachable</pre>
+ </li>
+ <li>type=routed
+ <br /><br />
+Allow inbound, but only to our expected subnet. Allow
+outbound, but only from our expected subnet. Allow traffic
+between guests. Deny all other inbound. Deny all other outbound.
+ <pre>
+target prot opt in out source destination
+ACCEPT all -- * virbr2 0.0.0.0/0 192.168.124.0/24
+ACCEPT all -- virbr2 * 192.168.124.0/24 0.0.0.0/0
+ACCEPT all -- virbr2 virbr2 0.0.0.0/0 0.0.0.0/0
+REJECT all -- * virbr2 0.0.0.0/0 0.0.0.0/0 reject-with
icmp-port-unreachable
+REJECT all -- virbr2 * 0.0.0.0/0 0.0.0.0/0 reject-with
icmp-port-unreachable</pre>
+ </li>
+ <li>Finally, with type=nat, there is also an entry in the POSTROUTING
+chain to apply masquerading:
+ <pre>
+target prot opt in out source destination
+MASQUERADE all -- * * 192.168.122.0/24 !192.168.122.0/24</pre>
+ </li>
+ </ul>
+
+ <h3><a name="name-fw-network-filter-driver"
+ id="id-fw-network-filter-driver">The network filter
driver</a>
+ </h3>
+ <p>This driver provides a fully configurable network filtering capability
+ that leverages ebtables, iptables and ip6tables. This was written by
+ the libvirt guys at IBM and although its XML schema is defined by libvirt,
+ the conceptual model is closely aligned with the DMTF CIM schema for
+ network filtering:
+ </p>
+ <p><a
href="http://www.dmtf.org/standards/cim/cim_schema_v2230/CIM_Network...
+ <p>The filters are managed in libvirt as a top level, standalone object.
+ This allows the filters to then be referenced by any libvirt object
+ that requires their functionality, instead tieing them only to use
+ by guest NICs. In the current implementation, filters can be associated
+ with individual guest NICs via the libvirt domain XML format. In the
+ future we might allow filters to be associated with the virtual network
+ objects. Further we're expecting to define a new 'virtual switch'
object
+ to remove the complexity of configuring bridge/sriov/vepa networking
+ modes. This make also end up making use of network filters.
+ </p>
+ <p>There are a new set of virsh commands for managing network
filters:</p>
+ <ul>
+ <li>virsh nwfilter-define
+ <br /><br />
+ define or update a network filter from an XML file
+ <br /><br />
+ </li>
+ <li>virsh nwfilter-undefine
+ <br /><br />
+ undefine a network filter
+ <br /><br />
+ </li>
+ <li>virsh nwfilter-dumpxml
+ <br /><br />
+ network filter information in XML
+ <br /><br />
+ </li>
+ <li>virsh nwfilter-list
+ <br /><br />
+ list network filters
+ <br /><br />
+ </li>
+ <li>virsh nwfilter-edit
+ <br /><br />
+ edit XML configuration for a network filter
+ </li>
+ </ul>
+ <p>There are equivalently named C APIs for each of these commands.</p>
+ <p>As with all objects libvirt manages, network filters are configured
+using an XML format. At a high level the format looks like this:
+ </p>
+<pre>
+<filter name='no-spamming' chain='XXXX'>
+ <uuid>d217f2d7-5a04-0e01-8b98-ec2743436b74</uuid>
+
+ <rule ...>
+ ....
+ </rule>
+
+ <filterref filter='XXXX'/>
+</filter></pre>
+ <p>Every filter has a name and UUID which serve as unique identifiers.
+ A filter can have zero-or-more <code><rule></code>
elements which
+ are used to actually define network controls. Filters can be arranged
+ into a DAG, so zero-or-more <code><filterref/></code>
elements are
+ also allowed. Cycles in the graph are not allowed.
+ </p>
+ <p>The <code><rule></code> element is where all the
interesting stuff
+ happens. It has three attributes, an action, a traffic direction and an
+ optional priority. eg:
+ </p>
+ <pre><rule action='drop' direction='out'
priority='500'></pre>
+ <p>Within the rule there are a wide variety of elements allowed, which
+ do protocol specific matching. Supported protocols currently include
+ <code>mac</code>, <code>arp</code>,
<code>rarp</code>, <code>ip</code>,
+ <code>ipv6</code>, <code>tcp/ip</code>,
<code>icmp/ip</code>,
+ <code>igmp/ip</code>, <code>udp/ip</code>,
<code>udplite/ip</code>,
+ <code>esp/ip</code>, <code>ah/ip</code>,
<code>sctp/ip</code>,
+ <code>tcp/ipv6</code>, <code>icmp/ipv6</code>,
<code>igmp/ipv6</code>,
+ <code>udp/ipv6</code>, <code>udplite/ipv6</code>,
<code>esp/ipv6</code>,
+ <code>ah/ipv6</code>, <code>sctp/ipv6</code>. Each
protocol defines what
+ is valid inside the <rule> element. The general pattern though is:
+ </p>
+ <pre>
+<protocol match='yes|no' attribute1='value1'
attribute2='value2'/></pre>
+ <p>So, eg a TCP protocol, matching ports 0-1023 would be expressed
as:</p>
+ <pre><tcp match='yes' srcportstart='0'
srcportend='1023'/></pre>
+ <p>Attributes can included references to variables defined by the
+ object using the rule. So the guest XML format allows each NIC
+ to have a MAC address and IP address defined. These are made
+ available to filters via the variables
<code><b>$IP</b></code> and
+ <code><b>$MAC</b></code>.
+ </p>
+ <p>So to define a filter that prevents IP address spoofing we can
+ simply match on source IP address <code>!= $IP</code> like this:
+ </p>
+ <pre>
+<filter name='no-ip-spoofing' chain='ipv4'>
+ <rule action='drop' direction='out'>
+ <ip match='no' srcipaddr='<b>$IP</b>' />
+ </rule>
+</filter></pre>
+ <p>I'm not going to go into details on all the other protocol
+ matches you can do, because it'll take far too much space.
+ You can read about the options
+ <a
href="http://libvirt.org/formatnwfilter.html#nwfelemsRulesProto"...;.
+ </p>
+ <p>Out of the box in RHEL6/Fedora rawhide, libvirt ships with a
+ set of default useful rules:
+ </p>
+ <pre>
+# virsh nwfilter-list
+UUID Name
+----------------------------------------------------------------
+15b1ab2b-b1ac-1be2-ed49-2042caba4abb allow-arp
+6c51a466-8d14-6d11-46b0-68b1a883d00f allow-dhcp
+7517ad6c-bd90-37c8-26c9-4eabcb69848d allow-dhcp-server
+3d38b406-7cf0-8335-f5ff-4b9add35f288 allow-incoming-ipv4
+5ff06320-9228-2899-3db0-e32554933415 allow-ipv4
+db0b1767-d62b-269b-ea96-0cc8b451144e clean-traffic
+f88f1932-debf-4aa1-9fbe-f10d3aa4bc95 no-arp-spoofing
+772f112d-52e4-700c-0250-e178a3d91a7a no-ip-multicast
+7ee20370-8106-765d-f7ff-8a60d5aaf30b no-ip-spoofing
+d5d3c490-c2eb-68b1-24fc-3ee362fc8af3 no-mac-broadcast
+fb57c546-76dc-a372-513f-e8179011b48a no-mac-spoofing
+dba10ea7-446d-76de-346f-335bd99c1d05 no-other-l2-traffic
+f5c78134-9da4-0c60-a9f0-fb37bc21ac1f no-other-rarp-traffic
+7637e405-4ccf-42ac-5b41-14f8d03d8cf3 qemu-announce-self
+9aed52e7-f0f3-343e-fe5c-7dcb27b594e5 qemu-announce-self-rarp</pre>
+ <p>Most of these are just building blocks. The interesting one here
+ is 'clean-traffic'. This pulls together all the building blocks
+ into one filter that you can then associate with a guest NIC.
+ This stops the most common bad things a guest might try, IP
+ spoofing, arp spoofing and MAC spoofing. To look at the rules for
+ any of these just do:
+ </p>
+ <pre>virsh nwfilter-dumpxml FILTERNAME|UUID</pre>
+ <p>They are all stored in <code>/etc/libvirt/nwfilter</code>, but
don't
+ edit the files there directly. Use <code>virsh nwfilter-define</code>
+ to update them. This ensures the guests have their iptables/ebtables
+ rules recreated.
+ </p>
+ <p>To associate the clean-trafffic filter with a guest, edit the
+ guest XML config and change the <code><interface></code>
element
+ to include a <code><filterref></code> and also specify
the
+ whitelisted <code><ip address/></code> the guest is
allowed to
+ use:
+ </p>
+ <pre>
+<interface type='bridge'>
+ <mac address='52:54:00:56:44:32'/>
+ <source bridge='br1'/>
+ <ip address='10.33.8.131'/>
+ <target dev='vnet0'/>
+ <model type='virtio'/>
+ <filterref filter='clean-traffic'/>
+</interface></pre>
+ <p>If no <code><ip address></code> is included, the
network filter
+ driver will activate its 'learning mode'. This uses libpcap to snoop on
+ network traffic the guest sends and attempts to identify the
+ first IP address it uses. It then locks traffic to this address.
+ Obviously this isn't entirely secure, but it does offer some
+ protection against the guest being trojaned once up and running.
+ In the future we intend to enhance the learning mode so that it
+ looks for DHCPOFFERS from a trusted DHCP server and only allows
+ the offered IP address to be used.
+ </p>
+ <p>Now, how is all this implemented...?</p>
+ <p>The network filter driver uses a combination of ebtables, iptables and
+ ip6tables, depending on which protocols are referenced in a filter. The
+ out of the box 'clean-traffic' filter rules only require use of
+ ebtables. If you want to do matching at tcp/udp/etc protocols (eg to add
+ a new filter 'no-email-spamming' to block port 25), then iptables will
+ also be used.
+ </p>
+ <p>The driver attempts to keep its rules separate from those that
+ the host admin might already have configured. So the first thing
+ it does with ebtables, is to add two hooks in POSTROUTING and
+ PREROUTING chains, to redirect traffic to custom chains. These
+ hooks match on the TAP device name of the guest NIC, so they
+ should not interact badly with any administrator defined rules:
+ </p>
+ <pre>
+Bridge chain: PREROUTING, entries: 1, policy: ACCEPT
+-i vnet0 -j libvirt-I-vnet0
+
+Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT
+-o vnet0 -j libvirt-O-vnet0</pre>
+ <p>To keep things managable and easy to follow, the driver will then
+ create further sub-chains for each protocol then it needs to match
+ against:
+ </p>
+ <pre>
+Bridge chain: libvirt-I-vnet0, entries: 5, policy: ACCEPT
+-p IPv4 -j I-vnet0-ipv4
+-p ARP -j I-vnet0-arp
+-p 0x8035 -j I-vnet0-rarp
+-p 0x835 -j ACCEPT
+-j DROP
+
+Bridge chain: libvirt-O-vnet0, entries: 4, policy: ACCEPT
+-p IPv4 -j O-vnet0-ipv4
+-p ARP -j O-vnet0-arp
+-p 0x8035 -j O-vnet0-rarp
+-j DROP</pre>
+ <p>Finally, here comes the actual implementation of the filters. This
+ example shows the 'clean-traffic' filter implementation.
+ I'm not going to explain what this is doing now. :-)
+ </p>
+ <pre>
+Bridge chain: I-vnet0-ipv4, entries: 2, policy: ACCEPT
+-s ! 52:54:0:56:44:32 -j DROP
+-p IPv4 --ip-src ! 10.33.8.131 -j DROP
+
+Bridge chain: O-vnet0-ipv4, entries: 1, policy: ACCEPT
+-j ACCEPT
+
+Bridge chain: I-vnet0-arp, entries: 6, policy: ACCEPT
+-s ! 52:54:0:56:44:32 -j DROP
+-p ARP --arp-mac-src ! 52:54:0:56:44:32 -j DROP
+-p ARP --arp-ip-src ! 10.33.8.131 -j DROP
+-p ARP --arp-op Request -j ACCEPT
+-p ARP --arp-op Reply -j ACCEPT
+-j DROP
+
+Bridge chain: O-vnet0-arp, entries: 5, policy: ACCEPT
+-p ARP --arp-op Reply --arp-mac-dst ! 52:54:0:56:44:32 -j DROP
+-p ARP --arp-ip-dst ! 10.33.8.131 -j DROP
+-p ARP --arp-op Request -j ACCEPT
+-p ARP --arp-op Reply -j ACCEPT
+-j DROP
+
+Bridge chain: I-vnet0-rarp, entries: 2, policy: ACCEPT
+-p 0x8035 -s 52:54:0:56:44:32 -d Broadcast --arp-op Request_Reverse --arp-ip-src 0.0.0.0
--arp-ip-dst 0.0.0.0 --arp-mac-src 52:54:0:56:44:32 --arp-mac-dst 52:54:0:56:44:32 -j
ACCEPT
+-j DROP
+
+Bridge chain: O-vnet0-rarp, entries: 2, policy: ACCEPT
+-p 0x8035 -d Broadcast --arp-op Request_Reverse --arp-ip-src 0.0.0.0 --arp-ip-dst 0.0.0.0
--arp-mac-src 52:54:0:56:44:32 --arp-mac-dst 52:54:0:56:44:32 -j ACCEPT
+-j DROP</pre>
+ <p>NB, we would have liked to include the prefix 'libvirt-' in all
+ of our chain names, but unfortunately the kernel limits names
+ to a very short maximum length. So only the first two custom
+ chains can include that prefix. The others just include the
+ TAP device name + protocol name.
+ </p>
+ <p>If I define a new filter 'no-spamming' and then add this to the
+ 'clean-traffic' filter, I can illustrate how iptables usage works:
+ </p>
+ <pre>
+# cat > /root/spamming.xml <<EOF
+<filter name='no-spamming' chain='root'>
+ <uuid>d217f2d7-5a04-0e01-8b98-ec2743436b74</uuid>
+ <rule action='drop' direction='out'
priority='500'>
+ <tcp dstportstart='25' dstportend='25'/>
+ </rule>
+</filter>
+EOF
+# virsh nwfilter-define /root/spamming.xml
+# virsh nwfilter-edit clean-traffic</pre>
+
+ <p>...add <code><filterref
filter='no-spamming'/></code></p>
+ <p>All active guests immediately have their iptables/ebtables rules
+ rebuilt.
+ </p>
+ <p>The network filter driver deals with iptables in a very similar
+ way. First it separates out its rules from those the admin may
+ have defined, by adding a couple of hooks into the INPUT/FORWARD
+ chains:
+ </p>
+ <pre>
+Chain INPUT (policy ACCEPT 13M packets, 21G bytes)
+target prot opt in out source destination
+libvirt-host-in all -- * * 0.0.0.0/0 0.0.0.0/0
+
+Chain FORWARD (policy ACCEPT 5532K packets, 3010M bytes)
+target prot opt in out source destination
+libvirt-in all -- * * 0.0.0.0/0 0.0.0.0/0
+libvirt-out all -- * * 0.0.0.0/0 0.0.0.0/0
+libvirt-in-post all -- * * 0.0.0.0/0 0.0.0.0/0</pre>
+ <p>These custom chains then do matching based on the TAP device
+ name, so they won't open holes in the admin defined matches for
+ the LAN/WAN (if any).
+ </p>
+ <pre>
+Chain libvirt-host-in (1 references)
+ target prot opt in out source destination
+ HI-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto]
PHYSDEV match --physdev-in vnet0
+
+Chain libvirt-in (1 references)
+ target prot opt in out source destination
+ FI-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto]
PHYSDEV match --physdev-in vnet0
+
+Chain libvirt-in-post (1 references)
+ target prot opt in out source destination
+ ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV
match --physdev-in vnet0
+
+Chain libvirt-out (1 references)
+ target prot opt in out source destination
+ FO-vnet0 all -- * * 0.0.0.0/0 0.0.0.0/0 [goto]
PHYSDEV match --physdev-out vnet0</pre>
+ <p>Finally, we can see the interesting bit which is the actual
+ implementation of my filter to block port 25 access:
+ </p>
+ <pre>
+Chain FI-vnet0 (1 references)
+ target prot opt in out source destination
+ DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:25
+
+Chain FO-vnet0 (1 references)
+ target prot opt in out source destination
+ DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:25
+
+Chain HI-vnet0 (1 references)
+ target prot opt in out source destination
+ DROP tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp
dpt:25</pre>
+ <p>One thing in looking at this you may notice is that if there
+ are many guests all using the same filters, we will be duplicating
+ the iptables rules over and over for each guest. This is merely a
+ limitation of the current rules engine implementation. At the libvirt
+ object modelling level you can clearly see we've designed the model
+ so filter rules are defined in one place, and indirectly referenced
+ by guests. Thus it should be possible to change the implementation in
+ the future so we can share the actual iptables/ebtables rules for
+ each guest to create a more scalable system. The stuff in current libvirt
+ is more or less the very first working implementation we've had of this,
+ so there's not been much optimization work done yet.
+ </p>
+ <p>Also notice that at the XML level we don't expose the fact we
+ are using iptables or ebtables at all. The rule definition is done in
+ terms of network protocols. Thus if we ever find a need, we could
+ plug in an alternative implementation that calls out to a different
+ firewall implementation instead of ebtables/iptables (providing that
+ implementation was suitably expressive of course)
+ </p>
+ <p>Finally, in terms of problems we have in deployment. The biggest
+ problem is that if the admin does <code>service iptables
restart</code>
+ all our work gets blown away. We've experimented with using lokkit
+ to record our custom rules in a persistent config file, but that
+ caused different problem. Admins who were not using lokkit for
+ their config found that all their own rules got blown away. So
+ we threw away our lokkit code. Instead we document that if you
+ run <code>service iptables restart</code>, you need to send SIGHUP to
+ libvirt to make it recreate its rules.
+ </p>
+ <p>More in depth documentation on this is <a
href="http://libvirt.org/formatnwfilter.html">here</a>...
+ </body>
+</html>
--
1.7.0.1