[libvirt] [PATCH] network: bridge: Don't start network if it collides with host routing

Fedora bug https://bugzilla.redhat.com/show_bug.cgi?id=235961 If using the default virtual network, an easy way to lose guest network connectivity is to install libvirt inside the VM. The autostarted default network inside the guest collides with host virtual network routing. This is a long standing issue that has caused users quite a bit of pain and confusion. On network startup, parse /proc/net/route and compare the requested IP+netmask against host routing destinations: if any matches are found, refuse to start the network. Caveat: I'm not very networking savvy Signed-off-by: Cole Robinson <crobinso@redhat.com> --- src/network/bridge_driver.c | 92 +++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 92 insertions(+), 0 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 5d7ef19..79da757 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -42,6 +42,8 @@ #include <stdio.h> #include <sys/wait.h> #include <sys/ioctl.h> +#include <netinet/in.h> +#include <arpa/inet.h> #include "virterror_internal.h" #include "datatypes.h" @@ -908,6 +910,92 @@ cleanup: return ret; } +#define PROC_NET_ROUTE "/proc/net/route" + +static int networkCheckRouteCollision(virNetworkObjPtr network) +{ + int ret = -1, len; + char *cur, *buf = NULL; + enum {MAX_ROUTE_SIZE = 1024*64}; + struct in_addr inaddress, innetmask; + char *netaddr = NULL; + + if (!network->def->ipAddress || !network->def->netmask) + return 0; + + if (inet_pton(AF_INET, network->def->ipAddress, &inaddress) <= 0) { + networkReportError(VIR_ERR_INTERNAL_ERROR, + _("cannot parse IP address '%s'"), + network->def->ipAddress); + goto error; + } + if (inet_pton(AF_INET, network->def->netmask, &innetmask) <= 0) { + networkReportError(VIR_ERR_INTERNAL_ERROR, + _("cannot parse netmask '%s'"), + network->def->netmask); + goto error; + } + + inaddress.s_addr &= innetmask.s_addr; + netaddr = strdup(inet_ntoa(inaddress)); + if (!netaddr) { + virReportOOMError(); + goto error; + } + + /* Read whole routing table into memory */ + if ((len = virFileReadAll(PROC_NET_ROUTE, MAX_ROUTE_SIZE, &buf)) < 0) + goto error; + + /* Dropping the last character shouldn't hurt */ + buf[len-1] = '\0'; + + /* First line is just headings, skip it */ + cur = strchr(buf, '\n'); + + while (cur) { + char iface[17]; + char dest_buf[128]; + char *dest_ip; + struct in_addr in; + int num; + unsigned int addr_val; + + cur++; + num = sscanf(cur, "%16s %127s", iface, dest_buf); + if (num != 2) { + networkReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to parse %s"), PROC_NET_ROUTE); + goto error; + } + + if (virStrToLong_ui(dest_buf, NULL, 16, &addr_val) < 0) { + networkReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed not convert network address %s"), + dest_buf); + goto error; + } + + in.s_addr = addr_val; + dest_ip = inet_ntoa(in); + + if (STREQ(netaddr, dest_ip)) { + networkReportError(VIR_ERR_INTERNAL_ERROR, + _("Network destination %s is already in use " + "by interface %s"), netaddr, iface); + goto error; + } + + cur = strchr(cur, '\n'); + } + + ret = 0; +error: + VIR_FREE(buf); + VIR_FREE(netaddr); + return ret; +} + static int networkStartNetworkDaemon(struct network_driver *driver, virNetworkObjPtr network) { @@ -919,6 +1007,10 @@ static int networkStartNetworkDaemon(struct network_driver *driver, return -1; } + /* Check to see if network collides with an existing route */ + if (networkCheckRouteCollision(network) < 0) + return -1; + if ((err = brAddBridge(driver->brctl, network->def->bridge))) { virReportSystemError(err, _("cannot create bridge '%s'"), -- 1.6.6.1

On 05/20/2010 07:41 PM, Cole Robinson wrote:
Fedora bug https://bugzilla.redhat.com/show_bug.cgi?id=235961
If using the default virtual network, an easy way to lose guest network connectivity is to install libvirt inside the VM. The autostarted default network inside the guest collides with host virtual network routing. This is a long standing issue that has caused users quite a bit of pain and confusion.
On network startup, parse /proc/net/route
Any interest in doing this with netlink instead? (I've got this "thing" against parsing text files to get information if it can be retrieved via a nice clean API). If so, I think I can whip up the equivalent code with libnl calls, but probably not until later in the afternoon. If I'm the only one who feels uneasy about parsing stuff out of /proc, then I'm fine with that too.
and compare the requested IP+netmask against host routing destinations: if any matches are found, refuse to start the network.
Do we maybe want to check for any networks that would encompass existing host interface IP addresses, or existing route destination IPs?

On Fri, May 21, 2010 at 01:23:32AM -0400, Laine Stump wrote:
On 05/20/2010 07:41 PM, Cole Robinson wrote:
Fedora bug https://bugzilla.redhat.com/show_bug.cgi?id=235961
If using the default virtual network, an easy way to lose guest network connectivity is to install libvirt inside the VM. The autostarted default network inside the guest collides with host virtual network routing. This is a long standing issue that has caused users quite a bit of pain and confusion.
Indeed; thanks for taking this on. Would it also be useful to have an API that simply sets the network and mask of the default network?
On network startup, parse /proc/net/route
Any interest in doing this with netlink instead? (I've got this "thing" against parsing text files to get information if it can be retrieved via a nice clean API). If so, I think I can whip up the equivalent code with libnl calls, but probably not until later in the afternoon.
I agree, using netlink would be a bit cleaner. Dave

On 05/21/2010 01:23 AM, Laine Stump wrote:
On 05/20/2010 07:41 PM, Cole Robinson wrote:
Fedora bug https://bugzilla.redhat.com/show_bug.cgi?id=235961
If using the default virtual network, an easy way to lose guest network connectivity is to install libvirt inside the VM. The autostarted default network inside the guest collides with host virtual network routing. This is a long standing issue that has caused users quite a bit of pain and confusion.
On network startup, parse /proc/net/route
Any interest in doing this with netlink instead? (I've got this "thing" against parsing text files to get information if it can be retrieved via a nice clean API). If so, I think I can whip up the equivalent code with libnl calls, but probably not until later in the afternoon.
If I'm the only one who feels uneasy about parsing stuff out of /proc, then I'm fine with that too.
My feeling is it depends on what else libnl buys us. A library dependency for a single safety check doesn't seem worth it. If we use the lib, we definitely don't want it to be optional, as debugging network issues in the future shouldn't require asking 'is libvirt compiled against libnl'. Also, this code is pretty simple, and I think /proc/net/route is as stable as it gets.
and compare the requested IP+netmask against host routing destinations: if any matches are found, refuse to start the network.
Do we maybe want to check for any networks that would encompass existing host interface IP addresses, or existing route destination IPs?
I'm not really knowledgeable enough to say, but we should try and detect any common case that may cause host networking malfunction. Thanks, Cole

On 05/21/2010 09:28 AM, Cole Robinson wrote:
Any interest in doing this with netlink instead? (I've got this "thing" against parsing text files to get information if it can be retrieved via a nice clean API). If so, I think I can whip up the equivalent code with libnl calls, but probably not until later in the afternoon.
If I'm the only one who feels uneasy about parsing stuff out of /proc, then I'm fine with that too.
My feeling is it depends on what else libnl buys us. A library dependency for a single safety check doesn't seem worth it. If we use the lib, we definitely don't want it to be optional, as debugging network issues in the future shouldn't require asking 'is libvirt compiled against libnl'.
Stefan just posted a patch series that would also add a dependency on libnl for vepa; as long as we are using the library, we might as well use it for more than one thing. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

On 05/21/2010 01:21 PM, Eric Blake wrote:
On 05/21/2010 09:28 AM, Cole Robinson wrote:
Any interest in doing this with netlink instead? (I've got this "thing" against parsing text files to get information if it can be retrieved via a nice clean API). If so, I think I can whip up the equivalent code with libnl calls, but probably not until later in the afternoon.
If I'm the only one who feels uneasy about parsing stuff out of /proc, then I'm fine with that too.
My feeling is it depends on what else libnl buys us. A library dependency for a single safety check doesn't seem worth it. If we use the lib, we definitely don't want it to be optional, as debugging network issues in the future shouldn't require asking 'is libvirt compiled against libnl'.
Stefan just posted a patch series that would also add a dependency on libnl for vepa; as long as we are using the library, we might as well use it for more than one thing.
Yes and no. I just learned a few days ago that RHEL5 has libnl-1.0 installed, and it is incompatible with libnl-1.1 (different API, can't be installed side by side). On the other hand, netcf is already dependent on libnl (and because of this will no longer build in EPEL5), so maybe that isn't an issue (ie maybe we'll need to figure out how to fix it anyway - NetworkManager fixes it by building a private copy of libnl-1.1 when building for RHEL5!) I still prefer using libnl to /proc, but don't want to rush into something that breaks something that can't stand being broken.

On Fri, May 21, 2010 at 11:21:34AM -0600, Eric Blake wrote:
On 05/21/2010 09:28 AM, Cole Robinson wrote:
Any interest in doing this with netlink instead? (I've got this "thing" against parsing text files to get information if it can be retrieved via a nice clean API). If so, I think I can whip up the equivalent code with libnl calls, but probably not until later in the afternoon.
If I'm the only one who feels uneasy about parsing stuff out of /proc, then I'm fine with that too.
My feeling is it depends on what else libnl buys us. A library dependency for a single safety check doesn't seem worth it. If we use the lib, we definitely don't want it to be optional, as debugging network issues in the future shouldn't require asking 'is libvirt compiled against libnl'.
Stefan just posted a patch series that would also add a dependency on libnl for vepa; as long as we are using the library, we might as well use it for more than one thing.
We already depend on it via netcf in any case. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

I'm stepping back, for now, on my suggestion of using libnl to do this. The proposed method works, and solves a serious problem. Using libnl would also work, but would create a new dependency (which can be a bit hairy with libnl - eg RHEL5 has libnl-1.0 which is API-incompatible with libnl-1.1, and can't be installed in parallel), and would take longer to get right. I still have a couple nits, but will ACK conditional on those. On 05/20/2010 07:41 PM, Cole Robinson wrote:
Fedora bug https://bugzilla.redhat.com/show_bug.cgi?id=235961
If using the default virtual network, an easy way to lose guest network connectivity is to install libvirt inside the VM. The autostarted default network inside the guest collides with host virtual network routing. This is a long standing issue that has caused users quite a bit of pain and confusion.
On network startup, parse /proc/net/route and compare the requested IP+netmask against host routing destinations: if any matches are found, refuse to start the network.
Caveat: I'm not very networking savvy
Signed-off-by: Cole Robinson<crobinso@redhat.com> --- src/network/bridge_driver.c | 92 +++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 92 insertions(+), 0 deletions(-)
diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 5d7ef19..79da757 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -42,6 +42,8 @@ #include<stdio.h> #include<sys/wait.h> #include<sys/ioctl.h> +#include<netinet/in.h> +#include<arpa/inet.h>
#include "virterror_internal.h" #include "datatypes.h" @@ -908,6 +910,92 @@ cleanup: return ret; }
+#define PROC_NET_ROUTE "/proc/net/route" + +static int networkCheckRouteCollision(virNetworkObjPtr network) +{ + int ret = -1, len; + char *cur, *buf = NULL; + enum {MAX_ROUTE_SIZE = 1024*64}; + struct in_addr inaddress, innetmask; + char *netaddr = NULL; + + if (!network->def->ipAddress || !network->def->netmask) + return 0; + + if (inet_pton(AF_INET, network->def->ipAddress,&inaddress)<= 0) { + networkReportError(VIR_ERR_INTERNAL_ERROR, + _("cannot parse IP address '%s'"), + network->def->ipAddress); + goto error; + } + if (inet_pton(AF_INET, network->def->netmask,&innetmask)<= 0) { + networkReportError(VIR_ERR_INTERNAL_ERROR, + _("cannot parse netmask '%s'"), + network->def->netmask); + goto error; + } + + inaddress.s_addr&= innetmask.s_addr; + netaddr = strdup(inet_ntoa(inaddress)); + if (!netaddr) { + virReportOOMError(); + goto error; + } + + /* Read whole routing table into memory */ + if ((len = virFileReadAll(PROC_NET_ROUTE, MAX_ROUTE_SIZE,&buf))< 0) + goto error; + + /* Dropping the last character shouldn't hurt */ + buf[len-1] = '\0'; + + /* First line is just headings, skip it */ + cur = strchr(buf, '\n'); + + while (cur) { + char iface[17]; + char dest_buf[128]; + char *dest_ip; + struct in_addr in; + int num; + unsigned int addr_val; + + cur++; + num = sscanf(cur, "%16s %127s", iface, dest_buf);
sscanf is still allowed, but there has been an effort to eliminate it. How about replacing it with something like this? (it has the added side effect of not chopping off interface names that are > 16 characters long) ... char *dest_raw; char *iface; cur++; iface = dest_raw = cur; while (*dest_raw > ' ') { /* skip until we hit a space or control character (probably a tab) */ dest_raw++l } *dest_raw++ = 0; /* null terminate interface name */ ... if (virStrToLong_ui(dest_raw, NULL, 16, &addr_val) < 0) { ...
+ if (num != 2) { + networkReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to parse %s"), PROC_NET_ROUTE); + goto error; + } + + if (virStrToLong_ui(dest_buf, NULL, 16,&addr_val)< 0) { + networkReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed not convert network address %s"), + dest_buf);
Some bad English in the error message there! ;-)
+ goto error; + } + + in.s_addr = addr_val; + dest_ip = inet_ntoa(in); + + if (STREQ(netaddr, dest_ip)) { + networkReportError(VIR_ERR_INTERNAL_ERROR, + _("Network destination %s is already in use " + "by interface %s"), netaddr, iface); + goto error; + } + + cur = strchr(cur, '\n'); + } + + ret = 0; +error: + VIR_FREE(buf); + VIR_FREE(netaddr); + return ret; +} + static int networkStartNetworkDaemon(struct network_driver *driver, virNetworkObjPtr network) { @@ -919,6 +1007,10 @@ static int networkStartNetworkDaemon(struct network_driver *driver, return -1; }
+ /* Check to see if network collides with an existing route */ + if (networkCheckRouteCollision(network)< 0) + return -1; + if ((err = brAddBridge(driver->brctl, network->def->bridge))) { virReportSystemError(err, _("cannot create bridge '%s'"),
ACK, modulo the error message typo and the use of sscanf, and with a note somewhere that we should refine the check later to possibly take into account route netmasks and gateways, and host interface addresses (we need to be careful not to put in too much restriction, and kill useful configurations, so I think we should take time to think about it before acting further).

On 05/21/2010 01:35 PM, Laine Stump wrote:
I'm stepping back, for now, on my suggestion of using libnl to do this. The proposed method works, and solves a serious problem. Using libnl would also work, but would create a new dependency (which can be a bit hairy with libnl - eg RHEL5 has libnl-1.0 which is API-incompatible with libnl-1.1, and can't be installed in parallel), and would take longer to get right.
...
ACK, modulo the error message typo and the use of sscanf, and with a note somewhere that we should refine the check later to possibly take into account route netmasks and gateways, and host interface addresses (we need to be careful not to put in too much restriction, and kill useful configurations, so I think we should take time to think about it before acting further).
Thanks for the feedback, resending shortly with your comments addressed. - Cole
participants (5)
-
Cole Robinson
-
Daniel P. Berrange
-
Dave Allan
-
Eric Blake
-
Laine Stump