[libvirt] [PATCH] Fix dnsmasq/radvd on bridges not being able to be bound to IPv6 address on "recent" kernels

Hi, I hit this problem recently when trying to create a bridge with an IPv6 address on a 3.2 kernel: dnsmasq (and, further, radvd) would not bind to the given address (resp. interface), waiting 20s and then giving up with -EADDRNOTAVAIL (resp. exiting immediately with "error parsing or activating the config file", without libvirt noticing it, BTW). This can be reproduced with (I think) any kernel >= 2.6.39 and the following XML (to be used with "virsh net-create"): <network> <name>test-bridge</name> <bridge name='testbr0' /> <ip family='ipv6' address='fd00::1' prefix='64'> </ip> </network> (it happens even when you have an IPv4, too) The problem is that since commit [1] (which, ironically, was made to “help IPv6 autoconfiguration”) the linux bridge code makes bridges behave like “real” devices regarding carrier detection. This makes the bridges created by libvirt, which are started without any up devices, stay with the NO-CARRIER flag set, and thus prevents DAD (Duplicate address detection) from happening, thus letting the IPv6 address flagged as “tentative”. Such addresses cannot be bound to (see RFC 2462), so dnsmasq fails binding to it (for radvd, it detects that "interface XXX is not RUNNING", thus that "interface XXX does not exist, ignoring the interface" (sic)). It seems that this behavior was enhanced somehow with commit [2] by avoiding setting NO-CARRIER on empty bridges, but I couldn't reproduce this behavior on my kernel. Anyway, with the “dummy tap to set MAC address” trick, this wouldn't work. To fix this, the idea is to get the bridge's attached device to be up so that DAD can happen (deactivating DAD altogether is not a good idea, I think). Currently, libvirt creates a dummy TAP device to set the MAC address of the bridge, keeping it down. But even if we set this device up, it is not RUNNING as soon as the tap file descriptor attached to it is closed, thus still preventing DAD. So, we must modify the API a bit, so that we can get the fd, keep the tap device persistent, run the daemons, and close it after DAD has taken place. After that, the bridge will be flagged NO-CARRIER again, but the daemons will be running, even if not happy about the device's state (but we don't really care about the bridge's daemons doing anything when no up interface is connected to it). Other solutions that I envisioned were: * Keeping the *-nic interface up: this would waste an fd for each bridge during all its life. May be acceptable, I don't really know. * Stop using the dummy tap trick, and set the MAC address directly on the bridge: it is possible since quite some time it seems, even if then there is the problem of the bridge not being RUNNING when empty, contrary to what [2] says, so this will need fixing (and this fix only happened in 3.1, so it wouldn't work for 2.6.39) * Using the --interface option of dnsmasq, but I saw somewhere that it's not used by libvirt for backward compatibility. I am not sure this would solve this problem, though, as I don't know how dnsmasq binds itself to it with this option. This is why this patch does what's described earlier. I see radvd yelling quite often in the logs when the interface is NO-CARRIER, but it's ok, it keeps running. Still, this patch is not exactly correct, as radvd get daemonized “too soon” most of the time (i.e. when not debugging libvirtd…) and probes the bridge once it has been set down (even if started before setting it down), thus failing as before (and libvirt lets it be like that: this would need some more checking, maybe). One /may/ introduce some delay between networkStartRadvd() and setting the dummy tap down to solve it, but it seemed too hackish to me. But I couldn't come with a better solution. I would welcome suggestions here. BTW, I removed the “up” argument from virNetDevTapCreateInBridgePort() and virNetDevTapCreate() as all TAP devices are created up now, and I fixed a function name in the docstring. [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=1faa... [2] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=b64b... --- src/network/bridge_driver.c | 15 ++++++++++++++- src/qemu/qemu_command.c | 2 +- src/uml/uml_conf.c | 2 +- src/util/virnetdevtap.c | 29 +++++++++++++++++------------ src/util/virnetdevtap.h | 7 ++++--- 5 files changed, 37 insertions(+), 18 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 63338a2..a13efe6 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -62,6 +62,7 @@ #include "virnetdev.h" #include "virnetdevbridge.h" #include "virnetdevtap.h" +#include "virfile.h" #define NETWORK_PID_DIR LOCALSTATEDIR "/run/libvirt/network" #define NETWORK_STATE_DIR LOCALSTATEDIR "/lib/libvirt/network" @@ -1693,6 +1694,7 @@ networkStartNetworkVirtual(struct network_driver *driver, virErrorPtr save_err = NULL; virNetworkIpDefPtr ipdef; char *macTapIfName = NULL; + int tapfd = -1; /* Check to see if any network IP collides with an existing route */ if (networkCheckRouteCollision(network) < 0) @@ -1714,8 +1716,9 @@ networkStartNetworkVirtual(struct network_driver *driver, virReportOOMError(); goto err0; } + /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, - &macTapIfName, network->def->mac, 0, false, NULL) < 0) { + &macTapIfName, network->def->mac, 0, &tapfd, true) < 0) { VIR_FREE(macTapIfName); goto err0; } @@ -1775,6 +1778,16 @@ networkStartNetworkVirtual(struct network_driver *driver, if (v6present && networkStartRadvd(network) < 0) goto err4; + /* DAD has happened (dnsmasq waits for it), dnsmasq is now bound to the + * bridge's IPv6 address, and radvd to the interface, so we can now set the + * dummy tun down. + */ + if (tapfd >= 0) { + if (virNetDevSetOnline(macTapIfName, false) < 0) + goto err4; + VIR_FORCE_CLOSE(tapfd); + } + if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 22dc871..08457e7 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -247,7 +247,7 @@ qemuNetworkIfaceConnect(virDomainDefPtr def, memcpy(tapmac, net->mac, VIR_MAC_BUFLEN); tapmac[0] = 0xFE; /* Discourage bridge from using TAP dev MAC */ err = virNetDevTapCreateInBridgePort(brname, &net->ifname, tapmac, - vnet_hdr, true, &tapfd); + vnet_hdr, &tapfd, false); virDomainAuditNetDevice(def, net, "/dev/net/tun", tapfd >= 0); if (err < 0) { if (template_ifname) diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index b878622..bd838a6 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -141,7 +141,7 @@ umlConnectTapDevice(virConnectPtr conn, memcpy(tapmac, net->mac, VIR_MAC_BUFLEN); tapmac[0] = 0xFE; /* Discourage bridge from using TAP dev MAC */ if (virNetDevTapCreateInBridgePort(bridge, &net->ifname, tapmac, - 0, true, NULL) < 0) { + 0, NULL, true) < 0) { if (template_ifname) VIR_FREE(net->ifname); goto error; diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 2ed53c6..3e2c2cc 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -109,18 +109,21 @@ virNetDevProbeVnetHdr(int tapfd) * @ifname: the interface name * @vnet_hr: whether to try enabling IFF_VNET_HDR * @tapfd: file descriptor return value for the new tap device + * @persist: if the device persists after the file descriptor is closed * * Creates a tap interface. * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use brDeleteTap to remove - * a persistent TAP devices when it is no longer needed. + * descriptor will be returned, otherwise the TAP device will be closed. The + * TAP device will persist after closing the file descriptor if @persist is + * true. The caller must use virNetDevTapDelete to remove a persistent TAP + * devices when it is no longer needed. * * Returns 0 in case of success or an errno code in case of failure. */ int virNetDevTapCreate(char **ifname, int vnet_hdr ATTRIBUTE_UNUSED, - int *tapfd) + int *tapfd, + bool persist) { int fd; struct ifreq ifr; @@ -156,7 +159,7 @@ int virNetDevTapCreate(char **ifname, goto cleanup; } - if (!tapfd && + if (persist && (errno = ioctl(fd, TUNSETPERSIST, 1))) { virReportSystemError(errno, _("Unable to set tap device %s to persistent"), @@ -249,14 +252,16 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED) * @macaddr: desired MAC address (VIR_MAC_BUFLEN long) * @vnet_hdr: whether to try enabling IFF_VNET_HDR * @tapfd: file descriptor return value for the new tap device + * @persist: if the device persists after the file descriptor is closed * * This function creates a new tap device on a bridge. @ifname can be either * a fixed name or a name template with '%d' for dynamic name allocation. * in either case the final name for the bridge will be stored in @ifname. * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use brDeleteTap to remove - * a persistent TAP devices when it is no longer needed. + * descriptor will be returned, otherwise the TAP device will be closed. The + * TAP device will persist after closing the file descriptor if @persist is + * true. The caller must use virNetDevTapDelete to remove a persistent TAP + * devices when it is no longer needed. * * Returns 0 in case of success or -1 on failure */ @@ -264,10 +269,10 @@ int virNetDevTapCreateInBridgePort(const char *brname, char **ifname, const unsigned char *macaddr, int vnet_hdr, - bool up, - int *tapfd) + int *tapfd, + bool persist) { - if (virNetDevTapCreate(ifname, vnet_hdr, tapfd) < 0) + if (virNetDevTapCreate(ifname, vnet_hdr, tapfd, persist) < 0) return -1; /* We need to set the interface MAC before adding it @@ -289,7 +294,7 @@ int virNetDevTapCreateInBridgePort(const char *brname, if (virNetDevBridgeAddPort(brname, *ifname) < 0) goto error; - if (virNetDevSetOnline(*ifname, up) < 0) + if (virNetDevSetOnline(*ifname, true) < 0) goto error; return 0; diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index fb35ac5..6aff641 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -27,7 +27,8 @@ int virNetDevTapCreate(char **ifname, int vnet_hdr, - int *tapfd) + int *tapfd, + bool persist) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; int virNetDevTapDelete(const char *ifname) @@ -37,8 +38,8 @@ int virNetDevTapCreateInBridgePort(const char *brname, char **ifname, const unsigned char *macaddr, int vnet_hdr, - bool up, - int *tapfd) + int *tapfd, + bool persist) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) ATTRIBUTE_RETURN_CHECK; -- Benjamin Cama <benjamin.cama@telecom-bretagne.eu>

Hi again, I forgot to port my patch to the latest git master… Here is a try, not tested, not even compiled; but it's less intrusive regarding the API thanks to the new “flags” argument. --- src/network/bridge_driver.c | 19 +++++++++++++++++-- src/uml/uml_conf.c | 2 +- src/util/virnetdevtap.c | 24 ++++++++++++++---------- src/util/virnetdevtap.h | 2 ++ 4 files changed, 34 insertions(+), 13 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 79d3010..8f45bd7 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -62,6 +62,7 @@ #include "virnetdev.h" #include "virnetdevbridge.h" #include "virnetdevtap.h" +#include "virfile.h" #define NETWORK_PID_DIR LOCALSTATEDIR "/run/libvirt/network" #define NETWORK_STATE_DIR LOCALSTATEDIR "/lib/libvirt/network" @@ -1744,6 +1745,7 @@ networkStartNetworkVirtual(struct network_driver *driver, virErrorPtr save_err = NULL; virNetworkIpDefPtr ipdef; char *macTapIfName = NULL; + int tapfd = -1; /* Check to see if any network IP collides with an existing route */ if (networkCheckRouteCollision(network) < 0) @@ -1765,10 +1767,13 @@ networkStartNetworkVirtual(struct network_driver *driver, virReportOOMError(); goto err0; } + /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, network->def->mac, - NULL, NULL, NULL, - VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE) < 0) { + NULL, &tapfd, NULL, + VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE + |VIR_NETDEV_TAP_CREATE_IFUP + |VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { VIR_FREE(macTapIfName); goto err0; } @@ -1828,6 +1833,16 @@ networkStartNetworkVirtual(struct network_driver *driver, if (v6present && networkStartRadvd(network) < 0) goto err4; + /* DAD has happened (dnsmasq waits for it), dnsmasq is now bound to the + * bridge's IPv6 address, and radvd to the interface, so we can now set the + * dummy tun down. + */ + if (tapfd >= 0) { + if (virNetDevSetOnline(macTapIfName, false) < 0) + goto err4; + VIR_FORCE_CLOSE(tapfd); + } + if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index 79b249d..c9b5044 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -141,7 +141,7 @@ umlConnectTapDevice(virConnectPtr conn, if (virNetDevTapCreateInBridgePort(bridge, &net->ifname, net->mac, vm->uuid, NULL, virDomainNetGetActualVirtPortProfile(net), - VIR_NETDEV_TAP_CREATE_IFUP) < 0) { + VIR_NETDEV_TAP_CREATE_IFUP|VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { if (template_ifname) VIR_FREE(net->ifname); goto error; diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 5d21164..45ff20f 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -118,18 +118,20 @@ virNetDevProbeVnetHdr(int tapfd) * * VIR_NETDEV_TAP_CREATE_VNET_HDR * - Enable IFF_VNET_HDR on the tap device + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * Creates a tap interface. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to - * remove a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or an errno code in case of failure. */ int virNetDevTapCreate(char **ifname, int *tapfd, - unsigned int flags ATTRIBUTE_UNUSED) + unsigned int flags) { int fd; struct ifreq ifr; @@ -166,7 +168,7 @@ int virNetDevTapCreate(char **ifname, goto cleanup; } - if (!tapfd && + if ((flags & VIR_NETDEV_TAP_CREATE_PERSIST) && (errno = ioctl(fd, TUNSETPERSIST, 1))) { virReportSystemError(errno, _("Unable to set tap device %s to persistent"), @@ -267,14 +269,16 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED) * - Enable IFF_VNET_HDR on the tap device * VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE * - Set this interface's MAC as the bridge's MAC address + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * This function creates a new tap device on a bridge. @ifname can be either * a fixed name or a name template with '%d' for dynamic name allocation. * in either case the final name for the bridge will be stored in @ifname. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to remove - * a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure */ diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index d9a3593..f1355ba 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -42,6 +42,8 @@ typedef enum { VIR_NETDEV_TAP_CREATE_VNET_HDR = 1 << 1, /* Set this interface's MAC as the bridge's MAC address */ VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE = 1 << 2, + /* The device will persist after the file descriptor is closed */ + VIR_NETDEV_TAP_CREATE_PERSIST = 1 << 3, } virNetDevTapCreateFlags; int virNetDevTapCreateInBridgePort(const char *brname, -- Benjamin Cama <benjamin.cama@telecom-bretagne.eu>

On Tue, Jun 19, 2012 at 07:52:43PM +0200, Benjamin Cama wrote:
Hi again,
I forgot to port my patch to the latest git master… Here is a try, not tested, not even compiled; but it's less intrusive regarding the API thanks to the new “flags” argument.
--- src/network/bridge_driver.c | 19 +++++++++++++++++-- src/uml/uml_conf.c | 2 +- src/util/virnetdevtap.c | 24 ++++++++++++++---------- src/util/virnetdevtap.h | 2 ++ 4 files changed, 34 insertions(+), 13 deletions(-)
diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 79d3010..8f45bd7 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -62,6 +62,7 @@ #include "virnetdev.h" #include "virnetdevbridge.h" #include "virnetdevtap.h" +#include "virfile.h"
#define NETWORK_PID_DIR LOCALSTATEDIR "/run/libvirt/network" #define NETWORK_STATE_DIR LOCALSTATEDIR "/lib/libvirt/network" @@ -1744,6 +1745,7 @@ networkStartNetworkVirtual(struct network_driver *driver, virErrorPtr save_err = NULL; virNetworkIpDefPtr ipdef; char *macTapIfName = NULL; + int tapfd = -1;
/* Check to see if any network IP collides with an existing route */ if (networkCheckRouteCollision(network) < 0) @@ -1765,10 +1767,13 @@ networkStartNetworkVirtual(struct network_driver *driver, virReportOOMError(); goto err0; } + /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, network->def->mac, - NULL, NULL, NULL, - VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE) < 0) { + NULL, &tapfd, NULL, + VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE + |VIR_NETDEV_TAP_CREATE_IFUP + |VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { VIR_FREE(macTapIfName); goto err0; } @@ -1828,6 +1833,16 @@ networkStartNetworkVirtual(struct network_driver *driver, if (v6present && networkStartRadvd(network) < 0) goto err4;
+ /* DAD has happened (dnsmasq waits for it), dnsmasq is now bound to the + * bridge's IPv6 address, and radvd to the interface, so we can now set the + * dummy tun down. + */ + if (tapfd >= 0) { + if (virNetDevSetOnline(macTapIfName, false) < 0) + goto err4; + VIR_FORCE_CLOSE(tapfd); + } + if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"),
I think you also need to have a VIR_FOFRCE_CLOSE(tapfd) in the very last of the cleanup jump labels. Since between the time we open the tapfd & close it, we might have done a 'goto err1' or similar.
diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index 79b249d..c9b5044 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -141,7 +141,7 @@ umlConnectTapDevice(virConnectPtr conn, if (virNetDevTapCreateInBridgePort(bridge, &net->ifname, net->mac, vm->uuid, NULL, virDomainNetGetActualVirtPortProfile(net), - VIR_NETDEV_TAP_CREATE_IFUP) < 0) { + VIR_NETDEV_TAP_CREATE_IFUP|VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { if (template_ifname) VIR_FREE(net->ifname); goto error; diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 5d21164..45ff20f 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -118,18 +118,20 @@ virNetDevProbeVnetHdr(int tapfd) * * VIR_NETDEV_TAP_CREATE_VNET_HDR * - Enable IFF_VNET_HDR on the tap device + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * Creates a tap interface. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to - * remove a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or an errno code in case of failure. */ int virNetDevTapCreate(char **ifname, int *tapfd, - unsigned int flags ATTRIBUTE_UNUSED) + unsigned int flags) { int fd; struct ifreq ifr; @@ -166,7 +168,7 @@ int virNetDevTapCreate(char **ifname, goto cleanup; }
- if (!tapfd && + if ((flags & VIR_NETDEV_TAP_CREATE_PERSIST) && (errno = ioctl(fd, TUNSETPERSIST, 1))) { virReportSystemError(errno, _("Unable to set tap device %s to persistent"), @@ -267,14 +269,16 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED) * - Enable IFF_VNET_HDR on the tap device * VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE * - Set this interface's MAC as the bridge's MAC address + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * This function creates a new tap device on a bridge. @ifname can be either * a fixed name or a name template with '%d' for dynamic name allocation. * in either case the final name for the bridge will be stored in @ifname. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to remove - * a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure */ diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index d9a3593..f1355ba 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -42,6 +42,8 @@ typedef enum { VIR_NETDEV_TAP_CREATE_VNET_HDR = 1 << 1, /* Set this interface's MAC as the bridge's MAC address */ VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE = 1 << 2, + /* The device will persist after the file descriptor is closed */ + VIR_NETDEV_TAP_CREATE_PERSIST = 1 << 3, } virNetDevTapCreateFlags;
int virNetDevTapCreateInBridgePort(const char *brname,
Looks reasonable apart from the cleanup bug. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Le mercredi 20 juin 2012 à 11:05 +0100, Daniel P. Berrange a écrit :
I think you also need to have a VIR_FOFRCE_CLOSE(tapfd) in the very last of the cleanup jump labels. Since between the time we open the tapfd & close it, we might have done a 'goto err1' or similar.
Yes, I forgot that. New patch following.
Looks reasonable apart from the cleanup bug.
Thanks. BTW, if this patch is commited, I'm already in AUTHORS under another email address, which is OK. --- src/network/bridge_driver.c | 22 ++++++++++++++++++++-- src/uml/uml_conf.c | 2 +- src/util/virnetdevtap.c | 24 ++++++++++++++---------- src/util/virnetdevtap.h | 2 ++ 4 files changed, 37 insertions(+), 13 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 79d3010..e9f3c90 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -62,6 +62,7 @@ #include "virnetdev.h" #include "virnetdevbridge.h" #include "virnetdevtap.h" +#include "virfile.h" #define NETWORK_PID_DIR LOCALSTATEDIR "/run/libvirt/network" #define NETWORK_STATE_DIR LOCALSTATEDIR "/lib/libvirt/network" @@ -840,6 +841,7 @@ networkStartRadvd(virNetworkObjPtr network) " AdvSendAdvert on;\n" " AdvManagedFlag off;\n" " AdvOtherConfigFlag off;\n" + " IgnoreIfMissing on;\n" "\n", network->def->bridge); for (ii = 0; @@ -1744,6 +1746,7 @@ networkStartNetworkVirtual(struct network_driver *driver, virErrorPtr save_err = NULL; virNetworkIpDefPtr ipdef; char *macTapIfName = NULL; + int tapfd = -1; /* Check to see if any network IP collides with an existing route */ if (networkCheckRouteCollision(network) < 0) @@ -1765,10 +1768,13 @@ networkStartNetworkVirtual(struct network_driver *driver, virReportOOMError(); goto err0; } + /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, network->def->mac, - NULL, NULL, NULL, - VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE) < 0) { + NULL, &tapfd, NULL, + VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE + |VIR_NETDEV_TAP_CREATE_IFUP + |VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { VIR_FREE(macTapIfName); goto err0; } @@ -1828,6 +1834,16 @@ networkStartNetworkVirtual(struct network_driver *driver, if (v6present && networkStartRadvd(network) < 0) goto err4; + /* DAD has happened (dnsmasq waits for it), dnsmasq is now bound to the + * bridge's IPv6 address, and radvd to the interface, so we can now set the + * dummy tun down. + */ + if (tapfd >= 0) { + if (virNetDevSetOnline(macTapIfName, false) < 0) + goto err4; + VIR_FORCE_CLOSE(tapfd); + } + if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), @@ -1866,6 +1882,8 @@ networkStartNetworkVirtual(struct network_driver *driver, save_err = virSaveLastError(); if (macTapIfName) { + if (tapfd >= 0) + VIR_FORCE_CLOSE(tapfd); ignore_value(virNetDevTapDelete(macTapIfName)); VIR_FREE(macTapIfName); } diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index 79b249d..c9b5044 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -141,7 +141,7 @@ umlConnectTapDevice(virConnectPtr conn, if (virNetDevTapCreateInBridgePort(bridge, &net->ifname, net->mac, vm->uuid, NULL, virDomainNetGetActualVirtPortProfile(net), - VIR_NETDEV_TAP_CREATE_IFUP) < 0) { + VIR_NETDEV_TAP_CREATE_IFUP|VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { if (template_ifname) VIR_FREE(net->ifname); goto error; diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 5d21164..45ff20f 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -118,18 +118,20 @@ virNetDevProbeVnetHdr(int tapfd) * * VIR_NETDEV_TAP_CREATE_VNET_HDR * - Enable IFF_VNET_HDR on the tap device + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * Creates a tap interface. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to - * remove a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or an errno code in case of failure. */ int virNetDevTapCreate(char **ifname, int *tapfd, - unsigned int flags ATTRIBUTE_UNUSED) + unsigned int flags) { int fd; struct ifreq ifr; @@ -166,7 +168,7 @@ int virNetDevTapCreate(char **ifname, goto cleanup; } - if (!tapfd && + if ((flags & VIR_NETDEV_TAP_CREATE_PERSIST) && (errno = ioctl(fd, TUNSETPERSIST, 1))) { virReportSystemError(errno, _("Unable to set tap device %s to persistent"), @@ -267,14 +269,16 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED) * - Enable IFF_VNET_HDR on the tap device * VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE * - Set this interface's MAC as the bridge's MAC address + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * This function creates a new tap device on a bridge. @ifname can be either * a fixed name or a name template with '%d' for dynamic name allocation. * in either case the final name for the bridge will be stored in @ifname. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to remove - * a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure */ diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index d9a3593..f1355ba 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -42,6 +42,8 @@ typedef enum { VIR_NETDEV_TAP_CREATE_VNET_HDR = 1 << 1, /* Set this interface's MAC as the bridge's MAC address */ VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE = 1 << 2, + /* The device will persist after the file descriptor is closed */ + VIR_NETDEV_TAP_CREATE_PERSIST = 1 << 3, } virNetDevTapCreateFlags; int virNetDevTapCreateInBridgePort(const char *brname, -- Benjamin Cama <benjamin.cama@telecom-bretagne.eu>

On 06/20/2012 05:49 AM, Benjamin Cama wrote:
Le mercredi 20 juin 2012 à 11:05 +0100, Daniel P. Berrange a écrit :
I think you also need to have a VIR_FOFRCE_CLOSE(tapfd) in the very last of the cleanup jump labels. Since between the time we open the tapfd & close it, we might have done a 'goto err1' or similar.
Yes, I forgot that. New patch following.
Looks reasonable apart from the cleanup bug.
Thanks. BTW, if this patch is commited, I'm already in AUTHORS under another email address, which is OK.
If you have two git commits under different email addresses, then we should update .mailmap to set up the aliases so that your preferred address in AUTHORS is attributed to all your other commits on any other address. Otherwise, 'make syntax-check' complains about a mismatch between commit authorship. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Le mercredi 20 juin 2012 à 06:13 -0600, Eric Blake a écrit :
On 06/20/2012 05:49 AM, Benjamin Cama wrote:
Thanks. BTW, if this patch is commited, I'm already in AUTHORS under another email address, which is OK.
If you have two git commits under different email addresses, then we should update .mailmap to set up the aliases so that your preferred address in AUTHORS is attributed to all your other commits on any other address. Otherwise, 'make syntax-check' complains about a mismatch between commit authorship.
Right. See the following patch. --- diff --git a/.mailmap b/.mailmap index e16c3c9..1b9e7e5 100644 --- a/.mailmap +++ b/.mailmap @@ -29,6 +29,7 @@ <neil@aldur.co.uk> <neil@brightbox.co.uk> <stefanb@us.ibm.com> <stefanb@linux.vnet.ibm.com> <josh.durgin@dreamhost.com> <joshd@hq.newdream.net> +<benoar@dolka.fr> <benjamin.cama@telecom-bretagne.eu> # Name consolidation: # Preferred author spelling <preferred email> -- Benjamin Cama <benjamin.cama@telecom-bretagne.eu>

On Wed, Jun 20, 2012 at 01:49:30PM +0200, Benjamin Cama wrote:
Le mercredi 20 juin 2012 à 11:05 +0100, Daniel P. Berrange a écrit :
I think you also need to have a VIR_FOFRCE_CLOSE(tapfd) in the very last of the cleanup jump labels. Since between the time we open the tapfd & close it, we might have done a 'goto err1' or similar.
Yes, I forgot that. New patch following.
Looks reasonable apart from the cleanup bug.
Thanks. BTW, if this patch is commited, I'm already in AUTHORS under another email address, which is OK.
--- src/network/bridge_driver.c | 22 ++++++++++++++++++++-- src/uml/uml_conf.c | 2 +- src/util/virnetdevtap.c | 24 ++++++++++++++---------- src/util/virnetdevtap.h | 2 ++ 4 files changed, 37 insertions(+), 13 deletions(-)
diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 79d3010..e9f3c90 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -62,6 +62,7 @@ #include "virnetdev.h" #include "virnetdevbridge.h" #include "virnetdevtap.h" +#include "virfile.h"
#define NETWORK_PID_DIR LOCALSTATEDIR "/run/libvirt/network" #define NETWORK_STATE_DIR LOCALSTATEDIR "/lib/libvirt/network" @@ -840,6 +841,7 @@ networkStartRadvd(virNetworkObjPtr network) " AdvSendAdvert on;\n" " AdvManagedFlag off;\n" " AdvOtherConfigFlag off;\n" + " IgnoreIfMissing on;\n" "\n", network->def->bridge); for (ii = 0; @@ -1744,6 +1746,7 @@ networkStartNetworkVirtual(struct network_driver *driver, virErrorPtr save_err = NULL; virNetworkIpDefPtr ipdef; char *macTapIfName = NULL; + int tapfd = -1;
/* Check to see if any network IP collides with an existing route */ if (networkCheckRouteCollision(network) < 0) @@ -1765,10 +1768,13 @@ networkStartNetworkVirtual(struct network_driver *driver, virReportOOMError(); goto err0; } + /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, network->def->mac, - NULL, NULL, NULL, - VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE) < 0) { + NULL, &tapfd, NULL, + VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE + |VIR_NETDEV_TAP_CREATE_IFUP + |VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { VIR_FREE(macTapIfName); goto err0; } @@ -1828,6 +1834,16 @@ networkStartNetworkVirtual(struct network_driver *driver, if (v6present && networkStartRadvd(network) < 0) goto err4;
+ /* DAD has happened (dnsmasq waits for it), dnsmasq is now bound to the + * bridge's IPv6 address, and radvd to the interface, so we can now set the + * dummy tun down. + */ + if (tapfd >= 0) { + if (virNetDevSetOnline(macTapIfName, false) < 0) + goto err4; + VIR_FORCE_CLOSE(tapfd); + } + if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), @@ -1866,6 +1882,8 @@ networkStartNetworkVirtual(struct network_driver *driver, save_err = virSaveLastError();
if (macTapIfName) { + if (tapfd >= 0) + VIR_FORCE_CLOSE(tapfd); ignore_value(virNetDevTapDelete(macTapIfName)); VIR_FREE(macTapIfName); } diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index 79b249d..c9b5044 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -141,7 +141,7 @@ umlConnectTapDevice(virConnectPtr conn, if (virNetDevTapCreateInBridgePort(bridge, &net->ifname, net->mac, vm->uuid, NULL, virDomainNetGetActualVirtPortProfile(net), - VIR_NETDEV_TAP_CREATE_IFUP) < 0) { + VIR_NETDEV_TAP_CREATE_IFUP|VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { if (template_ifname) VIR_FREE(net->ifname); goto error; diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 5d21164..45ff20f 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -118,18 +118,20 @@ virNetDevProbeVnetHdr(int tapfd) * * VIR_NETDEV_TAP_CREATE_VNET_HDR * - Enable IFF_VNET_HDR on the tap device + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * Creates a tap interface. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to - * remove a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or an errno code in case of failure. */ int virNetDevTapCreate(char **ifname, int *tapfd, - unsigned int flags ATTRIBUTE_UNUSED) + unsigned int flags) { int fd; struct ifreq ifr; @@ -166,7 +168,7 @@ int virNetDevTapCreate(char **ifname, goto cleanup; }
- if (!tapfd && + if ((flags & VIR_NETDEV_TAP_CREATE_PERSIST) && (errno = ioctl(fd, TUNSETPERSIST, 1))) { virReportSystemError(errno, _("Unable to set tap device %s to persistent"), @@ -267,14 +269,16 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED) * - Enable IFF_VNET_HDR on the tap device * VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE * - Set this interface's MAC as the bridge's MAC address + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * This function creates a new tap device on a bridge. @ifname can be either * a fixed name or a name template with '%d' for dynamic name allocation. * in either case the final name for the bridge will be stored in @ifname. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to remove - * a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure */ diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index d9a3593..f1355ba 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -42,6 +42,8 @@ typedef enum { VIR_NETDEV_TAP_CREATE_VNET_HDR = 1 << 1, /* Set this interface's MAC as the bridge's MAC address */ VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE = 1 << 2, + /* The device will persist after the file descriptor is closed */ + VIR_NETDEV_TAP_CREATE_PERSIST = 1 << 3, } virNetDevTapCreateFlags;
int virNetDevTapCreateInBridgePort(const char *brname,
ACK Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Ping? I just stumbled over this again today: vanillia Ubuntu 12.04, set up a network bridge with an IPv6 address, and it stays “tentative” and can't be used (it doesn't block when starting the network, though, contrary to my Debian experience). On my Debian squeeze, it's still blocking when doing net-start, with the latest backport (0.9.12) with kernel 3.2.0. Le mercredi 20 juin 2012 à 14:27 +0100, Daniel P. Berrange a écrit :
On Wed, Jun 20, 2012 at 01:49:30PM +0200, Benjamin Cama wrote:
Le mercredi 20 juin 2012 à 11:05 +0100, Daniel P. Berrange a écrit :
I think you also need to have a VIR_FOFRCE_CLOSE(tapfd) in the very last of the cleanup jump labels. Since between the time we open the tapfd & close it, we might have done a 'goto err1' or similar.
Yes, I forgot that. New patch following.
Looks reasonable apart from the cleanup bug.
Thanks. BTW, if this patch is commited, I'm already in AUTHORS under another email address, which is OK.
--- src/network/bridge_driver.c | 22 ++++++++++++++++++++-- src/uml/uml_conf.c | 2 +- src/util/virnetdevtap.c | 24 ++++++++++++++---------- src/util/virnetdevtap.h | 2 ++ 4 files changed, 37 insertions(+), 13 deletions(-)
diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 79d3010..e9f3c90 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -62,6 +62,7 @@ #include "virnetdev.h" #include "virnetdevbridge.h" #include "virnetdevtap.h" +#include "virfile.h"
#define NETWORK_PID_DIR LOCALSTATEDIR "/run/libvirt/network" #define NETWORK_STATE_DIR LOCALSTATEDIR "/lib/libvirt/network" @@ -840,6 +841,7 @@ networkStartRadvd(virNetworkObjPtr network) " AdvSendAdvert on;\n" " AdvManagedFlag off;\n" " AdvOtherConfigFlag off;\n" + " IgnoreIfMissing on;\n" "\n", network->def->bridge); for (ii = 0; @@ -1744,6 +1746,7 @@ networkStartNetworkVirtual(struct network_driver *driver, virErrorPtr save_err = NULL; virNetworkIpDefPtr ipdef; char *macTapIfName = NULL; + int tapfd = -1;
/* Check to see if any network IP collides with an existing route */ if (networkCheckRouteCollision(network) < 0) @@ -1765,10 +1768,13 @@ networkStartNetworkVirtual(struct network_driver *driver, virReportOOMError(); goto err0; } + /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, network->def->mac, - NULL, NULL, NULL, - VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE) < 0) { + NULL, &tapfd, NULL, + VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE + |VIR_NETDEV_TAP_CREATE_IFUP + |VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { VIR_FREE(macTapIfName); goto err0; } @@ -1828,6 +1834,16 @@ networkStartNetworkVirtual(struct network_driver *driver, if (v6present && networkStartRadvd(network) < 0) goto err4;
+ /* DAD has happened (dnsmasq waits for it), dnsmasq is now bound to the + * bridge's IPv6 address, and radvd to the interface, so we can now set the + * dummy tun down. + */ + if (tapfd >= 0) { + if (virNetDevSetOnline(macTapIfName, false) < 0) + goto err4; + VIR_FORCE_CLOSE(tapfd); + } + if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), @@ -1866,6 +1882,8 @@ networkStartNetworkVirtual(struct network_driver *driver, save_err = virSaveLastError();
if (macTapIfName) { + if (tapfd >= 0) + VIR_FORCE_CLOSE(tapfd); ignore_value(virNetDevTapDelete(macTapIfName)); VIR_FREE(macTapIfName); } diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index 79b249d..c9b5044 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -141,7 +141,7 @@ umlConnectTapDevice(virConnectPtr conn, if (virNetDevTapCreateInBridgePort(bridge, &net->ifname, net->mac, vm->uuid, NULL, virDomainNetGetActualVirtPortProfile(net), - VIR_NETDEV_TAP_CREATE_IFUP) < 0) { + VIR_NETDEV_TAP_CREATE_IFUP|VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { if (template_ifname) VIR_FREE(net->ifname); goto error; diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 5d21164..45ff20f 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -118,18 +118,20 @@ virNetDevProbeVnetHdr(int tapfd) * * VIR_NETDEV_TAP_CREATE_VNET_HDR * - Enable IFF_VNET_HDR on the tap device + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * Creates a tap interface. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to - * remove a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or an errno code in case of failure. */ int virNetDevTapCreate(char **ifname, int *tapfd, - unsigned int flags ATTRIBUTE_UNUSED) + unsigned int flags) { int fd; struct ifreq ifr; @@ -166,7 +168,7 @@ int virNetDevTapCreate(char **ifname, goto cleanup; }
- if (!tapfd && + if ((flags & VIR_NETDEV_TAP_CREATE_PERSIST) && (errno = ioctl(fd, TUNSETPERSIST, 1))) { virReportSystemError(errno, _("Unable to set tap device %s to persistent"), @@ -267,14 +269,16 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED) * - Enable IFF_VNET_HDR on the tap device * VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE * - Set this interface's MAC as the bridge's MAC address + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * This function creates a new tap device on a bridge. @ifname can be either * a fixed name or a name template with '%d' for dynamic name allocation. * in either case the final name for the bridge will be stored in @ifname. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to remove - * a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure */ diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index d9a3593..f1355ba 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -42,6 +42,8 @@ typedef enum { VIR_NETDEV_TAP_CREATE_VNET_HDR = 1 << 1, /* Set this interface's MAC as the bridge's MAC address */ VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE = 1 << 2, + /* The device will persist after the file descriptor is closed */ + VIR_NETDEV_TAP_CREATE_PERSIST = 1 << 3, } virNetDevTapCreateFlags;
int virNetDevTapCreateInBridgePort(const char *brname,
ACK
Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Regards, -- Benjamin Cama <benjamin.cama@telecom-bretagne.eu>

On 09/21/2012 11:04 AM, Benjamin Cama wrote:
Ping?
Thanks. BTW, if this patch is commited, I'm already in AUTHORS under another email address, which is OK.
Our syntax checker wants us to add a .mailmap listing, so that your preferred email address is in AUTHORS. Let me know if you want to pivot which address of yours is treated as preferred.
ACK
Sorry that this got overlooked. In the meantime, the code base has changed, making it harder to apply your patch. Would you mind rebasing and posting a v3? -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

From: Benjamin Cama <benjamin.cama@telecom-bretagne.eu> Le mercredi 20 juin 2012 à 11:05 +0100, Daniel P. Berrange a écrit :
I think you also need to have a VIR_FOFRCE_CLOSE(tapfd) in the very last of the cleanup jump labels. Since between the time we open the tapfd & close it, we might have done a 'goto err1' or similar.
Yes, I forgot that. New patch following.
Looks reasonable apart from the cleanup bug.
Thanks. BTW, if this patch is commited, I'm already in AUTHORS under another email address, which is OK. --- This is my attempt at resolving the merge conflict, but I'm not confident enough with the original patch, nor with my test environment, to see if I actually solved anything (not to mention the commit message used by 'git am' is horrible). Is it worth using this version? src/network/bridge_driver.c | 22 ++++++++++++++++++++-- src/uml/uml_conf.c | 3 ++- src/util/virnetdevtap.c | 24 ++++++++++++++---------- src/util/virnetdevtap.h | 2 ++ 4 files changed, 38 insertions(+), 13 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index fce1739..33aba74 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -65,6 +65,7 @@ #include "virnetdevtap.h" #include "virnetdevvportprofile.h" #include "virdbus.h" +#include "virfile.h" #define NETWORK_PID_DIR LOCALSTATEDIR "/run/libvirt/network" #define NETWORK_STATE_DIR LOCALSTATEDIR "/lib/libvirt/network" @@ -993,6 +994,7 @@ networkRadvdConfContents(virNetworkObjPtr network, char **configstr) " AdvSendAdvert on;\n" " AdvManagedFlag off;\n" " AdvOtherConfigFlag off;\n" + " IgnoreIfMissing on;\n" "\n", network->def->bridge); @@ -2061,6 +2063,7 @@ networkStartNetworkVirtual(struct network_driver *driver, virErrorPtr save_err = NULL; virNetworkIpDefPtr ipdef; char *macTapIfName = NULL; + int tapfd = -1; /* Check to see if any network IP collides with an existing route */ if (networkCheckRouteCollision(network) < 0) @@ -2082,10 +2085,13 @@ networkStartNetworkVirtual(struct network_driver *driver, virReportOOMError(); goto err0; } + /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, &network->def->mac, - NULL, NULL, NULL, NULL, - VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE) < 0) { + NULL, &tapfd, NULL, NULL, + VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE | + VIR_NETDEV_TAP_CREATE_IFUP | + VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { VIR_FREE(macTapIfName); goto err0; } @@ -2149,6 +2155,16 @@ networkStartNetworkVirtual(struct network_driver *driver, if (v6present && networkStartRadvd(network) < 0) goto err4; + /* DAD has happened (dnsmasq waits for it), dnsmasq is now bound to the + * bridge's IPv6 address, and radvd to the interface, so we can now set the + * dummy tun down. + */ + if (tapfd >= 0) { + if (virNetDevSetOnline(macTapIfName, false) < 0) + goto err4; + VIR_FORCE_CLOSE(tapfd); + } + if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), @@ -2187,6 +2203,8 @@ networkStartNetworkVirtual(struct network_driver *driver, save_err = virSaveLastError(); if (macTapIfName) { + if (tapfd >= 0) + VIR_FORCE_CLOSE(tapfd); ignore_value(virNetDevTapDelete(macTapIfName)); VIR_FREE(macTapIfName); } diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index a317bcc..e8b8779 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -142,7 +142,8 @@ umlConnectTapDevice(virConnectPtr conn, vm->uuid, NULL, virDomainNetGetActualVirtPortProfile(net), virDomainNetGetActualVlan(net), - VIR_NETDEV_TAP_CREATE_IFUP) < 0) { + VIR_NETDEV_TAP_CREATE_IFUP | + VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { if (template_ifname) VIR_FREE(net->ifname); goto error; diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 9f2dca1..0eadd6c 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -112,18 +112,20 @@ virNetDevProbeVnetHdr(int tapfd) * * VIR_NETDEV_TAP_CREATE_VNET_HDR * - Enable IFF_VNET_HDR on the tap device + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * Creates a tap interface. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to - * remove a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure. */ int virNetDevTapCreate(char **ifname, int *tapfd, - unsigned int flags ATTRIBUTE_UNUSED) + unsigned int flags) { int fd; struct ifreq ifr; @@ -160,7 +162,7 @@ int virNetDevTapCreate(char **ifname, goto cleanup; } - if (!tapfd && + if ((flags & VIR_NETDEV_TAP_CREATE_PERSIST) && (errno = ioctl(fd, TUNSETPERSIST, 1))) { virReportSystemError(errno, _("Unable to set tap device %s to persistent"), @@ -261,14 +263,16 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED) * - Enable IFF_VNET_HDR on the tap device * VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE * - Set this interface's MAC as the bridge's MAC address + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * This function creates a new tap device on a bridge. @ifname can be either * a fixed name or a name template with '%d' for dynamic name allocation. * in either case the final name for the bridge will be stored in @ifname. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to remove - * a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure */ diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index fa6a647..980db61 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -43,6 +43,8 @@ typedef enum { VIR_NETDEV_TAP_CREATE_VNET_HDR = 1 << 1, /* Set this interface's MAC as the bridge's MAC address */ VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE = 1 << 2, + /* The device will persist after the file descriptor is closed */ + VIR_NETDEV_TAP_CREATE_PERSIST = 1 << 3, } virNetDevTapCreateFlags; int virNetDevTapCreateInBridgePort(const char *brname, -- 1.7.11.4

On 09/21/2012 02:45 PM, Eric Blake wrote:
From: Benjamin Cama <benjamin.cama@telecom-bretagne.eu>
Le mercredi 20 juin 2012 à 11:05 +0100, Daniel P. Berrange a écrit :
I think you also need to have a VIR_FOFRCE_CLOSE(tapfd) in the very last of the cleanup jump labels. Since between the time we open the tapfd & close it, we might have done a 'goto err1' or similar. Yes, I forgot that. New patch following.
Looks reasonable apart from the cleanup bug. Thanks. BTW, if this patch is commited, I'm already in AUTHORS under another email address, which is OK.
This is my attempt at resolving the merge conflict, but I'm not confident enough with the original patch, nor with my test environment, to see if I actually solved anything (not to mention the commit message used by 'git am' is horrible). Is it worth using this version?
Whatever we push should have the commit log message from the first version of the patch. That gave a very good description of the problem, and why the patch is IFUPing a tap device that's connected to the bridge, but has no IP address (v6 or v4), no backend connection, and no traffic flowing over it. It seems very counterintuitive until you read the explanation in the original version.
src/network/bridge_driver.c | 22 ++++++++++++++++++++-- src/uml/uml_conf.c | 3 ++- src/util/virnetdevtap.c | 24 ++++++++++++++---------- src/util/virnetdevtap.h | 2 ++ 4 files changed, 38 insertions(+), 13 deletions(-)
diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index fce1739..33aba74 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -65,6 +65,7 @@ #include "virnetdevtap.h" #include "virnetdevvportprofile.h" #include "virdbus.h" +#include "virfile.h"
#define NETWORK_PID_DIR LOCALSTATEDIR "/run/libvirt/network" #define NETWORK_STATE_DIR LOCALSTATEDIR "/lib/libvirt/network" @@ -993,6 +994,7 @@ networkRadvdConfContents(virNetworkObjPtr network, char **configstr) " AdvSendAdvert on;\n" " AdvManagedFlag off;\n" " AdvOtherConfigFlag off;\n" + " IgnoreIfMissing on;\n"
This was added to the patch after V2, and I don't see any explanation why. Does this just silence the complaint log messages when the dummy tap is down?
"\n", network->def->bridge);
@@ -2061,6 +2063,7 @@ networkStartNetworkVirtual(struct network_driver *driver, virErrorPtr save_err = NULL; virNetworkIpDefPtr ipdef; char *macTapIfName = NULL; + int tapfd = -1;
/* Check to see if any network IP collides with an existing route */ if (networkCheckRouteCollision(network) < 0) @@ -2082,10 +2085,13 @@ networkStartNetworkVirtual(struct network_driver *driver, virReportOOMError(); goto err0; } + /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, &network->def->mac, - NULL, NULL, NULL, NULL, - VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE) < 0) { + NULL, &tapfd, NULL, NULL, + VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE | + VIR_NETDEV_TAP_CREATE_IFUP | + VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { VIR_FREE(macTapIfName); goto err0; } @@ -2149,6 +2155,16 @@ networkStartNetworkVirtual(struct network_driver *driver, if (v6present && networkStartRadvd(network) < 0) goto err4;
+ /* DAD has happened (dnsmasq waits for it),
Hmm. dnsmasq waits for it because it wants to listen for DNS packets on the bridge's IPv6 address, right? Okay, so the timing problem mentioned in the earlier messages is a thing of the past.
dnsmasq is now bound to the + * bridge's IPv6 address, and radvd to the interface, so we can now set the + * dummy tun down. + */ + if (tapfd >= 0) { + if (virNetDevSetOnline(macTapIfName, false) < 0) + goto err4; + VIR_FORCE_CLOSE(tapfd); + } + if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), @@ -2187,6 +2203,8 @@ networkStartNetworkVirtual(struct network_driver *driver, save_err = virSaveLastError();
if (macTapIfName) { + if (tapfd >= 0) + VIR_FORCE_CLOSE(tapfd); ignore_value(virNetDevTapDelete(macTapIfName)); VIR_FREE(macTapIfName); } diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index a317bcc..e8b8779 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -142,7 +142,8 @@ umlConnectTapDevice(virConnectPtr conn, vm->uuid, NULL, virDomainNetGetActualVirtPortProfile(net), virDomainNetGetActualVlan(net), - VIR_NETDEV_TAP_CREATE_IFUP) < 0) { + VIR_NETDEV_TAP_CREATE_IFUP | + VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { if (template_ifname) VIR_FREE(net->ifname); goto error; diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 9f2dca1..0eadd6c 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -112,18 +112,20 @@ virNetDevProbeVnetHdr(int tapfd) * * VIR_NETDEV_TAP_CREATE_VNET_HDR * - Enable IFF_VNET_HDR on the tap device + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * Creates a tap interface. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to - * remove a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure. */ int virNetDevTapCreate(char **ifname, int *tapfd, - unsigned int flags ATTRIBUTE_UNUSED) + unsigned int flags) { int fd; struct ifreq ifr; @@ -160,7 +162,7 @@ int virNetDevTapCreate(char **ifname, goto cleanup; }
- if (!tapfd && + if ((flags & VIR_NETDEV_TAP_CREATE_PERSIST) && (errno = ioctl(fd, TUNSETPERSIST, 1))) { virReportSystemError(errno, _("Unable to set tap device %s to persistent"), @@ -261,14 +263,16 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED) * - Enable IFF_VNET_HDR on the tap device * VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE * - Set this interface's MAC as the bridge's MAC address + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * This function creates a new tap device on a bridge. @ifname can be either * a fixed name or a name template with '%d' for dynamic name allocation. * in either case the final name for the bridge will be stored in @ifname. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to remove - * a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure */ diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index fa6a647..980db61 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -43,6 +43,8 @@ typedef enum { VIR_NETDEV_TAP_CREATE_VNET_HDR = 1 << 1, /* Set this interface's MAC as the bridge's MAC address */ VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE = 1 << 2, + /* The device will persist after the file descriptor is closed */ + VIR_NETDEV_TAP_CREATE_PERSIST = 1 << 3, } virNetDevTapCreateFlags;
int virNetDevTapCreateInBridgePort(const char *brname,
This has my ACK as long as: 1) the original commit message (with some modifications) is used, 2) the reason for adding the "IgnoreIfMissing" line is added to that commit message (and maybe as a comment in the code).

Hi, Le vendredi 21 septembre 2012 à 16:28 -0400, Laine Stump a écrit :
On 09/21/2012 02:45 PM, Eric Blake wrote:
From: Benjamin Cama <benjamin.cama@telecom-bretagne.eu>
Le mercredi 20 juin 2012 à 11:05 +0100, Daniel P. Berrange a écrit :
I think you also need to have a VIR_FOFRCE_CLOSE(tapfd) in the very last of the cleanup jump labels. Since between the time we open the tapfd & close it, we might have done a 'goto err1' or similar. Yes, I forgot that. New patch following.
Looks reasonable apart from the cleanup bug. Thanks. BTW, if this patch is commited, I'm already in AUTHORS under another email address, which is OK.
This is my attempt at resolving the merge conflict, but I'm not confident enough with the original patch, nor with my test environment, to see if I actually solved anything (not to mention the commit message used by 'git am' is horrible). Is it worth using this version?
Thanks Eric for your attempt, this is also what I came too. I will add Laine's comment suggestion, though.
Whatever we push should have the commit log message from the first version of the patch. That gave a very good description of the problem, and why the patch is IFUPing a tap device that's connected to the bridge, but has no IP address (v6 or v4), no backend connection, and no traffic flowing over it. It seems very counterintuitive until you read the explanation in the original version.
I will repost with the original (updated) description.
src/network/bridge_driver.c | 22 ++++++++++++++++++++-- src/uml/uml_conf.c | 3 ++- src/util/virnetdevtap.c | 24 ++++++++++++++---------- src/util/virnetdevtap.h | 2 ++ 4 files changed, 38 insertions(+), 13 deletions(-)
diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index fce1739..33aba74 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -65,6 +65,7 @@ #include "virnetdevtap.h" #include "virnetdevvportprofile.h" #include "virdbus.h" +#include "virfile.h"
#define NETWORK_PID_DIR LOCALSTATEDIR "/run/libvirt/network" #define NETWORK_STATE_DIR LOCALSTATEDIR "/lib/libvirt/network" @@ -993,6 +994,7 @@ networkRadvdConfContents(virNetworkObjPtr network, char **configstr) " AdvSendAdvert on;\n" " AdvManagedFlag off;\n" " AdvOtherConfigFlag off;\n" + " IgnoreIfMissing on;\n"
This was added to the patch after V2, and I don't see any explanation why. Does this just silence the complaint log messages when the dummy tap is down?
No, it allows radvd to start even when the bridge is not RUNNING, as sometimes radvd binds to it _after_ we set it “down” (by closing the tap fd). Contrary to what dnsmasq does, radvd daemonizes itself before binding itself to the interface, so if it comes “too late”, it fails and stops (and BTW, libvirt doesn't detect that it fails, which may be a problem in some other cases). Now that I re-read the code, maybe we could just start radvd before dnsmasq (it's the contrary currently), as the latter adds a delay that should be enough for radvd to start and bind itself. But with my patch, we play on the safe side.
"\n", network->def->bridge);
@@ -2061,6 +2063,7 @@ networkStartNetworkVirtual(struct network_driver *driver, virErrorPtr save_err = NULL; virNetworkIpDefPtr ipdef; char *macTapIfName = NULL; + int tapfd = -1;
/* Check to see if any network IP collides with an existing route */ if (networkCheckRouteCollision(network) < 0) @@ -2082,10 +2085,13 @@ networkStartNetworkVirtual(struct network_driver *driver, virReportOOMError(); goto err0; } + /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, &network->def->mac, - NULL, NULL, NULL, NULL, - VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE) < 0) { + NULL, &tapfd, NULL, NULL, + VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE | + VIR_NETDEV_TAP_CREATE_IFUP | + VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { VIR_FREE(macTapIfName); goto err0; } @@ -2149,6 +2155,16 @@ networkStartNetworkVirtual(struct network_driver *driver, if (v6present && networkStartRadvd(network) < 0) goto err4;
+ /* DAD has happened (dnsmasq waits for it),
Hmm. dnsmasq waits for it because it wants to listen for DNS packets on the bridge's IPv6 address, right? Okay, so the timing problem mentioned in the earlier messages is a thing of the past.
I am not sure if you are talking about the radvd timing issue; if this is it, see my above remark.
dnsmasq is now bound to the + * bridge's IPv6 address, and radvd to the interface, so we can now set the + * dummy tun down. + */ + if (tapfd >= 0) { + if (virNetDevSetOnline(macTapIfName, false) < 0) + goto err4; + VIR_FORCE_CLOSE(tapfd); + } + if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), @@ -2187,6 +2203,8 @@ networkStartNetworkVirtual(struct network_driver *driver, save_err = virSaveLastError();
if (macTapIfName) { + if (tapfd >= 0) + VIR_FORCE_CLOSE(tapfd); ignore_value(virNetDevTapDelete(macTapIfName)); VIR_FREE(macTapIfName); } diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index a317bcc..e8b8779 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -142,7 +142,8 @@ umlConnectTapDevice(virConnectPtr conn, vm->uuid, NULL, virDomainNetGetActualVirtPortProfile(net), virDomainNetGetActualVlan(net), - VIR_NETDEV_TAP_CREATE_IFUP) < 0) { + VIR_NETDEV_TAP_CREATE_IFUP | + VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { if (template_ifname) VIR_FREE(net->ifname); goto error; diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 9f2dca1..0eadd6c 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -112,18 +112,20 @@ virNetDevProbeVnetHdr(int tapfd) * * VIR_NETDEV_TAP_CREATE_VNET_HDR * - Enable IFF_VNET_HDR on the tap device + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * Creates a tap interface. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to - * remove a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure. */ int virNetDevTapCreate(char **ifname, int *tapfd, - unsigned int flags ATTRIBUTE_UNUSED) + unsigned int flags) { int fd; struct ifreq ifr; @@ -160,7 +162,7 @@ int virNetDevTapCreate(char **ifname, goto cleanup; }
- if (!tapfd && + if ((flags & VIR_NETDEV_TAP_CREATE_PERSIST) && (errno = ioctl(fd, TUNSETPERSIST, 1))) { virReportSystemError(errno, _("Unable to set tap device %s to persistent"), @@ -261,14 +263,16 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED) * - Enable IFF_VNET_HDR on the tap device * VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE * - Set this interface's MAC as the bridge's MAC address + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * This function creates a new tap device on a bridge. @ifname can be either * a fixed name or a name template with '%d' for dynamic name allocation. * in either case the final name for the bridge will be stored in @ifname. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to remove - * a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure */ diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index fa6a647..980db61 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -43,6 +43,8 @@ typedef enum { VIR_NETDEV_TAP_CREATE_VNET_HDR = 1 << 1, /* Set this interface's MAC as the bridge's MAC address */ VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE = 1 << 2, + /* The device will persist after the file descriptor is closed */ + VIR_NETDEV_TAP_CREATE_PERSIST = 1 << 3, } virNetDevTapCreateFlags;
int virNetDevTapCreateInBridgePort(const char *brname,
This has my ACK as long as:
1) the original commit message (with some modifications) is used,
2) the reason for adding the "IgnoreIfMissing" line is added to that commit message (and maybe as a comment in the code).
I will do both. Patch coming shortly after this message. I have tested it here (even if it's getting harder and harder to get it compiled & packaged for squeeze…) and it works OK. Regards, -- Benjamin Cama <benjamin.cama@telecom-bretagne.eu>

Le vendredi 21 septembre 2012 à 12:24 -0600, Eric Blake a écrit :
On 09/21/2012 11:04 AM, Benjamin Cama wrote:
Ping?
Thanks. BTW, if this patch is commited, I'm already in AUTHORS under another email address, which is OK.
Our syntax checker wants us to add a .mailmap listing, so that your preferred email address is in AUTHORS. Let me know if you want to pivot which address of yours is treated as preferred.
Forgot to re-send that. --- diff --git a/.mailmap b/.mailmap index 3ec5a3f..76e6513 100644 --- a/.mailmap +++ b/.mailmap @@ -33,6 +33,7 @@ <josh.durgin@inktank.com> <josh.durgin@dreamhost.com> <gerd@egidy.de> <lists@egidy.de> <gerd@egidy.de> <gerd.von.egidy@intra2net.com> +<benoar@dolka.fr> <benjamin.cama@telecom-bretagne.eu> # Name consolidation: # Preferred author spelling <preferred email> -- Benjamin Cama <benjamin.cama@telecom-bretagne.eu>

On 06/19/2012 01:00 PM, Benjamin Cama wrote:
I hit this problem recently when trying to create a bridge with an IPv6 address on a 3.2 kernel: [*very* detailed analysis removed for brevity]
Wow, very thorough research! Thanks for figuring this out! I'm leaving the country in 48 hours and may not be able to review this before I go. If anyone else wants to take a crack at it so it can go into this release, that would be wonderful. Otherwise I'll get to it as soon as I possibly can.

Wow, very thorough research! Thanks for figuring this out!
Thanks. It took me almost two days to figure this out and find a working solution. Interesting debugging, though, as I learned quite a lot on bridges and IPv6 DAD.
I'm leaving the country in 48 hours and may not be able to review this before I go. If anyone else wants to take a crack at it so it can go into this release, that would be wonderful. Otherwise I'll get to it as soon as I possibly can.
Thanks. And I've found a solution for radvd: use the “IgnoreIfMissing” flag in its config so that it continues running even if it find the interface down on startup. BTW, I'm also hesitating to set the VIR_NETDEV_TAP_CREATE_PERSIST flag automatically when no tapfd pointer is given, as currently, with my patch, if you call virNetDevTapCreate() without VIR_NETDEV_TAP_CREATE_PERSIST and without tapfd argument, the device will disappear immediately. I don't know if implicitly setting a flag is a good idea, though. Anyway, here's the updated patch (without mangled newlines, hopefully). A summary for this patch would be: When creating a network bridge with an IPv6 address on kernel >= 2.6.39 (because of kernel's commit 1faa4356a3bd89ea11fb92752d897cff3a20ec0e), dnsmasq and radvd fail to bind to this address as the interface is not IFF_RUNNING and thus DAD (Duplicate Address Detection) cannot happen. This keeps the address flagged “tentative” and no daemon can bind to it (RFC 2462). This happens because the dummy TAP interface, linked to the bridge to set its MAC address, has its file descriptor closed and this prevents the interface from being IFF_RUNNING. This patch makes the bridge creation code keep the dummy TAP file descriptor open until DAD has happened on the bridge, allowing dnsmasq to bind to the address (and we tell radvd to ignore the interface if it is “missing” on startup), and then sets the interface down and closes the fd, as before. Oh, and please Cc me, I'm not subscribed. --- src/network/bridge_driver.c | 20 ++++++++++++++++++-- src/uml/uml_conf.c | 2 +- src/util/virnetdevtap.c | 24 ++++++++++++++---------- src/util/virnetdevtap.h | 2 ++ 4 files changed, 35 insertions(+), 13 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 79d3010..b21d86b 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -62,6 +62,7 @@ #include "virnetdev.h" #include "virnetdevbridge.h" #include "virnetdevtap.h" +#include "virfile.h" #define NETWORK_PID_DIR LOCALSTATEDIR "/run/libvirt/network" #define NETWORK_STATE_DIR LOCALSTATEDIR "/lib/libvirt/network" @@ -840,6 +841,7 @@ networkStartRadvd(virNetworkObjPtr network) " AdvSendAdvert on;\n" " AdvManagedFlag off;\n" " AdvOtherConfigFlag off;\n" + " IgnoreIfMissing on;\n" "\n", network->def->bridge); for (ii = 0; @@ -1744,6 +1746,7 @@ networkStartNetworkVirtual(struct network_driver *driver, virErrorPtr save_err = NULL; virNetworkIpDefPtr ipdef; char *macTapIfName = NULL; + int tapfd = -1; /* Check to see if any network IP collides with an existing route */ if (networkCheckRouteCollision(network) < 0) @@ -1765,10 +1768,13 @@ networkStartNetworkVirtual(struct network_driver *driver, virReportOOMError(); goto err0; } + /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, network->def->mac, - NULL, NULL, NULL, - VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE) < 0) { + NULL, &tapfd, NULL, + VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE + |VIR_NETDEV_TAP_CREATE_IFUP + |VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { VIR_FREE(macTapIfName); goto err0; } @@ -1828,6 +1834,16 @@ networkStartNetworkVirtual(struct network_driver *driver, if (v6present && networkStartRadvd(network) < 0) goto err4; + /* DAD has happened (dnsmasq waits for it), dnsmasq is now bound to the + * bridge's IPv6 address, and radvd to the interface, so we can now set the + * dummy tun down. + */ + if (tapfd >= 0) { + if (virNetDevSetOnline(macTapIfName, false) < 0) + goto err4; + VIR_FORCE_CLOSE(tapfd); + } + if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index 79b249d..c9b5044 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -141,7 +141,7 @@ umlConnectTapDevice(virConnectPtr conn, if (virNetDevTapCreateInBridgePort(bridge, &net->ifname, net->mac, vm->uuid, NULL, virDomainNetGetActualVirtPortProfile(net), - VIR_NETDEV_TAP_CREATE_IFUP) < 0) { + VIR_NETDEV_TAP_CREATE_IFUP|VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { if (template_ifname) VIR_FREE(net->ifname); goto error; diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 5d21164..45ff20f 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -118,18 +118,20 @@ virNetDevProbeVnetHdr(int tapfd) * * VIR_NETDEV_TAP_CREATE_VNET_HDR * - Enable IFF_VNET_HDR on the tap device + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * Creates a tap interface. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to - * remove a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or an errno code in case of failure. */ int virNetDevTapCreate(char **ifname, int *tapfd, - unsigned int flags ATTRIBUTE_UNUSED) + unsigned int flags) { int fd; struct ifreq ifr; @@ -166,7 +168,7 @@ int virNetDevTapCreate(char **ifname, goto cleanup; } - if (!tapfd && + if ((flags & VIR_NETDEV_TAP_CREATE_PERSIST) && (errno = ioctl(fd, TUNSETPERSIST, 1))) { virReportSystemError(errno, _("Unable to set tap device %s to persistent"), @@ -267,14 +269,16 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED) * - Enable IFF_VNET_HDR on the tap device * VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE * - Set this interface's MAC as the bridge's MAC address + * VIR_NETDEV_TAP_CREATE_PERSIST + * - The device will persist after the file descriptor is closed * * This function creates a new tap device on a bridge. @ifname can be either * a fixed name or a name template with '%d' for dynamic name allocation. * in either case the final name for the bridge will be stored in @ifname. - * If the @tapfd parameter is supplied, the open tap device file - * descriptor will be returned, otherwise the TAP device will be made - * persistent and closed. The caller must use virNetDevTapDelete to remove - * a persistent TAP devices when it is no longer needed. + * If the @tapfd parameter is supplied, the open tap device file descriptor + * will be returned, otherwise the TAP device will be closed. The caller must + * use virNetDevTapDelete to remove a persistent TAP device when it is no + * longer needed. * * Returns 0 in case of success or -1 on failure */ diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index d9a3593..f1355ba 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -42,6 +42,8 @@ typedef enum { VIR_NETDEV_TAP_CREATE_VNET_HDR = 1 << 1, /* Set this interface's MAC as the bridge's MAC address */ VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE = 1 << 2, + /* The device will persist after the file descriptor is closed */ + VIR_NETDEV_TAP_CREATE_PERSIST = 1 << 3, } virNetDevTapCreateFlags; int virNetDevTapCreateInBridgePort(const char *brname, -- Benjamin Cama <benjamin.cama@telecom-bretagne.eu>

On Wed, Jun 20, 2012 at 12:00:37PM +0200, Benjamin Cama wrote:
Wow, very thorough research! Thanks for figuring this out!
Thanks. It took me almost two days to figure this out and find a working solution. Interesting debugging, though, as I learned quite a lot on bridges and IPv6 DAD.
I'm leaving the country in 48 hours and may not be able to review this before I go. If anyone else wants to take a crack at it so it can go into this release, that would be wonderful. Otherwise I'll get to it as soon as I possibly can.
Thanks.
And I've found a solution for radvd: use the “IgnoreIfMissing” flag in its config so that it continues running even if it find the interface down on startup.
BTW, I'm also hesitating to set the VIR_NETDEV_TAP_CREATE_PERSIST flag automatically when no tapfd pointer is given, as currently, with my patch, if you call virNetDevTapCreate() without VIR_NETDEV_TAP_CREATE_PERSIST and without tapfd argument, the device will disappear immediately. I don't know if implicitly setting a flag is a good idea, though.
There are only a few callers of this API, so I think it is reasonable to just update the callers as needed, rather than implicitly setting the flag. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Jun 19, 2012 at 07:00:58PM +0200, Benjamin Cama wrote:
Hi,
I hit this problem recently when trying to create a bridge with an IPv6 address on a 3.2 kernel: dnsmasq (and, further, radvd) would not bind to the given address (resp. interface), waiting 20s and then giving up with -EADDRNOTAVAIL (resp. exiting immediately with "error parsing or activating the config file", without libvirt noticing it, BTW). This can be reproduced with (I think) any kernel >= 2.6.39 and the following XML (to be used with "virsh net-create"):
<network> <name>test-bridge</name> <bridge name='testbr0' /> <ip family='ipv6' address='fd00::1' prefix='64'> </ip> </network>
(it happens even when you have an IPv4, too)
The problem is that since commit [1] (which, ironically, was made to “help IPv6 autoconfiguration”) the linux bridge code makes bridges behave like “real” devices regarding carrier detection. This makes the bridges created by libvirt, which are started without any up devices, stay with the NO-CARRIER flag set, and thus prevents DAD (Duplicate address detection) from happening, thus letting the IPv6 address flagged as “tentative”. Such addresses cannot be bound to (see RFC 2462), so dnsmasq fails binding to it (for radvd, it detects that "interface XXX is not RUNNING", thus that "interface XXX does not exist, ignoring the interface" (sic)). It seems that this behavior was enhanced somehow with commit [2] by avoiding setting NO-CARRIER on empty bridges, but I couldn't reproduce this behavior on my kernel. Anyway, with the “dummy tap to set MAC address” trick, this wouldn't work.
To fix this, the idea is to get the bridge's attached device to be up so that DAD can happen (deactivating DAD altogether is not a good idea, I think). Currently, libvirt creates a dummy TAP device to set the MAC address of the bridge, keeping it down. But even if we set this device up, it is not RUNNING as soon as the tap file descriptor attached to it is closed, thus still preventing DAD. So, we must modify the API a bit, so that we can get the fd, keep the tap device persistent, run the daemons, and close it after DAD has taken place. After that, the bridge will be flagged NO-CARRIER again, but the daemons will be running, even if not happy about the device's state (but we don't really care about the bridge's daemons doing anything when no up interface is connected to it).
Other solutions that I envisioned were: * Keeping the *-nic interface up: this would waste an fd for each bridge during all its life. May be acceptable, I don't really know.
Is that because closing the TAP device FD makes it go into offline state ? In the great scheme of things, I don't think one extra file descriptor per network is too onerous - there would only ever be a handful of networks per host.
* Stop using the dummy tap trick, and set the MAC address directly on the bridge: it is possible since quite some time it seems, even if then there is the problem of the bridge not being RUNNING when empty, contrary to what [2] says, so this will need fixing (and this fix only happened in 3.1, so it wouldn't work for 2.6.39)
While you can set the MAC address on a bridge, there are some problems with doing so. The MAC address you set on the bridge must match the MAC address of one of the enslaved interfaces or no traffic will flow. We can't use the MAC addr of any of the guest TAP devices, so we must have a dummy TAP device present in the bridge.
* Using the --interface option of dnsmasq, but I saw somewhere that it's not used by libvirt for backward compatibility. I am not sure this would solve this problem, though, as I don't know how dnsmasq binds itself to it with this option.
I don't know either.
This is why this patch does what's described earlier. I see radvd yelling quite often in the logs when the interface is NO-CARRIER, but it's ok, it keeps running.
Still, this patch is not exactly correct, as radvd get daemonized “too soon” most of the time (i.e. when not debugging libvirtd…) and probes the bridge once it has been set down (even if started before setting it down), thus failing as before (and libvirt lets it be like that: this would need some more checking, maybe). One /may/ introduce some delay between networkStartRadvd() and setting the dummy tap down to solve it, but it seemed too hackish to me. But I couldn't come with a better solution. I would welcome suggestions here.
Introducing delays is indeed hackish & potentially unreliable if starting a network on a highly loaded system. The only real way to avoid that race would be to leave the dummy tap device in the online state permanently AFAICT. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Le mercredi 20 juin 2012 à 11:10 +0100, Daniel P. Berrange a écrit :
On Tue, Jun 19, 2012 at 07:00:58PM +0200, Benjamin Cama wrote:
Other solutions that I envisioned were: * Keeping the *-nic interface up: this would waste an fd for each bridge during all its life. May be acceptable, I don't really know.
Is that because closing the TAP device FD makes it go into offline state ?
Yes.
In the great scheme of things, I don't think one extra file descriptor per network is too onerous - there would only ever be a handful of networks per host.
It depends on the usage. I am using libvirt to simulate networks, linking some bridges and VMs together, and it *could* grow quite large if one wanted to simulate big networks. For me, it's not about the memory cost of an fd, which is small compared to a VM, but about limits of the OS (1024 open fds by default; OK, it's still a quite large number).
* Stop using the dummy tap trick, and set the MAC address directly on the bridge: it is possible since quite some time it seems, even if then there is the problem of the bridge not being RUNNING when empty, contrary to what [2] says, so this will need fixing (and this fix only happened in 3.1, so it wouldn't work for 2.6.39)
While you can set the MAC address on a bridge, there are some problems with doing so. The MAC address you set on the bridge must match the MAC address of one of the enslaved interfaces or no traffic will flow. We can't use the MAC addr of any of the guest TAP devices, so we must have a dummy TAP device present in the bridge.
It looks like it has changed recently, according to this commit http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=4359... (introduced in 3.3) if I understand it well. […]
Still, this patch is not exactly correct, as radvd get daemonized “too soon” most of the time (i.e. when not debugging libvirtd…) and probes the bridge once it has been set down (even if started before setting it down), thus failing as before (and libvirt lets it be like that: this would need some more checking, maybe). One /may/ introduce some delay between networkStartRadvd() and setting the dummy tap down to solve it, but it seemed too hackish to me. But I couldn't come with a better solution. I would welcome suggestions here.
Introducing delays is indeed hackish & potentially unreliable if starting a network on a highly loaded system. The only real way to avoid that race would be to leave the dummy tap device in the online state permanently AFAICT.
Does the solution I posted earlier seem OK to avoid that? Regarding keeping the device operationally up, I don't really know if it's the best solution or not. I don't know if setting the bridge to a “disconnected” state when no VM is connected to it has a meaning. Regards, -- Benjamin Cama <benjamin.cama@telecom-bretagne.eu>

On Wed, Jun 20, 2012 at 02:48:55PM +0200, Benjamin Cama wrote:
Le mercredi 20 juin 2012 à 11:10 +0100, Daniel P. Berrange a écrit :
On Tue, Jun 19, 2012 at 07:00:58PM +0200, Benjamin Cama wrote:
Other solutions that I envisioned were: * Keeping the *-nic interface up: this would waste an fd for each bridge during all its life. May be acceptable, I don't really know.
Is that because closing the TAP device FD makes it go into offline state ?
Yes.
In the great scheme of things, I don't think one extra file descriptor per network is too onerous - there would only ever be a handful of networks per host.
It depends on the usage. I am using libvirt to simulate networks, linking some bridges and VMs together, and it *could* grow quite large if one wanted to simulate big networks. For me, it's not about the memory cost of an fd, which is small compared to a VM, but about limits of the OS (1024 open fds by default; OK, it's still a quite large number).
* Stop using the dummy tap trick, and set the MAC address directly on the bridge: it is possible since quite some time it seems, even if then there is the problem of the bridge not being RUNNING when empty, contrary to what [2] says, so this will need fixing (and this fix only happened in 3.1, so it wouldn't work for 2.6.39)
While you can set the MAC address on a bridge, there are some problems with doing so. The MAC address you set on the bridge must match the MAC address of one of the enslaved interfaces or no traffic will flow. We can't use the MAC addr of any of the guest TAP devices, so we must have a dummy TAP device present in the bridge.
It looks like it has changed recently, according to this commit http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=4359... (introduced in 3.3) if I understand it well.
Ok, that's a bit too recent for us to be able to rely on in libvirt.
[…]
Still, this patch is not exactly correct, as radvd get daemonized “too soon” most of the time (i.e. when not debugging libvirtd…) and probes the bridge once it has been set down (even if started before setting it down), thus failing as before (and libvirt lets it be like that: this would need some more checking, maybe). One /may/ introduce some delay between networkStartRadvd() and setting the dummy tap down to solve it, but it seemed too hackish to me. But I couldn't come with a better solution. I would welcome suggestions here.
Introducing delays is indeed hackish & potentially unreliable if starting a network on a highly loaded system. The only real way to avoid that race would be to leave the dummy tap device in the online state permanently AFAICT.
Does the solution I posted earlier seem OK to avoid that?
If you're happy that your latest patch is good enough, then so am I. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Le mercredi 20 juin 2012 à 14:05 +0100, Daniel P. Berrange a écrit :
On Wed, Jun 20, 2012 at 02:48:55PM +0200, Benjamin Cama wrote:
Does the solution I posted earlier seem OK to avoid that?
If you're happy that your latest patch is good enough, then so am I.
It's the least disturbing solution regarding how it's done today, and it works fine here. So it looks good enough to me. -- Benjamin Cama <benjamin.cama@telecom-bretagne.eu>
participants (4)
-
Benjamin Cama
-
Daniel P. Berrange
-
Eric Blake
-
Laine Stump