[libvirt] [PATCH 0/3] allow fixing tap-bridge connections when network is restarted

It's been a long-standing problem that when you stop and restart a libvirt network, the guest tap devices are no longer connected to the network. Until now the only way to recover from this was to either shutdown and restart all the affected guests, or to detach and then re-attach all affected guest network devices. These patches permit the situation to be remedied by just restarting libvirtd - when the hypervisor drivers are notifying the network drivers of each connection, the network can retrieve the current "master" for the tap device and compare it to the network's bridge device. If they don't match, then it disconnects the tap from any incorrect bridge and connects to the correct bridge. *** For now, that's as far as we go - *semi*-automated (since you need to restart libvirtd for it to happen). My intent is for this to be completely automated, but the logic to do that is a bit "convoluted" so I've deferred trying to implement it until I've given it more thought - a few different trains of thought have led to dead-ends, and so far only one seems reasonably doable: * add a networkStartCallback list to the network driver state object. * as each hypervisor driver that uses the network driver is initialized, it will call an internal-only function in the network driver to register a callback. * whenever a network is started, it will call all the registered callback functions (sending the name of the network being started, and some HV-specific context pointer). * Each hypervisor's callback function will look through all of its active domains for interfaces that are using the given network, and for each such interface, will call networkNotifyActualDevice() (the function that has been updated in Patch 3/3 to reconnect taps to bridges). The "convoluted" part here is that we need to make sure there is enough (but not too much!) locking/refcounting both when adding items to the callback list (which should only be done single threaded, since it happens during the driver initializations) and when the networkStart() function (in some state of lockedness/refcountedness) calls over to some e.g. qemu function that will then need to do some amount of locking (at least on each domain as it is processed) and then calls back into the network driver (which will need "some amount" of locking/refcounting). Since we're calling from network driver into qemu into network driver, while the normal nesting is just calling from qemu into network, I want to be sure there is no possibility for a deadlock. (Also, I want to avoid making the list of callbacks any more complicated than absolutely necessary - the "in" thing to do these days seems to be to allocate a hash table, but there's a lot of extra code required for that (see util/virclosecallbacks.[hc]) which seems like overkill for a list of maximum 4-5 items that will *never* change after the driver initialization - if anyone has ideas/opinions about that, please speak up. Laine Stump (2): util: new function virNetDevGetMaster() util: new function virNetDevTapAttachBridge() root (1): network: reconnect tap devices during networkNotifyActualDevice src/libvirt_private.syms | 2 + src/network/bridge_driver.c | 30 +++++++++++- src/util/virnetdev.c | 49 +++++++++++++++++++ src/util/virnetdev.h | 3 ++ src/util/virnetdevtap.c | 111 +++++++++++++++++++++++++++++--------------- src/util/virnetdevtap.h | 12 +++++ 6 files changed, 169 insertions(+), 38 deletions(-) -- 2.9.3

This function provides the bridge/bond device that the given network device is attached to. The return value is 0 or -1, and the master device is a char** argument to the function - this is needed in order to allow for a "success" return from a device that has no master. --- src/libvirt_private.syms | 1 + src/util/virnetdev.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virnetdev.h | 3 +++ 3 files changed, 53 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c96fe2a..1d783b6 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1994,6 +1994,7 @@ virNetDevGetFeatures; virNetDevGetIndex; virNetDevGetLinkInfo; virNetDevGetMAC; +virNetDevGetMaster; virNetDevGetMTU; virNetDevGetName; virNetDevGetOnline; diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 0d19432..756dcdd 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -931,6 +931,55 @@ int virNetDevGetIndex(const char *ifname ATTRIBUTE_UNUSED, #endif /* ! SIOCGIFINDEX */ +#if defined(__linux__) && defined(HAVE_LIBNL) +/** + * virNetDevGetMaster: + * @ifname: name of interface we're interested in + * @master: used to return a string containing the name of @ifname's "master" + * (this is the bridge or bond device that this device is attached to) + * + * Returns 0 on success, -1 on failure (if @ifname has no master + * @master will be NULL, but return value will still be 0 (success)). + */ +int +virNetDevGetMaster(const char *ifname, char **master) +{ + int ret = -1; + void *nlData = NULL; + struct nlattr *tb[IFLA_MAX + 1] = {NULL, }; + + *master = NULL; + + if (virNetlinkDumpLink(ifname, -1, &nlData, tb, 0, 0) < 0) + goto cleanup; + + if (tb[IFLA_MASTER]) { + if (!(*master = virNetDevGetName(*(int *)RTA_DATA(tb[IFLA_MASTER])))) + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(nlData); + return ret; +} + + +#else + + +int +virNetDevGetMaster(const char *ifname, char **master) +{ + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Unable to get device master from netlink on this platform")); + return -1; +} + + +#endif /* defined(__linux__) && defined(HAVE_LIBNL) */ + + #if defined(SIOCGIFVLAN) && defined(HAVE_STRUCT_IFREQ) && HAVE_DECL_GET_VLAN_VID_CMD int virNetDevGetVLanID(const char *ifname, int *vlanid) { diff --git a/src/util/virnetdev.h b/src/util/virnetdev.h index 0551af6..9da5d85 100644 --- a/src/util/virnetdev.h +++ b/src/util/virnetdev.h @@ -165,6 +165,9 @@ int virNetDevGetIndex(const char *ifname, int *ifindex) int virNetDevGetVLanID(const char *ifname, int *vlanid) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK; +int virNetDevGetMaster(const char *ifname, char **master) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK; + int virNetDevValidateConfig(const char *ifname, const virMacAddr *macaddr, int ifindex) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; -- 2.9.3

This patch splits out the part of virNetDevTapCreateInBridgePort() that would need to be re-done if an existing tap device had to be re-attached to a bridge, and puts it into a separate function. This can be used both when an existing domain interface config is updated to change its connection, and also to re-attach to the "same" bridge when a network has been stopped and restarted. So far it is used for nothing. --- src/libvirt_private.syms | 1 + src/util/virnetdevtap.c | 111 +++++++++++++++++++++++++++++++---------------- src/util/virnetdevtap.h | 12 +++++ 3 files changed, 87 insertions(+), 37 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 1d783b6..905d55a 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2106,6 +2106,7 @@ virNetDevOpenvswitchSetTimeout; # util/virnetdevtap.h +virNetDevTapAttachBridge; virNetDevTapCreate; virNetDevTapCreateInBridgePort; virNetDevTapDelete; diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 41e68f4..02ef7fd 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -505,6 +505,77 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED, /** + * virNetDevTapAttachBridge: + * @tapname: the tap interface name (or name template) + * @brname: the bridge name + * @macaddr: desired MAC address + * @virtPortProfile: bridge/port specific configuration + * @virtVlan: vlan tag info + * @mtu: requested MTU for port (or 0 for "default") + * @actualMTU: MTU actually set for port (after accounting for bridge's MTU) + * + * This attaches an existing tap device (@tapname) to a bridge + * (@brname). + * + * Returns 0 in case of success or -1 on failure + */ +int +virNetDevTapAttachBridge(const char *tapname, + const char *brname, + const virMacAddr *macaddr, + const unsigned char *vmuuid, + virNetDevVPortProfilePtr virtPortProfile, + virNetDevVlanPtr virtVlan, + unsigned int mtu, + unsigned int *actualMTU) +{ + /* If an MTU is specified for the new device, set it before + * attaching the device to the bridge, as it may affect the MTU of + * the bridge (in particular if it is the first device attached to + * the bridge, or if it is smaller than the current MTU of the + * bridge). If MTU isn't specified for the new device (i.e. 0), + * we need to set the interface MTU to the current MTU of the + * bridge (to avoid inadvertantly changing the bridge's MTU). + */ + if (mtu > 0) { + if (virNetDevSetMTU(tapname, mtu) < 0) + goto error; + } else { + if (virNetDevSetMTUFromDevice(tapname, brname) < 0) + goto error; + } + if (actualMTU) { + int retMTU = virNetDevGetMTU(tapname); + + if (retMTU < 0) + goto error; + + *actualMTU = retMTU; + } + + + if (virtPortProfile) { + if (virtPortProfile->virtPortType == VIR_NETDEV_VPORT_PROFILE_MIDONET) { + if (virNetDevMidonetBindPort(tapname, virtPortProfile) < 0) + goto error; + } else if (virtPortProfile->virtPortType == VIR_NETDEV_VPORT_PROFILE_OPENVSWITCH) { + if (virNetDevOpenvswitchAddPort(brname, tapname, macaddr, vmuuid, + virtPortProfile, virtVlan) < 0) + goto error; + } + } else { + if (virNetDevBridgeAddPort(brname, tapname) < 0) + goto error; + } + + return 0; + + error: + return -1; +} + + +/** * virNetDevTapCreateInBridgePort: * @brname: the bridge name * @ifname: the interface name (or name template) @@ -582,43 +653,9 @@ int virNetDevTapCreateInBridgePort(const char *brname, if (virNetDevSetMAC(*ifname, &tapmac) < 0) goto error; - /* If an MTU is specified for the new device, set it before - * attaching the device to the bridge, as it may affect the MTU of - * the bridge (in particular if it is the first device attached to - * the bridge, or if it is smaller than the current MTU of the - * bridge). If MTU isn't specified for the new device (i.e. 0), - * we need to set the interface MTU to the current MTU of the - * bridge (to avoid inadvertantly changing the bridge's MTU). - */ - if (mtu > 0) { - if (virNetDevSetMTU(*ifname, mtu) < 0) - goto error; - } else { - if (virNetDevSetMTUFromDevice(*ifname, brname) < 0) - goto error; - } - if (actualMTU) { - int retMTU = virNetDevGetMTU(*ifname); - - if (retMTU < 0) - goto error; - - *actualMTU = retMTU; - } - - - if (virtPortProfile) { - if (virtPortProfile->virtPortType == VIR_NETDEV_VPORT_PROFILE_MIDONET) { - if (virNetDevMidonetBindPort(*ifname, virtPortProfile) < 0) - goto error; - } else if (virtPortProfile->virtPortType == VIR_NETDEV_VPORT_PROFILE_OPENVSWITCH) { - if (virNetDevOpenvswitchAddPort(brname, *ifname, macaddr, vmuuid, - virtPortProfile, virtVlan) < 0) - goto error; - } - } else { - if (virNetDevBridgeAddPort(brname, *ifname) < 0) - goto error; + if (virNetDevTapAttachBridge(*ifname, brname, macaddr, vmuuid, + virtPortProfile, virtVlan, mtu, actualMTU) < 0) { + goto error; } if (virNetDevSetOnline(*ifname, !!(flags & VIR_NETDEV_TAP_CREATE_IFUP)) < 0) diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index 6441df4..6bb3b88 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -62,6 +62,18 @@ typedef enum { VIR_NETDEV_TAP_CREATE_PERSIST = 1 << 3, } virNetDevTapCreateFlags; +int +virNetDevTapAttachBridge(const char *tapname, + const char *brname, + const virMacAddr *macaddr, + const unsigned char *vmuuid, + virNetDevVPortProfilePtr virtPortProfile, + virNetDevVlanPtr virtVlan, + unsigned int mtu, + unsigned int *actualMTU) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) + ATTRIBUTE_RETURN_CHECK; + int virNetDevTapCreateInBridgePort(const char *brname, char **ifname, const virMacAddr *macaddr, -- 2.9.3

From: root <root@vhost2.laine.org> If a network is destroyed and restarted, or its bridge is changed, any tap devices that had previously been connected to the bridge will no longer be connected. As a first step in automating a reconnection of all tap devices when this happens, this patch modifies networkNotifyActualDevice() (which is called once for every <interface> of every active domain whenever libvirtd is restarted) to reconnect any tap devices that it finds disconnected. With this patch in place, you will need to restart libvirtd to reconnect all the taps to their proper bridges. A future patch will add a callback that hypervisor drivers can register with the network driver to that the network driver can trigger this behavior automatically whenever a network is started. --- src/network/bridge_driver.c | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 32c5ab7..6f2d03a 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -4605,7 +4605,8 @@ networkAllocateActualDevice(virDomainDefPtr dom, * Called to notify the network driver when libvirtd is restarted and * finds an already running domain. If appropriate it will force an * allocation of the actual->direct.linkdev to get everything back in - * order. + * order, or re-attach the interface's tap device to the network's + * bridge. * * Returns 0 on success, -1 on failure. */ @@ -4620,6 +4621,7 @@ networkNotifyActualDevice(virDomainDefPtr dom, virNetworkForwardIfDefPtr dev = NULL; size_t i; int ret = -1; + char *master = NULL; if (iface->type != VIR_DOMAIN_NET_TYPE_NETWORK) return 0; @@ -4644,6 +4646,31 @@ networkNotifyActualDevice(virDomainDefPtr dom, netdef->bridge) < 0)) goto error; + /* see if we're connected to the correct bridge */ + if (netdef->bridge) { + if (virNetDevGetMaster(iface->ifname, &master) < 0) + goto error; + + if (STRNEQ_NULLABLE(netdef->bridge, master)) { + /* disconnect from current (incorrect) bridge */ + if (master) + ignore_value(virNetDevBridgeRemovePort(master, iface->ifname)); + + /* attach/reattach to correct bridge. + * NB: we can't notify the guest of any MTU change anyway, + * so there is no point in trying to learn the actualMTU + * (final arg to virNetDevTapAttachBridge()) + */ + if (virNetDevTapAttachBridge(iface->ifname, netdef->bridge, + &iface->mac, dom->uuid, + virDomainNetGetActualVirtPortProfile(iface), + virDomainNetGetActualVlan(iface), + iface->mtu, NULL) < 0) { + goto error; + } + } + } + if (!iface->data.network.actual || (actualType != VIR_DOMAIN_NET_TYPE_DIRECT && actualType != VIR_DOMAIN_NET_TYPE_HOSTDEV)) { @@ -4776,6 +4803,7 @@ networkNotifyActualDevice(virDomainDefPtr dom, ret = 0; cleanup: virNetworkObjEndAPI(&network); + VIR_FREE(master); return ret; error: -- 2.9.3

On 03/21/2017 04:03 PM, Laine Stump wrote:
It's been a long-standing problem that when you stop and restart a libvirt network, the guest tap devices are no longer connected to the network. Until now the only way to recover from this was to either shutdown and restart all the affected guests, or to detach and then re-attach all affected guest network devices.
These patches permit the situation to be remedied by just restarting libvirtd - when the hypervisor drivers are notifying the network drivers of each connection, the network can retrieve the current "master" for the tap device and compare it to the network's bridge device. If they don't match, then it disconnects the tap from any incorrect bridge and connects to the correct bridge.
***
For now, that's as far as we go - *semi*-automated (since you need to restart libvirtd for it to happen). My intent is for this to be completely automated, but the logic to do that is a bit "convoluted" so I've deferred trying to implement it until I've given it more thought - a few different trains of thought have led to dead-ends, and so far only one seems reasonably doable:
* add a networkStartCallback list to the network driver state object.
* as each hypervisor driver that uses the network driver is initialized, it will call an internal-only function in the network driver to register a callback.
* whenever a network is started, it will call all the registered callback functions (sending the name of the network being started, and some HV-specific context pointer).
* Each hypervisor's callback function will look through all of its active domains for interfaces that are using the given network, and for each such interface, will call networkNotifyActualDevice() (the function that has been updated in Patch 3/3 to reconnect taps to bridges).
The "convoluted" part here is that we need to make sure there is enough (but not too much!) locking/refcounting both when adding items to the callback list (which should only be done single threaded, since it happens during the driver initializations) and when the networkStart() function (in some state of lockedness/refcountedness) calls over to some e.g. qemu function that will then need to do some amount of locking (at least on each domain as it is processed) and then calls back into the network driver (which will need "some amount" of locking/refcounting). Since we're calling from network driver into qemu into network driver, while the normal nesting is just calling from qemu into network, I want to be sure there is no possibility for a deadlock. (Also, I want to avoid making the list of callbacks any more complicated than absolutely necessary - the "in" thing to do these days seems to be to allocate a hash table, but there's a lot of extra code required for that (see util/virclosecallbacks.[hc]) which seems like overkill for a list of maximum 4-5 items that will *never* change after the driver initialization - if anyone has ideas/opinions about that, please speak up.
Laine Stump (2): util: new function virNetDevGetMaster() util: new function virNetDevTapAttachBridge()
root (1): network: reconnect tap devices during networkNotifyActualDevice
You might want to reset the owner here ;-)
src/libvirt_private.syms | 2 + src/network/bridge_driver.c | 30 +++++++++++- src/util/virnetdev.c | 49 +++++++++++++++++++ src/util/virnetdev.h | 3 ++ src/util/virnetdevtap.c | 111 +++++++++++++++++++++++++++++--------------- src/util/virnetdevtap.h | 12 +++++ 6 files changed, 169 insertions(+), 38 deletions(-)
ACK series. Michal
participants (2)
-
Laine Stump
-
Michal Privoznik