[libvirt] [PATCH 0/2] Fix bridge creation/deletion on 2.6.x kernels

These two patches fix this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1252780 which was also separate reported in libvirt-users a couple days ago. Most people running CentOS6 or RHEL6 also run the downstream version of libvirt-0.10.2 that has many patches backported from later upstream releases, but not the patch described in the commit logs of these proposed patches. And the problem being fixed here only occurs on a 2.6.x kernel, while most people running upstream libvirt are running something much more modern. That's why not many people have seen the bug that these patches fix. Still, we claim to fully support upstream on RHEL6/CentOS6, so we'd better do it right! Two notes: 1) I tried timing 20 iterations of net-start...net-delete with these patches vs. a libvirt build that simply calls the ioctls in the beginnng (i.e. it doesn't try netlink then fall back to ioctl if that fails), and the time to completion was within 0.25%, so the extra overhead of trying one then the other isn't significant even on the older machines that will always fall back to ioctl. 2) There are 2 different versions of Patch 2/2. I did the first one as an exercise to see how ugly it would be to *not* duplicate the entire virNetlinkDelLink() function inside the new virNetDevBridgeDelete(). It was pretty ugly, so I made a cleaner version that duplicates the code instead. I would be okay pushing either version, but I think I slightly prefer the one that duplicates code and remains uncomplicated. Laine Stump (2): util: fallback to ioctl(SIOCBRADDBR) if netlink RTM_NEWLINK fails util: fallback to ioctl(SIOCBRDELBR) if netlink RTM_DELLINK fails src/util/virnetdevbridge.c | 112 ++++++++++++++++++++++++++++++++------------ src/util/virnetdevmacvlan.c | 2 +- src/util/virnetlink.c | 17 +++++-- src/util/virnetlink.h | 2 +- 4 files changed, 98 insertions(+), 35 deletions(-) -- 2.1.0

commit fc7b23db switched from using ioctl(SIOCBRADDBR) to using a netlink RTM_NEWLINK message with an IFLA_INFO_KIND of "bridge", which is the more modern way to create a bridge. However, although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support creating *some* link types with RTM_NEWLINK, they don't support creating bridges, and there is no compile-time way to figure this out (since the "type" isn't an enum, but rather a character string). This patch moves the body of the SIOCBRADDBR version of virNetDevBridgeCreate() into a static function, calls the new function from the original, and also calls the new function from the RTM_NEWLINK version if the RTM_NEWLINK message generates an EOPNOTSUPP error. This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1252780 --- src/util/virnetdevbridge.c | 64 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 45 insertions(+), 19 deletions(-) diff --git a/src/util/virnetdevbridge.c b/src/util/virnetdevbridge.c index aa255d6..ae38901 100644 --- a/src/util/virnetdevbridge.c +++ b/src/util/virnetdevbridge.c @@ -394,8 +394,33 @@ virNetDevBridgePortSetUnicastFlood(const char *brname ATTRIBUTE_UNUSED, * * Returns 0 in case of success or -1 on failure */ +#if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRADDBR) +static int +virNetDevBridgeCreateWithIoctl(const char *brname) +{ + int fd = -1; + int ret = -1; + + if ((fd = virNetDevSetupControl(NULL, NULL)) < 0) + return -1; + + if (ioctl(fd, SIOCBRADDBR, brname) < 0) { + virReportSystemError(errno, + _("Unable to create bridge %s"), brname); + goto cleanup; + } + + ret = 0; + + cleanup: + VIR_FORCE_CLOSE(fd); + return ret; +} +#endif + #if defined(__linux__) && defined(HAVE_LIBNL) -int virNetDevBridgeCreate(const char *brname) +int +virNetDevBridgeCreate(const char *brname) { /* use a netlink RTM_NEWLINK message to create the bridge */ const char *type = "bridge"; @@ -441,6 +466,17 @@ int virNetDevBridgeCreate(const char *brname) switch (err->error) { case 0: break; + case -EOPNOTSUPP: +# if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRADDBR) + /* fallback to ioctl if netlink doesn't support creating + * bridges + */ + rc = virNetDevBridgeCreateWithIoctl(brname); + goto cleanup; +# endif + /* intentionally fall through if virNetDevBridgeCreateWithIoctl() + * isn't available. + */ default: virReportSystemError(-err->error, _("error creating bridge interface %s"), @@ -470,29 +506,19 @@ int virNetDevBridgeCreate(const char *brname) _("allocated netlink buffer is too small")); goto cleanup; } -#elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRADDBR) -int virNetDevBridgeCreate(const char *brname) -{ - int fd = -1; - int ret = -1; - if ((fd = virNetDevSetupControl(NULL, NULL)) < 0) - return -1; - if (ioctl(fd, SIOCBRADDBR, brname) < 0) { - virReportSystemError(errno, - _("Unable to create bridge %s"), brname); - goto cleanup; - } +#elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRADDBR) +int +virNetDevBridgeCreate(const char *brname) +{ + return virNetDevBridgeCreateWithIoctl(brname); +} - ret = 0; - cleanup: - VIR_FORCE_CLOSE(fd); - return ret; -} #elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCIFCREATE2) -int virNetDevBridgeCreate(const char *brname) +int +virNetDevBridgeCreate(const char *brname) { int s; struct ifreq ifr; -- 2.1.0

On Tue, Aug 25, 2015 at 11:47:54PM -0400, Laine Stump wrote:
commit fc7b23db switched from using ioctl(SIOCBRADDBR) to using a netlink RTM_NEWLINK message with an IFLA_INFO_KIND of "bridge", which is the more modern way to create a bridge. However, although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support creating *some* link types with RTM_NEWLINK, they don't support creating bridges, and there is no compile-time way to figure this out (since the "type" isn't an enum, but rather a character string).
This patch moves the body of the SIOCBRADDBR version of virNetDevBridgeCreate() into a static function, calls the new function from the original, and also calls the new function from the RTM_NEWLINK version if the RTM_NEWLINK message generates an EOPNOTSUPP error.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1252780 --- src/util/virnetdevbridge.c | 64 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 45 insertions(+), 19 deletions(-)
diff --git a/src/util/virnetdevbridge.c b/src/util/virnetdevbridge.c index aa255d6..ae38901 100644 --- a/src/util/virnetdevbridge.c +++ b/src/util/virnetdevbridge.c @@ -394,8 +394,33 @@ virNetDevBridgePortSetUnicastFlood(const char *brname ATTRIBUTE_UNUSED, * * Returns 0 in case of success or -1 on failure */ +#if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRADDBR) +static int +virNetDevBridgeCreateWithIoctl(const char *brname) +{ + int fd = -1; + int ret = -1; + + if ((fd = virNetDevSetupControl(NULL, NULL)) < 0) + return -1; + + if (ioctl(fd, SIOCBRADDBR, brname) < 0) { + virReportSystemError(errno, + _("Unable to create bridge %s"), brname); + goto cleanup; + } + + ret = 0; + + cleanup: + VIR_FORCE_CLOSE(fd); + return ret; +} +#endif + #if defined(__linux__) && defined(HAVE_LIBNL) -int virNetDevBridgeCreate(const char *brname) +int +virNetDevBridgeCreate(const char *brname) { /* use a netlink RTM_NEWLINK message to create the bridge */ const char *type = "bridge"; @@ -441,6 +466,17 @@ int virNetDevBridgeCreate(const char *brname) switch (err->error) { case 0: break; + case -EOPNOTSUPP: +# if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRADDBR)
The problem mentioned in te 2/2 (v3) doesn't exist here, so I think the second patch should really be changed. Anyway, would it make sense to also allow fallback to the version with SIOCIFCREATE2? ACK if the answer to the question is "No". Martin

On 26.08.2015 05:47, Laine Stump wrote:
commit fc7b23db switched from using ioctl(SIOCBRADDBR) to using a netlink RTM_NEWLINK message with an IFLA_INFO_KIND of "bridge", which is the more modern way to create a bridge. However, although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support creating *some* link types with RTM_NEWLINK, they don't support creating bridges, and there is no compile-time way to figure this out (since the "type" isn't an enum, but rather a character string).
This patch moves the body of the SIOCBRADDBR version of virNetDevBridgeCreate() into a static function, calls the new function from the original, and also calls the new function from the RTM_NEWLINK version if the RTM_NEWLINK message generates an EOPNOTSUPP error.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1252780 --- src/util/virnetdevbridge.c | 64 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 45 insertions(+), 19 deletions(-)
ACK Michal

commit 09778e09 switched from using ioctl(SIOCBRDELBR) for bridge device deletion to using a netlink RTM_DELLINK message, which is the more modern way to delete a bridge (and also doesn't require the bridge to be ~IFF_UP to succeed). However, although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support deleting *some* link types with RTM_NEWLINK, they don't support deleting bridges, and there is no compile-time way to figure this out. This patch moves the body of the SIOCBRDELBR version of virNetDevBridgeDelete() into a static function, calls the new function from the original, and also calls the new function from the RTM_DELLINK version if the RTM_DELLINK message generates an EOPNOTSUPP error. (Ugly Hack Alert: because the RTM_DELLINK message is sent from a different function (virNetDevDelLink()), not directly from virNetDevBridgeDelete(), and the other caller of that function needs to *not* have special behavior on EOPNOTSUPP, virNetlinkDelLink() has an added "silentEOPNOTSUPP" argument, which is set to true when virNetlinkDelLink should be silent when encountering EOPNOTSUPP, and instead return -2 so that the caller can try to recover before logging an error. This was done to avoid duplicating the entire body of virNetlinkDelLink() inside virNetDevBridgeDelete(). I am also preparing a patch that does this duplication, but avoids the tricky special boolean and return value. If anyone objects in the slightest to this version, I will gladly use the other. Mostly I wrote this version to illustrate how ugly it would be to eliminate the duplicate code :-)) This resolves a similar issue to that reported in: https://bugzilla.redhat.com/show_bug.cgi?id=1252780 --- src/util/virnetdevbridge.c | 48 +++++++++++++++++++++++++++++++++++---------- src/util/virnetdevmacvlan.c | 2 +- src/util/virnetlink.c | 17 ++++++++++++---- src/util/virnetlink.h | 2 +- 4 files changed, 53 insertions(+), 16 deletions(-) diff --git a/src/util/virnetdevbridge.c b/src/util/virnetdevbridge.c index ae38901..19eaf7b 100644 --- a/src/util/virnetdevbridge.c +++ b/src/util/virnetdevbridge.c @@ -558,16 +558,9 @@ int virNetDevBridgeCreate(const char *brname) * * Returns 0 in case of success or an errno code in case of failure. */ -#if defined(__linux__) && defined(HAVE_LIBNL) -int virNetDevBridgeDelete(const char *brname) -{ - /* If netlink is available, use it, as it is successful at - * deleting a bridge even if it is currently IFF_UP. - */ - return virNetlinkDelLink(brname); -} -#elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) -int virNetDevBridgeDelete(const char *brname) +#if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) +static int +virNetDevBridgeDeleteWithIoctl(const char *brname) { int fd = -1; int ret = -1; @@ -587,6 +580,41 @@ int virNetDevBridgeDelete(const char *brname) VIR_FORCE_CLOSE(fd); return ret; } +#endif + + +#if defined(__linux__) && defined(HAVE_LIBNL) +int +virNetDevBridgeDelete(const char *brname) +{ + /* If netlink is available, use it, as it is successful at + * deleting a bridge even if it is currently IFF_UP. If it fails + * with EOPNOTSUPP, itwill return a special "retry with something + * older" code, -2. + */ + int ret; + +# if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) + ret = virNetlinkDelLink(brname, true); + if (ret == -2) { + ignore_value(virNetDevSetOnline(brname, false)); + ret = virNetDevBridgeDeleteWithIoctl(brname); + } +# else + ret = virNetlinkDelLink(brname, false); +# endif + return ret; +} + + +#elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) +int +virNetDevBridgeDelete(const char *brname) +{ + return virNetDevBridgeDeleteWithIoctl(brname); +} + + #elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCIFDESTROY) int virNetDevBridgeDelete(const char *brname) { diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index 213b8eb..48403b3 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -220,7 +220,7 @@ virNetDevMacVLanCreate(const char *ifname, */ int virNetDevMacVLanDelete(const char *ifname) { - return virNetlinkDelLink(ifname); + return virNetlinkDelLink(ifname, false); } diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c index 0052ef9..ae7b9c8 100644 --- a/src/util/virnetlink.c +++ b/src/util/virnetlink.c @@ -281,15 +281,17 @@ int virNetlinkCommand(struct nl_msg *nl_msg, * virNetlinkDelLink: * * @ifname: Name of the link + * @silentEOPNOTSUPP * * delete a network "link" (aka interface aka device) with the given * name. This works for many different types of network devices, * including macvtap and bridges. * - * Returns 0 on success, -1 on fatal error. + * Returns 0 on success, -1 on fatal error, -2 if EOPNOTSUPP (meaning + * the caller will need to retry the operation with an alternate method) */ int -virNetlinkDelLink(const char *ifname) +virNetlinkDelLink(const char *ifname, bool silentEOPNOTSUPP) { int rc = -1; struct nlmsghdr *resp = NULL; @@ -324,7 +326,13 @@ virNetlinkDelLink(const char *ifname) err = (struct nlmsgerr *)NLMSG_DATA(resp); if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err))) goto malformed_resp; - + if (err->error == -EOPNOTSUPP && silentEOPNOTSUPP) { + /* give the caller a chance to recover without + * logging an error. + */ + rc = -2; + goto cleanup; + } if (err->error) { virReportSystemError(-err->error, _("error destroying network device %s"), @@ -886,7 +894,8 @@ int virNetlinkCommand(struct nl_msg *nl_msg ATTRIBUTE_UNUSED, int -virNetlinkDelLink(const char *ifname ATTRIBUTE_UNUSED) +virNetlinkDelLink(const char *ifname ATTRIBUTE_UNUSED, + bool silentEOPNOTSUPP ATTRIBUTE_UNUSED) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _(unsupported)); return -1; diff --git a/src/util/virnetlink.h b/src/util/virnetlink.h index 06c3cd0..bf9f914 100644 --- a/src/util/virnetlink.h +++ b/src/util/virnetlink.h @@ -51,7 +51,7 @@ int virNetlinkCommand(struct nl_msg *nl_msg, struct nlmsghdr **resp, unsigned int *respbuflen, uint32_t src_pid, uint32_t dst_pid, unsigned int protocol, unsigned int groups); -int virNetlinkDelLink(const char *ifname); +int virNetlinkDelLink(const char *ifname, bool silentEOPNOTSUPP); int virNetlinkGetErrorCode(struct nlmsghdr *resp, unsigned int recvbuflen); -- 2.1.0

commit 09778e09 switched from using ioctl(SIOCBRDELBR) for bridge device deletion to using a netlink RTM_DELLINK message, which is the more modern way to delete a bridge (and also doesn't require the bridge to be ~IFF_UP to succeed). However, although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support deleting *some* link types with RTM_NEWLINK, they don't support deleting bridges, and there is no compile-time way to figure this out. This patch moves the body of the SIOCBRDELBR version of virNetDevBridgeDelete() into a static function, calls the new function from the original, and also calls the new function from the RTM_DELLINK version if the RTM_DELLINK message generates an EOPNOTSUPP error (this version of the function had to be copied essentially verbatim from virNetlinkDelLink() rather than just calling virNetlinkDelLink() in order to avoid logging the "Operation Not Supported" error prior to retrying with SIOCBRDELBR) This resolves a similar issue to that reported in: https://bugzilla.redhat.com/show_bug.cgi?id=1252780 --- Note: this is the version that is simpler, but duplicates an entire function src/util/virnetdevbridge.c | 114 ++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 103 insertions(+), 11 deletions(-) diff --git a/src/util/virnetdevbridge.c b/src/util/virnetdevbridge.c index ae38901..779b068 100644 --- a/src/util/virnetdevbridge.c +++ b/src/util/virnetdevbridge.c @@ -558,16 +558,9 @@ int virNetDevBridgeCreate(const char *brname) * * Returns 0 in case of success or an errno code in case of failure. */ -#if defined(__linux__) && defined(HAVE_LIBNL) -int virNetDevBridgeDelete(const char *brname) -{ - /* If netlink is available, use it, as it is successful at - * deleting a bridge even if it is currently IFF_UP. - */ - return virNetlinkDelLink(brname); -} -#elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) -int virNetDevBridgeDelete(const char *brname) +#if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) +static int +virNetDevBridgeDeleteWithIoctl(const char *brname) { int fd = -1; int ret = -1; @@ -587,8 +580,107 @@ int virNetDevBridgeDelete(const char *brname) VIR_FORCE_CLOSE(fd); return ret; } +#endif + + +#if defined(__linux__) && defined(HAVE_LIBNL) +int +virNetDevBridgeDelete(const char *brname) +{ + /* If netlink is available, use it, as it is successful at + * deleting a bridge even if it is currently IFF_UP. + */ + int rc = -1; + struct nlmsghdr *resp = NULL; + struct nlmsgerr *err; + struct ifinfomsg ifinfo = { .ifi_family = AF_UNSPEC }; + unsigned int recvbuflen; + struct nl_msg *nl_msg; + + nl_msg = nlmsg_alloc_simple(RTM_DELLINK, + NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL); + if (!nl_msg) { + virReportOOMError(); + return -1; + } + + if (nlmsg_append(nl_msg, &ifinfo, sizeof(ifinfo), NLMSG_ALIGNTO) < 0) + goto buffer_too_small; + + if (nla_put(nl_msg, IFLA_IFNAME, strlen(brname)+1, brname) < 0) + goto buffer_too_small; + + if (virNetlinkCommand(nl_msg, &resp, &recvbuflen, 0, 0, + NETLINK_ROUTE, 0) < 0) { + goto cleanup; + } + + if (recvbuflen < NLMSG_LENGTH(0) || resp == NULL) + goto malformed_resp; + + switch (resp->nlmsg_type) { + case NLMSG_ERROR: + err = (struct nlmsgerr *)NLMSG_DATA(resp); + if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err))) + goto malformed_resp; + + switch (err->error) { + case 0: + break; + case -EOPNOTSUPP: +# if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) + /* fallback to ioctl if netlink doesn't support deleting + * bridges + */ + ignore_value(virNetDevSetOnline(brname, false)); + rc = virNetDevBridgeDeleteWithIoctl(brname); + goto cleanup; +# endif + /* intentionally fall through if virNetDevBridgeDeleteWithIoctl() + * isn't available. + */ + default: + virReportSystemError(-err->error, + _("error destroying bridge interface %s"), + brname); + goto cleanup; + } + break; + + case NLMSG_DONE: + break; + default: + goto malformed_resp; + } + + rc = 0; + cleanup: + nlmsg_free(nl_msg); + VIR_FREE(resp); + return rc; + + malformed_resp: + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("malformed netlink response message")); + goto cleanup; + buffer_too_small: + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("allocated netlink buffer is too small")); + goto cleanup; +} + + +#elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) +int +virNetDevBridgeDelete(const char *brname) +{ + return virNetDevBridgeDeleteWithIoctl(brname); +} + + #elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCIFDESTROY) -int virNetDevBridgeDelete(const char *brname) +int +virNetDevBridgeDelete(const char *brname) { int s; struct ifreq ifr; -- 2.1.0

On Tue, Aug 25, 2015 at 11:47:53PM -0400, Laine Stump wrote:
These two patches fix this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1252780
which was also separate reported in libvirt-users a couple days ago.
Most people running CentOS6 or RHEL6 also run the downstream version of libvirt-0.10.2 that has many patches backported from later upstream releases, but not the patch described in the commit logs of these proposed patches. And the problem being fixed here only occurs on a 2.6.x kernel, while most people running upstream libvirt are running something much more modern. That's why not many people have seen the bug that these patches fix. Still, we claim to fully support upstream on RHEL6/CentOS6, so we'd better do it right!
Two notes:
1) I tried timing 20 iterations of net-start...net-delete with these patches vs. a libvirt build that simply calls the ioctls in the beginnng (i.e. it doesn't try netlink then fall back to ioctl if that fails), and the time to completion was within 0.25%, so the extra overhead of trying one then the other isn't significant even on the older machines that will always fall back to ioctl.
2) There are 2 different versions of Patch 2/2. I did the first one as an exercise to see how ugly it would be to *not* duplicate the entire virNetlinkDelLink() function inside the new virNetDevBridgeDelete(). It was pretty ugly, so I made a cleaner version that duplicates the code instead. I would be okay pushing either version, but I think I slightly prefer the one that duplicates code and remains uncomplicated.
To be honest, I dislike both approaches. Sure, the second one is not that messy, but looking at those patches I have another idea. At first I thought that since there are not that many callers of virNetlinkDelLink(), what if we did not report the error inside that function at all and just return the errno and callers would report errors themselves. But that's not good either, virNetlinkDelLink() has very nice error reporting. So what if we took the second version, but the code that's duplicated (e.g. preparing netlink messages) that can fail by itself could be moved into its own functions that would be called in both virNetlinkDelLink() and virNetDevBridgeDelete() so that the duplication would be minimal and more prone to differentiation in the future.
Laine Stump (2): util: fallback to ioctl(SIOCBRADDBR) if netlink RTM_NEWLINK fails util: fallback to ioctl(SIOCBRDELBR) if netlink RTM_DELLINK fails
src/util/virnetdevbridge.c | 112 ++++++++++++++++++++++++++++++++------------ src/util/virnetdevmacvlan.c | 2 +- src/util/virnetlink.c | 17 +++++-- src/util/virnetlink.h | 2 +- 4 files changed, 98 insertions(+), 35 deletions(-)
-- 2.1.0
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 08/26/2015 05:01 AM, Martin Kletzander wrote:
On Tue, Aug 25, 2015 at 11:47:53PM -0400, Laine Stump wrote:
These two patches fix this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1252780
which was also separate reported in libvirt-users a couple days ago.
Most people running CentOS6 or RHEL6 also run the downstream version of libvirt-0.10.2 that has many patches backported from later upstream releases, but not the patch described in the commit logs of these proposed patches. And the problem being fixed here only occurs on a 2.6.x kernel, while most people running upstream libvirt are running something much more modern. That's why not many people have seen the bug that these patches fix. Still, we claim to fully support upstream on RHEL6/CentOS6, so we'd better do it right!
Two notes:
1) I tried timing 20 iterations of net-start...net-delete with these patches vs. a libvirt build that simply calls the ioctls in the beginnng (i.e. it doesn't try netlink then fall back to ioctl if that fails), and the time to completion was within 0.25%, so the extra overhead of trying one then the other isn't significant even on the older machines that will always fall back to ioctl.
2) There are 2 different versions of Patch 2/2. I did the first one as an exercise to see how ugly it would be to *not* duplicate the entire virNetlinkDelLink() function inside the new virNetDevBridgeDelete(). It was pretty ugly, so I made a cleaner version that duplicates the code instead. I would be okay pushing either version, but I think I slightly prefer the one that duplicates code and remains uncomplicated.
To be honest, I dislike both approaches. Sure, the second one is not that messy, but looking at those patches I have another idea. At first I thought that since there are not that many callers of virNetlinkDelLink(), what if we did not report the error inside that function at all and just return the errno and callers would report errors themselves. But that's not good either, virNetlinkDelLink() has very nice error reporting.
Right - there is too much inormation that would be lost by leaving error reporting up to the caller, and that leads to bug reports of the type "This is broken." rather than "This has a problem with XYZ".
So what if we took the second version, but the code that's duplicated (e.g. preparing netlink messages) that can fail by itself could be moved into its own functions
The "s" on the end of that is the crucial part - it would take two functions, one for the part before checking the error code, and one for the part after. How about this: give virNetlinkDelLink() a new argument "fallbackDeleteFunc" which would either be NULL, or a pointer to virNetDevBridgeDeleteWithIoctl() (in this case)? That avoids needing to split virNetlinkDelLink into multiple parts as well as the duplication of code, and I think it's much less obtuse than the boolean thing of V1.

On 26.08.2015 17:53, Laine Stump wrote:
<snip/>
How about this: give virNetlinkDelLink() a new argument "fallbackDeleteFunc" which would either be NULL, or a pointer to virNetDevBridgeDeleteWithIoctl() (in this case)? That avoids needing to split virNetlinkDelLink into multiple parts as well as the duplication of code, and I think it's much less obtuse than the boolean thing of V1.
D'oh! I mean, if someone calls virNetlink*() he can assume that the function will talk to kernel via netlink. This would break the assumption. But I because I am unable to come up with a better idea, I think it's our best option. Michal

On Wed, Aug 26, 2015 at 07:22:10PM +0200, Michal Privoznik wrote:
On 26.08.2015 17:53, Laine Stump wrote:
<snip/>
How about this: give virNetlinkDelLink() a new argument "fallbackDeleteFunc" which would either be NULL, or a pointer to virNetDevBridgeDeleteWithIoctl() (in this case)? That avoids needing to split virNetlinkDelLink into multiple parts as well as the duplication of code, and I think it's much less obtuse than the boolean thing of V1.
D'oh! I mean, if someone calls virNetlink*() he can assume that the function will talk to kernel via netlink. This would break the assumption. But I because I am unable to come up with a better idea, I think it's our best option.
As hacky as the previous one, but at least clean and understandable, in my opinion.

commit 09778e09 switched from using ioctl(SIOCBRDELBR) for bridge device deletion to using a netlink RTM_DELLINK message, which is the more modern way to delete a bridge (and also doesn't require the bridge to be ~IFF_UP to succeed). However, although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support deleting *some* link types with RTM_NEWLINK, they don't support deleting bridges, and there is no compile-time way to figure this out. This patch moves the body of the SIOCBRDELBR version of virNetDevBridgeDelete() into a static function, calls the new function from the original, and also calls the new function from the RTM_DELLINK version if the RTM_DELLINK message generates an EOPNOTSUPP error. Since RTM_DELLINK is done from the subordinate function virNetlinkDelLink, which is also called for other purposes (deleting a macvtap interface), a function pointer called "fallback" has been added to the arglist of virNetlinkDelLink() - if that arg != NULL, the provided function will be called when (and only when) RTM_DELLINK fails with EOPNOTSUPP. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1252780 (part 2) --- Another idea for doing this I came up with during review of the others... src/util/virnetdevbridge.c | 41 ++++++++++++++++++++++++++++++----------- src/util/virnetdevmacvlan.c | 2 +- src/util/virnetlink.c | 13 +++++++++++-- src/util/virnetlink.h | 5 ++++- 4 files changed, 46 insertions(+), 15 deletions(-) diff --git a/src/util/virnetdevbridge.c b/src/util/virnetdevbridge.c index ae38901..3b06829 100644 --- a/src/util/virnetdevbridge.c +++ b/src/util/virnetdevbridge.c @@ -558,20 +558,15 @@ int virNetDevBridgeCreate(const char *brname) * * Returns 0 in case of success or an errno code in case of failure. */ -#if defined(__linux__) && defined(HAVE_LIBNL) -int virNetDevBridgeDelete(const char *brname) -{ - /* If netlink is available, use it, as it is successful at - * deleting a bridge even if it is currently IFF_UP. - */ - return virNetlinkDelLink(brname); -} -#elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) -int virNetDevBridgeDelete(const char *brname) +#if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) +static int +virNetDevBridgeDeleteWithIoctl(const char *brname) { int fd = -1; int ret = -1; + ignore_value(virNetDevSetOnline(brname, false)); + if ((fd = virNetDevSetupControl(NULL, NULL)) < 0) return -1; @@ -587,8 +582,32 @@ int virNetDevBridgeDelete(const char *brname) VIR_FORCE_CLOSE(fd); return ret; } +#endif + + +#if defined(__linux__) && defined(HAVE_LIBNL) +int +virNetDevBridgeDelete(const char *brname) +{ + /* If netlink is available, use it, as it is successful at + * deleting a bridge even if it is currently IFF_UP. fallback to + * using ioctl(SIOCBRDELBR) if netlink fails with EOPNOTSUPP. + */ + return virNetlinkDelLink(brname, virNetDevBridgeDeleteWithIoctl); +} + + +#elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) +int +virNetDevBridgeDelete(const char *brname) +{ + return virNetDevBridgeDeleteWithIoctl(brname); +} + + #elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCIFDESTROY) -int virNetDevBridgeDelete(const char *brname) +int +virNetDevBridgeDelete(const char *brname) { int s; struct ifreq ifr; diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index 9c4da1f..954736a 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -220,7 +220,7 @@ virNetDevMacVLanCreate(const char *ifname, */ int virNetDevMacVLanDelete(const char *ifname) { - return virNetlinkDelLink(ifname); + return virNetlinkDelLink(ifname, NULL); } diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c index 0052ef9..0276522 100644 --- a/src/util/virnetlink.c +++ b/src/util/virnetlink.c @@ -281,6 +281,10 @@ int virNetlinkCommand(struct nl_msg *nl_msg, * virNetlinkDelLink: * * @ifname: Name of the link + * @fallback: pointer to an alternate function that will + * be called to perform the delete if RTM_DELLINK fails + * with EOPNOTSUPP (any other error will simply be treated + * as an error). * * delete a network "link" (aka interface aka device) with the given * name. This works for many different types of network devices, @@ -289,7 +293,7 @@ int virNetlinkCommand(struct nl_msg *nl_msg, * Returns 0 on success, -1 on fatal error. */ int -virNetlinkDelLink(const char *ifname) +virNetlinkDelLink(const char *ifname, virNetlinkDelLinkFallback fallback) { int rc = -1; struct nlmsghdr *resp = NULL; @@ -325,6 +329,10 @@ virNetlinkDelLink(const char *ifname) if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err))) goto malformed_resp; + if (-err->error == EOPNOTSUPP && fallback) { + rc = fallback(ifname); + goto cleanup; + } if (err->error) { virReportSystemError(-err->error, _("error destroying network device %s"), @@ -886,7 +894,8 @@ int virNetlinkCommand(struct nl_msg *nl_msg ATTRIBUTE_UNUSED, int -virNetlinkDelLink(const char *ifname ATTRIBUTE_UNUSED) +virNetlinkDelLink(const char *ifname ATTRIBUTE_UNUSED, + virNetlinkDelLinkFallback fallback ATTRIBUTE_UNUSED) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _(unsupported)); return -1; diff --git a/src/util/virnetlink.h b/src/util/virnetlink.h index 06c3cd0..0664a7a 100644 --- a/src/util/virnetlink.h +++ b/src/util/virnetlink.h @@ -51,7 +51,10 @@ int virNetlinkCommand(struct nl_msg *nl_msg, struct nlmsghdr **resp, unsigned int *respbuflen, uint32_t src_pid, uint32_t dst_pid, unsigned int protocol, unsigned int groups); -int virNetlinkDelLink(const char *ifname); + +typedef int (*virNetlinkDelLinkFallback)(const char *ifname); + +int virNetlinkDelLink(const char *ifname, virNetlinkDelLinkFallback fallback); int virNetlinkGetErrorCode(struct nlmsghdr *resp, unsigned int recvbuflen); -- 2.1.0

On 26.08.2015 21:05, Laine Stump wrote:
commit 09778e09 switched from using ioctl(SIOCBRDELBR) for bridge device deletion to using a netlink RTM_DELLINK message, which is the more modern way to delete a bridge (and also doesn't require the bridge to be ~IFF_UP to succeed). However, although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support deleting *some* link types with RTM_NEWLINK, they don't support deleting bridges, and there is no compile-time way to figure this out.
This patch moves the body of the SIOCBRDELBR version of virNetDevBridgeDelete() into a static function, calls the new function from the original, and also calls the new function from the RTM_DELLINK version if the RTM_DELLINK message generates an EOPNOTSUPP error. Since RTM_DELLINK is done from the subordinate function virNetlinkDelLink, which is also called for other purposes (deleting a macvtap interface), a function pointer called "fallback" has been added to the arglist of virNetlinkDelLink() - if that arg != NULL, the provided function will be called when (and only when) RTM_DELLINK fails with EOPNOTSUPP.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1252780 (part 2) --- Another idea for doing this I came up with during review of the others...
src/util/virnetdevbridge.c | 41 ++++++++++++++++++++++++++++++----------- src/util/virnetdevmacvlan.c | 2 +- src/util/virnetlink.c | 13 +++++++++++-- src/util/virnetlink.h | 5 ++++- 4 files changed, 46 insertions(+), 15 deletions(-)
ACK. But don't forget that the counterpart is broken too. Once you destroy network, on RHEL-6, you can't start it up, because of the same problem. Michal

On 08/26/2015 09:24 PM, Michal Privoznik wrote:
commit 09778e09 switched from using ioctl(SIOCBRDELBR) for bridge device deletion to using a netlink RTM_DELLINK message, which is the more modern way to delete a bridge (and also doesn't require the bridge to be ~IFF_UP to succeed). However, although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support deleting *some* link types with RTM_NEWLINK, they don't support deleting bridges, and there is no compile-time way to figure this out.
This patch moves the body of the SIOCBRDELBR version of virNetDevBridgeDelete() into a static function, calls the new function from the original, and also calls the new function from the RTM_DELLINK version if the RTM_DELLINK message generates an EOPNOTSUPP error. Since RTM_DELLINK is done from the subordinate function virNetlinkDelLink, which is also called for other purposes (deleting a macvtap interface), a function pointer called "fallback" has been added to the arglist of virNetlinkDelLink() - if that arg != NULL, the provided function will be called when (and only when) RTM_DELLINK fails with EOPNOTSUPP.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1252780 (part 2) --- Another idea for doing this I came up with during review of the others...
src/util/virnetdevbridge.c | 41 ++++++++++++++++++++++++++++++----------- src/util/virnetdevmacvlan.c | 2 +- src/util/virnetlink.c | 13 +++++++++++-- src/util/virnetlink.h | 5 ++++- 4 files changed, 46 insertions(+), 15 deletions(-) ACK. But don't forget that the counterpart is broken too. Once you destroy network, on RHEL-6, you can't start it up, because of the same
On 26.08.2015 21:05, Laine Stump wrote: problem.
Yeah, that was patch 1/2 (which hasn't been reviewed yet (hint hint :-)

On Wed, Aug 26, 2015 at 03:05:25PM -0400, Laine Stump wrote:
commit 09778e09 switched from using ioctl(SIOCBRDELBR) for bridge device deletion to using a netlink RTM_DELLINK message, which is the more modern way to delete a bridge (and also doesn't require the bridge to be ~IFF_UP to succeed). However, although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support deleting *some* link types with RTM_NEWLINK, they don't support deleting bridges, and there is no compile-time way to figure this out.
This patch moves the body of the SIOCBRDELBR version of virNetDevBridgeDelete() into a static function, calls the new function from the original, and also calls the new function from the RTM_DELLINK version if the RTM_DELLINK message generates an EOPNOTSUPP error. Since RTM_DELLINK is done from the subordinate function virNetlinkDelLink, which is also called for other purposes (deleting a macvtap interface), a function pointer called "fallback" has been added to the arglist of virNetlinkDelLink() - if that arg != NULL, the provided function will be called when (and only when) RTM_DELLINK fails with EOPNOTSUPP.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1252780 (part 2) --- Another idea for doing this I came up with during review of the others...
src/util/virnetdevbridge.c | 41 ++++++++++++++++++++++++++++++----------- src/util/virnetdevmacvlan.c | 2 +- src/util/virnetlink.c | 13 +++++++++++-- src/util/virnetlink.h | 5 ++++- 4 files changed, 46 insertions(+), 15 deletions(-)
diff --git a/src/util/virnetdevbridge.c b/src/util/virnetdevbridge.c index ae38901..3b06829 100644 --- a/src/util/virnetdevbridge.c +++ b/src/util/virnetdevbridge.c @@ -558,20 +558,15 @@ int virNetDevBridgeCreate(const char *brname) * * Returns 0 in case of success or an errno code in case of failure. */ -#if defined(__linux__) && defined(HAVE_LIBNL) -int virNetDevBridgeDelete(const char *brname) -{ - /* If netlink is available, use it, as it is successful at - * deleting a bridge even if it is currently IFF_UP. - */ - return virNetlinkDelLink(brname); -} -#elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) -int virNetDevBridgeDelete(const char *brname) +#if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) +static int +virNetDevBridgeDeleteWithIoctl(const char *brname) { int fd = -1; int ret = -1;
+ ignore_value(virNetDevSetOnline(brname, false)); + if ((fd = virNetDevSetupControl(NULL, NULL)) < 0) return -1;
@@ -587,8 +582,32 @@ int virNetDevBridgeDelete(const char *brname) VIR_FORCE_CLOSE(fd); return ret; } +#endif + + +#if defined(__linux__) && defined(HAVE_LIBNL) +int +virNetDevBridgeDelete(const char *brname) +{ + /* If netlink is available, use it, as it is successful at + * deleting a bridge even if it is currently IFF_UP. fallback to + * using ioctl(SIOCBRDELBR) if netlink fails with EOPNOTSUPP. + */ + return virNetlinkDelLink(brname, virNetDevBridgeDeleteWithIoctl);
You're passing DeleteWithIoctl here, but that was defined only if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR), does it mean that if you have libnl on linux, you also have those two? If not, then I suggest adding check for HAVE_STRUCT_IFREQ && SIOCBRDELBR definitions atop this function definition as well, just to make sure we don't run into similar problems later on with some weird distribution/custom builds/versions. No need to resend, this is just a hint before pushing. Martin

On 08/27/2015 06:07 AM, Martin Kletzander wrote:
On Wed, Aug 26, 2015 at 03:05:25PM -0400, Laine Stump wrote:
commit 09778e09 switched from using ioctl(SIOCBRDELBR) for bridge device deletion to using a netlink RTM_DELLINK message, which is the more modern way to delete a bridge (and also doesn't require the bridge to be ~IFF_UP to succeed). However, although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support deleting *some* link types with RTM_NEWLINK, they don't support deleting bridges, and there is no compile-time way to figure this out.
This patch moves the body of the SIOCBRDELBR version of virNetDevBridgeDelete() into a static function, calls the new function from the original, and also calls the new function from the RTM_DELLINK version if the RTM_DELLINK message generates an EOPNOTSUPP error. Since RTM_DELLINK is done from the subordinate function virNetlinkDelLink, which is also called for other purposes (deleting a macvtap interface), a function pointer called "fallback" has been added to the arglist of virNetlinkDelLink() - if that arg != NULL, the provided function will be called when (and only when) RTM_DELLINK fails with EOPNOTSUPP.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1252780 (part 2) --- Another idea for doing this I came up with during review of the others...
src/util/virnetdevbridge.c | 41 ++++++++++++++++++++++++++++++----------- src/util/virnetdevmacvlan.c | 2 +- src/util/virnetlink.c | 13 +++++++++++-- src/util/virnetlink.h | 5 ++++- 4 files changed, 46 insertions(+), 15 deletions(-)
diff --git a/src/util/virnetdevbridge.c b/src/util/virnetdevbridge.c index ae38901..3b06829 100644 --- a/src/util/virnetdevbridge.c +++ b/src/util/virnetdevbridge.c @@ -558,20 +558,15 @@ int virNetDevBridgeCreate(const char *brname) * * Returns 0 in case of success or an errno code in case of failure. */ -#if defined(__linux__) && defined(HAVE_LIBNL) -int virNetDevBridgeDelete(const char *brname) -{ - /* If netlink is available, use it, as it is successful at - * deleting a bridge even if it is currently IFF_UP. - */ - return virNetlinkDelLink(brname); -} -#elif defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) -int virNetDevBridgeDelete(const char *brname) +#if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR) +static int +virNetDevBridgeDeleteWithIoctl(const char *brname) { int fd = -1; int ret = -1;
+ ignore_value(virNetDevSetOnline(brname, false)); + if ((fd = virNetDevSetupControl(NULL, NULL)) < 0) return -1;
@@ -587,8 +582,32 @@ int virNetDevBridgeDelete(const char *brname) VIR_FORCE_CLOSE(fd); return ret; } +#endif + + +#if defined(__linux__) && defined(HAVE_LIBNL) +int +virNetDevBridgeDelete(const char *brname) +{ + /* If netlink is available, use it, as it is successful at + * deleting a bridge even if it is currently IFF_UP. fallback to + * using ioctl(SIOCBRDELBR) if netlink fails with EOPNOTSUPP. + */ + return virNetlinkDelLink(brname, virNetDevBridgeDeleteWithIoctl);
You're passing DeleteWithIoctl here, but that was defined only if defined(HAVE_STRUCT_IFREQ) && defined(SIOCBRDELBR), does it mean that if you have libnl on linux, you also have those two? If not, then I suggest adding check for HAVE_STRUCT_IFREQ && SIOCBRDELBR definitions atop this function definition as well, just to make sure we don't run into similar problems later on with some weird distribution/custom builds/versions. No need to resend, this is just a hint before pushing.
Yeah, thanks for pointing that out. I had that check in v2, but forgot it here.
participants (3)
-
Laine Stump
-
Martin Kletzander
-
Michal Privoznik