[libvirt] [PATCH 0/2] account for change in netlink kernel ABI

The first of these two patches changes libvirt to account for a change in netlink's kernel ABI that happened upstream several months ago. The second patch adds in error logs to the functions that failed after the kernel change. Laine Stump (2): util: fix functions that retrieve SRIOV VF info util: add missing error log messages when failing to get netlink VFINFO src/util/virnetdev.c | 87 +++++++++++++++++++++++++++++----------------------- 1 file changed, 49 insertions(+), 38 deletions(-) -- 1.7.11.7

This patch resolves: https://bugzilla.redhat.com/show_bug.cgi?id=889319 When assigning an SRIOV virtual function to a guest using "intelligent PCI passthrough" (<interface type='hostdev'>, which sets the MAC address and vlan tag of the VF before passing its info to qemu), libvirt first learns the current MAC address and vlan tag by sending an NLM_F_REQUEST message for the VF's PF (physical function) to the kernel via a NETLINK_ROUTE socket (see virNetDevLinkDump()); the response message's IFLA_VFINFO_LIST section is examined to extract the info for the particular VF being assigned. This worked fine with kernels up until kernel commit 115c9b81928360d769a76c632bae62d15206a94a (first appearing in upstream kernel 3.3) which changed the ABI to not return IFLA_VFINFO_LIST in the response until a newly introduced IFLA_EXT_MASK field was included in the request, with the (newly introduced, of course) RTEXT_FILTER_VF flag set. The justification for this ABI change was that new fields had been added to the VFINFO, causing NLM_F_REQUEST messages to fail on systems with large numbers of VFs if the requesting application didn't have a large enough buffer for all the info. The idea is that most applications doing an NLM_F_REQUEST don't care about VFINFO anyway, so eliminating it from the response would lower the requirements on buffer size. Apparently, the people who pushed this patch made the mistaken assumption that iproute2 (the "ip" command) was the only package that used IFLA_VFINFO_LIST, so it wouldn't break anything else (and they made sure that iproute2 was fixed. The logic of this "fix" is debatable at best (one could claim that the proper fix would be for the applications in question to be fixed so that they properly sized the buffer, which is what libvirt does (purely by virtue of using libnl), but it is what it is and we have to deal with it. In order for <interface type='hostdev'> to work properly on systems with a kernel 3.3 or later, libvirt needs to add the afore-mentioned IFLA_EXT_MASK field with RTEXT_FILTER_VF set. Of course we also need to continue working on systems with older kernels, so that one bit of code is compiled conditionally. The one time this could cause problems is if the libvirt binary was built on a system without IFLA_EXT_MASK which was subsequently updated to a kernel that *did* have it. That could be solved by manually providing the values of IFLA_EXT_MASK and RTEXT_FILTER_VF and adding it to the message anyway, but I'm uncertain what that might actually do on a system that didn't support the message, so for the time being we'll just fail in that case (which will very likely never happen anyway). --- src/util/virnetdev.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index c345013..205fd93 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1275,6 +1275,15 @@ virNetDevLinkDump(const char *ifname, int ifindex, goto buffer_too_small; } +# if defined(IFLA_EXT_MASK) && defined(RTEXT_FILTER_VF) + /* if this filter exists in the kernel's netlink implementation, + * we need to set it, otherwise the response message will not + * contain the IFLA_VFINFO_LIST that we're looking for. + */ + if (nla_put(nl_msg, IFLA_EXT_MASK, RTEXT_FILTER_VF) < 0) + goto buffer_too_small; +# endif + if (virNetlinkCommand(nl_msg, recvbuf, &recvbuflen, src_pid, dst_pid, NETLINK_ROUTE, 0) < 0) goto cleanup; -- 1.7.11.7

On 12/20/2012 01:01 PM, Laine Stump wrote:
This patch resolves:
Of course we also need to continue working on systems with older kernels, so that one bit of code is compiled conditionally. The one time this could cause problems is if the libvirt binary was built on a system without IFLA_EXT_MASK which was subsequently updated to a kernel that *did* have it. That could be solved by manually providing the values of IFLA_EXT_MASK and RTEXT_FILTER_VF and adding it to the message anyway, but I'm uncertain what that might actually do on a system that didn't support the message, so for the time being we'll just fail in that case (which will very likely never happen anyway).
Well, there are indeed people that like to compile against the oldest supported kernel then run that same binary across a range of kernels [1]; but if someone IS interested in that scenario, they can go ahead and submit a followon patch at that time. Besides, most people that use pre-built libvirt get it from a distro, and distros happen to match their builds to their kernels; not to mention that with open source, you can always recompile yourself to pick up any differences in the build dependencies you want to have. [1] I've noticed that the most common case of building against the oldest supported version then reusing that binary happens in the case of non-free software - after all, it is easier to ship binary-only executables that work across the widest range of systems than it is to have one binary per system, if you aren't going to give your users the freedom of recompiling from source.
+# if defined(IFLA_EXT_MASK) && defined(RTEXT_FILTER_VF) + /* if this filter exists in the kernel's netlink implementation, + * we need to set it, otherwise the response message will not + * contain the IFLA_VFINFO_LIST that we're looking for. + */ + if (nla_put(nl_msg, IFLA_EXT_MASK, RTEXT_FILTER_VF) < 0) + goto buffer_too_small; +# endif +
ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

This patch fixes the lack of error messages when libvirt fails to find VFINFO in a returned netlinke response message. https://bugzilla.redhat.com/show_bug.cgi?id=827519#c10 is an example of the error message that was previously logged when the IFLA_VFINFO_LIST object was missing from the netlink response. The reason for this failure is detailed in https://bugzilla.redhat.com/show_bug.cgi?id=889319 Even though that root problem has been fixed, the experience of finding the root cause shows us how important it is to properly log an error message in these cases. This patch *seems* to replace the entire function, but really most of the changes are due to moving code that was previously inside an if() statement out to the top level of the function (the original if() was reversed and made to log an error and return). --- src/util/virnetdev.c | 78 +++++++++++++++++++++++++++------------------------- 1 file changed, 40 insertions(+), 38 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 205fd93..f5f4562 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1465,53 +1465,55 @@ static int virNetDevParseVfConfig(struct nlattr **tb, int32_t vf, virMacAddrPtr mac, int *vlanid) { - const char *msg = NULL; int rc = -1; + struct ifla_vf_mac *vf_mac; + struct ifla_vf_vlan *vf_vlan; + struct nlattr *tb_vf_info = {NULL, }; + struct nlattr *tb_vf[IFLA_VF_MAX+1]; + int found = 0; + int rem; + + if (!tb[IFLA_VFINFO_LIST]) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("missing IFLA_VF_INFO in netlink response")); + goto cleanup; + } - if (tb[IFLA_VFINFO_LIST]) { - struct ifla_vf_mac *vf_mac; - struct ifla_vf_vlan *vf_vlan; - struct nlattr *tb_vf_info = {NULL, }; - struct nlattr *tb_vf[IFLA_VF_MAX+1]; - int found = 0; - int rem; - - nla_for_each_nested(tb_vf_info, tb[IFLA_VFINFO_LIST], rem) { - if (nla_type(tb_vf_info) != IFLA_VF_INFO) - continue; - - if (nla_parse_nested(tb_vf, IFLA_VF_MAX, tb_vf_info, - ifla_vf_policy)) { - msg = _("error parsing IFLA_VF_INFO"); - goto cleanup; - } + nla_for_each_nested(tb_vf_info, tb[IFLA_VFINFO_LIST], rem) { + if (nla_type(tb_vf_info) != IFLA_VF_INFO) + continue; - if (tb[IFLA_VF_MAC]) { - vf_mac = RTA_DATA(tb_vf[IFLA_VF_MAC]); - if (vf_mac && vf_mac->vf == vf) { - virMacAddrSetRaw(mac, vf_mac->mac); - found = 1; - } - } + if (nla_parse_nested(tb_vf, IFLA_VF_MAX, tb_vf_info, + ifla_vf_policy)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("error parsing IFLA_VF_INFO")); + goto cleanup; + } - if (tb[IFLA_VF_VLAN]) { - vf_vlan = RTA_DATA(tb_vf[IFLA_VF_VLAN]); - if (vf_vlan && vf_vlan->vf == vf) { - *vlanid = vf_vlan->vlan; - found = 1; - } + if (tb[IFLA_VF_MAC]) { + vf_mac = RTA_DATA(tb_vf[IFLA_VF_MAC]); + if (vf_mac && vf_mac->vf == vf) { + virMacAddrSetRaw(mac, vf_mac->mac); + found = 1; } - if (found) { - rc = 0; - break; + } + + if (tb[IFLA_VF_VLAN]) { + vf_vlan = RTA_DATA(tb_vf[IFLA_VF_VLAN]); + if (vf_vlan && vf_vlan->vf == vf) { + *vlanid = vf_vlan->vlan; + found = 1; } } + if (found) { + rc = 0; + goto cleanup; + } } - + virReportError(VIR_ERR_INTERNAL_ERROR, + _("couldn't find IFLA_VF_INFO for VF %d " + "in netlink response"), vf); cleanup: - if (msg) - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", msg); - return rc; } -- 1.7.11.7

On 12/20/2012 01:01 PM, Laine Stump wrote:
This patch fixes the lack of error messages when libvirt fails to find VFINFO in a returned netlinke response message.
https://bugzilla.redhat.com/show_bug.cgi?id=827519#c10 is an example of the error message that was previously logged when the IFLA_VFINFO_LIST object was missing from the netlink response. The reason for this failure is detailed in
https://bugzilla.redhat.com/show_bug.cgi?id=889319
Even though that root problem has been fixed, the experience of finding the root cause shows us how important it is to properly log an error message in these cases. This patch *seems* to replace the entire function, but really most of the changes are due to moving code that was previously inside an if() statement out to the top level of the function (the original if() was reversed and made to log an error and return). --- src/util/virnetdev.c | 78 +++++++++++++++++++++++++++------------------------- 1 file changed, 40 insertions(+), 38 deletions(-)
'git diff -b' to the rescue. I'm reviewing this simpler version that ignores whitespace:
diff --git c/src/util/virnetdev.c w/src/util/virnetdev.c index 205fd93..f5f4562 100644 --- c/src/util/virnetdev.c +++ w/src/util/virnetdev.c @@ -1465,10 +1465,7 @@ static int virNetDevParseVfConfig(struct nlattr **tb, int32_t vf, virMacAddrPtr mac, int *vlanid) { - const char *msg = NULL; int rc = -1; - - if (tb[IFLA_VFINFO_LIST]) { struct ifla_vf_mac *vf_mac; struct ifla_vf_vlan *vf_vlan; struct nlattr *tb_vf_info = {NULL, }; @@ -1476,13 +1473,20 @@ virNetDevParseVfConfig(struct nlattr **tb, int32_t vf, virMacAddrPtr mac, int found = 0; int rem;
+ if (!tb[IFLA_VFINFO_LIST]) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("missing IFLA_VF_INFO in netlink response")); + goto cleanup; + } + nla_for_each_nested(tb_vf_info, tb[IFLA_VFINFO_LIST], rem) { if (nla_type(tb_vf_info) != IFLA_VF_INFO) continue;
if (nla_parse_nested(tb_vf, IFLA_VF_MAX, tb_vf_info, ifla_vf_policy)) { - msg = _("error parsing IFLA_VF_INFO"); + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("error parsing IFLA_VF_INFO")); goto cleanup;
Yeah, reporting the error as soon as it's known instead of squirreling away a non-NULL msg to report later is nicer.
}
@@ -1503,15 +1507,13 @@ virNetDevParseVfConfig(struct nlattr **tb, int32_t vf, virMacAddrPtr mac, } if (found) { rc = 0; - break; - } + goto cleanup;
Okay, so ignoring whitespace makes this hunk look a bit odd :) But the logic is correct.
} } - + virReportError(VIR_ERR_INTERNAL_ERROR, + _("couldn't find IFLA_VF_INFO for VF %d " + "in netlink response"), vf); cleanup: - if (msg) - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", msg); - return rc; }
Sometimes, when doing a patch like this, I will intentionally break it into two commits, one that does the logic change but leaves indents wrong, and the next to fix just indents, as it is easier to review that way. But no need to rework it, you have my: ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (2)
-
Eric Blake
-
Laine Stump