[libvirt PATCH v8 0/3] Ignore EPERM on implicit clearing of VF VLAN ID

SmartNIC DPUs may not expose some privileged eswitch operations to the hypervisor hosts. For example, this happens with Bluefield devices running in the ECPF (default) mode [1] for security reasons. While VF MAC address programming is possible via an RTM_SETLINK operation, trying to set a VLAN ID in the same operation may fail with EPERM. Specifically for the mlx5 driver this behavior was altered in the Linux kernel upstream [2] to avoid getting EPERM when trying to program VLAN 0. This may also get backported to older downstream kernels (e.g. [3]). However, Libvirt could potentially handle this case gracefully without needing a specific kernel version or depending on a specific driver fix. In the kernel a relevant call chain may look like do_setlink -> do_setvfinfo -> dev->netdev_ops->set_vf_vlan which calls a driver-specific function like [4] eventually. The equivalent ip link commands below provide an illustration: 1. This will work without an error: sudo ip link set enp130s0f0 vf 2 mac de:ad:be:ef:ca:fe 2. Setting (or clearing) a VLAN will fail with EPERM: sudo ip link set enp130s0f0 vf 2 vlan 0 RTNETLINK answers: Operation not permitted 3. This is what Libvirt attempts to do today when trying to clear a VF VLAN at the same time as programming a VF MAC: sudo ip link set enp130s0f0 vf 2 vlan 0 mac de:ad:be:ef:ca:fe RTNETLINK answers: Operation not permitted If setting an explicit VLAN ID results in an EPERM, clearing a VLAN (setting a VLAN ID to 0) can be handled gracefully by ignoring the EPERM error with the rationale being that if we cannot set this state in the first place, we cannot clear it either. Thus, virNetDevSetVfConfig is split into two distinct functions. If clearing a VLAN ID fails with EPERM when clearing is implicit, the error is simply ignored. For explicit clearing EPERM is still a fatal error. Both new functions rely virNetDevSendVfSetLinkRequest that implements common functionality related to formatting a request, sending it and handling error conditions and returns 0 or an error since in both cases the payload is either NLMSG_DONE (no error) or NLMSG_ERROR where an error message is needed by the caller to handle known cases appropriately. This function allows the conditional code to be unit tested. An alternative to this could be providing a higher level control plane mechanism that would provide metadata about a device being remotely managed in which case Libvirt would avoid trying to set or clear a VLAN ID. This would be more complicated since other software (like Nova in the OpenStack case) would have to annotate every guest device with an attribute indicating whether a device is remotely managed or not based on operator provided configuration so that Libvirt can act on this and avoid VLAN programming. https://gitlab.com/dmitriis/libvirt/-/pipelines/460293963 v8 change: * Rebased on top of the latest changes to Libvirt; * Added relevant upstream Linux and downstream kernel references to this cover letter. [1] https://docs.mellanox.com/display/BlueFieldSWv35111601/Modes+of+Operation#Mo... [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... [3] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1957753 [4] https://github.com/torvalds/linux/blob/v5.15/drivers/net/ethernet/mellanox/m... Dmitrii Shcherbakov (3): Set VF MAC and VLAN ID in two different operations Allow VF vlanid to be passed as a pointer Ignore EPERM on implicit clearing of VF VLAN ID NEWS.rst | 14 ++ src/hypervisor/virhostdev.c | 4 +- src/libvirt_private.syms | 7 + src/util/virnetdev.c | 256 +++++++++++++++++++++++++----------- src/util/virnetdevpriv.h | 44 +++++++ tests/virnetdevtest.c | 249 ++++++++++++++++++++++++++++++++++- 6 files changed, 496 insertions(+), 78 deletions(-) create mode 100644 src/util/virnetdevpriv.h -- 2.32.0

This has a benefit of being able to handle error codes for those operations separately which is useful when drivers allow setting a MAC address but do not allow setting a VLAN (which is the case with some SmartNIC DPUs). Signed-off-by: Dmitrii Shcherbakov <dmitrii.shcherbakov@canonical.com> --- src/libvirt_private.syms | 7 ++ src/util/virnetdev.c | 226 ++++++++++++++++++++++++++------------ src/util/virnetdevpriv.h | 44 ++++++++ tests/virnetdevtest.c | 230 ++++++++++++++++++++++++++++++++++++++- 4 files changed, 434 insertions(+), 73 deletions(-) create mode 100644 src/util/virnetdevpriv.h diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index bc6fa191bf..e6493c2176 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2847,6 +2847,13 @@ virNetDevOpenvswitchSetTimeout; virNetDevOpenvswitchUpdateVlan; +# util/virnetdevpriv.h +virNetDevSendVfSetLinkRequest; +virNetDevSetVfConfig; +virNetDevSetVfMac; +virNetDevSetVfVlan; + + # util/virnetdevtap.h virNetDevTapAttachBridge; virNetDevTapCreate; diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index d93b2c6a83..043225d31f 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -19,7 +19,9 @@ #include <config.h> #include <math.h> -#include "virnetdev.h" +#define LIBVIRT_VIRNETDEVPRIV_H_ALLOW + +#include "virnetdevpriv.h" #include "viralloc.h" #include "virnetlink.h" #include "virmacaddr.h" @@ -1509,16 +1511,12 @@ static struct nla_policy ifla_vfstats_policy[IFLA_VF_STATS_MAX+1] = { [IFLA_VF_STATS_MULTICAST] = { .type = NLA_U64 }, }; - -static int -virNetDevSetVfConfig(const char *ifname, int vf, - const virMacAddr *macaddr, int vlanid, - bool *allowRetry) +int +virNetDevSendVfSetLinkRequest(const char *ifname, int vfInfoType, + const void *payload, const size_t payloadLen) { - int rc = -1; - char macstr[VIR_MAC_STRING_BUFLEN]; g_autofree struct nlmsghdr *resp = NULL; - struct nlmsgerr *err; + struct nlmsgerr *err = NULL; unsigned int recvbuflen = 0; struct nl_msg *nl_msg; struct nlattr *vfinfolist, *vfinfo; @@ -1526,9 +1524,11 @@ virNetDevSetVfConfig(const char *ifname, int vf, .ifi_family = AF_UNSPEC, .ifi_index = -1, }; + int rc = 0; - if (!macaddr && vlanid < 0) + if (payload == NULL || payloadLen == 0) { return -1; + } nl_msg = virNetlinkMsgNew(RTM_SETLINK, NLM_F_REQUEST); @@ -1546,30 +1546,8 @@ virNetDevSetVfConfig(const char *ifname, int vf, if (!(vfinfo = nla_nest_start(nl_msg, IFLA_VF_INFO))) goto buffer_too_small; - if (macaddr) { - struct ifla_vf_mac ifla_vf_mac = { - .vf = vf, - .mac = { 0, }, - }; - - virMacAddrGetRaw(macaddr, ifla_vf_mac.mac); - - if (nla_put(nl_msg, IFLA_VF_MAC, sizeof(ifla_vf_mac), - &ifla_vf_mac) < 0) - goto buffer_too_small; - } - - if (vlanid >= 0) { - struct ifla_vf_vlan ifla_vf_vlan = { - .vf = vf, - .vlan = vlanid, - .qos = 0, - }; - - if (nla_put(nl_msg, IFLA_VF_VLAN, sizeof(ifla_vf_vlan), - &ifla_vf_vlan) < 0) - goto buffer_too_small; - } + if (nla_put(nl_msg, vfInfoType, payloadLen, payload) < 0) + goto buffer_too_small; nla_nest_end(nl_msg, vfinfo); nla_nest_end(nl_msg, vfinfolist); @@ -1582,48 +1560,20 @@ virNetDevSetVfConfig(const char *ifname, int vf, goto malformed_resp; switch (resp->nlmsg_type) { - case NLMSG_ERROR: - err = (struct nlmsgerr *)NLMSG_DATA(resp); - if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err))) + case NLMSG_ERROR: + err = (struct nlmsgerr *)NLMSG_DATA(resp); + if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err))) + goto malformed_resp; + rc = err->error; + break; + case NLMSG_DONE: + rc = 0; + break; + default: goto malformed_resp; - - /* if allowRetry is true and the error was EINVAL, then - * silently return a failure so the caller can retry with a - * different MAC address - */ - if (err->error == -EINVAL && *allowRetry && - macaddr && !virMacAddrCmp(macaddr, &zeroMAC)) { - goto cleanup; - } else if (err->error) { - /* other errors are permanent */ - virReportSystemError(-err->error, - _("Cannot set interface MAC/vlanid to %s/%d " - "for ifname %s vf %d"), - (macaddr - ? virMacAddrFormat(macaddr, macstr) - : "(unchanged)"), - vlanid, - ifname ? ifname : "(unspecified)", - vf); - *allowRetry = false; /* no use retrying */ - goto cleanup; - } - break; - - case NLMSG_DONE: - break; - - default: - goto malformed_resp; } - rc = 0; cleanup: - VIR_DEBUG("RTM_SETLINK %s vf %d MAC=%s vlanid=%d - %s", - ifname, vf, - macaddr ? virMacAddrFormat(macaddr, macstr) : "(unchanged)", - vlanid, rc < 0 ? "Fail" : "Success"); - nlmsg_free(nl_msg); return rc; @@ -1638,6 +1588,100 @@ virNetDevSetVfConfig(const char *ifname, int vf, goto cleanup; } +int +virNetDevSetVfVlan(const char *ifname, int vf, int vlanid) +{ + int rc = 0; + int requestError = 0; + + struct ifla_vf_vlan ifla_vf_vlan = { + .vf = vf, + .vlan = vlanid, + .qos = 0, + }; + + /* VLAN ids 0 and 4095 are reserved per 802.1Q but are valid values. */ + if ((vlanid < 0 || vlanid > 4095)) { + return -ERANGE; + } + + requestError = virNetDevSendVfSetLinkRequest(ifname, IFLA_VF_VLAN, + &ifla_vf_vlan, sizeof(ifla_vf_vlan)); + + if (requestError) { + virReportSystemError(-requestError, + _("Cannot set interface vlanid to %d " + "for ifname %s vf %d"), + vlanid, ifname ? ifname : "(unspecified)", vf); + rc = requestError; + } + VIR_DEBUG("RTM_SETLINK %s vf %d vlanid=%d - %s", + ifname, vf, + vlanid, rc < 0 ? "Fail" : "Success"); + return rc; +} + +int +virNetDevSetVfMac(const char *ifname, int vf, + const virMacAddr *macaddr, + bool *allowRetry) +{ + int rc = 0; + int requestError = 0; + char macstr[VIR_MAC_STRING_BUFLEN]; + + struct ifla_vf_mac ifla_vf_mac = { + .vf = vf, + .mac = { 0, }, + }; + + if (macaddr == NULL || allowRetry == NULL) + return -EINVAL; + + virMacAddrGetRaw(macaddr, ifla_vf_mac.mac); + + requestError = virNetDevSendVfSetLinkRequest(ifname, IFLA_VF_MAC, + &ifla_vf_mac, sizeof(ifla_vf_mac)); + /* if allowRetry is true and the error was EINVAL, then + * silently return a failure so the caller can retry with a + * different MAC address. */ + if (requestError == -EINVAL && *allowRetry && !virMacAddrCmp(macaddr, &zeroMAC)) { + rc = requestError; + } else if (requestError) { + /* other errors are permanent */ + virReportSystemError(-requestError, + _("Cannot set interface MAC to %s " + "for ifname %s vf %d"), + (macaddr + ? virMacAddrFormat(macaddr, macstr) + : "(unchanged)"), + ifname ? ifname : "(unspecified)", + vf); + *allowRetry = false; /* no use retrying */ + rc = requestError; + } else { + rc = 0; + } + VIR_DEBUG("RTM_SETLINK %s vf %d MAC=%s - %s", + ifname, vf, + macaddr ? virMacAddrFormat(macaddr, macstr) : "(unchanged)", + rc < 0 ? "Fail" : "Success"); + return rc; +} + +int +virNetDevSetVfConfig(const char *ifname, int vf, + const virMacAddr *macaddr, int vlanid, + bool *allowRetry) +{ + int rc = 0; + if ((rc = virNetDevSetVfMac(ifname, vf, macaddr, allowRetry)) < 0) + return rc; + if ((rc = virNetDevSetVfVlan(ifname, vf, vlanid)) < 0) + return rc; + return rc; +} + /** * virNetDevParseVfInfo: * Get the VF interface information from kernel by netlink, To make netlink @@ -2391,6 +2435,44 @@ virNetDevVFInterfaceStats(virPCIDeviceAddress *vfAddr G_GNUC_UNUSED, return -1; } +int +virNetDevSendVfSetLinkRequest(const char *ifname G_GNUC_UNUSED, int vfInfoType G_GNUC_UNUSED, + const void *payload G_GNUC_UNUSED, + const size_t payloadLen G_GNUC_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to send a VF SETLINK request on this platform")); + return -1; +} + +int +virNetDevSetVfVlan(const char *ifname G_GNUC_UNUSED, int vf G_GNUC_UNUSED, int vlanid G_GNUC_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to set a VF VLAN on this platform")); + return -1; +} + +int +virNetDevSetVfMac(const char *ifname G_GNUC_UNUSED, int vf G_GNUC_UNUSED, + const virMacAddr *macaddr G_GNUC_UNUSED, + bool *allowRetry G_GNUC_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to set a VF MAC on this platform")); + return -1; +} + +int +virNetDevSetVfConfig(const char *ifname G_GNUC_UNUSED, int vf G_GNUC_UNUSED, + const virMacAddr *macaddr G_GNUC_UNUSED, int vlanid G_GNUC_UNUSED, + bool *allowRetry G_GNUC_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to set a VF config on this platform")); + return -1; +} + #endif /* defined(WITH_LIBNL) */ diff --git a/src/util/virnetdevpriv.h b/src/util/virnetdevpriv.h new file mode 100644 index 0000000000..7990bf5269 --- /dev/null +++ b/src/util/virnetdevpriv.h @@ -0,0 +1,44 @@ +/* + * virnetdevpriv.h: private virnetdev header for unit testing + * + * Copyright (C) 2021 Canonical Ltd. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef LIBVIRT_VIRNETDEVPRIV_H_ALLOW +# error "virnetdevpriv.h may only be included by virnetdev.c or test suites" +#endif /* LIBVIRT_VIRNETDEVPRIV_H_ALLOW */ + +#pragma once + +#include "virnetdev.h" + +int +virNetDevSendVfSetLinkRequest(const char *ifname, int vfInfoType, + const void *payload, const size_t payloadLen); + +int +virNetDevSetVfVlan(const char *ifname, int vf, int vlanid); + +int +virNetDevSetVfMac(const char *ifname, int vf, + const virMacAddr *macaddr, + bool *allowRetry); + +int +virNetDevSetVfConfig(const char *ifname, int vf, + const virMacAddr *macaddr, int vlanid, + bool *allowRetry); diff --git a/tests/virnetdevtest.c b/tests/virnetdevtest.c index aadbeb1ef4..c8dba05c31 100644 --- a/tests/virnetdevtest.c +++ b/tests/virnetdevtest.c @@ -18,11 +18,17 @@ #include <config.h> +#include "internal.h" #include "testutils.h" +#include "virnetlink.h" + +#define LIBVIRT_VIRNETDEVPRIV_H_ALLOW + #ifdef __linux__ -# include "virnetdev.h" +# include "virmock.h" +# include "virnetdevpriv.h" # define VIR_FROM_THIS VIR_FROM_NONE @@ -59,6 +65,215 @@ testVirNetDevGetLinkInfo(const void *opaque) return 0; } +# if defined(WITH_LIBNL) + +int +(*real_virNetDevSendVfSetLinkRequest)(const char *ifname, int vfInfoType, + const void *payload, const size_t payloadLen); + +int +(*real_virNetDevSetVfMac)(const char *ifname, int vf, const virMacAddr *macaddr, bool *allowRetry); + +int +(*real_virNetDevSetVfVlan)(const char *ifname, int vf, int vlanid); + +static void +init_syms(void) +{ + VIR_MOCK_REAL_INIT(virNetDevSendVfSetLinkRequest); + VIR_MOCK_REAL_INIT(virNetDevSetVfMac); + VIR_MOCK_REAL_INIT(virNetDevSetVfVlan); +} + +int +virNetDevSetVfMac(const char *ifname, int vf, + const virMacAddr *macaddr, + bool *allowRetry) +{ + init_syms(); + + if (STREQ_NULLABLE(ifname, "fakeiface-macerror")) { + return -EBUSY; + } else if (STREQ_NULLABLE(ifname, "fakeiface-altmacerror")) { + return -EINVAL; + } else if (STREQ_NULLABLE(ifname, "fakeiface-macerror-novlanerror")) { + return -EAGAIN; + } else if (STREQ_NULLABLE(ifname, "fakeiface-macerror-vlanerror")) { + return -ENODEV; + } else if (STREQ_NULLABLE(ifname, "fakeiface-nomacerror-vlanerror")) { + return 0; + } else if (STREQ_NULLABLE(ifname, "fakeiface-nomacerror-novlanerror")) { + return 0; + } + return real_virNetDevSetVfMac(ifname, vf, macaddr, allowRetry); +} + +int +virNetDevSetVfVlan(const char *ifname, int vf, int vlanid) +{ + init_syms(); + + if (STREQ_NULLABLE(ifname, "fakeiface-macerror-vlanerror")) { + return -EPERM; + } else if (STREQ_NULLABLE(ifname, "fakeiface-nomacerror-vlanerror")) { + return -EPERM; + } else if (STREQ_NULLABLE(ifname, "fakeiface-macerror-novlanerror")) { + return 0; + } else if (STREQ_NULLABLE(ifname, "fakeiface-nomacerror-novlanerror")) { + return 0; + } + return real_virNetDevSetVfVlan(ifname, vf, vlanid); +} + +int +virNetDevSendVfSetLinkRequest(const char *ifname, int vfInfoType, + const void *payload, const size_t payloadLen) +{ + init_syms(); + + if (STREQ_NULLABLE(ifname, "fakeiface-eperm")) { + return -EPERM; + } else if (STREQ_NULLABLE(ifname, "fakeiface-eagain")) { + return -EAGAIN; + } else if (STREQ_NULLABLE(ifname, "fakeiface-einval")) { + return -EINVAL; + } else if (STREQ_NULLABLE(ifname, "fakeiface-ok")) { + return 0; + } + return real_virNetDevSendVfSetLinkRequest(ifname, vfInfoType, payload, payloadLen); +} + +static int +testVirNetDevSetVfMac(const void *opaque G_GNUC_UNUSED) +{ + struct testCase { + const char *ifname; + const int vf_num; + const virMacAddr macaddr; + bool allow_retry; + const int rc; + }; + size_t i = 0; + int rc = 0; + struct testCase testCases[] = { + { .ifname = "fakeiface-ok", .vf_num = 1, + .macaddr = { .addr = { 0, 0, 0, 0, 0, 0 } }, .allow_retry = false, .rc = 0 }, + { .ifname = "fakeiface-ok", .vf_num = 2, + .macaddr = { .addr = { 0, 0, 0, 7, 7, 7 } }, .allow_retry = false, .rc = 0 }, + { .ifname = "fakeiface-ok", .vf_num = 3, + .macaddr = { .addr = { 0, 0, 0, 0, 0, 0 } }, .allow_retry = true, .rc = 0 }, + { .ifname = "fakeiface-ok", .vf_num = 4, + .macaddr = { .addr = { 0, 0, 0, 7, 7, 7 } }, .allow_retry = true, .rc = 0 }, + { .ifname = "fakeiface-eperm", .vf_num = 5, + .macaddr = { .addr = { 0, 0, 0, 0, 0, 0 } }, .allow_retry = false, .rc = -EPERM }, + { .ifname = "fakeiface-einval", .vf_num = 6, + .macaddr = { .addr = { 0, 0, 0, 0, 0, 0 } }, .allow_retry = false, .rc = -EINVAL }, + { .ifname = "fakeiface-einval", .vf_num = 7, + .macaddr = { .addr = { 0, 0, 0, 0, 0, 0 } }, .allow_retry = true, .rc = -EINVAL }, + { .ifname = "fakeiface-einval", .vf_num = 8, + .macaddr = { .addr = { 0, 0, 0, 7, 7, 7 } }, .allow_retry = false, .rc = -EINVAL }, + { .ifname = "fakeiface-einval", .vf_num = 9, + .macaddr = { .addr = { 0, 0, 0, 7, 7, 7 } }, .allow_retry = true, .rc = -EINVAL }, + }; + + for (i = 0; i < sizeof(testCases) / sizeof(struct testCase); ++i) { + rc = virNetDevSetVfMac(testCases[i].ifname, testCases[i].vf_num, + &testCases[i].macaddr, &testCases[i].allow_retry); + if (rc != testCases[i].rc) { + return -1; + } + } + return 0; +} + +static int +testVirNetDevSetVfMissingMac(const void *opaque G_GNUC_UNUSED) +{ + bool allowRetry = false; + /* NULL MAC pointer. */ + if (virNetDevSetVfMac("fakeiface-ok", 1, NULL, &allowRetry) != -EINVAL) { + return -1; + } + allowRetry = true; + if (virNetDevSetVfMac("fakeiface-ok", 1, NULL, &allowRetry) != -EINVAL) { + return -1; + } + return 0; +} + +static int +testVirNetDevSetVfVlan(const void *opaque G_GNUC_UNUSED) +{ + struct testCase { + const char *ifname; + const int vf_num; + const int vlan_id; + const int rc; + }; + size_t i = 0; + int rc = 0; + const struct testCase testCases[] = { + /* VLAN ID is out of range of valid values (0-4095). */ + { .ifname = "enxdeadbeefcafe", .vf_num = 1, .vlan_id = 4096, .rc = -ERANGE }, + { .ifname = "enxdeadbeefcafe", .vf_num = 1, .vlan_id = -1, .rc = -ERANGE }, + { .ifname = "fakeiface-eperm", .vf_num = 1, .vlan_id = 0, .rc = -EPERM }, + { .ifname = "fakeiface-eagain", .vf_num = 1, .vlan_id = 0, .rc = -EAGAIN }, + /* Successful requests with vlan id 0 need to have a zero return code. */ + { .ifname = "fakeiface-ok", .vf_num = 1, .vlan_id = 0, .rc = 0 }, + /* Requests with a non-zero VLAN ID that result in an EPERM need to result in failures. + * failures. */ + { .ifname = "fakeiface-eperm", .vf_num = 1, .vlan_id = 42, .rc = -EPERM }, + /* Requests with a non-zero VLAN ID that result in some other errors need to result in + * failures. */ + { .ifname = "fakeiface-eagain", .vf_num = 1, .vlan_id = 42, .rc = -EAGAIN }, + /* Successful requests with a non-zero VLAN ID */ + { .ifname = "fakeiface-ok", .vf_num = 1, .vlan_id = 42, .rc = 0 }, + }; + + for (i = 0; i < sizeof(testCases) / sizeof(struct testCase); ++i) { + rc = virNetDevSetVfVlan(testCases[i].ifname, testCases[i].vf_num, testCases[i].vlan_id); + if (rc != testCases[i].rc) { + return -1; + } + } + + return 0; +} + +static int +testVirNetDevSetVfConfig(const void *opaque G_GNUC_UNUSED) +{ + struct testCase { + const char *ifname; + const int rc; + }; + int rc = 0; + size_t i = 0; + /* Nested functions are mocked so dummy values are used. */ + const virMacAddr mac = { .addr = { 0xDE, 0xAD, 0xBE, 0xEF, 0xCA, 0xFE }}; + const int vfNum = 1; + const int vlanid = 0; + bool *allowRetry = NULL; + + const struct testCase testCases[] = { + { .ifname = "fakeiface-macerror", .rc = -EBUSY }, + { .ifname = "fakeiface-altmacerror", .rc = -EINVAL }, + { .ifname = "fakeiface-macerror-novlanerror", .rc = -EAGAIN }, + { .ifname = "fakeiface-macerror-vlanerror", .rc = -ENODEV }, + { .ifname = "fakeiface-nomacerror-novlanerror", .rc = 0 }, + }; + + for (i = 0; i < sizeof(testCases) / sizeof(struct testCase); ++i) { + rc = virNetDevSetVfConfig(testCases[i].ifname, vfNum, &mac, vlanid, allowRetry); + if (rc != testCases[i].rc) { + return -1; + } + } + return 0; +} + +# endif /* defined(WITH_LIBNL) */ + static int mymain(void) { @@ -76,6 +291,19 @@ mymain(void) DO_TEST_LINK("lo", VIR_NETDEV_IF_STATE_UNKNOWN, 0); DO_TEST_LINK("eth0-broken", VIR_NETDEV_IF_STATE_DOWN, 0); +# if defined(WITH_LIBNL) + + if (virTestRun("Set VF MAC", testVirNetDevSetVfMac, NULL) < 0) + ret = -1; + if (virTestRun("Set VF MAC: missing MAC pointer", testVirNetDevSetVfMissingMac, NULL) < 0) + ret = -1; + if (virTestRun("Set VF VLAN", testVirNetDevSetVfVlan, NULL) < 0) + ret = -1; + if (virTestRun("Set VF Config", testVirNetDevSetVfConfig, NULL) < 0) + ret = -1; + +# endif /* defined(WITH_LIBNL) */ + return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } -- 2.32.0

On 2/1/22 09:28, Dmitrii Shcherbakov wrote:
This has a benefit of being able to handle error codes for those operations separately which is useful when drivers allow setting a MAC address but do not allow setting a VLAN (which is the case with some SmartNIC DPUs).
Signed-off-by: Dmitrii Shcherbakov <dmitrii.shcherbakov@canonical.com> --- src/libvirt_private.syms | 7 ++ src/util/virnetdev.c | 226 ++++++++++++++++++++++++++------------ src/util/virnetdevpriv.h | 44 ++++++++ tests/virnetdevtest.c | 230 ++++++++++++++++++++++++++++++++++++++- 4 files changed, 434 insertions(+), 73 deletions(-) create mode 100644 src/util/virnetdevpriv.h
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index bc6fa191bf..e6493c2176 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2847,6 +2847,13 @@ virNetDevOpenvswitchSetTimeout; virNetDevOpenvswitchUpdateVlan;
+# util/virnetdevpriv.h +virNetDevSendVfSetLinkRequest; +virNetDevSetVfConfig; +virNetDevSetVfMac; +virNetDevSetVfVlan; + + # util/virnetdevtap.h virNetDevTapAttachBridge; virNetDevTapCreate; diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index d93b2c6a83..043225d31f 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -19,7 +19,9 @@ #include <config.h> #include <math.h>
-#include "virnetdev.h" +#define LIBVIRT_VIRNETDEVPRIV_H_ALLOW + +#include "virnetdevpriv.h" #include "viralloc.h" #include "virnetlink.h" #include "virmacaddr.h" @@ -1509,16 +1511,12 @@ static struct nla_policy ifla_vfstats_policy[IFLA_VF_STATS_MAX+1] = { [IFLA_VF_STATS_MULTICAST] = { .type = NLA_U64 }, };
- -static int -virNetDevSetVfConfig(const char *ifname, int vf, - const virMacAddr *macaddr, int vlanid, - bool *allowRetry) +int +virNetDevSendVfSetLinkRequest(const char *ifname, int vfInfoType, + const void *payload, const size_t payloadLen)
Here and everywhere else, please put one argument per line.
{ - int rc = -1; - char macstr[VIR_MAC_STRING_BUFLEN]; g_autofree struct nlmsghdr *resp = NULL; - struct nlmsgerr *err; + struct nlmsgerr *err = NULL; unsigned int recvbuflen = 0; struct nl_msg *nl_msg; struct nlattr *vfinfolist, *vfinfo; @@ -1526,9 +1524,11 @@ virNetDevSetVfConfig(const char *ifname, int vf, .ifi_family = AF_UNSPEC, .ifi_index = -1, }; + int rc = 0;
We usually initialize return values to -1 so that they don't have to be overwritten in each error path.
- if (!macaddr && vlanid < 0) + if (payload == NULL || payloadLen == 0) { return -1; + }
This seems rather excessive. I know you just copied whatever was in the original code, but neither of callers passes NULL/0.
nl_msg = virNetlinkMsgNew(RTM_SETLINK, NLM_F_REQUEST);
@@ -1546,30 +1546,8 @@ virNetDevSetVfConfig(const char *ifname, int vf, if (!(vfinfo = nla_nest_start(nl_msg, IFLA_VF_INFO))) goto buffer_too_small;
- if (macaddr) { - struct ifla_vf_mac ifla_vf_mac = { - .vf = vf, - .mac = { 0, }, - }; - - virMacAddrGetRaw(macaddr, ifla_vf_mac.mac); - - if (nla_put(nl_msg, IFLA_VF_MAC, sizeof(ifla_vf_mac), - &ifla_vf_mac) < 0) - goto buffer_too_small; - } - - if (vlanid >= 0) { - struct ifla_vf_vlan ifla_vf_vlan = { - .vf = vf, - .vlan = vlanid, - .qos = 0, - }; - - if (nla_put(nl_msg, IFLA_VF_VLAN, sizeof(ifla_vf_vlan), - &ifla_vf_vlan) < 0) - goto buffer_too_small; - } + if (nla_put(nl_msg, vfInfoType, payloadLen, payload) < 0) + goto buffer_too_small;
nla_nest_end(nl_msg, vfinfo); nla_nest_end(nl_msg, vfinfolist); @@ -1582,48 +1560,20 @@ virNetDevSetVfConfig(const char *ifname, int vf, goto malformed_resp;
switch (resp->nlmsg_type) { - case NLMSG_ERROR: - err = (struct nlmsgerr *)NLMSG_DATA(resp); - if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err))) + case NLMSG_ERROR: + err = (struct nlmsgerr *)NLMSG_DATA(resp); + if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err))) + goto malformed_resp; + rc = err->error; + break; + case NLMSG_DONE: + rc = 0; + break; + default: goto malformed_resp;
No need for formatting change.
- - /* if allowRetry is true and the error was EINVAL, then - * silently return a failure so the caller can retry with a - * different MAC address - */ - if (err->error == -EINVAL && *allowRetry && - macaddr && !virMacAddrCmp(macaddr, &zeroMAC)) { - goto cleanup; - } else if (err->error) { - /* other errors are permanent */ - virReportSystemError(-err->error, - _("Cannot set interface MAC/vlanid to %s/%d " - "for ifname %s vf %d"), - (macaddr - ? virMacAddrFormat(macaddr, macstr) - : "(unchanged)"), - vlanid, - ifname ? ifname : "(unspecified)", - vf); - *allowRetry = false; /* no use retrying */ - goto cleanup; - } - break; - - case NLMSG_DONE: - break; - - default: - goto malformed_resp; }
- rc = 0; cleanup: - VIR_DEBUG("RTM_SETLINK %s vf %d MAC=%s vlanid=%d - %s", - ifname, vf, - macaddr ? virMacAddrFormat(macaddr, macstr) : "(unchanged)", - vlanid, rc < 0 ? "Fail" : "Success"); - nlmsg_free(nl_msg); return rc;
@@ -1638,6 +1588,100 @@ virNetDevSetVfConfig(const char *ifname, int vf, goto cleanup; }
+int +virNetDevSetVfVlan(const char *ifname, int vf, int vlanid) +{ + int rc = 0; + int requestError = 0; + + struct ifla_vf_vlan ifla_vf_vlan = { + .vf = vf, + .vlan = vlanid, + .qos = 0, + };
Alignment.
+ + /* VLAN ids 0 and 4095 are reserved per 802.1Q but are valid values. */ + if ((vlanid < 0 || vlanid > 4095)) { + return -ERANGE;
Here, we need to report an error because we do so in the other error path. The rule of thumb is that either all or none error paths report an error.
+ } + + requestError = virNetDevSendVfSetLinkRequest(ifname, IFLA_VF_VLAN, + &ifla_vf_vlan, sizeof(ifla_vf_vlan)); + + if (requestError) { + virReportSystemError(-requestError, + _("Cannot set interface vlanid to %d " + "for ifname %s vf %d"),
Pre-existing, but error message are exempt from 80-chars long line rule so that they can be easily 'git-grep'-ed. But since you're touching this code it's worth fixing.
+ vlanid, ifname ? ifname : "(unspecified)", vf); + rc = requestError;
So there's no difference between @rc and @requestError. They both have the same value in the end.
+ } + VIR_DEBUG("RTM_SETLINK %s vf %d vlanid=%d - %s", + ifname, vf, + vlanid, rc < 0 ? "Fail" : "Success"); + return rc; +} + +int +virNetDevSetVfMac(const char *ifname, int vf, + const virMacAddr *macaddr, + bool *allowRetry) +{
Basically all comments from above apply here too.
+ int rc = 0; + int requestError = 0; + char macstr[VIR_MAC_STRING_BUFLEN]; + + struct ifla_vf_mac ifla_vf_mac = { + .vf = vf, + .mac = { 0, }, + }; + + if (macaddr == NULL || allowRetry == NULL) + return -EINVAL; + + virMacAddrGetRaw(macaddr, ifla_vf_mac.mac); + + requestError = virNetDevSendVfSetLinkRequest(ifname, IFLA_VF_MAC, + &ifla_vf_mac, sizeof(ifla_vf_mac)); + /* if allowRetry is true and the error was EINVAL, then + * silently return a failure so the caller can retry with a + * different MAC address. */ + if (requestError == -EINVAL && *allowRetry && !virMacAddrCmp(macaddr, &zeroMAC)) { + rc = requestError; + } else if (requestError) { + /* other errors are permanent */ + virReportSystemError(-requestError, + _("Cannot set interface MAC to %s " + "for ifname %s vf %d"), + (macaddr + ? virMacAddrFormat(macaddr, macstr) + : "(unchanged)"), + ifname ? ifname : "(unspecified)", + vf); + *allowRetry = false; /* no use retrying */ + rc = requestError; + } else { + rc = 0; + } + VIR_DEBUG("RTM_SETLINK %s vf %d MAC=%s - %s", + ifname, vf, + macaddr ? virMacAddrFormat(macaddr, macstr) : "(unchanged)", + rc < 0 ? "Fail" : "Success"); + return rc; +} + +int +virNetDevSetVfConfig(const char *ifname, int vf, + const virMacAddr *macaddr, int vlanid, + bool *allowRetry) +{ + int rc = 0; + if ((rc = virNetDevSetVfMac(ifname, vf, macaddr, allowRetry)) < 0) + return rc; + if ((rc = virNetDevSetVfVlan(ifname, vf, vlanid)) < 0) + return rc; + return rc; +} + /** * virNetDevParseVfInfo: * Get the VF interface information from kernel by netlink, To make netlink @@ -2391,6 +2435,44 @@ virNetDevVFInterfaceStats(virPCIDeviceAddress *vfAddr G_GNUC_UNUSED, return -1; }
+int +virNetDevSendVfSetLinkRequest(const char *ifname G_GNUC_UNUSED, int vfInfoType G_GNUC_UNUSED, + const void *payload G_GNUC_UNUSED, + const size_t payloadLen G_GNUC_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to send a VF SETLINK request on this platform")); + return -1;
I'm inclined to return -ENOSYS rather than -1 because the real implementation would return a real -errno. I know it doesn't matter really because neither of callers looks at errno, they merely check for <0 but consistency is paramount. Thus if one variant of function returns -errno then the other should too.
+} + +int +virNetDevSetVfVlan(const char *ifname G_GNUC_UNUSED, int vf G_GNUC_UNUSED, int vlanid G_GNUC_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to set a VF VLAN on this platform")); + return -1; +} + +int +virNetDevSetVfMac(const char *ifname G_GNUC_UNUSED, int vf G_GNUC_UNUSED, + const virMacAddr *macaddr G_GNUC_UNUSED, + bool *allowRetry G_GNUC_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to set a VF MAC on this platform")); + return -1; +} + +int +virNetDevSetVfConfig(const char *ifname G_GNUC_UNUSED, int vf G_GNUC_UNUSED, + const virMacAddr *macaddr G_GNUC_UNUSED, int vlanid G_GNUC_UNUSED, + bool *allowRetry G_GNUC_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to set a VF config on this platform")); + return -1; +} +
#endif /* defined(WITH_LIBNL) */
diff --git a/src/util/virnetdevpriv.h b/src/util/virnetdevpriv.h new file mode 100644 index 0000000000..7990bf5269 --- /dev/null +++ b/src/util/virnetdevpriv.h @@ -0,0 +1,44 @@ +/* + * virnetdevpriv.h: private virnetdev header for unit testing + * + * Copyright (C) 2021 Canonical Ltd. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; If not, see + * <http://www.gnu.org/licenses/>. + */ + +#ifndef LIBVIRT_VIRNETDEVPRIV_H_ALLOW +# error "virnetdevpriv.h may only be included by virnetdev.c or test suites" +#endif /* LIBVIRT_VIRNETDEVPRIV_H_ALLOW */ + +#pragma once + +#include "virnetdev.h" + +int +virNetDevSendVfSetLinkRequest(const char *ifname, int vfInfoType, + const void *payload, const size_t payloadLen); + +int +virNetDevSetVfVlan(const char *ifname, int vf, int vlanid); + +int +virNetDevSetVfMac(const char *ifname, int vf, + const virMacAddr *macaddr, + bool *allowRetry); + +int +virNetDevSetVfConfig(const char *ifname, int vf, + const virMacAddr *macaddr, int vlanid, + bool *allowRetry); diff --git a/tests/virnetdevtest.c b/tests/virnetdevtest.c index aadbeb1ef4..c8dba05c31 100644 --- a/tests/virnetdevtest.c +++ b/tests/virnetdevtest.c
Bonus points for extensive tests. Michal

There should be a way to show no intent in programming a VLAN at all (including clearing it). This allows handling error conditions differently when VLAN clearing is explicit (vlan id == 0) vs implicit (vlanid == NULL - try to clear it if possible). Signed-off-by: Dmitrii Shcherbakov <dmitrii.shcherbakov@canonical.com> --- src/hypervisor/virhostdev.c | 4 +++- src/util/virnetdev.c | 41 ++++++++++++++++++++++++------------- src/util/virnetdevpriv.h | 4 ++-- tests/virnetdevtest.c | 27 ++++++++++++++++++++---- 4 files changed, 55 insertions(+), 21 deletions(-) diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index e44eb7f1d3..929b0a0805 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -463,6 +463,7 @@ virHostdevSetNetConfig(virDomainHostdevDef *hostdev, const virNetDevVPortProfile *virtPort; int vf = -1; bool port_profile_associate = true; + bool setVlan = false; if (!virHostdevIsPCINetDevice(hostdev)) return 0; @@ -471,6 +472,7 @@ virHostdevSetNetConfig(virDomainHostdevDef *hostdev, return -1; vlan = virDomainNetGetActualVlan(hostdev->parentnet); + setVlan = vlan != NULL; virtPort = virDomainNetGetActualVirtPortProfile(hostdev->parentnet); if (virtPort) { if (vlan) { @@ -486,7 +488,7 @@ virHostdevSetNetConfig(virDomainHostdevDef *hostdev, return -1; } else { if (virNetDevSetNetConfig(linkdev, vf, &hostdev->parentnet->mac, - vlan, NULL, true) < 0) + vlan, NULL, setVlan) < 0) return -1; } diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 043225d31f..eacdc3bf1f 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1589,20 +1589,25 @@ virNetDevSendVfSetLinkRequest(const char *ifname, int vfInfoType, } int -virNetDevSetVfVlan(const char *ifname, int vf, int vlanid) +virNetDevSetVfVlan(const char *ifname, int vf, const int *vlanid) { int rc = 0; int requestError = 0; struct ifla_vf_vlan ifla_vf_vlan = { .vf = vf, - .vlan = vlanid, .qos = 0, + /* If vlanid is NULL, assume it needs to be cleared. */ + .vlan = 0, }; - /* VLAN ids 0 and 4095 are reserved per 802.1Q but are valid values. */ - if ((vlanid < 0 || vlanid > 4095)) { - return -ERANGE; + if (vlanid != NULL) { + /* VLAN ids 0 and 4095 are reserved per 802.1Q but are valid values. */ + if ((*vlanid < 0 || *vlanid > 4095)) { + return -ERANGE; + } else { + ifla_vf_vlan.vlan = *vlanid; + } } requestError = virNetDevSendVfSetLinkRequest(ifname, IFLA_VF_VLAN, @@ -1612,12 +1617,12 @@ virNetDevSetVfVlan(const char *ifname, int vf, int vlanid) virReportSystemError(-requestError, _("Cannot set interface vlanid to %d " "for ifname %s vf %d"), - vlanid, ifname ? ifname : "(unspecified)", vf); + ifla_vf_vlan.vlan, ifname ? ifname : "(unspecified)", vf); rc = requestError; } VIR_DEBUG("RTM_SETLINK %s vf %d vlanid=%d - %s", ifname, vf, - vlanid, rc < 0 ? "Fail" : "Success"); + ifla_vf_vlan.vlan, rc < 0 ? "Fail" : "Success"); return rc; } @@ -1671,7 +1676,7 @@ virNetDevSetVfMac(const char *ifname, int vf, int virNetDevSetVfConfig(const char *ifname, int vf, - const virMacAddr *macaddr, int vlanid, + const virMacAddr *macaddr, const int *vlanid, bool *allowRetry) { int rc = 0; @@ -2207,7 +2212,7 @@ virNetDevSetNetConfig(const char *linkdev, int vf, const char *pfDevName = NULL; g_autofree char *pfDevOrig = NULL; g_autofree char *vfDevOrig = NULL; - int vlanTag = -1; + g_autofree int *vlanTag = NULL; g_autoptr(virPCIDevice) vfPCIDevice = NULL; if (vf >= 0) { @@ -2266,10 +2271,17 @@ virNetDevSetNetConfig(const char *linkdev, int vf, return -1; } - vlanTag = vlan->tag[0]; + vlanTag = g_new0(int, 1); + *vlanTag = vlan->tag[0]; } else if (setVlan) { - vlanTag = 0; /* assure any existing vlan tag is reset */ + vlanTag = g_new0(int, 1); + /* Assure any existing vlan tag is reset. */ + *vlanTag = 0; + } else { + /* Indicate that setting a VLAN has not been explicitly requested. + * This allows selected errors in clearing a VF VLAN to be ignored. */ + vlanTag = NULL; } } @@ -2351,7 +2363,7 @@ virNetDevSetNetConfig(const char *linkdev, int vf, } } - if (adminMAC || vlanTag >= 0) { + if (adminMAC) { /* Set vlanTag and admin MAC using an RTM_SETLINK request sent to * PFdevname+VF#, if mac != NULL this will set the "admin MAC" via * the PF, *not* the actual VF MAC - the admin MAC only takes @@ -2446,7 +2458,8 @@ virNetDevSendVfSetLinkRequest(const char *ifname G_GNUC_UNUSED, int vfInfoType G } int -virNetDevSetVfVlan(const char *ifname G_GNUC_UNUSED, int vf G_GNUC_UNUSED, int vlanid G_GNUC_UNUSED) +virNetDevSetVfVlan(const char *ifname G_GNUC_UNUSED, int vf G_GNUC_UNUSED, + const int *vlanid G_GNUC_UNUSED) { virReportSystemError(ENOSYS, "%s", _("Unable to set a VF VLAN on this platform")); @@ -2465,7 +2478,7 @@ virNetDevSetVfMac(const char *ifname G_GNUC_UNUSED, int vf G_GNUC_UNUSED, int virNetDevSetVfConfig(const char *ifname G_GNUC_UNUSED, int vf G_GNUC_UNUSED, - const virMacAddr *macaddr G_GNUC_UNUSED, int vlanid G_GNUC_UNUSED, + const virMacAddr *macaddr G_GNUC_UNUSED, const int *vlanid G_GNUC_UNUSED, bool *allowRetry G_GNUC_UNUSED) { virReportSystemError(ENOSYS, "%s", diff --git a/src/util/virnetdevpriv.h b/src/util/virnetdevpriv.h index 7990bf5269..84a42fb747 100644 --- a/src/util/virnetdevpriv.h +++ b/src/util/virnetdevpriv.h @@ -31,7 +31,7 @@ virNetDevSendVfSetLinkRequest(const char *ifname, int vfInfoType, const void *payload, const size_t payloadLen); int -virNetDevSetVfVlan(const char *ifname, int vf, int vlanid); +virNetDevSetVfVlan(const char *ifname, int vf, const int *vlanid); int virNetDevSetVfMac(const char *ifname, int vf, @@ -40,5 +40,5 @@ virNetDevSetVfMac(const char *ifname, int vf, int virNetDevSetVfConfig(const char *ifname, int vf, - const virMacAddr *macaddr, int vlanid, + const virMacAddr *macaddr, const int *vlanid, bool *allowRetry); diff --git a/tests/virnetdevtest.c b/tests/virnetdevtest.c index c8dba05c31..f5f54653bb 100644 --- a/tests/virnetdevtest.c +++ b/tests/virnetdevtest.c @@ -75,7 +75,7 @@ int (*real_virNetDevSetVfMac)(const char *ifname, int vf, const virMacAddr *macaddr, bool *allowRetry); int -(*real_virNetDevSetVfVlan)(const char *ifname, int vf, int vlanid); +(*real_virNetDevSetVfVlan)(const char *ifname, int vf, const int *vlanid); static void init_syms(void) @@ -109,7 +109,7 @@ virNetDevSetVfMac(const char *ifname, int vf, } int -virNetDevSetVfVlan(const char *ifname, int vf, int vlanid) +virNetDevSetVfVlan(const char *ifname, int vf, const int *vlanid) { init_syms(); @@ -210,6 +210,11 @@ testVirNetDevSetVfVlan(const void *opaque G_GNUC_UNUSED) const int vlan_id; const int rc; }; + struct nullVlanTestCase { + const char *ifname; + const int vf_num; + const int rc; + }; size_t i = 0; int rc = 0; const struct testCase testCases[] = { @@ -230,13 +235,27 @@ testVirNetDevSetVfVlan(const void *opaque G_GNUC_UNUSED) { .ifname = "fakeiface-ok", .vf_num = 1, .vlan_id = 42, .rc = 0 }, }; + const struct nullVlanTestCase nullVLANTestCases[] = { + { .ifname = "fakeiface-eperm", .vf_num = 1, .rc = -EPERM }, + { .ifname = "fakeiface-eagain", .vf_num = 1, .rc = -EAGAIN }, + /* Successful requests with vlan id 0 need to have a zero return code. */ + { .ifname = "fakeiface-ok", .vf_num = 1, .rc = 0 }, + }; + for (i = 0; i < sizeof(testCases) / sizeof(struct testCase); ++i) { - rc = virNetDevSetVfVlan(testCases[i].ifname, testCases[i].vf_num, testCases[i].vlan_id); + rc = virNetDevSetVfVlan(testCases[i].ifname, testCases[i].vf_num, &testCases[i].vlan_id); if (rc != testCases[i].rc) { return -1; } } + for (i = 0; i < sizeof(nullVLANTestCases) / sizeof(struct nullVlanTestCase); ++i) { + rc = virNetDevSetVfVlan(nullVLANTestCases[i].ifname, nullVLANTestCases[i].vf_num, NULL); + if (rc != nullVLANTestCases[i].rc) { + return -1; + } + } + return 0; } @@ -264,7 +283,7 @@ testVirNetDevSetVfConfig(const void *opaque G_GNUC_UNUSED) }; for (i = 0; i < sizeof(testCases) / sizeof(struct testCase); ++i) { - rc = virNetDevSetVfConfig(testCases[i].ifname, vfNum, &mac, vlanid, allowRetry); + rc = virNetDevSetVfConfig(testCases[i].ifname, vfNum, &mac, &vlanid, allowRetry); if (rc != testCases[i].rc) { return -1; } -- 2.32.0

On 2/1/22 09:28, Dmitrii Shcherbakov wrote:
There should be a way to show no intent in programming a VLAN at all (including clearing it). This allows handling error conditions differently when VLAN clearing is explicit (vlan id == 0) vs implicit (vlanid == NULL - try to clear it if possible).
Signed-off-by: Dmitrii Shcherbakov <dmitrii.shcherbakov@canonical.com> --- src/hypervisor/virhostdev.c | 4 +++- src/util/virnetdev.c | 41 ++++++++++++++++++++++++------------- src/util/virnetdevpriv.h | 4 ++-- tests/virnetdevtest.c | 27 ++++++++++++++++++++---- 4 files changed, 55 insertions(+), 21 deletions(-)
This patch looks okay, but I had to adapt it to meet changes I made to 1/3. Michal

SmartNIC DPUs may not expose some privileged eswitch operations to the hypervisor hosts. For example, this happens with Bluefield devices running in the ECPF (default) mode for security reasons. While VF MAC address programming is possible via an RTM_SETLINK operation, trying to set a VLAN ID in the same operation will fail with EPERM. The equivalent ip link commands below provide an illustration: 1. This works: sudo ip link set enp130s0f0 vf 2 mac de:ad:be:ef:ca:fe 2. Setting (or clearing) a VLAN fails with EPERM: sudo ip link set enp130s0f0 vf 2 vlan 0 RTNETLINK answers: Operation not permitted 3. This is what Libvirt attempts to do today (when trying to clear a VF VLAN at the same time as programming a VF MAC). sudo ip link set enp130s0f0 vf 2 vlan 0 mac de:ad:be:ef:ca:fe RTNETLINK answers: Operation not permitted If setting an explicit VLAN ID results in an EPERM, clearing a VLAN (setting a VLAN ID to 0) can be handled gracefully by ignoring the EPERM error with the rationale being that if we cannot set this state in the first place, we cannot clear it either. In order to keep explicit clearing of VLAN ID working as it used to be passing a NULL pointer for VLAN ID is used. Signed-off-by: Dmitrii Shcherbakov <dmitrii.shcherbakov@canonical.com> --- NEWS.rst | 14 ++++++++++++++ src/util/virnetdev.c | 11 ++++++++++- tests/virnetdevtest.c | 2 +- 3 files changed, 25 insertions(+), 2 deletions(-) diff --git a/NEWS.rst b/NEWS.rst index 666a593b58..f5453257ab 100644 --- a/NEWS.rst +++ b/NEWS.rst @@ -34,6 +34,20 @@ v8.1.0 (unreleased) to parse sysconfig files, in case they are created by the admin and filled with the desired key=value pairs. + * virnetdev: Ignore EPERM on implicit clearing of VF VLAN ID + + Libvirt will now ignore EPERM errors on attempts to implicitly clear a + VLAN ID (when a VLAN is not explicitly provided via an interface XML + using a 0 or a non-zero value) as SmartNIC DPUs do not expose VLAN + programming capabilities to the hypervisor host. This allows Libvirt + clients to avoid specifying a VLAN and expect VF configuration to work + since Libvirt tries to clear a VLAN in the same operation + as setting a MAC address for VIR_DOMAIN_NET_TYPE_HOSTDEV devices which + is now split into two distinct operations. EPERM errors received while + trying to program a non-zero VLAN ID or explicitly program a VLAN ID 0 + will still cause errors as before so there is no change in behavior + in those cases. + * **Bug fixes** diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index eacdc3bf1f..cf9056f1bd 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1613,12 +1613,21 @@ virNetDevSetVfVlan(const char *ifname, int vf, const int *vlanid) requestError = virNetDevSendVfSetLinkRequest(ifname, IFLA_VF_VLAN, &ifla_vf_vlan, sizeof(ifla_vf_vlan)); - if (requestError) { + /* If vlanid is NULL - we are attempting to implicitly clear an existing VLAN id. + * An EPERM received at this stage is an indicator that the embedded + * switch is not exposed to this host and the network driver is not + * able to set a VLAN for a VF, whereas the Libvirt client has not + * explicitly configured a VLAN or requested it to be cleared via vid 0. */ + if (requestError == -EPERM && vlanid == NULL) { + rc = 0; + } else if (requestError) { virReportSystemError(-requestError, _("Cannot set interface vlanid to %d " "for ifname %s vf %d"), ifla_vf_vlan.vlan, ifname ? ifname : "(unspecified)", vf); rc = requestError; + } else { + rc = 0; } VIR_DEBUG("RTM_SETLINK %s vf %d vlanid=%d - %s", ifname, vf, diff --git a/tests/virnetdevtest.c b/tests/virnetdevtest.c index f5f54653bb..d73eaec817 100644 --- a/tests/virnetdevtest.c +++ b/tests/virnetdevtest.c @@ -236,7 +236,7 @@ testVirNetDevSetVfVlan(const void *opaque G_GNUC_UNUSED) }; const struct nullVlanTestCase nullVLANTestCases[] = { - { .ifname = "fakeiface-eperm", .vf_num = 1, .rc = -EPERM }, + { .ifname = "fakeiface-eperm", .vf_num = 1, .rc = 0 }, { .ifname = "fakeiface-eagain", .vf_num = 1, .rc = -EAGAIN }, /* Successful requests with vlan id 0 need to have a zero return code. */ { .ifname = "fakeiface-ok", .vf_num = 1, .rc = 0 }, -- 2.32.0

On 2/1/22 09:28, Dmitrii Shcherbakov wrote:
SmartNIC DPUs may not expose some privileged eswitch operations to the hypervisor hosts. For example, this happens with Bluefield devices running in the ECPF (default) mode for security reasons. While VF MAC address programming is possible via an RTM_SETLINK operation, trying to set a VLAN ID in the same operation will fail with EPERM.
The equivalent ip link commands below provide an illustration:
1. This works:
sudo ip link set enp130s0f0 vf 2 mac de:ad:be:ef:ca:fe
2. Setting (or clearing) a VLAN fails with EPERM:
sudo ip link set enp130s0f0 vf 2 vlan 0 RTNETLINK answers: Operation not permitted
3. This is what Libvirt attempts to do today (when trying to clear a VF VLAN at the same time as programming a VF MAC).
sudo ip link set enp130s0f0 vf 2 vlan 0 mac de:ad:be:ef:ca:fe RTNETLINK answers: Operation not permitted
If setting an explicit VLAN ID results in an EPERM, clearing a VLAN (setting a VLAN ID to 0) can be handled gracefully by ignoring the EPERM error with the rationale being that if we cannot set this state in the first place, we cannot clear it either.
In order to keep explicit clearing of VLAN ID working as it used to be passing a NULL pointer for VLAN ID is used.
Signed-off-by: Dmitrii Shcherbakov <dmitrii.shcherbakov@canonical.com> --- NEWS.rst | 14 ++++++++++++++ src/util/virnetdev.c | 11 ++++++++++- tests/virnetdevtest.c | 2 +- 3 files changed, 25 insertions(+), 2 deletions(-)
diff --git a/NEWS.rst b/NEWS.rst index 666a593b58..f5453257ab 100644 --- a/NEWS.rst +++ b/NEWS.rst @@ -34,6 +34,20 @@ v8.1.0 (unreleased) to parse sysconfig files, in case they are created by the admin and filled with the desired key=value pairs.
+ * virnetdev: Ignore EPERM on implicit clearing of VF VLAN ID + + Libvirt will now ignore EPERM errors on attempts to implicitly clear a + VLAN ID (when a VLAN is not explicitly provided via an interface XML + using a 0 or a non-zero value) as SmartNIC DPUs do not expose VLAN + programming capabilities to the hypervisor host. This allows Libvirt + clients to avoid specifying a VLAN and expect VF configuration to work + since Libvirt tries to clear a VLAN in the same operation + as setting a MAC address for VIR_DOMAIN_NET_TYPE_HOSTDEV devices which + is now split into two distinct operations. EPERM errors received while + trying to program a non-zero VLAN ID or explicitly program a VLAN ID 0 + will still cause errors as before so there is no change in behavior + in those cases. + * **Bug fixes**
Any NEWS.rst change MUST (in sense of RFC2119) be done in a separate patch. The reason is that when somebody backports these patches onto one of previous releases then they would get needless conflict only because of this file.
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index eacdc3bf1f..cf9056f1bd 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1613,12 +1613,21 @@ virNetDevSetVfVlan(const char *ifname, int vf, const int *vlanid) requestError = virNetDevSendVfSetLinkRequest(ifname, IFLA_VF_VLAN, &ifla_vf_vlan, sizeof(ifla_vf_vlan));
Again, nothing technically wrong here, I had to adopt it to the changes I made to previous patches. Michal

The reason is that when somebody backports these patches onto one of previous releases then they would get needless conflict only because of this file.
Ack, I'll make a note of that for the future changes, thanks guiding me with this! Best Regards, Dmitrii Shcherbakov LP/MM/oftc: dmitriis On Wed, Feb 2, 2022 at 9:04 PM Michal Prívozník <mprivozn@redhat.com> wrote:
On 2/1/22 09:28, Dmitrii Shcherbakov wrote:
SmartNIC DPUs may not expose some privileged eswitch operations to the hypervisor hosts. For example, this happens with Bluefield devices running in the ECPF (default) mode for security reasons. While VF MAC address programming is possible via an RTM_SETLINK operation, trying to set a VLAN ID in the same operation will fail with EPERM.
The equivalent ip link commands below provide an illustration:
1. This works:
sudo ip link set enp130s0f0 vf 2 mac de:ad:be:ef:ca:fe
2. Setting (or clearing) a VLAN fails with EPERM:
sudo ip link set enp130s0f0 vf 2 vlan 0 RTNETLINK answers: Operation not permitted
3. This is what Libvirt attempts to do today (when trying to clear a VF VLAN at the same time as programming a VF MAC).
sudo ip link set enp130s0f0 vf 2 vlan 0 mac de:ad:be:ef:ca:fe RTNETLINK answers: Operation not permitted
If setting an explicit VLAN ID results in an EPERM, clearing a VLAN (setting a VLAN ID to 0) can be handled gracefully by ignoring the EPERM error with the rationale being that if we cannot set this state in the first place, we cannot clear it either.
In order to keep explicit clearing of VLAN ID working as it used to be passing a NULL pointer for VLAN ID is used.
Signed-off-by: Dmitrii Shcherbakov <dmitrii.shcherbakov@canonical.com> --- NEWS.rst | 14 ++++++++++++++ src/util/virnetdev.c | 11 ++++++++++- tests/virnetdevtest.c | 2 +- 3 files changed, 25 insertions(+), 2 deletions(-)
diff --git a/NEWS.rst b/NEWS.rst index 666a593b58..f5453257ab 100644 --- a/NEWS.rst +++ b/NEWS.rst @@ -34,6 +34,20 @@ v8.1.0 (unreleased) to parse sysconfig files, in case they are created by the admin and filled with the desired key=value pairs.
+ * virnetdev: Ignore EPERM on implicit clearing of VF VLAN ID + + Libvirt will now ignore EPERM errors on attempts to implicitly clear a + VLAN ID (when a VLAN is not explicitly provided via an interface XML + using a 0 or a non-zero value) as SmartNIC DPUs do not expose VLAN + programming capabilities to the hypervisor host. This allows Libvirt + clients to avoid specifying a VLAN and expect VF configuration to work + since Libvirt tries to clear a VLAN in the same operation + as setting a MAC address for VIR_DOMAIN_NET_TYPE_HOSTDEV devices which + is now split into two distinct operations. EPERM errors received while + trying to program a non-zero VLAN ID or explicitly program a VLAN ID 0 + will still cause errors as before so there is no change in behavior + in those cases. + * **Bug fixes**
Any NEWS.rst change MUST (in sense of RFC2119) be done in a separate patch. The reason is that when somebody backports these patches onto one of previous releases then they would get needless conflict only because of this file.
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index eacdc3bf1f..cf9056f1bd 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1613,12 +1613,21 @@ virNetDevSetVfVlan(const char *ifname, int vf,
const int *vlanid)
requestError = virNetDevSendVfSetLinkRequest(ifname, IFLA_VF_VLAN, &ifla_vf_vlan,
sizeof(ifla_vf_vlan));
Again, nothing technically wrong here, I had to adopt it to the changes I made to previous patches.
Michal

On 2/1/22 09:28, Dmitrii Shcherbakov wrote:
SmartNIC DPUs may not expose some privileged eswitch operations to the hypervisor hosts. For example, this happens with Bluefield devices running in the ECPF (default) mode [1] for security reasons. While VF MAC address programming is possible via an RTM_SETLINK operation, trying to set a VLAN ID in the same operation may fail with EPERM.
Specifically for the mlx5 driver this behavior was altered in the Linux kernel upstream [2] to avoid getting EPERM when trying to program VLAN 0. This may also get backported to older downstream kernels (e.g. [3]). However, Libvirt could potentially handle this case gracefully without needing a specific kernel version or depending on a specific driver fix.
In the kernel a relevant call chain may look like
do_setlink -> do_setvfinfo -> dev->netdev_ops->set_vf_vlan
which calls a driver-specific function like [4] eventually.
The equivalent ip link commands below provide an illustration:
1. This will work without an error:
sudo ip link set enp130s0f0 vf 2 mac de:ad:be:ef:ca:fe
2. Setting (or clearing) a VLAN will fail with EPERM:
sudo ip link set enp130s0f0 vf 2 vlan 0 RTNETLINK answers: Operation not permitted
3. This is what Libvirt attempts to do today when trying to clear a VF VLAN at the same time as programming a VF MAC:
sudo ip link set enp130s0f0 vf 2 vlan 0 mac de:ad:be:ef:ca:fe RTNETLINK answers: Operation not permitted
If setting an explicit VLAN ID results in an EPERM, clearing a VLAN (setting a VLAN ID to 0) can be handled gracefully by ignoring the EPERM error with the rationale being that if we cannot set this state in the first place, we cannot clear it either.
Thus, virNetDevSetVfConfig is split into two distinct functions. If clearing a VLAN ID fails with EPERM when clearing is implicit, the error is simply ignored. For explicit clearing EPERM is still a fatal error.
Both new functions rely virNetDevSendVfSetLinkRequest that implements common functionality related to formatting a request, sending it and handling error conditions and returns 0 or an error since in both cases the payload is either NLMSG_DONE (no error) or NLMSG_ERROR where an error message is needed by the caller to handle known cases appropriately. This function allows the conditional code to be unit tested.
An alternative to this could be providing a higher level control plane mechanism that would provide metadata about a device being remotely managed in which case Libvirt would avoid trying to set or clear a VLAN ID. This would be more complicated since other software (like Nova in the OpenStack case) would have to annotate every guest device with an attribute indicating whether a device is remotely managed or not based on operator provided configuration so that Libvirt can act on this and avoid VLAN programming.
https://gitlab.com/dmitriis/libvirt/-/pipelines/460293963
v8 change:
* Rebased on top of the latest changes to Libvirt; * Added relevant upstream Linux and downstream kernel references to this cover letter.
[1] https://docs.mellanox.com/display/BlueFieldSWv35111601/Modes+of+Operation#Mo... [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... [3] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1957753 [4] https://github.com/torvalds/linux/blob/v5.15/drivers/net/ethernet/mellanox/m...
Dmitrii Shcherbakov (3): Set VF MAC and VLAN ID in two different operations Allow VF vlanid to be passed as a pointer Ignore EPERM on implicit clearing of VF VLAN ID
NEWS.rst | 14 ++ src/hypervisor/virhostdev.c | 4 +- src/libvirt_private.syms | 7 + src/util/virnetdev.c | 256 +++++++++++++++++++++++++----------- src/util/virnetdevpriv.h | 44 +++++++ tests/virnetdevtest.c | 249 ++++++++++++++++++++++++++++++++++- 6 files changed, 496 insertions(+), 78 deletions(-) create mode 100644 src/util/virnetdevpriv.h
Sorry for allowing this go to v8 without proper review. To make it up to you, here's my suggestion: problems I've raised in my review are easy to solve. I've uploaded the 'fixup' commits onto my gilab here: https://gitlab.com/MichalPrivoznik/libvirt/-/commits/vf_vlan_id/ and if you agree, I'd squash them in respective commits and merge. Laine, any thoughts? Michal

Hi Michal, No problem, thanks for coming back to re-review it, I also acknowledge that it was late in the year and during the holiday season so things got slower.
and if you agree, I'd squash them in respective commits and merge.
I looked at the fixup commits - I agree with the changes and with squashing them so it's a +1 from me, thanks a lot for doing this! Best Regards, Dmitrii Shcherbakov LP/MM/oftc: dmitriis On Wed, Feb 2, 2022 at 9:04 PM Michal Prívozník <mprivozn@redhat.com> wrote:
On 2/1/22 09:28, Dmitrii Shcherbakov wrote:
SmartNIC DPUs may not expose some privileged eswitch operations to the hypervisor hosts. For example, this happens with Bluefield devices running in the ECPF (default) mode [1] for security reasons. While VF MAC address programming is possible via an RTM_SETLINK operation, trying to set a VLAN ID in the same operation may fail with EPERM.
Specifically for the mlx5 driver this behavior was altered in the Linux kernel upstream [2] to avoid getting EPERM when trying to program VLAN 0. This may also get backported to older downstream kernels (e.g. [3]). However, Libvirt could potentially handle this case gracefully without needing a specific kernel version or depending on a specific driver fix.
In the kernel a relevant call chain may look like
do_setlink -> do_setvfinfo -> dev->netdev_ops->set_vf_vlan
which calls a driver-specific function like [4] eventually.
The equivalent ip link commands below provide an illustration:
1. This will work without an error:
sudo ip link set enp130s0f0 vf 2 mac de:ad:be:ef:ca:fe
2. Setting (or clearing) a VLAN will fail with EPERM:
sudo ip link set enp130s0f0 vf 2 vlan 0 RTNETLINK answers: Operation not permitted
3. This is what Libvirt attempts to do today when trying to clear a VF VLAN at the same time as programming a VF MAC:
sudo ip link set enp130s0f0 vf 2 vlan 0 mac de:ad:be:ef:ca:fe RTNETLINK answers: Operation not permitted
If setting an explicit VLAN ID results in an EPERM, clearing a VLAN (setting a VLAN ID to 0) can be handled gracefully by ignoring the EPERM error with the rationale being that if we cannot set this state in the first place, we cannot clear it either.
Thus, virNetDevSetVfConfig is split into two distinct functions. If clearing a VLAN ID fails with EPERM when clearing is implicit, the error is simply ignored. For explicit clearing EPERM is still a fatal error.
Both new functions rely virNetDevSendVfSetLinkRequest that implements common functionality related to formatting a request, sending it and handling error conditions and returns 0 or an error since in both cases the payload is either NLMSG_DONE (no error) or NLMSG_ERROR where an error message is needed by the caller to handle known cases appropriately. This function allows the conditional code to be unit tested.
An alternative to this could be providing a higher level control plane mechanism that would provide metadata about a device being remotely managed in which case Libvirt would avoid trying to set or clear a VLAN ID. This would be more complicated since other software (like Nova in the OpenStack case) would have to annotate every guest device with an attribute indicating whether a device is remotely managed or not based on operator provided configuration so that Libvirt can act on this and avoid VLAN programming.
https://gitlab.com/dmitriis/libvirt/-/pipelines/460293963
v8 change:
* Rebased on top of the latest changes to Libvirt; * Added relevant upstream Linux and downstream kernel references to this cover letter.
[1] https://docs.mellanox.com/display/BlueFieldSWv35111601/Modes+of+Operation#Mo... [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... [3] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1957753 [4] https://github.com/torvalds/linux/blob/v5.15/drivers/net/ethernet/mellanox/m...
Dmitrii Shcherbakov (3): Set VF MAC and VLAN ID in two different operations Allow VF vlanid to be passed as a pointer Ignore EPERM on implicit clearing of VF VLAN ID
NEWS.rst | 14 ++ src/hypervisor/virhostdev.c | 4 +- src/libvirt_private.syms | 7 + src/util/virnetdev.c | 256 +++++++++++++++++++++++++----------- src/util/virnetdevpriv.h | 44 +++++++ tests/virnetdevtest.c | 249 ++++++++++++++++++++++++++++++++++- 6 files changed, 496 insertions(+), 78 deletions(-) create mode 100644 src/util/virnetdevpriv.h
Sorry for allowing this go to v8 without proper review. To make it up to you, here's my suggestion: problems I've raised in my review are easy to solve. I've uploaded the 'fixup' commits onto my gilab here:
https://gitlab.com/MichalPrivoznik/libvirt/-/commits/vf_vlan_id/
and if you agree, I'd squash them in respective commits and merge.
Laine, any thoughts?
Michal

Squashed & auto-tested here: https://gitlab.com/dmitriis/libvirt/-/commits/2021-ignore-eperm-for-vid0 https://gitlab.com/dmitriis/libvirt/-/pipelines/462532823 Best Regards, Dmitrii Shcherbakov LP/MM/oftc: dmitriis On Wed, Feb 2, 2022 at 11:10 PM Dmitrii Shcherbakov <dmitrii.shcherbakov@canonical.com> wrote:
Hi Michal,
No problem, thanks for coming back to re-review it, I also acknowledge that it was late in the year and during the holiday season so things got slower.
and if you agree, I'd squash them in respective commits and merge.
I looked at the fixup commits - I agree with the changes and with squashing them so it's a +1 from me, thanks a lot for doing this!
Best Regards, Dmitrii Shcherbakov LP/MM/oftc: dmitriis
On Wed, Feb 2, 2022 at 9:04 PM Michal Prívozník <mprivozn@redhat.com> wrote:
On 2/1/22 09:28, Dmitrii Shcherbakov wrote:
SmartNIC DPUs may not expose some privileged eswitch operations to the hypervisor hosts. For example, this happens with Bluefield devices running in the ECPF (default) mode [1] for security reasons. While VF MAC address programming is possible via an RTM_SETLINK operation, trying to set a VLAN ID in the same operation may fail with EPERM.
Specifically for the mlx5 driver this behavior was altered in the Linux kernel upstream [2] to avoid getting EPERM when trying to program VLAN 0. This may also get backported to older downstream kernels (e.g. [3]). However, Libvirt could potentially handle this case gracefully without needing a specific kernel version or depending on a specific driver fix.
In the kernel a relevant call chain may look like
do_setlink -> do_setvfinfo -> dev->netdev_ops->set_vf_vlan
which calls a driver-specific function like [4] eventually.
The equivalent ip link commands below provide an illustration:
1. This will work without an error:
sudo ip link set enp130s0f0 vf 2 mac de:ad:be:ef:ca:fe
2. Setting (or clearing) a VLAN will fail with EPERM:
sudo ip link set enp130s0f0 vf 2 vlan 0 RTNETLINK answers: Operation not permitted
3. This is what Libvirt attempts to do today when trying to clear a VF VLAN at the same time as programming a VF MAC:
sudo ip link set enp130s0f0 vf 2 vlan 0 mac de:ad:be:ef:ca:fe RTNETLINK answers: Operation not permitted
If setting an explicit VLAN ID results in an EPERM, clearing a VLAN (setting a VLAN ID to 0) can be handled gracefully by ignoring the EPERM error with the rationale being that if we cannot set this state in the first place, we cannot clear it either.
Thus, virNetDevSetVfConfig is split into two distinct functions. If clearing a VLAN ID fails with EPERM when clearing is implicit, the error is simply ignored. For explicit clearing EPERM is still a fatal error.
Both new functions rely virNetDevSendVfSetLinkRequest that implements common functionality related to formatting a request, sending it and handling error conditions and returns 0 or an error since in both cases the payload is either NLMSG_DONE (no error) or NLMSG_ERROR where an error message is needed by the caller to handle known cases appropriately. This function allows the conditional code to be unit tested.
An alternative to this could be providing a higher level control plane mechanism that would provide metadata about a device being remotely managed in which case Libvirt would avoid trying to set or clear a VLAN ID. This would be more complicated since other software (like Nova in the OpenStack case) would have to annotate every guest device with an attribute indicating whether a device is remotely managed or not based on operator provided configuration so that Libvirt can act on this and avoid VLAN programming.
https://gitlab.com/dmitriis/libvirt/-/pipelines/460293963
v8 change:
* Rebased on top of the latest changes to Libvirt; * Added relevant upstream Linux and downstream kernel references to this cover letter.
[1] https://docs.mellanox.com/display/BlueFieldSWv35111601/Modes+of+Operation#Mo... [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... [3] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1957753 [4] https://github.com/torvalds/linux/blob/v5.15/drivers/net/ethernet/mellanox/m...
Dmitrii Shcherbakov (3): Set VF MAC and VLAN ID in two different operations Allow VF vlanid to be passed as a pointer Ignore EPERM on implicit clearing of VF VLAN ID
NEWS.rst | 14 ++ src/hypervisor/virhostdev.c | 4 +- src/libvirt_private.syms | 7 + src/util/virnetdev.c | 256 +++++++++++++++++++++++++----------- src/util/virnetdevpriv.h | 44 +++++++ tests/virnetdevtest.c | 249 ++++++++++++++++++++++++++++++++++- 6 files changed, 496 insertions(+), 78 deletions(-) create mode 100644 src/util/virnetdevpriv.h
Sorry for allowing this go to v8 without proper review. To make it up to you, here's my suggestion: problems I've raised in my review are easy to solve. I've uploaded the 'fixup' commits onto my gilab here:
https://gitlab.com/MichalPrivoznik/libvirt/-/commits/vf_vlan_id/
and if you agree, I'd squash them in respective commits and merge.
Laine, any thoughts?
Michal

On 2/2/22 1:04 PM, Michal Prívozník wrote:
Laine, any thoughts?
I somehow thought this had already been pushed, so I was surprised when it showed up again :-/ I think the only issue I had before was that I thought the new unit tests were more testing the test setup than the actual code, but Dan convinced me otherwise. So I'm fine with this change. A newly discovered (but pre-existing, so non-consequential to these patches) problem associated with vlans is that we don't differentiate between "set vlan 0" and "don't set any vlan", which I hadn't even considered before, until this BZ came up: https://bugzilla.redhat.com/2035726 (The reporter is trying to use <tag id='-1'/> to "unset" the vlan tag for an interface)

On 2/3/22 15:45, Laine Stump wrote:
On 2/2/22 1:04 PM, Michal Prívozník wrote:
Laine, any thoughts?
I somehow thought this had already been pushed, so I was surprised when it showed up again :-/
I think the only issue I had before was that I thought the new unit tests were more testing the test setup than the actual code, but Dan convinced me otherwise.
So I'm fine with this change.
Cool, pushed now.
A newly discovered (but pre-existing, so non-consequential to these patches) problem associated with vlans is that we don't differentiate between "set vlan 0" and "don't set any vlan", which I hadn't even considered before, until this BZ came up:
https://bugzilla.redhat.com/2035726
(The reporter is trying to use <tag id='-1'/> to "unset" the vlan tag for an interface)
Ah, but the bug report is for openvswitch port profile, so that's doubly unrelated :-) Michal

Thanks again for the feedback - much appreciated! Best Regards, Dmitrii Shcherbakov LP/MM/oftc: dmitriis On Thu, Feb 3, 2022 at 6:52 PM Michal Prívozník <mprivozn@redhat.com> wrote:
On 2/3/22 15:45, Laine Stump wrote:
On 2/2/22 1:04 PM, Michal Prívozník wrote:
Laine, any thoughts?
I somehow thought this had already been pushed, so I was surprised when it showed up again :-/
I think the only issue I had before was that I thought the new unit tests were more testing the test setup than the actual code, but Dan convinced me otherwise.
So I'm fine with this change.
Cool, pushed now.
A newly discovered (but pre-existing, so non-consequential to these patches) problem associated with vlans is that we don't differentiate between "set vlan 0" and "don't set any vlan", which I hadn't even considered before, until this BZ came up:
https://bugzilla.redhat.com/2035726
(The reporter is trying to use <tag id='-1'/> to "unset" the vlan tag for an interface)
Ah, but the bug report is for openvswitch port profile, so that's doubly unrelated :-)
Michal

On 2/3/22 10:52 AM, Michal Prívozník wrote:
On 2/3/22 15:45, Laine Stump wrote:
On 2/2/22 1:04 PM, Michal Prívozník wrote:
Laine, any thoughts?
I somehow thought this had already been pushed, so I was surprised when it showed up again :-/
I think the only issue I had before was that I thought the new unit tests were more testing the test setup than the actual code, but Dan convinced me otherwise.
So I'm fine with this change.
Cool, pushed now.
A newly discovered (but pre-existing, so non-consequential to these patches) problem associated with vlans is that we don't differentiate between "set vlan 0" and "don't set any vlan", which I hadn't even considered before, until this BZ came up:
https://bugzilla.redhat.com/2035726
(The reporter is trying to use <tag id='-1'/> to "unset" the vlan tag for an interface)
Ah, but the bug report is for openvswitch port profile, so that's doubly unrelated :-)
Yes and now. The original report was about openvswitch ports, but then the reporter also tried things with hostdev SRIOV devices and macvtap passthrough devices (with varying results). I'm sure that whatever is done to fix it will be deeply intertwined and complicated by the current topic (since the whole point of these patches is that smartnics want/need to just leave the entire vlan tag thing alone) :-) Additionally, Dmitrii's change might make it possible to easily/simply fix the case of updating vlan tag for existing macvtap passthrough and hostdev SRIOV VFs (since we can now set the vlan tag without doing anything else). So for once there is an unintended *good* side effect!
participants (3)
-
Dmitrii Shcherbakov
-
Laine Stump
-
Michal Prívozník