[libvirt] [PATCH 00/19] Fix saving/setting/restoring SR-IOV VF MAC address

There are multiple bugs filed against both libvirt and the kernel related to incorrect restoration of MAC addresses for SR-IOV VFs, both when used via VFIO device assignment and when used for macvtap passthrough mode https://bugzilla.redhat.com/1415609 - libvirt https://bugzilla.redhat.com/1341248 - kernel (igb) https://bugzilla.redhat.com/1415609 - kernel (ixgbe) https://bugzilla.redhat.com/1302166 - kernel (mlx, fixed) Beyond that, there are other problems with incorrect setting and restoration of VF MAC addresses that don't have their own bug reports (although they may be partially described in the comments of the BZes mentioned above). This patch series attempts (and I hope succeeds!) to fix all of these problems, which can be summarized in these three points: * The VF's own MAC address was not being set prior to using it for macvtap passthrough mode. Instead, due to serious misconception on my part, the "admin MAC" for that VF was set in the PF. The problem is that the admin MAC isn't put into use on the VF immediately; rather it is unused until the next time the VF driver is initialized (on either the host or in a guest) so setting it had no practical effect, meaning that the macvtap device's MAC didn't match the MAC of the VF device it was attached to - this meant that traffic wouldn't pass unless the device was in promiscuous mode (and apparently it was back when the code was changed to behave this way?). * The VF's own MAC address was never restored after it had been used (for both VFIO device assignment and for macvtap passthrough mode). Only the "admin MAC" address (which, again, won't take effect until/unless the VF driver is reloaded/reinitialized) was restored. * If the original admin MAC was 00:00:00:00:00:00, libvirt would save that, then attempt to restore it when the guest was finished with the device, but for some/many SRIOV-capable net drivers, an all 0 MAC is not allowed to be set (even though that is its initial value). This would result in the MAC not being changed (of course, since this is the admin MAC, that didn't actually matter, but the combination of the error message and the VF's own MAC still being set to the value used by the guest, users mistakenly believed this was the source of networking problems (e.g. when the guest moved to another host, but the old host still had an interface showing its MAC address). There are lots of details in the patches. 01 - 07 just clean up and slightly modify some existing utility functions. 08 - 11 define some functions to save/restore netdev MAC/vlan settings, and 12-14 change the existing setup/teardown code for macvtap and hostdev to use the new functions, then 15 and 17-19 modify their behavior further, while 16 just removes functions that are no longer used. In the end, the VF's own MAC *and* admin MAC are saved for all SRIOV VFs when we begin using a VF, and the VF's MAC is restored when finished. Since the admin MAC doesn't actually have any immediate practical effect, we don't concern ourselves with assuring it is restored in all cases (in particular, when use use the VF for VFIO assignment, we only restore the VF's MAC, but not the admin MAC during teardown, and when using the VF for macvtap passthrough we avoid restoring the admin MAC because that would unnecessarily turn on the "administratively set" flag in the PF driver (described in the commit log for one of these patches). I might decide to fix the former later, but for now it just unnecessarily complicates the code for no real gain). Laine Stump (19): util: permit querying a VF MAC address or VLAN tag by itself util: remove unused args from virNetDevSetVfConfig() util: use cleanup label consistently in virHostdevNetConfigReplace() util: eliminate useless local variable util: make virMacAddrParse more versatile util: change virPCIGetNetName() to not return error if device has no net name util: make virPCIGetDeviceAddressFromSysfsLink() public util: new function virPCIDeviceRebind() util: new internal function to permit silent failure of virNetDevSetMAC() util: new function virNetDevPFGetVF() util: new functions virNetDev(Save|Read|Set)NetConfig() util: use new virNetDev*NetConfig() functions for macvtap setup/teardown util: use new virNetDev*NetConfig() functions for hostdev setup/teardown util: replace virHostdevNetConfigReplace with ...(Save|Set)NetConfig() util: save hostdev network device config before unbinding from host driver util: after hostdev assignment, restore VF MAC address via setting admin MAC util: remove unused functions from virnetdev.c util: if setting admin MAC to 00:00:00:00:00:00 fails, try 02:00:00:00:00:00 util: try *really* hard to set the MAC address of an SRIOV VF src/libvirt_private.syms | 10 +- src/util/virhostdev.c | 171 ++++++-- src/util/virmacaddr.c | 2 +- src/util/virnetdev.c | 937 +++++++++++++++++++++++++++++++------------- src/util/virnetdev.h | 25 ++ src/util/virnetdevmacvlan.c | 45 ++- src/util/virpci.c | 65 ++- src/util/virpci.h | 4 + 8 files changed, 933 insertions(+), 326 deletions(-) -- 2.9.3

virNetDevParseVfConfig() assumed that both the MAC address and VLAN tag pointers were valid, so even if you only wanted one or the other, you would need a variable to hold the returned value for both. This patch checks each for a NULL pointer before filling it in. --- src/util/virnetdev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index d123248..75f969d 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1534,7 +1534,7 @@ virNetDevParseVfConfig(struct nlattr **tb, int32_t vf, virMacAddrPtr mac, goto cleanup; } - if (tb[IFLA_VF_MAC]) { + if (mac && tb[IFLA_VF_MAC]) { vf_mac = RTA_DATA(tb_vf[IFLA_VF_MAC]); if (vf_mac && vf_mac->vf == vf) { virMacAddrSetRaw(mac, vf_mac->mac); @@ -1542,7 +1542,7 @@ virNetDevParseVfConfig(struct nlattr **tb, int32_t vf, virMacAddrPtr mac, } } - if (tb[IFLA_VF_VLAN]) { + if (vlanid && tb[IFLA_VF_VLAN]) { vf_vlan = RTA_DATA(tb_vf[IFLA_VF_VLAN]); if (vf_vlan && vf_vlan->vf == vf) { *vlanid = vf_vlan->vlan; -- 2.9.3

This function is only called in two places, and the ifindex, nltarget_kernel, and getPidFunc args are never used (and never will be). ifindex - we always know the name of the device, and never know the ifindex - if we really did need the ifindex we would have to get it from the name using virNetDevGetIndex(). In practice, we just send -1 to virNetDevSetVfConfig(), which doesn't bother to learn the real ifindex (you only need a name *or* an ifindex for the netlink command to succeed, not both). nltarget_kernel - messages to set the config of an SRIOV VF will always go to netlink in the kernel, not to another user process, so this arg is always true (there are other uses of netlink messages where the message might need to go to another user process, but never in the case of RTM_SETLINK for SRIOV). getPidFunc - this arg is only used if nltarget_kernel is false, and it never is. None of this has any functional effect, it just makes it easier to follow what's happening when virNetDevSetVfConfig() is called. --- src/util/virnetdev.c | 30 ++++++++---------------------- 1 file changed, 8 insertions(+), 22 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 75f969d..30a4a01 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1378,20 +1378,18 @@ static struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = { static int -virNetDevSetVfConfig(const char *ifname, int ifindex, int vf, - bool nltarget_kernel, const virMacAddr *macaddr, - int vlanid, uint32_t (*getPidFunc)(void)) +virNetDevSetVfConfig(const char *ifname, int vf, + const virMacAddr *macaddr, int vlanid) { int rc = -1; struct nlmsghdr *resp = NULL; struct nlmsgerr *err; unsigned int recvbuflen = 0; - uint32_t pid = 0; struct nl_msg *nl_msg; struct nlattr *vfinfolist, *vfinfo; struct ifinfomsg ifinfo = { .ifi_family = AF_UNSPEC, - .ifi_index = ifindex + .ifi_index = -1, }; if (!macaddr && vlanid < 0) @@ -1445,15 +1443,7 @@ virNetDevSetVfConfig(const char *ifname, int ifindex, int vf, nla_nest_end(nl_msg, vfinfo); nla_nest_end(nl_msg, vfinfolist); - if (!nltarget_kernel) { - pid = getPidFunc(); - if (pid == 0) { - rc = -1; - goto cleanup; - } - } - - if (virNetlinkCommand(nl_msg, &resp, &recvbuflen, 0, pid, + if (virNetlinkCommand(nl_msg, &resp, &recvbuflen, 0, 0, NETLINK_ROUTE, 0) < 0) goto cleanup; @@ -1471,13 +1461,13 @@ virNetDevSetVfConfig(const char *ifname, int ifindex, int vf, virReportSystemError(-err->error, _("Cannot set interface MAC/vlanid to %s/%d " - "for ifname %s ifindex %d vf %d"), + "for ifname %s vf %d"), (macaddr ? virMacAddrFormat(macaddr, macstr) : "(unchanged)"), vlanid, ifname ? ifname : "(unspecified)", - ifindex, vf); + vf); goto cleanup; } break; @@ -1593,7 +1583,6 @@ virNetDevReplaceVfConfig(const char *pflinkdev, int vf, char *path = NULL; char macstr[VIR_MAC_STRING_BUFLEN]; char *fileData = NULL; - int ifindex = -1; bool pfIsOnline; /* Assure that PF is online prior to twiddling with the VF. It @@ -1633,8 +1622,7 @@ virNetDevReplaceVfConfig(const char *pflinkdev, int vf, goto cleanup; } - ret = virNetDevSetVfConfig(pflinkdev, ifindex, vf, true, - macaddress, vlanid, NULL); + ret = virNetDevSetVfConfig(pflinkdev, vf, macaddress, vlanid); cleanup: VIR_FREE(path); @@ -1653,7 +1641,6 @@ virNetDevRestoreVfConfig(const char *pflinkdev, char *vlan = NULL; virMacAddr oldmac; int vlanid = -1; - int ifindex = -1; if (virAsprintf(&path, "%s/%s_vf%d", stateDir, pflinkdev, vf) < 0) @@ -1696,8 +1683,7 @@ virNetDevRestoreVfConfig(const char *pflinkdev, } /*reset mac and remove file-ignore results*/ - rc = virNetDevSetVfConfig(pflinkdev, ifindex, vf, true, - &oldmac, vlanid, NULL); + rc = virNetDevSetVfConfig(pflinkdev, vf, &oldmac, vlanid); ignore_value(unlink(path)); cleanup: -- 2.9.3

This will make an upcoming functional change more straightforward. --- src/util/virhostdev.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 86ca8e0..a967182 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -411,15 +411,14 @@ virHostdevNetConfigReplace(virDomainHostdevDefPtr hostdev, virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Interface type hostdev is currently supported on" " SR-IOV Virtual Functions only")); - return ret; + goto cleanup; } if (virHostdevNetDevice(hostdev, &linkdev, &vf) < 0) - return ret; + goto cleanup; vlan = virDomainNetGetActualVlan(hostdev->parent.data.net); - virtPort = virDomainNetGetActualVirtPortProfile( - hostdev->parent.data.net); + virtPort = virDomainNetGetActualVirtPortProfile(hostdev->parent.data.net); if (virtPort) { if (vlan) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -428,15 +427,21 @@ virHostdevNetConfigReplace(virDomainHostdevDefPtr hostdev, virNetDevVPortTypeToString(virtPort->virtPortType)); goto cleanup; } - ret = virHostdevNetConfigVirtPortProfile(linkdev, vf, - virtPort, &hostdev->parent.data.net->mac, uuid, - port_profile_associate); + if (virHostdevNetConfigVirtPortProfile(linkdev, vf, virtPort, + &hostdev->parent.data.net->mac, + uuid, port_profile_associate) < 0) { + goto cleanup; + } } else { /* Set only mac and vlan */ - ret = virNetDevReplaceNetConfig(linkdev, vf, - &hostdev->parent.data.net->mac, - vlan, stateDir); + if (virNetDevReplaceNetConfig(linkdev, vf, + &hostdev->parent.data.net->mac, + vlan, stateDir) < 0) { + goto cleanup; + } } + + ret = 0; cleanup: VIR_FREE(linkdev); return ret; -- 2.9.3

vf in virNetDevMacVLanDeleteWithVPortProfile() is initialized to -1 and never set. It's not set for a good reason - because it doesn't make sense during macvtap device setup to refer to a VF device as "PF:VF#". This patch replaces the two uses of "vf" with "-1", and removes the local variable, so that it's more clear we are always calling the utility functions with vf set to -1. --- src/util/virnetdevmacvlan.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index b106ca1..be9dff6 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2010-2016 Red Hat, Inc. + * Copyright (C) 2010-2017 Red Hat, Inc. * Copyright (C) 2010-2012 IBM Corporation * * This library is free software; you can redistribute it and/or @@ -1179,14 +1179,13 @@ int virNetDevMacVLanDeleteWithVPortProfile(const char *ifname, char *stateDir) { int ret = 0; - int vf = -1; if (ifname) { if (virNetDevVPortProfileDisassociate(ifname, virtPortProfile, macaddr, linkdev, - vf, + -1, VIR_NETDEV_VPORT_PROFILE_OP_DESTROY) < 0) ret = -1; if (virNetDevMacVLanDelete(ifname) < 0) @@ -1199,7 +1198,7 @@ int virNetDevMacVLanDeleteWithVPortProfile(const char *ifname, virtPortProfile->virtPortType == VIR_NETDEV_VPORT_PROFILE_8021QBH) ignore_value(virNetDevRestoreMacAddress(linkdev, stateDir)); else - ignore_value(virNetDevRestoreNetConfig(linkdev, vf, stateDir)); + ignore_value(virNetDevRestoreNetConfig(linkdev, -1, stateDir)); } virNetlinkEventRemoveClient(0, macaddr, NETLINK_ROUTE); -- 2.9.3

Previously the MAC address text was required to be terminated with a NULL. After this, it can be terminated with a space or any control character. --- src/util/virmacaddr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/util/virmacaddr.c b/src/util/virmacaddr.c index 612a409..7afe032 100644 --- a/src/util/virmacaddr.c +++ b/src/util/virmacaddr.c @@ -169,7 +169,7 @@ virMacAddrParse(const char* str, virMacAddrPtr addr) addr->addr[i] = (unsigned char) result; - if ((i == 5) && (*end_ptr == '\0')) + if ((i == 5) && (*end_ptr <= ' ')) return 0; if (*end_ptr != ':') break; -- 2.9.3

...and cleanup the callers to report it when it *is* an error. In many cases It's useful for virPCIGetNetName() to not log an error and simply return a NULL pointer when the given device isn't bound to a net driver (e.g. we're looking at a VF that is permanently bound to vfio-pci). The existing code would silently return an error in this case, which could eventually lead to the dreaded "An error occurred but the cause is unknown" log message. This patch changes virPCIGetNetName() to still return success if the device simply isn't bound to a net driver, and adjusts all the callers that require a non-null netname to check for that condition and log an error when it happens. --- src/util/virhostdev.c | 13 +++++++++++++ src/util/virnetdev.c | 19 +++++++++++++++++-- src/util/virpci.c | 30 ++++++++++++++++++++++-------- 3 files changed, 52 insertions(+), 10 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index a967182..7292cf4 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -317,8 +317,21 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, char **linkdev, vf) < 0) goto cleanup; } else { + /* In practice this should never happen, since we currently + * only support assigning SRIOV VFs via <interface + * type='hostdev'>, and it is only those devices that should + * end up calling this function. + */ if (virPCIGetNetName(sysfs_path, linkdev) < 0) goto cleanup; + + if (!linkdev) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("The device at %s has no network device name"), + sysfs_path); + goto cleanup; + } + *vf = -1; } diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 30a4a01..766638d 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1162,6 +1162,9 @@ virNetDevGetVirtualFunctions(const char *pfname, } if (virPCIGetNetName(pci_sysfs_device_link, &((*vfname)[i])) < 0) + goto cleanup; + + if (!(*vfname)[i]) VIR_INFO("VF does not have an interface name"); } @@ -1258,10 +1261,22 @@ virNetDevGetPhysicalFunction(const char *ifname, char **pfname) if (virNetDevSysfsDeviceFile(&physfn_sysfs_path, ifname, "physfn") < 0) return ret; - ret = virPCIGetNetName(physfn_sysfs_path, pfname); + if (virPCIGetNetName(physfn_sysfs_path, pfname) < 0) + goto cleanup; - VIR_FREE(physfn_sysfs_path); + if (!*pfname) { + /* this shouldn't be possible. A VF can't exist unless its + * PF device is bound to a network driver + */ + virReportError(VIR_ERR_INTERNAL_ERROR, + _("The PF device for VF %s has no network device name"), + ifname); + goto cleanup; + } + ret = 0; + cleanup: + VIR_FREE(physfn_sysfs_path); return ret; } diff --git a/src/util/virpci.c b/src/util/virpci.c index 3c1e13b..337afda 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -2854,8 +2854,11 @@ virPCIGetNetName(char *device_link_sysfs_path, char **netname) return -1; } - if (virDirOpenQuiet(&dir, pcidev_sysfs_net_path) < 0) + if (virDirOpenQuiet(&dir, pcidev_sysfs_net_path) < 0) { + /* this *isn't* an error - caller needs to check for netname == NULL */ + ret = 0; goto out; + } while (virDirRead(dir, &entry, pcidev_sysfs_net_path) > 0) { /* Assume a single directory entry */ @@ -2881,24 +2884,35 @@ virPCIGetVirtualFunctionInfo(const char *vf_sysfs_device_path, int ret = -1; if (virPCIGetPhysicalFunction(vf_sysfs_device_path, &pf_config_address) < 0) - return ret; + goto cleanup; if (!pf_config_address) - return ret; + goto cleanup; if (virPCIDeviceAddressGetSysfsFile(pf_config_address, &pf_sysfs_device_path) < 0) { + goto cleanup; + } - VIR_FREE(pf_config_address); - return ret; + if (virPCIGetVirtualFunctionIndex(pf_sysfs_device_path, + vf_sysfs_device_path, vf_index) < 0) { + goto cleanup; } - if (virPCIGetVirtualFunctionIndex(pf_sysfs_device_path, vf_sysfs_device_path, - vf_index) < 0) + if (virPCIGetNetName(pf_sysfs_device_path, pfname) < 0) goto cleanup; - ret = virPCIGetNetName(pf_sysfs_device_path, pfname); + if (!*pfname) { + /* this shouldn't be possible. A VF can't exist unless its + * PF device is bound to a network driver + */ + virReportError(VIR_ERR_INTERNAL_ERROR, + _("The PF device for VF %s has no network device name"), + vf_sysfs_device_path); + goto cleanup; + } + ret = 0; cleanup: VIR_FREE(pf_config_address); VIR_FREE(pf_sysfs_device_path); -- 2.9.3

This function will be useful in virnetdev.c, so promote it from static. --- src/libvirt_private.syms | 1 + src/util/virpci.c | 10 +++++++++- src/util/virpci.h | 3 +++ 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 6c89d44..b44a6ee 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2227,6 +2227,7 @@ virPCIDeviceSetUsedBy; virPCIDeviceUnbind; virPCIDeviceWaitForCleanup; virPCIEDeviceInfoFree; +virPCIGetDeviceAddressFromSysfsLink; virPCIGetHeaderType; virPCIGetNetName; virPCIGetPhysicalFunction; diff --git a/src/util/virpci.c b/src/util/virpci.c index 337afda..9878398 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -2594,7 +2594,7 @@ virPCIDeviceAddressIsEqual(virPCIDeviceAddressPtr bdf1, (bdf1->function == bdf2->function)); } -static virPCIDeviceAddressPtr +virPCIDeviceAddressPtr virPCIGetDeviceAddressFromSysfsLink(const char *device_link) { virPCIDeviceAddressPtr bdf = NULL; @@ -2923,6 +2923,14 @@ virPCIGetVirtualFunctionInfo(const char *vf_sysfs_device_path, #else static const char *unsupported = N_("not supported on non-linux platforms"); +virPCIDeviceAddressPtr +virPCIGetDeviceAddressFromSysfsLink(const char *device_link ATTRIBUTE_UNUSED) +{ + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _(unsupported)); + return -1; +} + + int virPCIGetPhysicalFunction(const char *vf_sysfs_path ATTRIBUTE_UNUSED, virPCIDeviceAddressPtr *pf ATTRIBUTE_UNUSED) diff --git a/src/util/virpci.h b/src/util/virpci.h index a5e8d00..4be9cc0 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -186,6 +186,9 @@ int virPCIDeviceIsAssignable(virPCIDevicePtr dev, int strict_acs_check); int virPCIDeviceWaitForCleanup(virPCIDevicePtr dev, const char *matcher); +virPCIDeviceAddressPtr +virPCIGetDeviceAddressFromSysfsLink(const char *device_link); + int virPCIGetPhysicalFunction(const char *vf_sysfs_path, virPCIDeviceAddressPtr *pf); -- 2.9.3

This function unbinds a device from its driver, then immediately rebinds it to its driver again. --- src/libvirt_private.syms | 1 + src/util/virpci.c | 25 +++++++++++++++++++++++++ src/util/virpci.h | 1 + 3 files changed, 27 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index b44a6ee..ef027cc 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2217,6 +2217,7 @@ virPCIDeviceListSteal; virPCIDeviceListStealIndex; virPCIDeviceNew; virPCIDeviceReattach; +virPCIDeviceRebind; virPCIDeviceReset; virPCIDeviceSetManaged; virPCIDeviceSetRemoveSlot; diff --git a/src/util/virpci.c b/src/util/virpci.c index 9878398..a007eea 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -1101,6 +1101,31 @@ virPCIDeviceUnbind(virPCIDevicePtr dev) return ret; } + +/** + * virPCIDeviceRebind: + * @dev: virPCIDevice object describing the device to rebind + * + * unbind a device from its driver, then immediately rebind it. + * + * Returns 0 on success, -1 on failure + */ +int virPCIDeviceRebind(virPCIDevicePtr dev) +{ + if (virPCIDeviceUnbind(dev) < 0) + return -1; + + if (virFileWriteStr(PCI_SYSFS "drivers_probe", dev->name, 0) < 0) { + virReportSystemError(errno, + _("Failed to trigger a probe for PCI device '%s'"), + dev->name); + return -1; + } + + return 0; +} + + /* * Bind a PCI device to a driver using driver_override sysfs interface. * E.g. diff --git a/src/util/virpci.h b/src/util/virpci.h index 4be9cc0..8637c2c 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -225,6 +225,7 @@ int virPCIGetVirtualFunctionInfo(const char *vf_sysfs_device_path, char **pfname, int *vf_index); int virPCIDeviceUnbind(virPCIDevicePtr dev); +int virPCIDeviceRebind(virPCIDevicePtr dev); int virPCIDeviceGetDriverPathAndName(virPCIDevicePtr dev, char **path, char **name); -- 2.9.3

On 03/10/2017 09:35 PM, Laine Stump wrote:
This function unbinds a device from its driver, then immediately rebinds it to its driver again. --- src/libvirt_private.syms | 1 + src/util/virpci.c | 25 +++++++++++++++++++++++++ src/util/virpci.h | 1 + 3 files changed, 27 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index b44a6ee..ef027cc 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2217,6 +2217,7 @@ virPCIDeviceListSteal; virPCIDeviceListStealIndex; virPCIDeviceNew; virPCIDeviceReattach; +virPCIDeviceRebind; virPCIDeviceReset; virPCIDeviceSetManaged; virPCIDeviceSetRemoveSlot; diff --git a/src/util/virpci.c b/src/util/virpci.c index 9878398..a007eea 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -1101,6 +1101,31 @@ virPCIDeviceUnbind(virPCIDevicePtr dev) return ret; }
+ +/** + * virPCIDeviceRebind: + * @dev: virPCIDevice object describing the device to rebind + * + * unbind a device from its driver, then immediately rebind it. + * + * Returns 0 on success, -1 on failure + */ +int virPCIDeviceRebind(virPCIDevicePtr dev) +{ + if (virPCIDeviceUnbind(dev) < 0) + return -1; + + if (virFileWriteStr(PCI_SYSFS "drivers_probe", dev->name, 0) < 0) { + virReportSystemError(errno, + _("Failed to trigger a probe for PCI device '%s'"), + dev->name); + return -1; + } + + return 0; +} +
This is the same code as in virPCIDeviceBindWithDriverOverride(). ACK if you replace it with call to this function. Michal

On 03/17/2017 09:32 AM, Michal Privoznik wrote:
On 03/10/2017 09:35 PM, Laine Stump wrote:
This function unbinds a device from its driver, then immediately rebinds it to its driver again. --- src/libvirt_private.syms | 1 + src/util/virpci.c | 25 +++++++++++++++++++++++++ src/util/virpci.h | 1 + 3 files changed, 27 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index b44a6ee..ef027cc 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2217,6 +2217,7 @@ virPCIDeviceListSteal; virPCIDeviceListStealIndex; virPCIDeviceNew; virPCIDeviceReattach; +virPCIDeviceRebind; virPCIDeviceReset; virPCIDeviceSetManaged; virPCIDeviceSetRemoveSlot; diff --git a/src/util/virpci.c b/src/util/virpci.c index 9878398..a007eea 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -1101,6 +1101,31 @@ virPCIDeviceUnbind(virPCIDevicePtr dev) return ret; }
+ +/** + * virPCIDeviceRebind: + * @dev: virPCIDevice object describing the device to rebind + * + * unbind a device from its driver, then immediately rebind it. + * + * Returns 0 on success, -1 on failure + */ +int virPCIDeviceRebind(virPCIDevicePtr dev) +{ + if (virPCIDeviceUnbind(dev) < 0) + return -1; + + if (virFileWriteStr(PCI_SYSFS "drivers_probe", dev->name, 0) < 0) { + virReportSystemError(errno, + _("Failed to trigger a probe for PCI device '%s'"), + dev->name); + return -1; + } + + return 0; +} +
This is the same code as in virPCIDeviceBindWithDriverOverride(). ACK if you replace it with call to this function.
Ah, okay. It took me a second to parse your request - at first I was thinking "No, it's not the same as virPCIDeviceBindWithDriverOverride() - there's extra stuff in that function!". Then I realized you meant I should replace those lines in ....WithDriverOverride() with a call to this new function. Sure, that makes sense.
Michal

We will want to allow silent failure of virNetDevSetMAC() in the case that the SIOSIFHWADDR ioctl fails with errno == EADDRNOTAVAIL. (Yes, that is very specific, but we really *do* want a logged failure in all other circumstances, and don't want to duplicate code in the caller for the other possibilities). This patch renames the 3 different virNetDevSetMAC() functions to virNetDevSetMACInternal(), adding a 3rd arg called "quiet" and making them static (because this extra control will only be needed within virnetdev.c). A new global virNetDevSetMAC() is defined that calls whichever of the three *Internal() functions gets compiled with quiet = false. Callers in virnetdev.c that want to notice a failure with errno == EADDRNOTAVAIL and retry with a different strategy rather than immediately failing, can call virNetDevSetMACInternal(..., true). --- src/util/virnetdev.c | 51 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 41 insertions(+), 10 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 766638d..ffc2fb4 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -222,17 +222,20 @@ int virNetDevExists(const char *ifname) #if defined(SIOCGIFHWADDR) && defined(SIOCSIFHWADDR) && \ defined(HAVE_STRUCT_IFREQ) /** - * virNetDevSetMAC: + * virNetDevSetMACInternal: * @ifname: interface name to set MTU for * @macaddr: MAC address + * @quiet: true if a failure to set MAC address with errno == EADDRNOTAVAIL + * should be silent (still returns error, but without log) * - * This function sets the @macaddr for a given interface @ifname. This - * gets rid of the kernel's automatically assigned random MAC. + * This function sets the @macaddr for a given interface @ifname. * * Returns 0 in case of success or -1 on failure */ -int virNetDevSetMAC(const char *ifname, - const virMacAddr *macaddr) +static int +virNetDevSetMACInternal(const char *ifname, + const virMacAddr *macaddr, + bool quiet) { int fd = -1; int ret = -1; @@ -254,6 +257,9 @@ int virNetDevSetMAC(const char *ifname, if (ioctl(fd, SIOCSIFHWADDR, &ifr) < 0) { char macstr[VIR_MAC_STRING_BUFLEN]; + if (quiet && errno == EADDRNOTAVAIL) + goto cleanup; + virReportSystemError(errno, _("Cannot set interface MAC to %s on '%s'"), virMacAddrFormat(macaddr, macstr), ifname); @@ -266,10 +272,16 @@ int virNetDevSetMAC(const char *ifname, VIR_FORCE_CLOSE(fd); return ret; } -#elif defined(SIOCSIFLLADDR) && defined(HAVE_STRUCT_IFREQ) && \ + + +#elif defined(SIOCSIFLLADDR) && defined(HAVE_STRUCT_IFREQ) && \ HAVE_DECL_LINK_ADDR -int virNetDevSetMAC(const char *ifname, - const virMacAddr *macaddr) + + +static int +virNetDevSetMACInternal(const char *ifname, + const virMacAddr *macaddr, + bool quiet) { struct ifreq ifr; struct sockaddr_dl sdl; @@ -288,6 +300,9 @@ int virNetDevSetMAC(const char *ifname, ifr.ifr_addr.sa_len = VIR_MAC_BUFLEN; if (ioctl(s, SIOCSIFLLADDR, &ifr) < 0) { + if (quiet && errno == EADDRNOTAVAIL) + goto cleanup; + virReportSystemError(errno, _("Cannot set interface MAC to %s on '%s'"), mac + 1, ifname); @@ -300,18 +315,34 @@ int virNetDevSetMAC(const char *ifname, return ret; } + + #else -int virNetDevSetMAC(const char *ifname, - const virMacAddr *macaddr ATTRIBUTE_UNUSED) + + +static int +virNetDevSetMACInternal(const char *ifname, + const virMacAddr *macaddr ATTRIBUTE_UNUSED, + bool quiet ATTRIBUTE_UNUSED) { virReportSystemError(ENOSYS, _("Cannot set interface MAC on '%s'"), ifname); return -1; } + + #endif +int +virNetDevSetMAC(const char *ifname, + const virMacAddr *macaddr) +{ + return virNetDevSetMACInternal(ifname, macaddr, false); +} + + #if defined(SIOCGIFHWADDR) && defined(HAVE_STRUCT_IFREQ) /** * virNetDevGetMAC: -- 2.9.3

On 03/10/2017 09:35 PM, Laine Stump wrote:
We will want to allow silent failure of virNetDevSetMAC() in the case that the SIOSIFHWADDR ioctl fails with errno == EADDRNOTAVAIL. (Yes, that is very specific, but we really *do* want a logged failure in all other circumstances, and don't want to duplicate code in the caller for the other possibilities).
This patch renames the 3 different virNetDevSetMAC() functions to virNetDevSetMACInternal(), adding a 3rd arg called "quiet" and making them static (because this extra control will only be needed within virnetdev.c). A new global virNetDevSetMAC() is defined that calls whichever of the three *Internal() functions gets compiled with quiet = false. Callers in virnetdev.c that want to notice a failure with errno == EADDRNOTAVAIL and retry with a different strategy rather than immediately failing, can call virNetDevSetMACInternal(..., true). --- src/util/virnetdev.c | 51 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 41 insertions(+), 10 deletions(-)
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 766638d..ffc2fb4 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -222,17 +222,20 @@ int virNetDevExists(const char *ifname) #if defined(SIOCGIFHWADDR) && defined(SIOCSIFHWADDR) && \ defined(HAVE_STRUCT_IFREQ) /** - * virNetDevSetMAC: + * virNetDevSetMACInternal: * @ifname: interface name to set MTU for * @macaddr: MAC address + * @quiet: true if a failure to set MAC address with errno == EADDRNOTAVAIL + * should be silent (still returns error, but without log) * - * This function sets the @macaddr for a given interface @ifname. This - * gets rid of the kernel's automatically assigned random MAC. + * This function sets the @macaddr for a given interface @ifname. * * Returns 0 in case of success or -1 on failure */ -int virNetDevSetMAC(const char *ifname, - const virMacAddr *macaddr) +static int +virNetDevSetMACInternal(const char *ifname, + const virMacAddr *macaddr, + bool quiet) { int fd = -1; int ret = -1; @@ -254,6 +257,9 @@ int virNetDevSetMAC(const char *ifname, if (ioctl(fd, SIOCSIFHWADDR, &ifr) < 0) { char macstr[VIR_MAC_STRING_BUFLEN];
+ if (quiet && errno == EADDRNOTAVAIL) + goto cleanup; + virReportSystemError(errno, _("Cannot set interface MAC to %s on '%s'"), virMacAddrFormat(macaddr, macstr), ifname);
Frankly, I like functions with quiet = true to be really silent. But I don't care that much. ACK Michal

Given an SRIOV PF netdev name (e.g. "enp2s0f0") and VF#, this new function returns the netdev name of the referenced VF device (e.g. "enp2s11f6"), or NULL if the device isn't bound to a net driver. --- src/libvirt_private.syms | 1 + src/util/virnetdev.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virnetdev.h | 3 +++ 3 files changed, 62 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index ef027cc..e9705ae 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1997,6 +1997,7 @@ virNetDevGetVLanID; virNetDevIfStateTypeFromString; virNetDevIfStateTypeToString; virNetDevIsVirtualFunction; +virNetDevPFGetVF; virNetDevReplaceMacAddress; virNetDevReplaceNetConfig; virNetDevRestoreMacAddress; diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index ffc2fb4..49a11f3 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1311,6 +1311,54 @@ virNetDevGetPhysicalFunction(const char *ifname, char **pfname) return ret; } + +/** + * virNetDevPFGetVF: + * + * @pfname: netdev name of the physical function (PF) + * @vf: virtual function (VF) number for the device of interest + * @vfname: name of the physical function interface name + * + * Finds the netdev name of VF# @vf of SRIOV PF @pfname, and puts it + * in @vfname. The caller must free @vfname when it's finished with + * it. + * + * Returns 0 on success, -1 on failure + * + */ +int +virNetDevPFGetVF(const char *pfname, int vf, char **vfname) +{ + char *virtfnName = NULL; + char *virtfnSysfsPath = NULL; + int ret = -1; + + if (virAsprintf(&virtfnName, "virtfn%d", vf) < 0) + goto cleanup; + + /* this provides the path to the VF's directory in sysfs, + * e.g. "/sys/class/net/enp2s0f0/virtfn3" + */ + if (virNetDevSysfsDeviceFile(&virtfnSysfsPath, pfname, virtfnName) < 0) + goto cleanup; + + /* and this gets the netdev name associated with it, which is a + * directory entry in [virtfnSysfsPath]/net, + * e.g. "/sys/class/net/enp2s0f0/virtfn3/net/enp2s11f4" - in this + * example the VF for enp2s0f0 vf#3 is "enp2s11f4". (If the VF + * isn't bound to a netdev driver, it won't have a netdev name, + * and vfname will be NULL). + */ + ret = virPCIGetNetName(virtfnSysfsPath, vfname); + + cleanup: + VIR_FREE(virtfnName); + VIR_FREE(virtfnSysfsPath); + + return ret; +} + + /** * virNetDevGetVirtualFunctionInfo: * @vfname: name of the virtual function interface @@ -1391,6 +1439,16 @@ virNetDevGetPhysicalFunction(const char *ifname ATTRIBUTE_UNUSED, } int +virNetDevPFGetVF(const char *pfname ATTRIBUTE_UNUSED, + int vf ATTRIBUTE_UNUSED, + char **vfname ATTRUBUTE_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to get virtual function name on this platform")); + return -1; +} + +int virNetDevGetVirtualFunctionInfo(const char *vfname ATTRIBUTE_UNUSED, char **pfname ATTRIBUTE_UNUSED, int *vf ATTRIBUTE_UNUSED) diff --git a/src/util/virnetdev.h b/src/util/virnetdev.h index 236cf83..ecc28c8 100644 --- a/src/util/virnetdev.h +++ b/src/util/virnetdev.h @@ -178,6 +178,9 @@ int virNetDevGetVirtualFunctionIndex(const char *pfname, const char *vfname, int virNetDevGetPhysicalFunction(const char *ifname, char **pfname) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK; +int virNetDevPFGetVF(const char *pfname, int vf, char **vfname) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; + int virNetDevGetVirtualFunctions(const char *pfname, char ***vfname, virPCIDeviceAddressPtr **virt_fns, -- 2.9.3

On 03/10/2017 09:35 PM, Laine Stump wrote:
Given an SRIOV PF netdev name (e.g. "enp2s0f0") and VF#, this new function returns the netdev name of the referenced VF device (e.g. "enp2s11f6"), or NULL if the device isn't bound to a net driver. --- src/libvirt_private.syms | 1 + src/util/virnetdev.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++ src/util/virnetdev.h | 3 +++ 3 files changed, 62 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index ef027cc..e9705ae 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1997,6 +1997,7 @@ virNetDevGetVLanID; virNetDevIfStateTypeFromString; virNetDevIfStateTypeToString; virNetDevIsVirtualFunction; +virNetDevPFGetVF; virNetDevReplaceMacAddress; virNetDevReplaceNetConfig; virNetDevRestoreMacAddress; diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index ffc2fb4..49a11f3 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1311,6 +1311,54 @@ virNetDevGetPhysicalFunction(const char *ifname, char **pfname) return ret; }
+ +/** + * virNetDevPFGetVF: + * + * @pfname: netdev name of the physical function (PF) + * @vf: virtual function (VF) number for the device of interest + * @vfname: name of the physical function interface name + * + * Finds the netdev name of VF# @vf of SRIOV PF @pfname, and puts it + * in @vfname. The caller must free @vfname when it's finished with + * it. + * + * Returns 0 on success, -1 on failure
Not entirely true. After your patch of 06/19 virPCIGetNetName() can return 0 (which in turn is the return value of virNetDevPFGetVF()), and still not fetch vfname - if virDirOpenQuiet() fails. I'd recommend to at least note this in the comment.
+ * + */ +int +virNetDevPFGetVF(const char *pfname, int vf, char **vfname) +{ + char *virtfnName = NULL; + char *virtfnSysfsPath = NULL; + int ret = -1; + + if (virAsprintf(&virtfnName, "virtfn%d", vf) < 0) + goto cleanup; + + /* this provides the path to the VF's directory in sysfs, + * e.g. "/sys/class/net/enp2s0f0/virtfn3" + */ + if (virNetDevSysfsDeviceFile(&virtfnSysfsPath, pfname, virtfnName) < 0) + goto cleanup; + + /* and this gets the netdev name associated with it, which is a + * directory entry in [virtfnSysfsPath]/net, + * e.g. "/sys/class/net/enp2s0f0/virtfn3/net/enp2s11f4" - in this + * example the VF for enp2s0f0 vf#3 is "enp2s11f4". (If the VF + * isn't bound to a netdev driver, it won't have a netdev name, + * and vfname will be NULL). + */ + ret = virPCIGetNetName(virtfnSysfsPath, vfname); + + cleanup: + VIR_FREE(virtfnName); + VIR_FREE(virtfnSysfsPath); + + return ret; +} + +
ACK with that fixed. Michal

These three functions are destined to replace virNetDev(Replace|Restore)NetConfig() and virNetDev(Replace|Restore)MacAddress(), which both do the save and set together as a single step. We need to separate the save, read, and set steps because there will be situations where we need to do something else in between (in particular, we will need to rebind a VF's driver after save but before set). This patch creates the new functions, but doesn't call them - that will come in a subsequent patch. --- src/libvirt_private.syms | 3 + src/util/virnetdev.c | 531 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virnetdev.h | 22 ++ 3 files changed, 556 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e9705ae..c983438 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1998,6 +1998,7 @@ virNetDevIfStateTypeFromString; virNetDevIfStateTypeToString; virNetDevIsVirtualFunction; virNetDevPFGetVF; +virNetDevReadNetConfig; virNetDevReplaceMacAddress; virNetDevReplaceNetConfig; virNetDevRestoreMacAddress; @@ -2007,11 +2008,13 @@ virNetDevRxFilterFree; virNetDevRxFilterModeTypeFromString; virNetDevRxFilterModeTypeToString; virNetDevRxFilterNew; +virNetDevSaveNetConfig; virNetDevSetMAC; virNetDevSetMTU; virNetDevSetMTUFromDevice; virNetDevSetName; virNetDevSetNamespace; +virNetDevSetNetConfig; virNetDevSetOnline; virNetDevSetPromiscuous; virNetDevSetRcvAllMulti; diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 49a11f3..feb5ba7 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1899,6 +1899,496 @@ virNetDevRestoreNetConfig(const char *linkdev, int vf, const char *stateDir) return ret; } + +/** + * virNetDevSaveNetConfig: + * @linkdev: name of the interface + * @vf: vf index if linkdev is a pf + * @stateDir: directory to store old net config + * @saveVlan: false if we shouldn't attempt to save vlan tag info + * (eg for interfaces using 802.1Qbg, since it handles + * vlan tags internally) + * + * Save current MAC address and (if linkdev itself is a VF, or if @vf + * >= 0) the "admin MAC address" and vlan tag the device described by + * @linkdev:@vf to @stateDir. (the "admin MAC address" is stored in + * the PF, and is what the VF MAC will be initialized to the next time + * its driver is reloaded (either on host or guest). + * + * File Name and Format: + * + * If the device is a VF and we're allowed to save vlan tag info, the + * file will be named ${pfDevName_vf#{vf} (e.g. "enp2s0f0_vf5") and + * will contain 2 or 3 lines of text: + * + * line 1 - admin MAC address + * line 2 - vlan tag + * line 3 - VF MAC address (or missing if VF has no host net driver) + * + * If the device isn't a VF, or we're not allowed to save vlan tag + * info, the file will be named ${linkdev} (e.g. "enp3s0f0") and will + * contain a single line of text containing linkdev's MAC address. + * + * Returns 0 on success, -1 on failure + * + */ +int +virNetDevSaveNetConfig(const char *linkdev, int vf, + const char *stateDir, + bool saveVlan) +{ + int ret = -1; + const char *pfDevName = NULL; + char *pfDevOrig = NULL; + char *vfDevOrig = NULL; + virMacAddr oldMAC = { 0 }; + char MACStr[VIR_MAC_STRING_BUFLEN]; + int oldVlanTag = -1; + char *filePath = NULL; + char *fileStr = NULL; + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (vf >= 0) { + /* linkdev is the PF */ + pfDevName = linkdev; + + /* linkdev should get the VF's netdev name (or NULL if none) */ + if (virNetDevPFGetVF(pfDevName, vf, &vfDevOrig) < 0) + goto cleanup; + + linkdev = vfDevOrig; + + } else if (saveVlan && virNetDevIsVirtualFunction(linkdev) == 1) { + /* when vf is -1, linkdev might be a standard netdevice (not + * SRIOV), or it might be an SRIOV VF. If it's a VF, normalize + * it to PF + VFname + */ + + if (virNetDevGetPhysicalFunction(linkdev, &pfDevOrig) < 0) + goto cleanup; + + pfDevName = pfDevOrig; + + if (virNetDevGetVirtualFunctionIndex(pfDevName, linkdev, &vf) < 0) + goto cleanup; + } + + /* if there is a PF, it's now in pfDevName, and linkdev is either + * the VF's name, or NULL (if the VF isn't bound to a net driver + * on the host) + */ + + if (pfDevName) { + /* get admin MAC and vlan tag */ + if (virNetDevGetVfConfig(pfDevName, vf, &oldMAC, + saveVlan ? &oldVlanTag : NULL) < 0) { + goto cleanup; + } + + virBufferAsprintf(&buf, "%s\n%d\n", + virMacAddrFormat(&oldMAC, MACStr), oldVlanTag); + + if (virAsprintf(&filePath, "%s/%s_vf%d", stateDir, pfDevName, vf) < 0) + goto cleanup; + + } else { + if (virAsprintf(&filePath, "%s/%s", stateDir, linkdev) < 0) + goto cleanup; + } + + if (linkdev) { + if (virNetDevGetMAC(linkdev, &oldMAC) < 0) + goto cleanup; + + /* for interfaces with no pfDevName, this will be the first + * line of the file. For interfaces that do have pfDevName, + * this will be the 3rd line of the file. + */ + virBufferAsprintf(&buf, "%s\n", virMacAddrFormat(&oldMAC, MACStr)); + } + + if (!(fileStr = virBufferContentAndReset(&buf))) + goto cleanup; + + if (virFileWriteStr(filePath, fileStr, O_CREAT|O_TRUNC|O_WRONLY) < 0) { + virReportSystemError(errno, _("Unable to preserve mac/vlan tag " + "for device = %s, vf = %d"), linkdev, vf); + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(pfDevOrig); + VIR_FREE(vfDevOrig); + VIR_FREE(filePath); + VIR_FREE(fileStr); + virBufferFreeAndReset(&buf); + return ret; +} + + +/** + * virNetDevReadNetConfig: + * @linkdev: name of the interface + * @vf: vf index if linkdev is a pf + * @stateDir: directory where net config is stored + * @adminMAC: returns admin MAC to store in the PF (if this is a VF) + * @MAC: returns MAC to set on device immediately + * + * Read saved MAC address and (if linkdev itself is a VF, or if @vf >= + * 0) "admin MAC address" and vlan tag of the device described by + * @linkdev:@vf from a file in @stateDir. (see virNetDevSaveNetConfig + * for details of file name and format). + * + * Returns 0 on success, -1 on failure. + * + * The caller MUST free adminMAC, vlan, and MAC when it is finished + * with them (they will be NULL if they weren't found in the file) + * + */ +int +virNetDevReadNetConfig(const char *linkdev, int vf, + const char *stateDir, + virMacAddrPtr *adminMAC, + virNetDevVlanPtr *vlan, + virMacAddrPtr *MAC) +{ + int ret = -1; + const char *pfDevName = NULL; + char *pfDevOrig = NULL; + char *vfDevOrig = NULL; + char *filePath = NULL; + char *fileStr = NULL; + /* the following two do *not* point to strings that need to be freed! */ + char *vlanStr = NULL; + char *MACStr = NULL; + + *adminMAC = NULL; + *vlan = NULL; + *MAC = NULL; + + if (vf >= 0) { + /* linkdev is the PF */ + pfDevName = linkdev; + + /* linkdev should get the VF's netdev name (or NULL if none) */ + if (virNetDevPFGetVF(pfDevName, vf, &vfDevOrig) < 0) + goto cleanup; + + linkdev = vfDevOrig; + + } else if (virNetDevIsVirtualFunction(linkdev) == 1) { + /* when vf is -1, linkdev might be a standard netdevice (not + * SRIOV), or it might be an SRIOV VF. If it's a VF, normalize + * it to PF + VFname + */ + + if (virNetDevGetPhysicalFunction(linkdev, &pfDevOrig) < 0) + goto cleanup; + + pfDevName = pfDevOrig; + + if (virNetDevGetVirtualFunctionIndex(pfDevName, linkdev, &vf) < 0) + goto cleanup; + } + + /* if there is a PF, it's now in pfDevName, and linkdev is either + * the VF's name, or NULL (if the VF isn't bound to a net driver + * on the host) + */ + + if (pfDevName) { + if (virAsprintf(&filePath, "%s/%s_vf%d", stateDir, pfDevName, vf) < 0) + goto cleanup; + + if (linkdev && !virFileExists(filePath)) { + /* the device may have been stored in a file named for the + * VF due to saveVlan == false (or an older version of + * libvirt), so reset filePath so we'll try the other + * filename before failing. + */ + VIR_FREE(filePath); + pfDevName = NULL; + } + } + + if (!pfDevName) { + if (virAsprintf(&filePath, "%s/%s", stateDir, linkdev) < 0) + goto cleanup; + } + + if (virFileReadAll(filePath, 128, &fileStr) < 0) + goto cleanup; + + /* find the (up to) 3 lines in the input file */ + if ((vlanStr = strchr(fileStr, '\n'))) { + vlanStr++; + if (*vlanStr) { + MACStr = strchr(vlanStr, '\n'); + if (MACStr) + MACStr++; + } + } + + if (VIR_ALLOC(*MAC) < 0) + goto cleanup; + + if (virMacAddrParse(fileStr, *MAC) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse MAC address from " + "line 1 of '%s': '%s'"), + filePath, fileStr); + goto cleanup; + } + + if (vlanStr) { + int vlanTag = -1; + char *endptr; + + if ((virStrToLong_i(vlanStr, &endptr, 10, &vlanTag) < 0) || + (endptr && *endptr != '\n' && *endptr != 0)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse vlan tag from " + "line 2 of '%s': '%s'"), + filePath, vlanStr); + goto cleanup; + } + + if (vlanTag != -1) { + /* construct a simple virNetDevVlan object with a single + * tag + */ + if (VIR_ALLOC(*vlan) < 0) + goto cleanup; + if (VIR_ALLOC((*vlan)->tag) < 0) + goto cleanup; + (*vlan)->nTags = 1; + (*vlan)->tag[0] = vlanTag; + } + } + + if (MACStr) { + /* If there is a 3rd line, then the first MAC was adminMAC, + * and this line will be MAC + */ + *adminMAC = *MAC; + if (VIR_ALLOC(*MAC) < 0) + goto cleanup; + + if (virMacAddrParse(MACStr, *MAC) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot parse MAC address from " + "line 3 of '%s': '%s'"), + filePath, MACStr); + goto cleanup; + } + } + + /* we won't need the file again */ + ignore_value(unlink(filePath)); + + ret = 0; + cleanup: + if (ret < 0) { + VIR_FREE(*adminMAC); + VIR_FREE(*MAC); + VIR_FREE(*vlan); + } + + VIR_FREE(pfDevOrig); + VIR_FREE(vfDevOrig); + VIR_FREE(filePath); + VIR_FREE(fileStr); + return ret; +} + + +/** + * virNetDevSetNetConfig: + * @linkdev: name of the interface + * @vf: vf index if linkdev is a PF + * @adminMAC: new admin MAC address (will be stored in PF and + * used for next initialization of VF driver) + * @vlan: new vlan tag info (or NULL) + * @MAC: new MAC address to set on the device immediately + * @setVlan: true to enable setting vlan tag (even if @vlan is NULL, + * the interface vlan tag will be set to 0). + * + * + * Set new MAC address and (optionally) admin MAC and vlan tag of + * @linkdev VF# @vf. + * + * Returns 0 on success, -1 on failure + * + */ +int +virNetDevSetNetConfig(const char *linkdev, int vf, + const virMacAddr *adminMAC, + virNetDevVlanPtr vlan, + const virMacAddr *MAC, + bool setVlan) +{ + int ret = -1; + char MACStr[VIR_MAC_STRING_BUFLEN]; + const char *pfDevName = NULL; + char *pfDevOrig = NULL; + char *vfDevOrig = NULL; + int vlanTag = -1; + + if (vf >= 0) { + /* linkdev is the PF */ + pfDevName = linkdev; + + /* linkdev should get the VF's netdev name (or NULL if none) */ + if (virNetDevPFGetVF(pfDevName, vf, &vfDevOrig) < 0) + goto cleanup; + + linkdev = vfDevOrig; + + } else if (virNetDevIsVirtualFunction(linkdev) == 1) { + /* when vf is -1, linkdev might be a standard netdevice (not + * SRIOV), or it might be an SRIOV VF. If it's a VF, normalize + * it to PF + VFname + */ + + if (virNetDevGetPhysicalFunction(linkdev, &pfDevOrig) < 0) + goto cleanup; + + pfDevName = pfDevOrig; + + if (virNetDevGetVirtualFunctionIndex(pfDevName, linkdev, &vf) < 0) + goto cleanup; + } + + + if (!pfDevName) { + /* if it's not SRIOV, then we can't set the admin MAC address + * or vlan tag + */ + if (adminMAC) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("admin MAC can only be set for SR-IOV VFs, but " + "%s is not a VF"), linkdev); + goto cleanup; + } + + if (vlan) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("vlan can only be set for SR-IOV VFs, but " + "%s is not a VF"), linkdev); + goto cleanup; + } + + } else { + bool pfIsOnline; + + /* Assure that PF is online before trying to use it to set + * anything up for this VF. It *should* be online already, + * but if it isn't online the changes made to the VF via the + * PF won't take effect, yet there will be no error + * reported. In the case that the PF isn't online, we need to + * fail and report the error, rather than automatically + * setting it online, since setting an unconfigured interface + * online automatically turns on IPv6 autoconfig, which may + * not be what the admin expects, so we require them to + * explicitly enable the PF in the host system network config. + */ + if (virNetDevGetOnline(pfDevName, &pfIsOnline) < 0) + goto cleanup; + + if (!pfIsOnline) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to configure VF %d of PF '%s' " + "because the PF is not online. Please " + "change host network config to put the " + "PF online."), + vf, pfDevName); + goto cleanup; + } + + if (vlan) { + if (vlan->nTags != 1 || vlan->trunk) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("vlan trunking is not supported " + "by SR-IOV network devices")); + goto cleanup; + } + + if (!setVlan) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("vlan tag set for interface %s but " + "caller requested it not be set")); + goto cleanup; + } + + vlanTag = vlan->tag[0]; + + } else if (setVlan) { + vlanTag = 0; /* assure any existing vlan tag is reset */ + } + } + + if (MAC) { + if (!linkdev) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("VF %d of PF '%s' is not bound to a net driver, " + "so its MAC address cannot be set to %s"), + vf, pfDevName, virMacAddrFormat(MAC, MACStr)); + goto cleanup; + } + + if (virNetDevSetMAC(linkdev, MAC) < 0) { + /* This may have failed due to the "administratively + * set" flag being set in the PF for this VF. For now + * we will just fail, but in the future we should + * attempt to set the VF MAC via the PF. + */ + goto cleanup; + } + if (pfDevOrig) { + /* if pfDevOrig is set, it means that the caller was + * *really* only interested in setting the MAC of the VF + * itself, *not* the admin MAC via the PF. In those cases, + * the adminMAC was only provided in case we need to set + * the VF's MAC by temporarily unbinding/rebinding the + * VF's net driver with the admin MAC set to the desired + * MAC, and then want to restore the admin MAC to its + * original setting when we're finished. We would only + * need to do that if the virNetDevSetMAC() above had + * failed; since it didn't, we don't need to set the + * adminMAC, so we are NULLing it out here to avoid that + * below. + + * (NB: since setting the admin MAC sets the + * "administratively set" flag for the VF in the PF's + * driver, which prevents any future changes to the VF's + * MAC address, we want to avoid setting the admin MAC as + * much as possible.) + */ + adminMAC = NULL; + } + } + + if (adminMAC || vlanTag >= 0) { + /* Set vlanTag and admin MAC using an RTM_SETLINK request sent to + * PFdevname+VF#, if mac != NULL this will set the "admin MAC" via + * the PF, *not* the actual VF MAC - the admin MAC only takes + * effect the next time the VF's driver is initialized (either in + * guest or host). if there is a vlanTag to set, it will take + * effect immediately though. + */ + if (virNetDevSetVfConfig(pfDevName, vf, adminMAC, vlanTag) < 0) + goto cleanup; + } + + ret = 0; + cleanup: + VIR_FREE(pfDevOrig); + VIR_FREE(vfDevOrig); + return ret; +} + + #else /* defined(__linux__) && defined(HAVE_LIBNL) */ int @@ -1924,6 +2414,47 @@ virNetDevRestoreNetConfig(const char *linkdev ATTRIBUTE_UNUSED, return -1; } + +int +virNetDevSaveNetConfig(const char *linkdev ATTRIBUTE_UNUSED, + int vf ATTRIBUTE_UNUSED, + const char *stateDir ATTRIBUTE_UNUSED, + bool saveVlan ATTRIBUTE_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to save net device config on this platform")); + return -1; +} + + +int +virNetDevReadNetConfig(const char *linkdev ATTRIBUTE_UNUSED, + int vf ATTRIBUTE_UNUSED, + const char *stateDir ATTRIBUTE_UNUSED, + virMacAddrPtr *adminMAC ATTRIBUTE_UNUSED, + virNetDevVLanPtr *vlan ATTRIBUTE_UNUSED, + virMacAddrPtr *MAC ATTRIBUTE_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to read net device config on this platform")); + return -1; +} + + +int +virNetDevSetNetConfig(const char *linkdev ATTRIBUTE_UNUSED, + int vf ATTRIBUTE_UNUSED, + const virMacAddr *adminMAC ATTRIBUTE_UNUSED, + virNetDevVlanPtr vlan ATTRIBUTE_UNUSED, + const virMacAddr *MAC ATTRIBUTE_UNUSED, + bool setVlan ATTRIBUTE_UNUSED) +{ + virReportSystemError(ENOSYS, "%s", + _("Unable to set net device config on this platform")); + return -1; +} + + #endif /* defined(__linux__) && defined(HAVE_LIBNL) */ VIR_ENUM_IMPL(virNetDevIfState, diff --git a/src/util/virnetdev.h b/src/util/virnetdev.h index ecc28c8..e153d71 100644 --- a/src/util/virnetdev.h +++ b/src/util/virnetdev.h @@ -189,6 +189,28 @@ int virNetDevGetVirtualFunctions(const char *pfname, ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4) ATTRIBUTE_NONNULL(5) ATTRIBUTE_RETURN_CHECK; +int virNetDevSaveNetConfig(const char *linkdev, int vf, + const char *stateDir, + bool saveVlan) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3) ATTRIBUTE_RETURN_CHECK; + +int +virNetDevReadNetConfig(const char *linkdev, int vf, + const char *stateDir, + virMacAddrPtr *adminMAC, + virNetDevVlanPtr *vlan, + virMacAddrPtr *MAC) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4) + ATTRIBUTE_NONNULL(5) ATTRIBUTE_NONNULL(6) ATTRIBUTE_RETURN_CHECK; + +int +virNetDevSetNetConfig(const char *linkdev, int vf, + const virMacAddr *adminMAC, + virNetDevVlanPtr vlan, + const virMacAddr *MAC, + bool setVLan) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; + int virNetDevReplaceNetConfig(const char *linkdev, int vf, const virMacAddr *macaddress, virNetDevVlanPtr vlan, -- 2.9.3

On 03/10/2017 09:35 PM, Laine Stump wrote:
These three functions are destined to replace virNetDev(Replace|Restore)NetConfig() and virNetDev(Replace|Restore)MacAddress(), which both do the save and set together as a single step. We need to separate the save, read, and set steps because there will be situations where we need to do something else in between (in particular, we will need to rebind a VF's driver after save but before set).
This patch creates the new functions, but doesn't call them - that will come in a subsequent patch. --- src/libvirt_private.syms | 3 + src/util/virnetdev.c | 531 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virnetdev.h | 22 ++ 3 files changed, 556 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e9705ae..c983438 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1998,6 +1998,7 @@ virNetDevIfStateTypeFromString; virNetDevIfStateTypeToString; virNetDevIsVirtualFunction; virNetDevPFGetVF; +virNetDevReadNetConfig; virNetDevReplaceMacAddress; virNetDevReplaceNetConfig; virNetDevRestoreMacAddress; @@ -2007,11 +2008,13 @@ virNetDevRxFilterFree; virNetDevRxFilterModeTypeFromString; virNetDevRxFilterModeTypeToString; virNetDevRxFilterNew; +virNetDevSaveNetConfig; virNetDevSetMAC; virNetDevSetMTU; virNetDevSetMTUFromDevice; virNetDevSetName; virNetDevSetNamespace; +virNetDevSetNetConfig; virNetDevSetOnline; virNetDevSetPromiscuous; virNetDevSetRcvAllMulti; diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 49a11f3..feb5ba7 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1899,6 +1899,496 @@ virNetDevRestoreNetConfig(const char *linkdev, int vf, const char *stateDir) return ret; }
+ +/** + * virNetDevSaveNetConfig: + * @linkdev: name of the interface + * @vf: vf index if linkdev is a pf + * @stateDir: directory to store old net config + * @saveVlan: false if we shouldn't attempt to save vlan tag info + * (eg for interfaces using 802.1Qbg, since it handles + * vlan tags internally) + * + * Save current MAC address and (if linkdev itself is a VF, or if @vf + * >= 0) the "admin MAC address" and vlan tag the device described by + * @linkdev:@vf to @stateDir. (the "admin MAC address" is stored in + * the PF, and is what the VF MAC will be initialized to the next time + * its driver is reloaded (either on host or guest). + * + * File Name and Format: + * + * If the device is a VF and we're allowed to save vlan tag info, the + * file will be named ${pfDevName_vf#{vf} (e.g. "enp2s0f0_vf5") and + * will contain 2 or 3 lines of text: + * + * line 1 - admin MAC address + * line 2 - vlan tag + * line 3 - VF MAC address (or missing if VF has no host net driver)
I'd rather see a JSON format or something. This can bite us in the future. What are your thoughts?
+ * + * If the device isn't a VF, or we're not allowed to save vlan tag + * info, the file will be named ${linkdev} (e.g. "enp3s0f0") and will + * contain a single line of text containing linkdev's MAC address. + * + * Returns 0 on success, -1 on failure + * + */ +int +virNetDevSaveNetConfig(const char *linkdev, int vf, + const char *stateDir, + bool saveVlan) +{ + int ret = -1; + const char *pfDevName = NULL; + char *pfDevOrig = NULL; + char *vfDevOrig = NULL; + virMacAddr oldMAC = { 0 };
This causes a compile error for me. calling memset(&oldMAC, 0, sizeof(oldMAC)); solved it for me. Moreover, this variable will always be set before use. So your call wether to leave the initialization in or out.
+ char MACStr[VIR_MAC_STRING_BUFLEN]; + int oldVlanTag = -1; + char *filePath = NULL; + char *fileStr = NULL; + virBuffer buf = VIR_BUFFER_INITIALIZER;
Michal

On 03/17/2017 09:32 AM, Michal Privoznik wrote:
On 03/10/2017 09:35 PM, Laine Stump wrote:
These three functions are destined to replace virNetDev(Replace|Restore)NetConfig() and virNetDev(Replace|Restore)MacAddress(), which both do the save and set together as a single step. We need to separate the save, read, and set steps because there will be situations where we need to do something else in between (in particular, we will need to rebind a VF's driver after save but before set).
This patch creates the new functions, but doesn't call them - that will come in a subsequent patch. --- src/libvirt_private.syms | 3 + src/util/virnetdev.c | 531 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virnetdev.h | 22 ++ 3 files changed, 556 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e9705ae..c983438 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1998,6 +1998,7 @@ virNetDevIfStateTypeFromString; virNetDevIfStateTypeToString; virNetDevIsVirtualFunction; virNetDevPFGetVF; +virNetDevReadNetConfig; virNetDevReplaceMacAddress; virNetDevReplaceNetConfig; virNetDevRestoreMacAddress; @@ -2007,11 +2008,13 @@ virNetDevRxFilterFree; virNetDevRxFilterModeTypeFromString; virNetDevRxFilterModeTypeToString; virNetDevRxFilterNew; +virNetDevSaveNetConfig; virNetDevSetMAC; virNetDevSetMTU; virNetDevSetMTUFromDevice; virNetDevSetName; virNetDevSetNamespace; +virNetDevSetNetConfig; virNetDevSetOnline; virNetDevSetPromiscuous; virNetDevSetRcvAllMulti; diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 49a11f3..feb5ba7 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1899,6 +1899,496 @@ virNetDevRestoreNetConfig(const char *linkdev, int vf, const char *stateDir) return ret; }
+ +/** + * virNetDevSaveNetConfig: + * @linkdev: name of the interface + * @vf: vf index if linkdev is a pf + * @stateDir: directory to store old net config + * @saveVlan: false if we shouldn't attempt to save vlan tag info + * (eg for interfaces using 802.1Qbg, since it handles + * vlan tags internally) + * + * Save current MAC address and (if linkdev itself is a VF, or if @vf + * >= 0) the "admin MAC address" and vlan tag the device described by + * @linkdev:@vf to @stateDir. (the "admin MAC address" is stored in + * the PF, and is what the VF MAC will be initialized to the next time + * its driver is reloaded (either on host or guest). + * + * File Name and Format: + * + * If the device is a VF and we're allowed to save vlan tag info, the + * file will be named ${pfDevName_vf#{vf} (e.g. "enp2s0f0_vf5") and + * will contain 2 or 3 lines of text: + * + * line 1 - admin MAC address + * line 2 - vlan tag + * line 3 - VF MAC address (or missing if VF has no host net driver)
I'd rather see a JSON format or something. This can bite us in the future. What are your thoughts?
Every time I touch the code that uses these files I have the same thought - "%$&*($% this is ugly, and it's bound to cause a headache someday. We should change it." But that's always a secondary issue about an existing condition, and I'm chasing something "more important" (and I figure that when the day comes that we really *need* to switch to a better format, it will be no more difficult to detect which is being used than it would be today; that makes it morally easier to procrastinate) I suppose I can spend some time and write a function that parses something more expandable, with a fallback to the "old" method based on what's seen at the beginning of the file. Why JSON rather than XML though? I don't have a real preference over one or the other, but libvirt lore is *steeped* in XML, and all the other libvirt config files are XML...
+ * + * If the device isn't a VF, or we're not allowed to save vlan tag + * info, the file will be named ${linkdev} (e.g. "enp3s0f0") and will + * contain a single line of text containing linkdev's MAC address. + * + * Returns 0 on success, -1 on failure + * + */ +int +virNetDevSaveNetConfig(const char *linkdev, int vf, + const char *stateDir, + bool saveVlan) +{ + int ret = -1; + const char *pfDevName = NULL; + char *pfDevOrig = NULL; + char *vfDevOrig = NULL; + virMacAddr oldMAC = { 0 };
This causes a compile error for me. calling memset(&oldMAC, 0, sizeof(oldMAC));
Strange. You must be running a newer compiler than me (I'm doing everything on F25 + updates-testing). I also wonder why I thought I needed to initialize it, and why I did it that way, since in other cases I use "= { .addr = { blah } };". (I think most likely at some point early on I had thought that I might end up saving it even if it hadn't been set, so I wanted it to be consistent. But it ended up that I only save it if it was set, so as you say, the initialization is pointless.)
solved it for me. Moreover, this variable will always be set before use. So your call wether to leave the initialization in or out.
+ char MACStr[VIR_MAC_STRING_BUFLEN]; + int oldVlanTag = -1; + char *filePath = NULL; + char *fileStr = NULL; + virBuffer buf = VIR_BUFFER_INITIALIZER;
Michal

On 03/17/2017 02:58 PM, Laine Stump wrote:
On 03/17/2017 09:32 AM, Michal Privoznik wrote:
On 03/10/2017 09:35 PM, Laine Stump wrote:
These three functions are destined to replace virNetDev(Replace|Restore)NetConfig() and virNetDev(Replace|Restore)MacAddress(), which both do the save and set together as a single step. We need to separate the save, read, and set steps because there will be situations where we need to do something else in between (in particular, we will need to rebind a VF's driver after save but before set).
This patch creates the new functions, but doesn't call them - that will come in a subsequent patch. --- src/libvirt_private.syms | 3 + src/util/virnetdev.c | 531 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virnetdev.h | 22 ++ 3 files changed, 556 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index e9705ae..c983438 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1998,6 +1998,7 @@ virNetDevIfStateTypeFromString; virNetDevIfStateTypeToString; virNetDevIsVirtualFunction; virNetDevPFGetVF; +virNetDevReadNetConfig; virNetDevReplaceMacAddress; virNetDevReplaceNetConfig; virNetDevRestoreMacAddress; @@ -2007,11 +2008,13 @@ virNetDevRxFilterFree; virNetDevRxFilterModeTypeFromString; virNetDevRxFilterModeTypeToString; virNetDevRxFilterNew; +virNetDevSaveNetConfig; virNetDevSetMAC; virNetDevSetMTU; virNetDevSetMTUFromDevice; virNetDevSetName; virNetDevSetNamespace; +virNetDevSetNetConfig; virNetDevSetOnline; virNetDevSetPromiscuous; virNetDevSetRcvAllMulti; diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 49a11f3..feb5ba7 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1899,6 +1899,496 @@ virNetDevRestoreNetConfig(const char *linkdev, int vf, const char *stateDir) return ret; }
+ +/** + * virNetDevSaveNetConfig: + * @linkdev: name of the interface + * @vf: vf index if linkdev is a pf + * @stateDir: directory to store old net config + * @saveVlan: false if we shouldn't attempt to save vlan tag info + * (eg for interfaces using 802.1Qbg, since it handles + * vlan tags internally) + * + * Save current MAC address and (if linkdev itself is a VF, or if @vf + * >= 0) the "admin MAC address" and vlan tag the device described by + * @linkdev:@vf to @stateDir. (the "admin MAC address" is stored in + * the PF, and is what the VF MAC will be initialized to the next time + * its driver is reloaded (either on host or guest). + * + * File Name and Format: + * + * If the device is a VF and we're allowed to save vlan tag info, the + * file will be named ${pfDevName_vf#{vf} (e.g. "enp2s0f0_vf5") and + * will contain 2 or 3 lines of text: + * + * line 1 - admin MAC address + * line 2 - vlan tag + * line 3 - VF MAC address (or missing if VF has no host net driver)
I'd rather see a JSON format or something. This can bite us in the future. What are your thoughts?
Every time I touch the code that uses these files I have the same thought - "%$&*($% this is ugly, and it's bound to cause a headache someday. We should change it." But that's always a secondary issue about an existing condition, and I'm chasing something "more important" (and I figure that when the day comes that we really *need* to switch to a better format, it will be no more difficult to detect which is being used than it would be today; that makes it morally easier to procrastinate)
I suppose I can spend some time and write a function that parses something more expandable, with a fallback to the "old" method based on what's seen at the beginning of the file.
Why JSON rather than XML though? I don't have a real preference over one or the other, but libvirt lore is *steeped* in XML, and all the other libvirt config files are XML...
As discussed on IRC, I can write the code to save/parse the JSON here. You've done your part. However, I'm not sure I will manage to make it happen today, but maybe beginning of the next week if that is okay with you. And for the future reference: I prefer JSON over XML because I find it producing smaller files in terms of size. And also easier to read by us humans at a first glance. These are the reasons I've gone with JSON in NSS modules. Unfortunately, we have to stick with XML for out public APIs, but for storing some pieces of internal information, we can use other formats too.
+ * + * If the device isn't a VF, or we're not allowed to save vlan tag + * info, the file will be named ${linkdev} (e.g. "enp3s0f0") and will + * contain a single line of text containing linkdev's MAC address. + * + * Returns 0 on success, -1 on failure + * + */ +int +virNetDevSaveNetConfig(const char *linkdev, int vf, + const char *stateDir, + bool saveVlan) +{ + int ret = -1; + const char *pfDevName = NULL; + char *pfDevOrig = NULL; + char *vfDevOrig = NULL; + virMacAddr oldMAC = { 0 };
This causes a compile error for me. calling memset(&oldMAC, 0, sizeof(oldMAC));
Strange. You must be running a newer compiler than me (I'm doing everything on F25 + updates-testing). I also wonder why I thought I needed to initialize it, and why I did it that way, since in other cases I use "= { .addr = { blah } };". (I think most likely at some point early on I had thought that I might end up saving it even if it hadn't been set, so I wanted it to be consistent. But it ended up that I only save it if it was set, so as you say, the initialization is pointless.)
Don't get me wrong. I like initialized variables. But in this specific case it broke the compilation for me. Oh, and also I probably did not point it out - the same is happening in 18/19 with @zeroMAC global variable. The correct form is: static virMacAddr zeroMAC = { .addr = { 0 } }; Michal

On Fri, Mar 17, 2017 at 03:40:33PM +0100, Michal Privoznik wrote:
On 03/17/2017 02:58 PM, Laine Stump wrote:
Why JSON rather than XML though? I don't have a real preference over one or the other, but libvirt lore is *steeped* in XML, and all the other libvirt config files are XML...
As discussed on IRC, I can write the code to save/parse the JSON here. You've done your part. However, I'm not sure I will manage to make it happen today, but maybe beginning of the next week if that is okay with you.
And for the future reference: I prefer JSON over XML because I find it producing smaller files in terms of size. And also easier to read by us humans at a first glance. These are the reasons I've gone with JSON in NSS modules. Unfortunately, we have to stick with XML for out public APIs, but for storing some pieces of internal information, we can use other formats too.
Agreed, if I were starting libvirt from scratch I don't think we'd use XML, but rather JSON / YAML in APIs. Given where we are today though, our normal practice should be to use XML in any public APIs. For stuff not related to public APIs, we should choose the virConf format or JSON as appropriate. virConf if a simple flat set of data, JSON for anything needing structure Definitely don't invent any new text formats, unless it is needed for interoperability with a pre-existing apps we need to integrate with. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|

On 03/17/2017 03:40 PM, Michal Privoznik wrote:
As discussed on IRC, I can write the code to save/parse the JSON here. You've done your part. However, I'm not sure I will manage to make it happen today, but maybe beginning of the next week if that is okay with you.
And as promised, here's my code: https://github.com/zippy2/libvirt/commit/4e0538298f91ce62153a9e11f9eb6b198f3... Caution: I haven't tested it and frankly found it quite difficult to get right. The parsing that was there too big for my small brain. So perhaps in the parser I've exchanged adminMAC and MAC? You'll see when you test it. Michal

On 03/21/2017 07:21 AM, Michal Privoznik wrote:
On 03/17/2017 03:40 PM, Michal Privoznik wrote:
As discussed on IRC, I can write the code to save/parse the JSON here. You've done your part. However, I'm not sure I will manage to make it happen today, but maybe beginning of the next week if that is okay with you.
And as promised, here's my code:
https://github.com/zippy2/libvirt/commit/4e0538298f91ce62153a9e11f9eb6b198f3...
Thanks for doing that. Made it *much* easier to pick out which were the useful functions and how to use them. I modified what you did a bit and squashed it into this patch. (I made the file reader able to read both the old and new version of the file, for those cases where a libvirt upgrade is done while a domain is active). Since you had ACKed everything in the series except this one, I pushed 1 - 10 (after making the changes you suggested) and am reposting the remainder of the series starting with this one (only for completeness in case you want to test it yourself - I've only made the minor changes you suggested to the other patches (and I have tested the scenarios I can think of and it's doing the right thing).
Caution: I haven't tested it and frankly found it quite difficult to get right. The parsing that was there too big for my small brain. So perhaps in the parser I've exchanged adminMAC and MAC? You'll see when you test it.
Yep, you did swap them, but that's fine. It's easy to understand why you would swap them. There are some cases where line 1 of the file would be the adminMAC, and some cases where it would be MAC (for backward compatibility with hostdev and macvtap vf and non-vf). Now that it's JSON, that confusion is a thing of the past. I'm glad you insisted on fixing this now rather than continuing to propagate it :-)

This patch modifies the macvtap passthrough setup to use virNetDevSaveNetConfig()+virNetDevSetConfig() instead of virNetDevReplaceNetConfig() or virNetDevReplaceMacAddress(), and the teardown to use virNetDevReadNetConfig()+virNetDevSetConfig() instead of virNetDevRestoreNetConfig() or virNetDevRestoreMacAddress(). Since the older functions only saved/restored the admin MAC and vlan tag (which is incorrect) and the new functions save/restore the VF's own MAC address and vlan tag (correct), this actually fixes a bug (which was introduced by commit cb3fe38c7, which was itself supposed to be a fix for https://bugzilla.redhat.com/1113474 ). The downside to this patch is that it causes an *apparent* regression in that bug (because there will once again be an error reported if the interface had previously been used for VFIO device assignment), but in reality, the code hasn't been working for *any* case before this current patch (at least not with any recent kernel). Anyway, that "regression" will be fixed with an upcoming patch that fixes it the *right* way. --- src/util/virnetdevmacvlan.c | 40 +++++++++++++++++++++++++--------------- 1 file changed, 25 insertions(+), 15 deletions(-) diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index be9dff6..7222b0f 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -1010,20 +1010,22 @@ virNetDevMacVLanCreateWithVPortProfile(const char *ifnameRequested, */ if (mode == VIR_NETDEV_MACVLAN_MODE_PASSTHRU) { + bool setVlan = true; + if (virtPortProfile && virtPortProfile->virtPortType == VIR_NETDEV_VPORT_PROFILE_8021QBH) { - /* The Cisco enic driver (the only card that uses - * 802.1Qbh) doesn't support IFLA_VFINFO_LIST, which is - * required for virNetDevReplaceNetConfig(), so we must - * use this function (which uses ioctl(SIOCGIFHWADDR) - * instead or virNetDevReplaceNetConfig() + /* The Cisco enic driver (the only SRIOV-capable card that + * uses 802.1Qbh) doesn't support IFLA_VFINFO_LIST, which + * is required to get/set the vlan tag of a VF. */ - if (virNetDevReplaceMacAddress(linkdev, macaddress, stateDir) < 0) - return -1; - } else { - if (virNetDevReplaceNetConfig(linkdev, -1, macaddress, vlan, stateDir) < 0) - return -1; + setVlan = false; } + + if (virNetDevSaveNetConfig(linkdev, -1, stateDir, setVlan) < 0) + return -1; + + if (virNetDevSetNetConfig(linkdev, -1, NULL, vlan, macaddress, setVlan) < 0) + return -1; } if (ifnameRequested) { @@ -1194,11 +1196,19 @@ int virNetDevMacVLanDeleteWithVPortProfile(const char *ifname, } if (mode == VIR_NETDEV_MACVLAN_MODE_PASSTHRU) { - if (virtPortProfile && - virtPortProfile->virtPortType == VIR_NETDEV_VPORT_PROFILE_8021QBH) - ignore_value(virNetDevRestoreMacAddress(linkdev, stateDir)); - else - ignore_value(virNetDevRestoreNetConfig(linkdev, -1, stateDir)); + virMacAddrPtr MAC = NULL; + virMacAddrPtr adminMAC = NULL; + virNetDevVlanPtr vlan = NULL; + + if (virNetDevReadNetConfig(linkdev, -1, stateDir, + &adminMAC, &vlan, &MAC) == 0) { + + ignore_value(virNetDevSetNetConfig(linkdev, -1, + adminMAC, vlan, MAC, !!vlan)); + VIR_FREE(MAC); + VIR_FREE(adminMAC); + virNetDevVlanFree(vlan); + } } virNetlinkEventRemoveClient(0, macaddr, NETLINK_ROUTE); -- 2.9.3

virHostdevNetConfigReplace() and virHostdevNetConfigRestore() are modified to use the new virNetDev*NetConfig() functions. Note that due to the VF's original MAC addresses being saved after it has already been un-bound from the host net driver, the actual current VF MAC address won't be saved (because it no longer exists) - only the "admin MAC" will be saved. This reflects existing behavior that will be fixed in an upcoming patch. --- src/util/virhostdev.c | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 7292cf4..0bad925 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -446,10 +446,13 @@ virHostdevNetConfigReplace(virDomainHostdevDefPtr hostdev, goto cleanup; } } else { - /* Set only mac and vlan */ - if (virNetDevReplaceNetConfig(linkdev, vf, - &hostdev->parent.data.net->mac, - vlan, stateDir) < 0) { + /* Save/Set only mac and vlan */ + + if (virNetDevSaveNetConfig(linkdev, vf, stateDir, true) < 0) + goto cleanup; + + if (virNetDevSetNetConfig(linkdev, vf, &hostdev->parent.data.net->mac, + vlan, NULL, true) < 0) { goto cleanup; } } @@ -502,9 +505,23 @@ virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev, NULL, port_profile_associate); } else { - ret = virNetDevRestoreNetConfig(linkdev, vf, stateDir); - if (ret < 0 && oldStateDir != NULL) - ret = virNetDevRestoreNetConfig(linkdev, vf, oldStateDir); + virMacAddrPtr MAC = NULL; + virMacAddrPtr adminMAC = NULL; + virNetDevVlanPtr vlan = NULL; + + ret = virNetDevReadNetConfig(linkdev, vf, stateDir, &adminMAC, &vlan, &MAC); + if (ret < 0 && oldStateDir) + ret = virNetDevReadNetConfig(linkdev, vf, oldStateDir, + &adminMAC, &vlan, &MAC); + + if (ret == 0) { + ignore_value(virNetDevSetNetConfig(linkdev, vf, + adminMAC, vlan, MAC, true)); + } + + VIR_FREE(MAC); + VIR_FREE(adminMAC); + virNetDevVlanFree(vlan); } VIR_FREE(linkdev); -- 2.9.3

These two operations will need to be separated so that saving of the original config is done before detaching the host net driver, and setting the new config is done after attaching vfio-pci. This patch splits the single function into two, but for now calls them together (to make bisecting easier if there is a regression). --- src/util/virhostdev.c | 89 +++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 68 insertions(+), 21 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 0bad925..3b5bad0 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -408,17 +408,30 @@ virHostdevNetConfigVirtPortProfile(const char *linkdev, int vf, } +/** + * virHostdevSaveNetConfig: + * @hostdev: config object describing a hostdev device + * @stateDir: directory to save device state into + * + * If the given hostdev device is an SRIOV network VF and *does not* + * have a <virtualport> element (ie, it isn't being configured via + * 802.11Qbh), determine its PF+VF#, and use that to save its current + * "admin" MAC address and VF tag (the ones saved in the PF + * driver). + * + * Returns 0 on success, -1 on failure. + */ static int -virHostdevNetConfigReplace(virDomainHostdevDefPtr hostdev, - const unsigned char *uuid, - const char *stateDir) +virHostdevSaveNetConfig(virDomainHostdevDefPtr hostdev, + const char *stateDir) { - char *linkdev = NULL; - virNetDevVlanPtr vlan; - virNetDevVPortProfilePtr virtPort; int ret = -1; + char *linkdev = NULL; int vf = -1; - bool port_profile_associate = true; + + if (!virHostdevIsPCINetDevice(hostdev) || + virDomainNetGetActualVirtPortProfile(hostdev->parent.data.net)) + return 0; if (virHostdevIsVirtualFunction(hostdev) != 1) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", @@ -430,6 +443,44 @@ virHostdevNetConfigReplace(virDomainHostdevDefPtr hostdev, if (virHostdevNetDevice(hostdev, &linkdev, &vf) < 0) goto cleanup; + if (virNetDevSaveNetConfig(linkdev, vf, stateDir, true) < 0) + goto cleanup; + + ret = 0; + cleanup: + VIR_FREE(linkdev); + return ret; +} + + +/** + * virHostdevSetNetConfig: + * @hostdev: config object describing a hostdev device + * @uuid: uuid of the domain + * + * If the given hostdev device is an SRIOV network VF, determine its + * PF+VF#, and use that to set the "admin" MAC address and VF tag (the + * ones saved in the PF driver).xs + * + * Returns 0 on success, -1 on failure. + */ +static int +virHostdevSetNetConfig(virDomainHostdevDefPtr hostdev, + const unsigned char *uuid) +{ + char *linkdev = NULL; + virNetDevVlanPtr vlan; + virNetDevVPortProfilePtr virtPort; + int ret = -1; + int vf = -1; + bool port_profile_associate = true; + + if (!virHostdevIsPCINetDevice(hostdev)) + return 0; + + if (virHostdevNetDevice(hostdev, &linkdev, &vf) < 0) + goto cleanup; + vlan = virDomainNetGetActualVlan(hostdev->parent.data.net); virtPort = virDomainNetGetActualVirtPortProfile(hostdev->parent.data.net); if (virtPort) { @@ -446,11 +497,6 @@ virHostdevNetConfigReplace(virDomainHostdevDefPtr hostdev, goto cleanup; } } else { - /* Save/Set only mac and vlan */ - - if (virNetDevSaveNetConfig(linkdev, vf, stateDir, true) < 0) - goto cleanup; - if (virNetDevSetNetConfig(linkdev, vf, &hostdev->parent.data.net->mac, vlan, NULL, true) < 0) { goto cleanup; @@ -463,6 +509,7 @@ virHostdevNetConfigReplace(virDomainHostdevDefPtr hostdev, return ret; } + /* @oldStateDir: * For upgrade purpose: * To an existing VM on QEMU, the hostdev netconfig file is originally stored @@ -689,16 +736,16 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, } /* Step 4: For SRIOV network devices, Now that we have detached the - * the network device, set the netdev config */ + * the network device, set the new netdev config */ for (i = 0; i < nhostdevs; i++) { - virDomainHostdevDefPtr hostdev = hostdevs[i]; - if (!virHostdevIsPCINetDevice(hostdev)) - continue; - if (virHostdevNetConfigReplace(hostdev, uuid, - mgr->stateDir) < 0) { - goto resetvfnetconfig; - } - last_processed_hostdev_vf = i; + + if (virHostdevSaveNetConfig(hostdevs[i], mgr->stateDir) < 0) + goto resetvfnetconfig; + + if (virHostdevSetNetConfig(hostdevs[i], uuid) < 0) + goto resetvfnetconfig; + + last_processed_hostdev_vf = i; } /* Step 5: Move devices from the inactive list to the active list */ -- 2.9.3

In order to properly restore the original state of an SRIOV VF when we're finished with it, we need to save the MAC address of the VF itself (not just the admin MAC address for the VF that is stored in the PF). But that can only be done when the VF is still bound to the host's netdev driver, and we have always done the saving of device config after the VF is already bound to vfio-pci. This patch prepares us for adding a save of the VF's MAC by calling the function that saves netconfig earlier in the device preparation, before we've unbound it from the host netdev driver. --- src/util/virhostdev.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 3b5bad0..e9e2ab3 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -649,6 +649,14 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, } } + /* Step 1.5: For non-802.11Qbh SRIOV network devices, save the + * current device config + */ + for (i = 0; i < nhostdevs; i++) { + if (virHostdevSaveNetConfig(hostdevs[i], mgr->stateDir) < 0) + goto cleanup; + } + /* Step 2: detach managed devices and make sure unmanaged devices * have already been taken care of */ for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { @@ -739,9 +747,6 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, * the network device, set the new netdev config */ for (i = 0; i < nhostdevs; i++) { - if (virHostdevSaveNetConfig(hostdevs[i], mgr->stateDir) < 0) - goto resetvfnetconfig; - if (virHostdevSetNetConfig(hostdevs[i], uuid) < 0) goto resetvfnetconfig; -- 2.9.3

It takes longer to explain this than to fix it... In the past we weren't able to save the VF's own MAC address *at all* when using it for hostdev assignment, because we had already unbound the VF from the host net driver prior to saving its config. With the previous patch, that problem has been solved, so we now have the VF's MAC address saved and can move on to the *next* problem, which is twofold: 1) during teardown we restore the config before we've re-bound, so the VF doesn't have a net driver, and thus we can't set its MAC address directly. 2) even if we delay restoring the config until the VF is bound to a net driver, the request to set its MAC address would fail, since (during device setup) we had set the "admin MAC" for the VF via an RTM_SETLINK to the PF - once you've set the admin MAC for a VF, the VF driver (either on host or on guest) is not allowed to change the VF's MAC address "forever" (well, until you reload the PF driver, but that requires destroying and recreating every single VF, which isn't something you can require). The solution is to keep the restoration of config at the same place, but to set the *admin MAC* to the address you want the VF to have - when the VF net driver is later initialized (as a part of re-binding to the VF net driver) its MAC will be initialized to the current value of the admin MAC. --- src/util/virhostdev.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index e9e2ab3..a402c01 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -562,6 +562,30 @@ virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev, &adminMAC, &vlan, &MAC); if (ret == 0) { + /* if a MAC was stored for the VF, we should now restore + * that as the adminMAC. We have to do it this way because + * the VF is still not bound to the host's net driver, so + * we can't directly set its MAC (and even after it is + * re-bound to the host net driver, it will still have its + * "administratively set" flag on, and that prohibits the + * VF's net driver from directly setting the MAC + * anyway). But it we set the desired VF MAC as the "admin + * MAC" *now*, then when the VF is re-bound to the host + * net driver (which will happen soon after returning from + * this function), that adminMAC will be set (by the PF) + * as the VF's new initial MAC. + * + * If no MAC was stored for the VF, that means it wasn't + * bound to a net driver before we used it anyway, so the + * adminMAC is all we have, and we can just restore it + * directly. + */ + if (MAC) { + VIR_FREE(adminMAC); + adminMAC = MAC; + MAC = NULL; + } + ignore_value(virNetDevSetNetConfig(linkdev, vf, adminMAC, vlan, MAC, true)); } -- 2.9.3

The global functions virNetDevReplaceMacAddress(), virNetDevReplaceNetConfig(), virNetDevRestoreMacAddress(), and virNetDevRestoreNetConfig() are no longer used, as their functionality has been replaced by virNetDev(Save|Read|Set)NetConfig(). The static functions virNetDevReplaceVfConfig() and virNetDevRestoreVfConfig() were only used by the above-named global functions that were removed. --- src/libvirt_private.syms | 4 - src/util/virnetdev.c | 310 ----------------------------------------------- 2 files changed, 314 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index c983438..09c468f 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1999,10 +1999,6 @@ virNetDevIfStateTypeToString; virNetDevIsVirtualFunction; virNetDevPFGetVF; virNetDevReadNetConfig; -virNetDevReplaceMacAddress; -virNetDevReplaceNetConfig; -virNetDevRestoreMacAddress; -virNetDevRestoreNetConfig; virNetDevRunEthernetScript; virNetDevRxFilterFree; virNetDevRxFilterModeTypeFromString; diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index feb5ba7..2b1cebc 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -390,92 +390,6 @@ int virNetDevGetMAC(const char *ifname, #endif -/** - * virNetDevReplaceMacAddress: - * @macaddress: new MAC address for interface - * @linkdev: name of interface - * @stateDir: directory to store old MAC address - * - * Returns 0 on success, -1 on failure - * - */ -int -virNetDevReplaceMacAddress(const char *linkdev, - const virMacAddr *macaddress, - const char *stateDir) -{ - virMacAddr oldmac; - char *path = NULL; - char macstr[VIR_MAC_STRING_BUFLEN]; - int ret = -1; - - if (virNetDevGetMAC(linkdev, &oldmac) < 0) - return -1; - - if (virAsprintf(&path, "%s/%s", - stateDir, - linkdev) < 0) - return -1; - virMacAddrFormat(&oldmac, macstr); - if (virFileWriteStr(path, macstr, O_CREAT|O_TRUNC|O_WRONLY) < 0) { - virReportSystemError(errno, _("Unable to preserve mac for %s"), - linkdev); - goto cleanup; - } - - if (virNetDevSetMAC(linkdev, macaddress) < 0) - goto cleanup; - - ret = 0; - cleanup: - VIR_FREE(path); - return ret; -} - -/** - * virNetDevRestoreMacAddress: - * @linkdev: name of interface - * @stateDir: directory containing old MAC address - * - * Returns 0 on success, -errno on failure. - * - */ -int -virNetDevRestoreMacAddress(const char *linkdev, - const char *stateDir) -{ - int rc = -1; - char *oldmacname = NULL; - char *macstr = NULL; - char *path = NULL; - virMacAddr oldmac; - - if (virAsprintf(&path, "%s/%s", - stateDir, - linkdev) < 0) - return -1; - - if (virFileReadAll(path, VIR_MAC_STRING_BUFLEN, &macstr) < 0) - goto cleanup; - - if (virMacAddrParse(macstr, &oldmac) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Cannot parse MAC address from '%s'"), - oldmacname); - goto cleanup; - } - - /*reset mac and remove file-ignore results*/ - rc = virNetDevSetMAC(linkdev, &oldmac); - ignore_value(unlink(path)); - - cleanup: - VIR_FREE(macstr); - VIR_FREE(path); - return rc; -} - - #if defined(SIOCGIFMTU) && defined(HAVE_STRUCT_IFREQ) /** * virNetDevGetMTU: @@ -1675,230 +1589,6 @@ virNetDevGetVfConfig(const char *ifname, int vf, virMacAddrPtr mac, return rc; } -static int -virNetDevReplaceVfConfig(const char *pflinkdev, int vf, - const virMacAddr *macaddress, - int vlanid, - const char *stateDir) -{ - int ret = -1; - virMacAddr oldmac; - int oldvlanid = -1; - char *path = NULL; - char macstr[VIR_MAC_STRING_BUFLEN]; - char *fileData = NULL; - bool pfIsOnline; - - /* Assure that PF is online prior to twiddling with the VF. It - * *should* be, but if the PF isn't online the changes made to the - * VF via the PF won't take effect, yet there will be no error - * reported. In the case that it isn't online, fail and report the - * error, since setting an unconfigured interface online - * automatically turns on IPv6 autoconfig, which may not be what - * the admin expects, so we want them to explicitly enable the PF - * in the host system network config. - */ - if (virNetDevGetOnline(pflinkdev, &pfIsOnline) < 0) - goto cleanup; - if (!pfIsOnline) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to configure VF %d of PF '%s' " - "because the PF is not online. Please " - "change host network config to put the " - "PF online."), - vf, pflinkdev); - goto cleanup; - } - - if (virNetDevGetVfConfig(pflinkdev, vf, &oldmac, &oldvlanid) < 0) - goto cleanup; - - if (virAsprintf(&path, "%s/%s_vf%d", - stateDir, pflinkdev, vf) < 0) - goto cleanup; - - if (virAsprintf(&fileData, "%s\n%d\n", - virMacAddrFormat(&oldmac, macstr), oldvlanid) < 0) - goto cleanup; - if (virFileWriteStr(path, fileData, O_CREAT|O_TRUNC|O_WRONLY) < 0) { - virReportSystemError(errno, _("Unable to preserve mac/vlan tag " - "for pf = %s, vf = %d"), pflinkdev, vf); - goto cleanup; - } - - ret = virNetDevSetVfConfig(pflinkdev, vf, macaddress, vlanid); - - cleanup: - VIR_FREE(path); - VIR_FREE(fileData); - return ret; -} - -static int -virNetDevRestoreVfConfig(const char *pflinkdev, - int vf, const char *vflinkdev, - const char *stateDir) -{ - int rc = -1; - char *path = NULL; - char *fileData = NULL; - char *vlan = NULL; - virMacAddr oldmac; - int vlanid = -1; - - if (virAsprintf(&path, "%s/%s_vf%d", - stateDir, pflinkdev, vf) < 0) - return rc; - - if (vflinkdev && !virFileExists(path)) { - /* this VF's config may have been stored with - * virNetDevReplaceMacAddress while running an older version - * of libvirt. If so, the ${pf}_vf${id} file won't exist. In - * that case, try to restore using the older method with the - * VF's name directly. - */ - rc = virNetDevRestoreMacAddress(vflinkdev, stateDir); - goto cleanup; - } - - if (virFileReadAll(path, 128, &fileData) < 0) - goto cleanup; - - if ((vlan = strchr(fileData, '\n'))) { - char *endptr; - - *vlan++ = 0; /* NULL terminate the mac address */ - if (*vlan) { - if ((virStrToLong_i(vlan, &endptr, 10, &vlanid) < 0) || - (endptr && *endptr != '\n' && *endptr != 0)) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Cannot parse vlan tag from '%s'"), - vlan); - goto cleanup; - } - } - } - - if (virMacAddrParse(fileData, &oldmac) != 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Cannot parse MAC address from '%s'"), - fileData); - goto cleanup; - } - - /*reset mac and remove file-ignore results*/ - rc = virNetDevSetVfConfig(pflinkdev, vf, &oldmac, vlanid); - ignore_value(unlink(path)); - - cleanup: - VIR_FREE(path); - VIR_FREE(fileData); - - return rc; -} - -/** - * virNetDevReplaceNetConfig: - * @linkdev: name of the interface - * @vf: vf index if linkdev is a pf - * @macaddress: new MAC address for interface - * @vlanid: new vlanid - * @stateDir: directory to store old net config - * - * Returns 0 on success, -1 on failure - * - */ -int -virNetDevReplaceNetConfig(const char *linkdev, int vf, - const virMacAddr *macaddress, - virNetDevVlanPtr vlan, - const char *stateDir) -{ - int ret = -1; - char *pfdevname = NULL; - - if (vf == -1 && virNetDevIsVirtualFunction(linkdev) == 1) { - /* If this really *is* a VF and the caller just didn't know - * it, we should set the MAC address via PF+vf# instead of - * setting directly via VF, because the latter will be - * rejected any time after the former has been done. - */ - if (virNetDevGetPhysicalFunction(linkdev, &pfdevname) < 0) - goto cleanup; - if (virNetDevGetVirtualFunctionIndex(pfdevname, linkdev, &vf) < 0) - goto cleanup; - linkdev = pfdevname; - } - - if (vf == -1) { - if (vlan) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("vlan can only be set for SR-IOV VFs, but " - "%s is not a VF"), linkdev); - goto cleanup; - } - ret = virNetDevReplaceMacAddress(linkdev, macaddress, stateDir); - } else { - int vlanid = 0; /* assure any current vlan tag is reset */ - - if (vlan) { - if (vlan->nTags != 1 || vlan->trunk) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("vlan trunking is not supported " - "by SR-IOV network devices")); - goto cleanup; - } - vlanid = vlan->tag[0]; - } - ret = virNetDevReplaceVfConfig(linkdev, vf, macaddress, vlanid, - stateDir); - } - - cleanup: - VIR_FREE(pfdevname); - return ret; -} - -/** - * virNetDevRestoreNetConfig: - * @linkdev: name of the interface - * @vf: vf index if linkdev is a pf - * @stateDir: directory containing old net config - * - * Returns 0 on success, -errno on failure. - * - */ -int -virNetDevRestoreNetConfig(const char *linkdev, int vf, const char *stateDir) -{ - int ret = -1; - char *pfdevname = NULL; - const char *vfdevname = NULL; - - if (vf == -1 && virNetDevIsVirtualFunction(linkdev) == 1) { - /* If this really *is* a VF and the caller just didn't know - * it, we should set the MAC address via PF+vf# instead of - * setting directly via VF, because the latter will be - * rejected any time after the former has been done. - */ - if (virNetDevGetPhysicalFunction(linkdev, &pfdevname) < 0) - goto cleanup; - if (virNetDevGetVirtualFunctionIndex(pfdevname, linkdev, &vf) < 0) - goto cleanup; - vfdevname = linkdev; - linkdev = pfdevname; - } - - if (vf == -1) - ret = virNetDevRestoreMacAddress(linkdev, stateDir); - else - ret = virNetDevRestoreVfConfig(linkdev, vf, vfdevname, stateDir); - - cleanup: - VIR_FREE(pfdevname); - return ret; -} - /** * virNetDevSaveNetConfig: -- 2.9.3

Some PF drivers allow setting the admin MAC (that is the MAC address that the VF will be initialized to the next time the VF's driver is loaded) to 00:00:00:00:00:00, and some don't. Multiple drivers initialize the admin MACs to all 0, but don't allow setting it to that very same value. It has been an uphill battle convincing the driver people that it's reasonable to expect The argument that's used is that an all 0 device MAC address on a device is invalid; however, from an outsider's point of view, when the admin MAC is set to 0 at the time the VF driver is loaded, the VF's MAC is *not* set to 0, but to a random non-0 value. But that's beside the point - even if I could convince one or two SRIOV driver maintainers to permit setting the admin MAC to 0, there are still several other drivers. So rather than fighting that losing battle, this patch checks for a failure to set the admin MAC due to an all 0 value, and retries it with 02:00:00:00:00:00. That won't result in a random value being set in the VF MAC at next VF driver init, but that's okay, because we always want to set a specific value anyway. Rather, the "almost 0" setting makes it easy to visually detect from the output of "ip link show" which VFs are currently in use and which are free. --- src/util/virnetdev.c | 42 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 38 insertions(+), 4 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 2b1cebc..6cf0463 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1387,6 +1387,14 @@ virNetDevSysfsFile(char **pf_sysfs_device_link ATTRIBUTE_UNUSED, #if defined(__linux__) && defined(HAVE_LIBNL) && defined(IFLA_VF_MAX) +static virMacAddr zeroMAC = { 0 }; + +/* if a net driver doesn't allow setting MAC to all 0, try setting + * to this (the only bit that is set is the "locally administered" bit") + */ +static virMacAddr altZeroMAC = { .addr = { 0x02, 0, 0, 0, 0, 0, } }; + + static struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = { [IFLA_VF_MAC] = { .type = NLA_UNSPEC, .maxlen = sizeof(struct ifla_vf_mac) }, @@ -1397,7 +1405,8 @@ static struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = { static int virNetDevSetVfConfig(const char *ifname, int vf, - const virMacAddr *macaddr, int vlanid) + const virMacAddr *macaddr, int vlanid, + bool *allowRetry) { int rc = -1; struct nlmsghdr *resp = NULL; @@ -1474,7 +1483,15 @@ virNetDevSetVfConfig(const char *ifname, int vf, if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err))) goto malformed_resp; - if (err->error) { + /* if allowRetry is true and the error was EINVAL, then + * silently return a failure so the caller can retry with a + * different MAC address + */ + if (-err->error == EINVAL && *allowRetry && + macaddr && !virMacAddrCmp(macaddr, &zeroMAC)) { + goto cleanup; + } else if (err->error) { + /* other errors are permanent */ char macstr[VIR_MAC_STRING_BUFLEN]; virReportSystemError(-err->error, @@ -1486,6 +1503,7 @@ virNetDevSetVfConfig(const char *ifname, int vf, vlanid, ifname ? ifname : "(unspecified)", vf); + *allowRetry = false; /* no use retrying */ goto cleanup; } break; @@ -2067,8 +2085,24 @@ virNetDevSetNetConfig(const char *linkdev, int vf, * guest or host). if there is a vlanTag to set, it will take * effect immediately though. */ - if (virNetDevSetVfConfig(pfDevName, vf, adminMAC, vlanTag) < 0) - goto cleanup; + bool allowRetry = true; + + if (virNetDevSetVfConfig(pfDevName, vf, + adminMAC, vlanTag, &allowRetry) < 0) { + /* allowRetry will still be true if the failure was due to + * trying to set the MAC address to all 0. In that case, + * we can retry with "altZeroMAC", which is just an all-0 MAC + * with the "locally administered" bit set. + */ + if (!allowRetry) + goto cleanup; + + allowRetry = false; + if (virNetDevSetVfConfig(pfDevName, vf, + &altZeroMAC, vlanTag, &allowRetry) < 0) { + goto cleanup; + } + } } ret = 0; -- 2.9.3

On 03/10/2017 09:35 PM, Laine Stump wrote:
Some PF drivers allow setting the admin MAC (that is the MAC address that the VF will be initialized to the next time the VF's driver is loaded) to 00:00:00:00:00:00, and some don't. Multiple drivers initialize the admin MACs to all 0, but don't allow setting it to that very same value. It has been an uphill battle convincing the driver people that it's reasonable to expect The argument that's used is that an all 0 device MAC address on a device is invalid; however, from an outsider's point of view, when the admin MAC is set to 0 at the time the VF driver is loaded, the VF's MAC is *not* set to 0, but to a random non-0 value. But that's beside the point - even if I could convince one or two SRIOV driver maintainers to permit setting the admin MAC to 0, there are still several other drivers.
So rather than fighting that losing battle, this patch checks for a failure to set the admin MAC due to an all 0 value, and retries it with 02:00:00:00:00:00. That won't result in a random value being set in the VF MAC at next VF driver init, but that's okay, because we always want to set a specific value anyway. Rather, the "almost 0" setting makes it easy to visually detect from the output of "ip link show" which VFs are currently in use and which are free. --- src/util/virnetdev.c | 42 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 38 insertions(+), 4 deletions(-)
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 2b1cebc..6cf0463 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1387,6 +1387,14 @@ virNetDevSysfsFile(char **pf_sysfs_device_link ATTRIBUTE_UNUSED, #if defined(__linux__) && defined(HAVE_LIBNL) && defined(IFLA_VF_MAX)
+static virMacAddr zeroMAC = { 0 }; + +/* if a net driver doesn't allow setting MAC to all 0, try setting + * to this (the only bit that is set is the "locally administered" bit") + */ +static virMacAddr altZeroMAC = { .addr = { 0x02, 0, 0, 0, 0, 0, } }; + + static struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = { [IFLA_VF_MAC] = { .type = NLA_UNSPEC, .maxlen = sizeof(struct ifla_vf_mac) }, @@ -1397,7 +1405,8 @@ static struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = {
static int virNetDevSetVfConfig(const char *ifname, int vf, - const virMacAddr *macaddr, int vlanid) + const virMacAddr *macaddr, int vlanid, + bool *allowRetry) { int rc = -1; struct nlmsghdr *resp = NULL; @@ -1474,7 +1483,15 @@ virNetDevSetVfConfig(const char *ifname, int vf, if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err))) goto malformed_resp;
- if (err->error) { + /* if allowRetry is true and the error was EINVAL, then + * silently return a failure so the caller can retry with a + * different MAC address + */ + if (-err->error == EINVAL && *allowRetry &&
No, please no. if (err->error == -EINVAL ...) is way better.
+ macaddr && !virMacAddrCmp(macaddr, &zeroMAC)) { + goto cleanup; + } else if (err->error) { + /* other errors are permanent */ char macstr[VIR_MAC_STRING_BUFLEN];
virReportSystemError(-err->error,
ACK with that fixed. Michal

On 03/17/2017 09:32 AM, Michal Privoznik wrote:
On 03/10/2017 09:35 PM, Laine Stump wrote:
Some PF drivers allow setting the admin MAC (that is the MAC address that the VF will be initialized to the next time the VF's driver is loaded) to 00:00:00:00:00:00, and some don't. Multiple drivers initialize the admin MACs to all 0, but don't allow setting it to that very same value. It has been an uphill battle convincing the driver people that it's reasonable to expect The argument that's used is that an all 0 device MAC address on a device is invalid; however, from an outsider's point of view, when the admin MAC is set to 0 at the time the VF driver is loaded, the VF's MAC is *not* set to 0, but to a random non-0 value. But that's beside the point - even if I could convince one or two SRIOV driver maintainers to permit setting the admin MAC to 0, there are still several other drivers.
So rather than fighting that losing battle, this patch checks for a failure to set the admin MAC due to an all 0 value, and retries it with 02:00:00:00:00:00. That won't result in a random value being set in the VF MAC at next VF driver init, but that's okay, because we always want to set a specific value anyway. Rather, the "almost 0" setting makes it easy to visually detect from the output of "ip link show" which VFs are currently in use and which are free. --- src/util/virnetdev.c | 42 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 38 insertions(+), 4 deletions(-)
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 2b1cebc..6cf0463 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1387,6 +1387,14 @@ virNetDevSysfsFile(char **pf_sysfs_device_link ATTRIBUTE_UNUSED, #if defined(__linux__) && defined(HAVE_LIBNL) && defined(IFLA_VF_MAX)
+static virMacAddr zeroMAC = { 0 }; + +/* if a net driver doesn't allow setting MAC to all 0, try setting + * to this (the only bit that is set is the "locally administered" bit") + */ +static virMacAddr altZeroMAC = { .addr = { 0x02, 0, 0, 0, 0, 0, } }; + + static struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = { [IFLA_VF_MAC] = { .type = NLA_UNSPEC, .maxlen = sizeof(struct ifla_vf_mac) }, @@ -1397,7 +1405,8 @@ static struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = {
static int virNetDevSetVfConfig(const char *ifname, int vf, - const virMacAddr *macaddr, int vlanid) + const virMacAddr *macaddr, int vlanid, + bool *allowRetry) { int rc = -1; struct nlmsghdr *resp = NULL; @@ -1474,7 +1483,15 @@ virNetDevSetVfConfig(const char *ifname, int vf, if (resp->nlmsg_len < NLMSG_LENGTH(sizeof(*err))) goto malformed_resp;
- if (err->error) { + /* if allowRetry is true and the error was EINVAL, then + * silently return a failure so the caller can retry with a + * different MAC address + */ + if (-err->error == EINVAL && *allowRetry &&
No, please no. if (err->error == -EINVAL ...) is way better.
Yeah, sure. I just copy-pasta'd that from somewhere else. Consider it done.
+ macaddr && !virMacAddrCmp(macaddr, &zeroMAC)) { + goto cleanup; + } else if (err->error) { + /* other errors are permanent */ char macstr[VIR_MAC_STRING_BUFLEN];
virReportSystemError(-err->error,
ACK with that fixed.
Michal

If an SRIOV VF has previously been used for VFIO device assignment, the "admin MAC" that is stored in the PF driver's table of VF info will have been set to the MAC address that the virtual machine wanted the device to have. Setting the admin MAC for a VF also sets a flag in the PF that is loosely called the "administratively set" flag. Once that flag is set, it is no longer possible for the net driver of the VF (either on the host or in a virtual machine) to directly set the VF's MAC again; this flag isn't reset until the *PF* driver is restarted, and that requires taking *all* VFs offline, so it's not really feasible to do. If the same SRIOV VF is later used for macvtap passthrough mode, the VF's MAC address must be set, but normally we don't unbind the VF from its host net driver (since we actually need the host net driver in this case). Since setting the VF MAC directly will fail, in the past "we" ("I") had tried to fix the problem by simply setting the admin MAC (via the PF) instead. This *appeared* to work (and might have at one time, due to promiscuous mode being turned on somewhere or something), but it currently creates a non-working interface because only the value for admin MAC is set to the desired value, *not* the actual MAC that the VF is using. Earlier patches in this series reverted that behavior, so that we once again set the MAC of the VF itself for macvtap passthrough operation, not the admin MAC. But that brings back the original bug - if the interface has been used for VFIO device assignment, you can no longer use it for macvtap passthrough. This patch solves that problem by noticing when virNetDevSetMAC() fails for a VF, and in that case it sets the desired MAC to the admin MAC via the PF, then "bounces" the VF driver (by unbinding and the immediately rebinding it to the VF). This causes the VF's MAC to be reinitialized from the admin MAC, and everybody is happy (until the *next* time someone wants to set the VF's MAC address, since the "administratively set" bit is still turned on). --- src/util/virnetdev.c | 102 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 83 insertions(+), 19 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 6cf0463..861d725 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1053,6 +1053,32 @@ virNetDevSysfsDeviceFile(char **pf_sysfs_device_link, const char *ifname, return 0; } + +static virPCIDevicePtr +virNetDevGetPCIDevice(const char *devName) +{ + char *vfSysfsDevicePath = NULL; + virPCIDeviceAddressPtr vfPCIAddr = NULL; + virPCIDevicePtr vfPCIDevice = NULL; + + if (virNetDevSysfsFile(&vfSysfsDevicePath, devName, "device") < 0) + goto cleanup; + + vfPCIAddr = virPCIGetDeviceAddressFromSysfsLink(vfSysfsDevicePath); + if (!vfPCIAddr) + goto cleanup; + + vfPCIDevice = virPCIDeviceNew(vfPCIAddr->domain, vfPCIAddr->bus, + vfPCIAddr->slot, vfPCIAddr->function); + + cleanup: + VIR_FREE(vfSysfsDevicePath); + VIR_FREE(vfPCIAddr); + + return vfPCIDevice; +} + + /** * virNetDevGetVirtualFunctions: * @@ -1942,6 +1968,7 @@ virNetDevSetNetConfig(const char *linkdev, int vf, char *pfDevOrig = NULL; char *vfDevOrig = NULL; int vlanTag = -1; + virPCIDevicePtr vfPCIDevice = NULL; if (vf >= 0) { /* linkdev is the PF */ @@ -2037,6 +2064,8 @@ virNetDevSetNetConfig(const char *linkdev, int vf, } if (MAC) { + int setMACrc; + if (!linkdev) { virReportError(VIR_ERR_INTERNAL_ERROR, _("VF %d of PF '%s' is not bound to a net driver, " @@ -2045,27 +2074,61 @@ virNetDevSetNetConfig(const char *linkdev, int vf, goto cleanup; } - if (virNetDevSetMAC(linkdev, MAC) < 0) { - /* This may have failed due to the "administratively - * set" flag being set in the PF for this VF. For now - * we will just fail, but in the future we should - * attempt to set the VF MAC via the PF. + setMACrc = virNetDevSetMACInternal(linkdev, MAC, !!pfDevOrig); + if (setMACrc < 0) { + bool allowRetry = false; + int retries = 100; + + /* if pfDevOrig == NULL, this isn't a VF, so we've failed */ + if (!pfDevOrig || errno != EADDRNOTAVAIL) + goto cleanup; + + /* Otherwise this is a VF, and virNetDevSetMAC failed with + * EADDRNOTAVAIL, which could be due to the + * "administratively set" flag being set in the PF for + * this VF. When this happens, we can attempt to use an + * alternate method to set the VF MAC: first set it into + * the admin MAC for this VF in the PF, then unbind/rebind + * the VF from its net driver. This causes the VF's MAC to + * be initialized to whatever was stored in the admin MAC. */ - goto cleanup; + + if (virNetDevSetVfConfig(pfDevName, vf, + MAC, vlanTag, &allowRetry) < 0) { + goto cleanup; + } + + /* admin MAC is set, now we need to construct a virPCIDevice + * object so we can call virPCIDeviceRebind() + */ + if (!(vfPCIDevice = virNetDevGetPCIDevice(linkdev))) + goto cleanup; + + /* Rebind the device. This should set the proper MAC address */ + if (virPCIDeviceRebind(vfPCIDevice) < 0) + goto cleanup; + + /* Wait until virNetDevGetIndex for the VF netdev returns success. + * This indicates that the device is ready to be used. If we don't + * wait, then upcoming operations on the VF may fail. + */ + while (retries-- > 0 && !virNetDevExists(linkdev)) + usleep(1000); } - if (pfDevOrig) { - /* if pfDevOrig is set, it means that the caller was - * *really* only interested in setting the MAC of the VF - * itself, *not* the admin MAC via the PF. In those cases, - * the adminMAC was only provided in case we need to set - * the VF's MAC by temporarily unbinding/rebinding the - * VF's net driver with the admin MAC set to the desired - * MAC, and then want to restore the admin MAC to its - * original setting when we're finished. We would only - * need to do that if the virNetDevSetMAC() above had - * failed; since it didn't, we don't need to set the - * adminMAC, so we are NULLing it out here to avoid that - * below. + + if (pfDevOrig && setMACrc == 0) { + /* if pfDevOrig is set, it that the caller was *really* + * only interested in setting the MAC of the VF itself, + * *not* the admin MAC via the PF. In those cases, the + * adminMAC was only provided in case we need to set the + * VF's MAC by temporarily unbinding/rebinding the VF's + * net driver with the admin MAC set to the desired MAC, + * and then want to restore the admin MAC to its original + * setting when we're finished. We would only need to do + * that if the virNetDevSetMAC() above had failed; since + * setMACrc == 0, we know it didn't fail and we don't need + * to set the adminMAC, so we are NULLing it out here to + * avoid that below. * (NB: since setting the admin MAC sets the * "administratively set" flag for the VF in the PF's @@ -2109,6 +2172,7 @@ virNetDevSetNetConfig(const char *linkdev, int vf, cleanup: VIR_FREE(pfDevOrig); VIR_FREE(vfDevOrig); + VIR_FREE(vfPCIDevice); return ret; } -- 2.9.3

On 03/10/2017 09:35 PM, Laine Stump wrote:
If an SRIOV VF has previously been used for VFIO device assignment, the "admin MAC" that is stored in the PF driver's table of VF info will have been set to the MAC address that the virtual machine wanted the device to have. Setting the admin MAC for a VF also sets a flag in the PF that is loosely called the "administratively set" flag. Once that flag is set, it is no longer possible for the net driver of the VF (either on the host or in a virtual machine) to directly set the VF's MAC again; this flag isn't reset until the *PF* driver is restarted, and that requires taking *all* VFs offline, so it's not really feasible to do.
If the same SRIOV VF is later used for macvtap passthrough mode, the VF's MAC address must be set, but normally we don't unbind the VF from its host net driver (since we actually need the host net driver in this case). Since setting the VF MAC directly will fail, in the past "we" ("I") had tried to fix the problem by simply setting the admin MAC (via the PF) instead. This *appeared* to work (and might have at one time, due to promiscuous mode being turned on somewhere or something), but it currently creates a non-working interface because only the value for admin MAC is set to the desired value, *not* the actual MAC that the VF is using.
Earlier patches in this series reverted that behavior, so that we once again set the MAC of the VF itself for macvtap passthrough operation, not the admin MAC. But that brings back the original bug - if the interface has been used for VFIO device assignment, you can no longer use it for macvtap passthrough.
This patch solves that problem by noticing when virNetDevSetMAC() fails for a VF, and in that case it sets the desired MAC to the admin MAC via the PF, then "bounces" the VF driver (by unbinding and the immediately rebinding it to the VF). This causes the VF's MAC to be reinitialized from the admin MAC, and everybody is happy (until the *next* time someone wants to set the VF's MAC address, since the "administratively set" bit is still turned on). --- src/util/virnetdev.c | 102 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 83 insertions(+), 19 deletions(-)
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 6cf0463..861d725 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1053,6 +1053,32 @@ virNetDevSysfsDeviceFile(char **pf_sysfs_device_link, const char *ifname, return 0; }
+ +static virPCIDevicePtr +virNetDevGetPCIDevice(const char *devName) +{ + char *vfSysfsDevicePath = NULL; + virPCIDeviceAddressPtr vfPCIAddr = NULL; + virPCIDevicePtr vfPCIDevice = NULL; + + if (virNetDevSysfsFile(&vfSysfsDevicePath, devName, "device") < 0) + goto cleanup; + + vfPCIAddr = virPCIGetDeviceAddressFromSysfsLink(vfSysfsDevicePath); + if (!vfPCIAddr) + goto cleanup; + + vfPCIDevice = virPCIDeviceNew(vfPCIAddr->domain, vfPCIAddr->bus, + vfPCIAddr->slot, vfPCIAddr->function); + + cleanup: + VIR_FREE(vfSysfsDevicePath); + VIR_FREE(vfPCIAddr); + + return vfPCIDevice; +} + + /** * virNetDevGetVirtualFunctions: * @@ -1942,6 +1968,7 @@ virNetDevSetNetConfig(const char *linkdev, int vf, char *pfDevOrig = NULL; char *vfDevOrig = NULL; int vlanTag = -1; + virPCIDevicePtr vfPCIDevice = NULL;
if (vf >= 0) { /* linkdev is the PF */ @@ -2037,6 +2064,8 @@ virNetDevSetNetConfig(const char *linkdev, int vf, }
if (MAC) { + int setMACrc; + if (!linkdev) { virReportError(VIR_ERR_INTERNAL_ERROR, _("VF %d of PF '%s' is not bound to a net driver, " @@ -2045,27 +2074,61 @@ virNetDevSetNetConfig(const char *linkdev, int vf, goto cleanup; }
- if (virNetDevSetMAC(linkdev, MAC) < 0) { - /* This may have failed due to the "administratively - * set" flag being set in the PF for this VF. For now - * we will just fail, but in the future we should - * attempt to set the VF MAC via the PF. + setMACrc = virNetDevSetMACInternal(linkdev, MAC, !!pfDevOrig); + if (setMACrc < 0) { + bool allowRetry = false; + int retries = 100; + + /* if pfDevOrig == NULL, this isn't a VF, so we've failed */ + if (!pfDevOrig || errno != EADDRNOTAVAIL) + goto cleanup; + + /* Otherwise this is a VF, and virNetDevSetMAC failed with + * EADDRNOTAVAIL, which could be due to the + * "administratively set" flag being set in the PF for + * this VF. When this happens, we can attempt to use an + * alternate method to set the VF MAC: first set it into + * the admin MAC for this VF in the PF, then unbind/rebind + * the VF from its net driver. This causes the VF's MAC to + * be initialized to whatever was stored in the admin MAC. */ - goto cleanup; + + if (virNetDevSetVfConfig(pfDevName, vf, + MAC, vlanTag, &allowRetry) < 0) { + goto cleanup; + } + + /* admin MAC is set, now we need to construct a virPCIDevice + * object so we can call virPCIDeviceRebind() + */ + if (!(vfPCIDevice = virNetDevGetPCIDevice(linkdev))) + goto cleanup; + + /* Rebind the device. This should set the proper MAC address */ + if (virPCIDeviceRebind(vfPCIDevice) < 0) + goto cleanup; + + /* Wait until virNetDevGetIndex for the VF netdev returns success. + * This indicates that the device is ready to be used. If we don't + * wait, then upcoming operations on the VF may fail. + */ + while (retries-- > 0 && !virNetDevExists(linkdev)) + usleep(1000); } - if (pfDevOrig) { - /* if pfDevOrig is set, it means that the caller was - * *really* only interested in setting the MAC of the VF - * itself, *not* the admin MAC via the PF. In those cases, - * the adminMAC was only provided in case we need to set - * the VF's MAC by temporarily unbinding/rebinding the - * VF's net driver with the admin MAC set to the desired - * MAC, and then want to restore the admin MAC to its - * original setting when we're finished. We would only - * need to do that if the virNetDevSetMAC() above had - * failed; since it didn't, we don't need to set the - * adminMAC, so we are NULLing it out here to avoid that - * below. + + if (pfDevOrig && setMACrc == 0) { + /* if pfDevOrig is set, it that the caller was *really* + * only interested in setting the MAC of the VF itself, + * *not* the admin MAC via the PF. In those cases, the + * adminMAC was only provided in case we need to set the + * VF's MAC by temporarily unbinding/rebinding the VF's + * net driver with the admin MAC set to the desired MAC, + * and then want to restore the admin MAC to its original + * setting when we're finished. We would only need to do + * that if the virNetDevSetMAC() above had failed; since + * setMACrc == 0, we know it didn't fail and we don't need + * to set the adminMAC, so we are NULLing it out here to + * avoid that below.
* (NB: since setting the admin MAC sets the * "administratively set" flag for the VF in the PF's @@ -2109,6 +2172,7 @@ virNetDevSetNetConfig(const char *linkdev, int vf, cleanup: VIR_FREE(pfDevOrig); VIR_FREE(vfDevOrig); + VIR_FREE(vfPCIDevice);
No. this needs to be virPCIDeviceFree(). ACK with that fixed. Michal

On 03/17/2017 09:32 AM, Michal Privoznik wrote:
On 03/10/2017 09:35 PM, Laine Stump wrote:
* "administratively set" flag for the VF in the PF's @@ -2109,6 +2172,7 @@ virNetDevSetNetConfig(const char *linkdev, int vf, cleanup: VIR_FREE(pfDevOrig); VIR_FREE(vfDevOrig); + VIR_FREE(vfPCIDevice);
No. this needs to be virPCIDeviceFree().
Oops! *hides face in shame*
ACK with that fixed.

For consistency with the new virHostdevSaveNetConfig() and virHostdevSetNetConfig(). --- src/util/virhostdev.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index a402c01..dce0bbe 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -518,7 +518,7 @@ virHostdevSetNetConfig(virDomainHostdevDefPtr hostdev, * case, try to find in the old state dir. */ static int -virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev, +virHostdevRestoreNetConfig(virDomainHostdevDefPtr hostdev, const char *stateDir, const char *oldStateDir) { @@ -868,7 +868,7 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, resetvfnetconfig: if (last_processed_hostdev_vf >= 0) { for (i = 0; i <= last_processed_hostdev_vf; i++) - virHostdevNetConfigRestore(hostdevs[i], mgr->stateDir, NULL); + virHostdevRestoreNetConfig(hostdevs[i], mgr->stateDir, NULL); } reattachdevs: @@ -929,7 +929,7 @@ virHostdevReattachPCIDevice(virHostdevManagerPtr mgr, } /* @oldStateDir: - * For upgrade purpose: see virHostdevNetConfigRestore + * For upgrade purpose: see virHostdevRestoreNetConfig */ void virHostdevReAttachPCIDevices(virHostdevManagerPtr mgr, @@ -1030,7 +1030,7 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr mgr, if (actual) { VIR_DEBUG("Restoring network configuration of PCI device %s", virPCIDeviceGetName(actual)); - virHostdevNetConfigRestore(hostdev, mgr->stateDir, + virHostdevRestoreNetConfig(hostdev, mgr->stateDir, oldStateDir); } } -- 2.9.3

Having this information available will make it easier to determine the culprit when MAC or vlan tag appear to not be set, eg.: https://bugzilla.redhat.com/1364073 (This patch doesn't fix that bug, just makes it easier to diagnose) --- src/util/virnetdev.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 664a75d..86ae0d1 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -240,6 +240,7 @@ virNetDevSetMACInternal(const char *ifname, int fd = -1; int ret = -1; struct ifreq ifr; + char macstr[VIR_MAC_STRING_BUFLEN]; if ((fd = virNetDevSetupControl(ifname, &ifr)) < 0) return -1; @@ -255,7 +256,6 @@ virNetDevSetMACInternal(const char *ifname, virMacAddrGetRaw(macaddr, (unsigned char *)ifr.ifr_hwaddr.sa_data); if (ioctl(fd, SIOCSIFHWADDR, &ifr) < 0) { - char macstr[VIR_MAC_STRING_BUFLEN]; if (quiet && errno == EADDRNOTAVAIL) goto cleanup; @@ -269,6 +269,10 @@ virNetDevSetMACInternal(const char *ifname, ret = 0; cleanup: + VIR_DEBUG("SIOCSIFHWADDR %s MAC=%s - %s", + ifname, virMacAddrFormat(macaddr, macstr), + ret < 0 ? "Fail" : "Success"); + VIR_FORCE_CLOSE(fd); return ret; } @@ -311,6 +315,9 @@ virNetDevSetMACInternal(const char *ifname, ret = 0; cleanup: + VIR_DEBUG("SIOCSIFLLADDR %s MAC=%s - %s", ifname, mac + 1), + ret < 0 ? "Fail" : "Success"); + VIR_FORCE_CLOSE(s); return ret; @@ -1435,6 +1442,7 @@ virNetDevSetVfConfig(const char *ifname, int vf, bool *allowRetry) { int rc = -1; + char macstr[VIR_MAC_STRING_BUFLEN]; struct nlmsghdr *resp = NULL; struct nlmsgerr *err; unsigned int recvbuflen = 0; @@ -1518,8 +1526,6 @@ virNetDevSetVfConfig(const char *ifname, int vf, goto cleanup; } else if (err->error) { /* other errors are permanent */ - char macstr[VIR_MAC_STRING_BUFLEN]; - virReportSystemError(-err->error, _("Cannot set interface MAC/vlanid to %s/%d " "for ifname %s vf %d"), @@ -1543,6 +1549,11 @@ virNetDevSetVfConfig(const char *ifname, int vf, rc = 0; cleanup: + VIR_DEBUG("RTM_SETLINK %s vf %d MAC=%s vlanid=%d - %s", + ifname, vf, + macaddr ? virMacAddrFormat(macaddr, macstr) : "(unchanged)", + vlanid, rc < 0 ? "Fail" : "Success"); + nlmsg_free(nl_msg); VIR_FREE(resp); return rc; -- 2.9.3

On 03/10/2017 09:34 PM, Laine Stump wrote:
There are multiple bugs filed against both libvirt and the kernel related to incorrect restoration of MAC addresses for SR-IOV VFs, both when used via VFIO device assignment and when used for macvtap passthrough mode
https://bugzilla.redhat.com/1415609 - libvirt https://bugzilla.redhat.com/1341248 - kernel (igb) https://bugzilla.redhat.com/1415609 - kernel (ixgbe)
https://bugzilla.redhat.com/1302166 - kernel (mlx, fixed)
Beyond that, there are other problems with incorrect setting and restoration of VF MAC addresses that don't have their own bug reports (although they may be partially described in the comments of the BZes mentioned above). This patch series attempts (and I hope succeeds!) to fix all of these problems, which can be summarized in these three points:
* The VF's own MAC address was not being set prior to using it for macvtap passthrough mode. Instead, due to serious misconception on my part, the "admin MAC" for that VF was set in the PF. The problem is that the admin MAC isn't put into use on the VF immediately; rather it is unused until the next time the VF driver is initialized (on either the host or in a guest) so setting it had no practical effect, meaning that the macvtap device's MAC didn't match the MAC of the VF device it was attached to - this meant that traffic wouldn't pass unless the device was in promiscuous mode (and apparently it was back when the code was changed to behave this way?).
* The VF's own MAC address was never restored after it had been used (for both VFIO device assignment and for macvtap passthrough mode). Only the "admin MAC" address (which, again, won't take effect until/unless the VF driver is reloaded/reinitialized) was restored.
* If the original admin MAC was 00:00:00:00:00:00, libvirt would save that, then attempt to restore it when the guest was finished with the device, but for some/many SRIOV-capable net drivers, an all 0 MAC is not allowed to be set (even though that is its initial value). This would result in the MAC not being changed (of course, since this is the admin MAC, that didn't actually matter, but the combination of the error message and the VF's own MAC still being set to the value used by the guest, users mistakenly believed this was the source of networking problems (e.g. when the guest moved to another host, but the old host still had an interface showing its MAC address).
There are lots of details in the patches. 01 - 07 just clean up and slightly modify some existing utility functions. 08 - 11 define some functions to save/restore netdev MAC/vlan settings, and 12-14 change the existing setup/teardown code for macvtap and hostdev to use the new functions, then 15 and 17-19 modify their behavior further, while 16 just removes functions that are no longer used.
In the end, the VF's own MAC *and* admin MAC are saved for all SRIOV VFs when we begin using a VF, and the VF's MAC is restored when finished. Since the admin MAC doesn't actually have any immediate practical effect, we don't concern ourselves with assuring it is restored in all cases (in particular, when use use the VF for VFIO assignment, we only restore the VF's MAC, but not the admin MAC during teardown, and when using the VF for macvtap passthrough we avoid restoring the admin MAC because that would unnecessarily turn on the "administratively set" flag in the PF driver (described in the commit log for one of these patches). I might decide to fix the former later, but for now it just unnecessarily complicates the code for no real gain).
Laine Stump (19): util: permit querying a VF MAC address or VLAN tag by itself util: remove unused args from virNetDevSetVfConfig() util: use cleanup label consistently in virHostdevNetConfigReplace() util: eliminate useless local variable util: make virMacAddrParse more versatile util: change virPCIGetNetName() to not return error if device has no net name util: make virPCIGetDeviceAddressFromSysfsLink() public util: new function virPCIDeviceRebind() util: new internal function to permit silent failure of virNetDevSetMAC() util: new function virNetDevPFGetVF() util: new functions virNetDev(Save|Read|Set)NetConfig() util: use new virNetDev*NetConfig() functions for macvtap setup/teardown util: use new virNetDev*NetConfig() functions for hostdev setup/teardown util: replace virHostdevNetConfigReplace with ...(Save|Set)NetConfig() util: save hostdev network device config before unbinding from host driver util: after hostdev assignment, restore VF MAC address via setting admin MAC util: remove unused functions from virnetdev.c util: if setting admin MAC to 00:00:00:00:00:00 fails, try 02:00:00:00:00:00 util: try *really* hard to set the MAC address of an SRIOV VF
src/libvirt_private.syms | 10 +- src/util/virhostdev.c | 171 ++++++-- src/util/virmacaddr.c | 2 +- src/util/virnetdev.c | 937 +++++++++++++++++++++++++++++++------------- src/util/virnetdev.h | 25 ++ src/util/virnetdevmacvlan.c | 45 ++- src/util/virpci.c | 65 ++- src/util/virpci.h | 4 + 8 files changed, 933 insertions(+), 326 deletions(-)
Okay, I've gone through the patches and they look okay. ACK to all except for 11/19 where I'd like to see the file having some format (e.g. JSON). Now let me grab a six pack and clean my head :-). Michal
participants (3)
-
Daniel P. Berrange
-
Laine Stump
-
Michal Privoznik