On 9/22/20 9:48 AM, Laine Stump wrote:
(Please don't Cc individual developers in patch submissions
unless
they've specifically asked you to do so)
On 9/21/20 8:25 AM, zhenwei pi wrote:
> Collect PCI passthrough net device stats from kernel by netlink
> API.
>
> Currently, libvirt can not get PCI passthrough net device stats,
> run command:
> #virsh domifstat instance --interface=52:54:00:2d:b2:35
> error: Failed to get interface stats instance 52:54:00:2d:b2:35
> error: internal error: Interface name not provided
>
> The PCI device(usually SR-IOV virtual function device) is detached
> while it's used in PCI passthrough mode. And we can not parse this
> device from /proc/net/dev any more.
>
> In this patch, libvirt check net device is VF of not firstly, then
> query virNetDevVFInterfaceStats(new API).
> virNetDevVFInterfaceStats parses VFs info of all PFs, compares MAC
> address until the two MAC addresses match.
> '#ip -s link show' can get the same result. Instead of parsing the
> output result, implement this feature by libnl API.
>
> Notice that this feature deponds on driver of PF.
> Test on Mellanox ConnectX-4 Lx, it works well.
> Also test on Intel Corporation 82599ES, it works, but only get 0.
> (ip-link command get the same result).
>
> Signed-off-by: zhenwei pi <pizhenwei(a)bytedance.com>
> ---
> meson.build | 4 ++
> src/libvirt_private.syms | 1 +
> src/qemu/qemu_driver.c | 3 +
> src/util/virnetdev.c | 158
> +++++++++++++++++++++++++++++++++++++++++++++++
> src/util/virnetdev.h | 5 ++
> 5 files changed, 171 insertions(+)
>
> diff --git a/meson.build b/meson.build
> index 24535a403c..e17da9e2b9 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1392,6 +1392,10 @@ if not get_option('virtualport').disabled()
> endif
> endif
> +if cc.has_header_symbol('linux/if_link.h', 'IFLA_VF_STATS')
> + conf.set('WITH_VF_STATS', 1)
> +endif
> +
(a bit of digression here...)
I understand why you put this chunk in, but I'm fairly certain that it
isn't needed.
In the case of the few other things from if_link.h that have their own
"WITH_BLAH" compile-time flag (e.g. WITH_VIRTUALPORT, which checks for
existence of IFLA_PORT_MAX in if_link.h), they were added in the before
before time, when the feature in question had been recently added to the
kernel, and so there were some supported distro versions that had the
new feature, and some that didn't yet. In order to compile properly on
all supported platforms (though lacking these new-at-the-time features
on some) we had to gate them on the presence of some sentinel in
if_link.h. Those WITH_BLAH flags were then forgotten about, even though
the list of supported platform/versions has changed over the years, and
they will now work on *any* supported Linux platform.
Now that this patch has pointed them out, I think it's time for some
house cleaning. IFLA_PORT_MAX, for example, has been in the upstream
kernel since 2.6.35, and was even backported to the RHEL*6* 2.6.32
kernel. Even router firmware these days has a newer kernel than that, so
it's a safe bet that any system with __linux_ and WITH_LIBNL can enable
all of that code, i.e. we can just get rid of the extra
WITH_VIRTUALPORT. I just added this to my personal todo list; let's see
how long it takes until I actually accomplish it (betting starts now!)
Back to the topic of *this* patch: I looked up IFLA_VF_STATS for the
upstream kernel, and it first appeared in kernel 4.2. RHEL7 (which is
the oldest RHEL now supported by libvirt) is based on kernel 3.10, but
has *a lot* of backports, and so when I checked I found that it also
supports IFLA_VF_STATS. I don't have the details handy for the oldest
version of other Linuxes we support, but I tried eliminating the extra
check for IFLA_VF_STATS, and the full CI suite on gitlab builds without
error (there is an unrelated error in ubuntu-18.04 in libxl, and
different errors in the freebsd, mingw, and macos builds that *are*
caused by different aspects of this patch, but IFLA_VF_STATS does exist
on all Linux platforms included in the libvirt upstream CI testing).
So support for this feature can be gated on "defined(__linux__) &&
WITH_LIBNL (as should just about anything else using netlink)
> if host_machine.system() == 'windows'
> ole32_dep = cc.find_library('ole32')
> oleaut32_dep = cc.find_library('oleaut32')
> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> index bdbe3431b8..bcc40b8d69 100644
> --- a/src/libvirt_private.syms
> +++ b/src/libvirt_private.syms
> @@ -2585,6 +2585,7 @@ virNetDevSetRcvMulti;
> virNetDevSetupControl;
> virNetDevSysfsFile;
> virNetDevValidateConfig;
> +virNetDevVFInterfaceStats;
> # util/virnetdevbandwidth.h
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index ae715c01d7..f554010c40 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -10196,6 +10196,9 @@ qemuDomainInterfaceStats(virDomainPtr dom,
> if (virDomainNetGetActualType(net) ==
> VIR_DOMAIN_NET_TYPE_VHOSTUSER) {
> if (virNetDevOpenvswitchInterfaceStats(net->ifname, stats) < 0)
> goto cleanup;
> + } else if (virDomainNetGetActualType(net) ==
> VIR_DOMAIN_NET_TYPE_HOSTDEV) {
> + if (virNetDevVFInterfaceStats(&net->mac, stats) < 0)
> + goto cleanup;
> } else {
> if (virNetDevTapInterfaceStats(net->ifname, stats,
>
> !virDomainNetTypeSharesHostView(net)) < 0)
> diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c
> index e1a4cc2bef..be9b8ce4a9 100644
> --- a/src/util/virnetdev.c
> +++ b/src/util/virnetdev.c
> @@ -1489,6 +1489,9 @@ static struct nla_policy
> ifla_vf_policy[IFLA_VF_MAX+1] = {
> .maxlen = sizeof(struct ifla_vf_mac) },
> [IFLA_VF_VLAN] = { .type = NLA_UNSPEC,
> .maxlen = sizeof(struct ifla_vf_vlan) },
> +#if defined(WITH_VF_STATS)
Due to being a nested #if, the above should have been "# if defined...."
(note the space after #). Of course that #if will be removed, but to
catch things like this in the future, you should install the cppi
package (which is still optional for libvirt, since it is (was?)
unavailable on some supported platforms) and run "ninja -C build test",
which will run all the tests, including syntax / code style checks.
> + [IFLA_VF_STATS] = { .type = NLA_NESTED },
> +#endif
> };
> @@ -2265,6 +2268,151 @@ virNetDevSetNetConfig(const char *linkdev, int
> vf,
> return 0;
> }
> +#if defined(WITH_VF_STATS)
> +static struct nla_policy ifla_vfstats_policy[IFLA_VF_STATS_MAX+1] = {
> + [IFLA_VF_STATS_RX_PACKETS] = { .type = NLA_U64 },
> + [IFLA_VF_STATS_TX_PACKETS] = { .type = NLA_U64 },
> + [IFLA_VF_STATS_RX_BYTES] = { .type = NLA_U64 },
> + [IFLA_VF_STATS_TX_BYTES] = { .type = NLA_U64 },
> + [IFLA_VF_STATS_BROADCAST] = { .type = NLA_U64 },
> + [IFLA_VF_STATS_MULTICAST] = { .type = NLA_U64 },
> +};
> +
> +static int
> +virNetDevParseVfStats(struct nlattr **tb, virMacAddrPtr mac,
> + virDomainInterfaceStatsPtr stats)
> +{
> + int ret = -1, len;
> + struct ifla_vf_mac *vf_lladdr;
> + struct nlattr *nla, *t[IFLA_VF_MAX+1];
> + struct nlattr *stb[IFLA_VF_STATS_MAX+1];
> +
> + if (tb == NULL || mac == NULL || stats == NULL) {
> + return -1;
> + }
> +
> + if (!tb[IFLA_VFINFO_LIST])
> + return -1;
> +
> + len = nla_len(tb[IFLA_VFINFO_LIST]);
> +
> + for (nla = nla_data(tb[IFLA_VFINFO_LIST]); nla_ok(nla, len);
> + nla = nla_next(nla, &len)) {
> + ret = nla_parse(t, IFLA_VF_MAX, nla_data(nla), nla_len(nla),
> + ifla_vf_policy);
> + if (ret < 0)
> + return -1;
> +
> + if (t[IFLA_VF_MAC] == NULL) {
> + continue;
> + }
> +
> + vf_lladdr = nla_data(t[IFLA_VF_MAC]);
> + if(virMacAddrCmpRaw(mac, vf_lladdr->mac)) {
The above line failed syntax checks because there must be a space after
"if" before "("
> + continue;
> + }
> +
> + if (t[IFLA_VF_STATS]) {
> + ret = nla_parse_nested(stb, IFLA_VF_STATS_MAX,
> + t[IFLA_VF_STATS],
> + ifla_vfstats_policy);
> + if (ret < 0)
> + return -1;
> +
> + stats->rx_bytes = nla_get_u64(stb[IFLA_VF_STATS_RX_BYTES]);
> + stats->tx_bytes = nla_get_u64(stb[IFLA_VF_STATS_TX_BYTES]);
> + stats->rx_packets =
> nla_get_u64(stb[IFLA_VF_STATS_RX_PACKETS]);
> + stats->tx_packets =
> nla_get_u64(stb[IFLA_VF_STATS_TX_PACKETS]);
> + }
> + return 0;
> + }
> +
> + return ret;
> +}
> +
> +/**
> + * virNetDevVFInterfaceStats:
> + * @mac: MAC address of the VF interface
> + * @stats: returns stats of the VF interface
> + *
> + * Get the VF interface from kernel by netlink.
> + * Returns 0 on success, -1 on failure.
> + */
> +int
> +virNetDevVFInterfaceStats(virMacAddrPtr mac,
> + virDomainInterfaceStatsPtr stats)
> +{
> + FILE *fp;
> + char line[256], *colon, *ifname;
> + int rc = -1;
> + void *nlData = NULL;
> + struct nlattr *tb[IFLA_MAX + 1] = {NULL, };
> + char *sysfsDevicePath = NULL;
> +
> + fp = fopen("/proc/net/dev", "r");
> + if (!fp) {
> + virReportSystemError(errno, "%s",
> + _("Could not open /proc/net/dev"));
> + return -1;
> + }
> +
> + /* get all PCI net devices, and parse VFs list from netlink API.
> + * compare MAC address, collect device stats if matching.
> + */
> + while (fgets(line, sizeof(line), fp)) {
> + /* The line looks like:
> + * " eth0:..."
> + * Split it at the colon. and strip blank from head.
> + */
> + colon = strchr(line, ':');
> + if (!colon)
> + continue;
> + *colon = '\0';
> + ifname = line;
> + while((*ifname == ' ') && (ifname < colon))
This line also failed syntax check - needs a space after "while"
> + ifname++;
> +
> + if (virNetDevSysfsFile(&sysfsDevicePath, ifname, "device")
< 0)
> + break;
> +
> + if (virNetDevIsPCIDevice(sysfsDevicePath)) {
> + rc = virNetlinkDumpLink(ifname, -1, &nlData, tb, 0, 0);
> + if (rc < 0) {
> + rc = -1;
> + goto cleanup;
> + }
> +
> + rc = virNetDevParseVfStats(tb, mac, stats);
> + VIR_FREE(nlData);
> + if (rc == 0)
> + goto cleanup;
> + }
> + VIR_FREE(sysfsDevicePath);
> + }
> +
> +cleanup:
This also failed syntax checks - labels must start in column 2, not
column 1 (apparently this makes some simple-minded text editors happier
or something...)
> + VIR_FREE(sysfsDevicePath);
> + VIR_FORCE_FCLOSE(fp);
> + return rc;
> +}
> +
> +#else /* #if defined(WITH_VF_STATS) */
> +
> +int
> +virNetDevVFInterfaceStats(virMacAddrPtr mac,
> + virDomainInterfaceStatsPtr stats)
> +{
> + virReportSystemError(ENOSYS, "%s",
> + _("Unable to get VF net device stats on this
> kernel, please upgrade kernel at lease linux-4.2"));
> +
> + /* no need to do anything, just fix compling error here */
> + if (mac == NULL || stats == NULL) {
> + return -1;
> + }
Follow the pattern of other platforms to eliminate build error - declare
the parameters with G_GNUC_UNUSED.
> +
> + return -1;
> +}
> +#endif /* #if defined(WITH_VF_STATS) */
> #else /* defined(__linux__) && defined(WITH_LIBNL) &&
> defined(IFLA_VF_MAX) */
> @@ -2309,6 +2457,16 @@ virNetDevSetNetConfig(const char *linkdev
> G_GNUC_UNUSED,
> }
> +int
> +virNetDevVFInterfaceStats(virMacAddrPtr mac,
> + virDomainInterfaceStatsPtr stats)
> +{
> + virReportSystemError(ENOSYS, "%s",
> + _("Unable to get VF net device stats on this
> platform"));
> + return -1;
Again, you need G_GNUC_UNUSED on the parameters. (This failed the CI
tests for FreeBSD and MacOS).
> +}
> +
> +
> #endif /* defined(__linux__) && defined(WITH_LIBNL) &&
> defined(IFLA_VF_MAX) */
> VIR_ENUM_IMPL(virNetDevIfState,
> diff --git a/src/util/virnetdev.h b/src/util/virnetdev.h
> index 5f581323ed..ff59d9d341 100644
> --- a/src/util/virnetdev.h
> +++ b/src/util/virnetdev.h
> @@ -312,4 +312,9 @@ int virNetDevSysfsFile(char **pf_sysfs_device_link,
> int virNetDevRunEthernetScript(const char *ifname, const char *script)
> G_GNUC_NO_INLINE;
> +int virNetDevVFInterfaceStats(virMacAddrPtr mac,
> + virDomainInterfaceStatsPtr stats)
> +ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2);
> +
> +
> G_DEFINE_AUTOPTR_CLEANUP_FUNC(virNetDevRxFilter,
> virNetDevRxFilterFree);
I didn't have time to look at the code itself yet. In the meantime,
fixing it up so that this is just another one of the "#if
defined(__linux__) && WITH_LIBNL" functions, and eliminating the CI
build/test errors would be useful; possibly that's all that will be
needed, and even if not you'll have a head start :-)
Thanks a lot, I'll fix all the above problems and push a v2 version.
--
zhenwei pi