[PATCH 0/2] virnetdevbandwidth: Only clear qdisc for defined directions
The first part of this description is taken from the bug report: https://gitlab.com/libvirt/libvirt/-/work_items/875 By default, a macvtap device will have the qdisc for the tap set to `noqueue` to prevent lock contention limiting the available bandwidth for the VM (see [BZ 1329644](https://bugzilla.redhat.com/show_bug.cgi?id=1329644)): ```xml <interface type='direct'> <mac address='52:54:00:f5:7e:97'/> <source dev='ens2f0np0' mode='passthrough'/> <target dev='macvtap0'/> <model type='virtio'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </interface> ``` ``` 12: macvtap0@ens2f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500 ``` However, defining an interface with only an `inbound` bandwidth limit sets the iface's root qdisc to the system default: ```xml <interface type='direct'> <mac address='52:54:00:05:8f:18'/> <source dev='ens2f0np0' mode='passthrough'/> <bandwidth> <inbound average='3125000' peak='3125000'/> </bandwidth> <target dev='macvtap5'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> ``` ``` 16: macvtap5@ens2f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 500 ``` If I understand correctly, the `root` and `ingress` directions can be configured independently. So the above bandwidth definition should correspond to this (configured manually): ``` $ sudo tc qdisc show dev macvtap5 qdisc noqueue 8007: root refcnt 2 qdisc ingress ffff: parent ffff:fff1 ---------------- ``` Rather than: ``` $ sudo tc qdisc show dev macvtap6 qdisc fq_codel 0: root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc ingress ffff: parent ffff:fff1 ---------------- ``` When libvirt sets up a macvtap device, qemuDomainInterfaceSetDefaultQDisc configures qdisc noqueue for the interface. However, if bandwidth limits are defined, it also calls virNetDevBandwidthSet, which will call virNetDevBandwidthClear before reconfiguring the qdisc for the tx/rx. If only rx is defined, the root qdisc will be the system default. d53b0923539 prevents the noqueue configuration from being lost when reconfiguring bandwidth limits. All usages of virNetDevBandwidthSet are either covered by a call to qemuDomainInterfaceSetDefaultQDisc or qdisc noqueue does not apply there. This change doesn't change the semantics of virNetDevBandwidthClear. Alternatively, virNetDevBandwidthSet/virNetDevBandwidthClear could take a default qdisc which would be used to call virNetDevBandwidthSetRootQDisc in virNetDevBandwidthClear, removing the need for the two changes in d53b0923539. I've confirmed that the fix works when creating a domain with an ingress-only bandwidth limit. Please let me know what you think. Thanks! Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com> --- Wesley Hershberger (2): virnetdevbandwidth: Split virNetDevBandwidthClear into two helpers virnetdevbandwidth: Only clear qdisc for defined directions src/util/virnetdevbandwidth.c | 70 +++++++++++++++++++++++++++++-------------- src/util/virnetdevbandwidth.h | 2 ++ 2 files changed, 49 insertions(+), 23 deletions(-) --- base-commit: 543280d8c8fe06526ce9871e7d25f46f6ee89017 change-id: 20260506-qdisc-clear-11a1f86407a2 Best regards, -- Wesley Hershberger <wesley.hershberger@canonical.com>
This allows virNetDevBandwidthSet to clear only the interface's qdisc for directions where bandwidth is defined (see the next patch) Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com> --- src/util/virnetdevbandwidth.c | 52 +++++++++++++++++++++++++++++-------------- src/util/virnetdevbandwidth.h | 2 ++ 2 files changed, 37 insertions(+), 17 deletions(-) diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c index b141a38b10..8de0b57943 100644 --- a/src/util/virnetdevbandwidth.c +++ b/src/util/virnetdevbandwidth.c @@ -427,6 +427,38 @@ virNetDevBandwidthSet(const char *ifname, return ret; } +int +virNetDevBandwidthClearRoot(const char *ifname) +{ + int ret = 0; + int dummy; /* for ignoring the exit status */ + g_autoptr(virCommand) cmd = NULL; + + cmd = virCommandNew("tc"); + virCommandAddArgList(cmd, "qdisc", "del", "dev", ifname, "root", NULL); + + if (virCommandRun(cmd, &dummy) < 0) + ret = -1; + + return ret; +} + +int +virNetDevBandwidthClearIngress(const char *ifname) +{ + int ret = 0; + int dummy; /* for ignoring the exit status */ + g_autoptr(virCommand) cmd = NULL; + + cmd = virCommandNew("tc"); + virCommandAddArgList(cmd, "qdisc", "del", "dev", ifname, "ingress", NULL); + + if (virCommandRun(cmd, &dummy) < 0) + ret = -1; + + return ret; +} + /** * virNetDevBandwidthClear: * @ifname: on which interface @@ -440,27 +472,13 @@ virNetDevBandwidthSet(const char *ifname, int virNetDevBandwidthClear(const char *ifname) { - int ret = 0; - int dummy; /* for ignoring the exit status */ - g_autoptr(virCommand) rootcmd = NULL; - g_autoptr(virCommand) ingresscmd = NULL; - if (!ifname) return 0; - rootcmd = virCommandNew("tc"); - virCommandAddArgList(rootcmd, "qdisc", "del", "dev", ifname, "root", NULL); + if (virNetDevBandwidthClearRoot(ifname) < 0) + return -1; - if (virCommandRun(rootcmd, &dummy) < 0) - ret = -1; - - ingresscmd = virCommandNew("tc"); - virCommandAddArgList(ingresscmd, "qdisc", "del", "dev", ifname, "ingress", NULL); - - if (virCommandRun(ingresscmd, &dummy) < 0) - ret = -1; - - return ret; + return virNetDevBandwidthClearIngress(ifname); } /* diff --git a/src/util/virnetdevbandwidth.h b/src/util/virnetdevbandwidth.h index 9f271c5283..eaddb4f3d7 100644 --- a/src/util/virnetdevbandwidth.h +++ b/src/util/virnetdevbandwidth.h @@ -50,6 +50,8 @@ int virNetDevBandwidthSet(const char *ifname, unsigned int flags) G_GNUC_WARN_UNUSED_RESULT; +int virNetDevBandwidthClearRoot(const char *ifname); +int virNetDevBandwidthClearIngress(const char *ifname); int virNetDevBandwidthClear(const char *ifname); int virNetDevBandwidthCopy(virNetDevBandwidth **dest, const virNetDevBandwidth *src) -- 2.53.0
On 5/7/26 19:03, Wesley Hershberger via Devel wrote:
This allows virNetDevBandwidthSet to clear only the interface's qdisc for directions where bandwidth is defined (see the next patch)
Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com> --- src/util/virnetdevbandwidth.c | 52 +++++++++++++++++++++++++++++-------------- src/util/virnetdevbandwidth.h | 2 ++ 2 files changed, 37 insertions(+), 17 deletions(-)
diff --git a/src/util/virnetdevbandwidth.h b/src/util/virnetdevbandwidth.h index 9f271c5283..eaddb4f3d7 100644 --- a/src/util/virnetdevbandwidth.h +++ b/src/util/virnetdevbandwidth.h @@ -50,6 +50,8 @@ int virNetDevBandwidthSet(const char *ifname, unsigned int flags) G_GNUC_WARN_UNUSED_RESULT;
+int virNetDevBandwidthClearRoot(const char *ifname); +int virNetDevBandwidthClearIngress(const char *ifname); int virNetDevBandwidthClear(const char *ifname); int virNetDevBandwidthCopy(virNetDevBandwidth **dest, const virNetDevBandwidth *src)
No need to expose these in header file - nothing outside virnetbandwidth.c is calling (or going to call) these. In fact, if something needs to call these then these need to be listed in libvirt_private.syms. Michal
When virNetDevBandwidthSet is called for a macvtap device in qemu_command and qemu_hotplug, qemuDomainInterfaceSetDefaultQDisc has been called already, setting the iface qdisc to 'noqueue'. If the interface has an inbound-only bandwidth limit, the outgoing qdisc on the device will be reset to the system default. <interface type="direct"> ... <bandwidth> <inbound average='3125000' peak='3125000'/> </bandwidth> </interface> This only clears the qdisc on an interface before a bandwidth limit is actually set. Closes: https://gitlab.com/libvirt/libvirt/-/work_items/875 Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com> --- src/util/virnetdevbandwidth.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c index 8de0b57943..9c14d84807 100644 --- a/src/util/virnetdevbandwidth.c +++ b/src/util/virnetdevbandwidth.c @@ -253,13 +253,13 @@ virNetDevBandwidthSet(const char *ifname, tx = bandwidth->out; } - /* Only if the caller requests, clear everything including root - * qdisc and all filters before adding everything. - */ - if (flags & VIR_NETDEV_BANDWIDTH_SET_CLEAR_ALL) - virNetDevBandwidthClear(ifname); - if (tx && tx->average) { + /* Only if the caller requests, clear the root qdisc and all filters + * before adding everything. + */ + if (flags & VIR_NETDEV_BANDWIDTH_SET_CLEAR_ALL) + virNetDevBandwidthClearRoot(ifname); + average = g_strdup_printf("%llukbps", tx->average); if (tx->peak) peak = g_strdup_printf("%llukbps", tx->peak); @@ -383,6 +383,12 @@ virNetDevBandwidthSet(const char *ifname, } if (rx) { + /* Only if the caller requests, clear the ingress qdisc and all + * filters before adding everything. + */ + if (flags & VIR_NETDEV_BANDWIDTH_SET_CLEAR_ALL) + virNetDevBandwidthClearIngress(ifname); + average = g_strdup_printf("%llukbps", rx->average); if (rx->burst) { -- 2.53.0
On 5/7/26 19:03, Wesley Hershberger via Devel wrote:
When virNetDevBandwidthSet is called for a macvtap device in qemu_command and qemu_hotplug, qemuDomainInterfaceSetDefaultQDisc has been called already, setting the iface qdisc to 'noqueue'. If the interface has an inbound-only bandwidth limit, the outgoing qdisc on the device will be reset to the system default.
<interface type="direct"> ... <bandwidth> <inbound average='3125000' peak='3125000'/> </bandwidth> </interface>
This only clears the qdisc on an interface before a bandwidth limit is actually set.
Closes: https://gitlab.com/libvirt/libvirt/-/work_items/875 Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com> --- src/util/virnetdevbandwidth.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c index 8de0b57943..9c14d84807 100644 --- a/src/util/virnetdevbandwidth.c +++ b/src/util/virnetdevbandwidth.c @@ -253,13 +253,13 @@ virNetDevBandwidthSet(const char *ifname, tx = bandwidth->out; }
- /* Only if the caller requests, clear everything including root - * qdisc and all filters before adding everything. - */ - if (flags & VIR_NETDEV_BANDWIDTH_SET_CLEAR_ALL) - virNetDevBandwidthClear(ifname); - if (tx && tx->average) { + /* Only if the caller requests, clear the root qdisc and all filters + * before adding everything. + */ + if (flags & VIR_NETDEV_BANDWIDTH_SET_CLEAR_ALL) + virNetDevBandwidthClearRoot(ifname); + average = g_strdup_printf("%llukbps", tx->average); if (tx->peak) peak = g_strdup_printf("%llukbps", tx->peak); @@ -383,6 +383,12 @@ virNetDevBandwidthSet(const char *ifname, }
if (rx) { + /* Only if the caller requests, clear the ingress qdisc and all + * filters before adding everything. + */ + if (flags & VIR_NETDEV_BANDWIDTH_SET_CLEAR_ALL) + virNetDevBandwidthClearIngress(ifname); + average = g_strdup_printf("%llukbps", rx->average);
if (rx->burst) {
After this, virnetdevbandwidthtest.c must be updated to reflect that: a) in some cases root/ingress qdisc is not deleted, b) ingress qdisc is now removed later in the process. This is actually a good thing - it proves the patch works as expected. Michal
On 5/7/26 19:03, Wesley Hershberger via Devel wrote:
The first part of this description is taken from the bug report: https://gitlab.com/libvirt/libvirt/-/work_items/875
Signed-off-by: Wesley Hershberger <wesley.hershberger@canonical.com> --- Wesley Hershberger (2): virnetdevbandwidth: Split virNetDevBandwidthClear into two helpers virnetdevbandwidth: Only clear qdisc for defined directions
src/util/virnetdevbandwidth.c | 70 +++++++++++++++++++++++++++++-------------- src/util/virnetdevbandwidth.h | 2 ++ 2 files changed, 49 insertions(+), 23 deletions(-) --- base-commit: 543280d8c8fe06526ce9871e7d25f46f6ee89017 change-id: 20260506-qdisc-clear-11a1f86407a2
Best regards,
I'm changing 1/2 so that functions are static and moving them a bit earlier so that virNetDevBandwidthSet from 2/2 can call them. And in 2/2 I'm fixing the virnetdevbandwidthtest. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> and merged. Michal
participants (2)
-
Michal Prívozník -
Wesley Hershberger