[libvirt] [PATCH 0/3] support setting MTU of libvirt networks

This has been requested several times in #virt on IRC, and on libvirt-users, and a comment posted to https://bugzilla.redhat.com/1224348 reminded me that there's even an open BZ for it, so I finally sat down and wrote the patches for it. The idea of the patches is that an *empty* bridge device can't have its MTU directly set to anything higher than 1500, but once there are any devices on the bridge, its MTU is limited to the smallest of the MTUs of any attached devices. However, when the very first device is attached to the bridge, the bridge will *increase* its own MTU to match if needed. So the trick to setting a high MTU is to set the MTU of every device to the desired value before it is added (especially the first). libvirt already sets the MTU of every device to the MTU of the bridge, so all we need to do is make it so the first device is set from the network config's "mtu" attribute instead - the bridge will get the larger MTU, and all future devices added will get the higher MTU as well. This doesn't take care of setting the MTU on the guest side of the tap device; for now that must be done manually, but qemu has recently added a commandline parameter to send the desired MTU to the guest, and the virtio-net guest driver at least honors this. An upcoming patch to libvirt will automatically set this qemu commandline option so that no intervention will be required (beyond setting the MTU once in the network config). Laine Stump (3): util: add MTU arg to virNetDevTapCreateInBridgePort() conf: support mtu attribute in a network's <bridge> element network: honor mtu setting when creating network docs/formatnetwork.html.in | 8 +++++++- docs/news.xml | 6 +++++- docs/schemas/network.rng | 6 ++++++ src/bhyve/bhyve_command.c | 2 +- src/conf/network_conf.c | 20 ++++++++++++++++++-- src/conf/network_conf.h | 1 + src/network/bridge_driver.c | 1 + src/qemu/qemu_interface.c | 2 +- src/uml/uml_conf.c | 2 +- src/util/virnetdevtap.c | 20 +++++++++++++++----- src/util/virnetdevtap.h | 1 + tests/bhyvexml2argvmock.c | 1 + tests/networkxml2xmlin/set-mtu.xml | 11 +++++++++++ tests/networkxml2xmlout/set-mtu.xml | 11 +++++++++++ tests/networkxml2xmltest.c | 1 + 15 files changed, 81 insertions(+), 12 deletions(-) create mode 100644 tests/networkxml2xmlin/set-mtu.xml create mode 100644 tests/networkxml2xmlout/set-mtu.xml -- 2.9.3

virNetDevTapCreateInBridgePort() has always set the new tap device to the current MTU of the bridge it's being attached to. There is one case where we will want to set the new tap device to a different (usually larger) MTU - if that's done with the very first device added to the bridge, the bridge's MTU will be set to the device's MTU. This patch allows for that possibility by adding "int mtu" to the arg list for virNetDevTapCreateInBridgePort(), but all callers are sending -1, so it doesn't yet have any effect. --- src/bhyve/bhyve_command.c | 2 +- src/network/bridge_driver.c | 2 +- src/qemu/qemu_interface.c | 2 +- src/uml/uml_conf.c | 2 +- src/util/virnetdevtap.c | 20 +++++++++++++++----- src/util/virnetdevtap.h | 1 + tests/bhyvexml2argvmock.c | 1 + 7 files changed, 21 insertions(+), 9 deletions(-) diff --git a/src/bhyve/bhyve_command.c b/src/bhyve/bhyve_command.c index 8a29977..0e5d5db 100644 --- a/src/bhyve/bhyve_command.c +++ b/src/bhyve/bhyve_command.c @@ -77,7 +77,7 @@ bhyveBuildNetArgStr(const virDomainDef *def, if (virNetDevTapCreateInBridgePort(brname, &net->ifname, &net->mac, def->uuid, NULL, NULL, 0, virDomainNetGetActualVirtPortProfile(net), - virDomainNetGetActualVlan(net), + virDomainNetGetActualVlan(net), -1, VIR_NETDEV_TAP_CREATE_IFUP | VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { goto cleanup; } diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 7d340ef..7a7bdaf 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2300,7 +2300,7 @@ networkStartNetworkVirtual(virNetworkDriverStatePtr driver, /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, &network->def->mac, - NULL, NULL, &tapfd, 1, NULL, NULL, + NULL, NULL, &tapfd, 1, NULL, NULL, -1, VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE | VIR_NETDEV_TAP_CREATE_IFUP | VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c index 455c2d0..2313de4 100644 --- a/src/qemu/qemu_interface.c +++ b/src/qemu/qemu_interface.c @@ -543,7 +543,7 @@ qemuInterfaceBridgeConnect(virDomainDefPtr def, if (virNetDevTapCreateInBridgePort(brname, &net->ifname, &net->mac, def->uuid, tunpath, tapfd, *tapfdSize, virDomainNetGetActualVirtPortProfile(net), - virDomainNetGetActualVlan(net), + virDomainNetGetActualVlan(net), -1, tap_create_flags) < 0) { virDomainAuditNetDevice(def, net, tunpath, false); goto cleanup; diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index 6754d3c..e98973e 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -125,7 +125,7 @@ umlConnectTapDevice(virDomainDefPtr vm, if (virNetDevTapCreateInBridgePort(bridge, &net->ifname, &net->mac, vm->uuid, net->backend.tap, &tapfd, 1, virDomainNetGetActualVirtPortProfile(net), - virDomainNetGetActualVlan(net), + virDomainNetGetActualVlan(net), -1, VIR_NETDEV_TAP_CREATE_IFUP | VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { if (template_ifname) diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 85c0045..2827cde 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -543,6 +543,7 @@ int virNetDevTapCreateInBridgePort(const char *brname, size_t tapfdSize, virNetDevVPortProfilePtr virtPortProfile, virNetDevVlanPtr virtVlan, + int mtu, unsigned int flags) { virMacAddr tapmac; @@ -578,12 +579,21 @@ int virNetDevTapCreateInBridgePort(const char *brname, if (virNetDevSetMAC(*ifname, &tapmac) < 0) goto error; - /* We need to set the interface MTU before adding it - * to the bridge, because the bridge will have its - * MTU adjusted automatically when we add the new interface. + /* If an MTU is specified for the new device, set it before + * attaching the device to the bridge, as it may affect the MTU of + * the bridge (in particular if it is the first device attached to + * the bridge, or if it is smaller than the current MTU of the + * bridgeg). If MTU isn't specified for the new device, we need to + * set the interface MTU to the current MTU of the bridge (to + * avoid inadvertantly changing the bridge's MTU. */ - if (virNetDevSetMTUFromDevice(*ifname, brname) < 0) - goto error; + if (mtu > 0) { + if (virNetDevSetMTU(*ifname, mtu) < 0) + goto error; + } else { + if (virNetDevSetMTUFromDevice(*ifname, brname) < 0) + goto error; + } if (virtPortProfile) { if (virtPortProfile->virtPortType == VIR_NETDEV_VPORT_PROFILE_MIDONET) { diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index 259e7c9..8fe7fa1 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -71,6 +71,7 @@ int virNetDevTapCreateInBridgePort(const char *brname, size_t tapfdSize, virNetDevVPortProfilePtr virtPortProfile, virNetDevVlanPtr virtVlan, + int mtu, unsigned int flags) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) ATTRIBUTE_RETURN_CHECK; diff --git a/tests/bhyvexml2argvmock.c b/tests/bhyvexml2argvmock.c index a851632..a1080c0 100644 --- a/tests/bhyvexml2argvmock.c +++ b/tests/bhyvexml2argvmock.c @@ -28,6 +28,7 @@ int virNetDevTapCreateInBridgePort(const char *brname ATTRIBUTE_UNUSED, size_t tapfdSize ATTRIBUTE_UNUSED, virNetDevVPortProfilePtr virtPortProfile ATTRIBUTE_UNUSED, virNetDevVlanPtr virtVlan ATTRIBUTE_UNUSED, + int mtu ATTRIBUTE_UNUSED, unsigned int fakeflags ATTRIBUTE_UNUSED) { VIR_FREE(*ifname); -- 2.9.3

Example: <network> ... <bridge name='virbr2' mtu='9000'/> ... If mtu is unset, it's assumed that we want the default for whatever is the underlying transport (usually this is 1500). This setting isn't yet wired in, so it will have no effect. This partially resolves: https://bugzilla.redhat.com/1224348 --- docs/formatnetwork.html.in | 8 +++++++- docs/news.xml | 6 +++++- docs/schemas/network.rng | 6 ++++++ src/conf/network_conf.c | 20 ++++++++++++++++++-- src/conf/network_conf.h | 1 + tests/networkxml2xmlin/set-mtu.xml | 11 +++++++++++ tests/networkxml2xmlout/set-mtu.xml | 11 +++++++++++ tests/networkxml2xmltest.c | 1 + 8 files changed, 60 insertions(+), 4 deletions(-) create mode 100644 tests/networkxml2xmlin/set-mtu.xml create mode 100644 tests/networkxml2xmlout/set-mtu.xml diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in index 291dcea..19245cf 100644 --- a/docs/formatnetwork.html.in +++ b/docs/formatnetwork.html.in @@ -92,7 +92,7 @@ <pre> ... -<bridge name="virbr0" stp="on" delay="5" macTableManager="libvirt"/> +<bridge name="virbr0" stp="on" delay="5" mtu='9000' macTableManager="libvirt"/> <domain name="example.com" localOnly="no"/> <forward mode="nat" dev="eth0"/> ...</pre> @@ -121,6 +121,12 @@ delay value in seconds (default is 0). <span class="since">Since 0.3.0</span> + The optional attribute. <code>mtu</code> specifies the Maximum + Transmission Unit (MTU) for the + bridge. <span class="since">Since 3.1.0</span>. If mtu is + unspecified, the default setting for the type of bridge being + used is assumed (usually 1500). + <p> The <code>macTableManager</code> attribute of the bridge element is used to tell libvirt how the bridge's MAC address diff --git a/docs/news.xml b/docs/news.xml index ef7e2db..ca2546a 100644 --- a/docs/news.xml +++ b/docs/news.xml @@ -12,7 +12,11 @@ <libvirt> <release version="v3.1.0" date="unreleased"> <section title="New features"> - <change/> + <change> + <summary> + Support specifying MTU in a network's <bridge> element + </summary> + </change> </section> <section title="Improvements"> <change> diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 8f0a61b..4f52fdb 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -71,6 +71,12 @@ </optional> <optional> + <attribute name="mtu"> + <data type="unsignedInt"/> + </attribute> + </optional> + + <optional> <attribute name="macTableManager"> <ref name="macTableManager"/> </attribute> diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 86ce311..681dffb 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -2212,6 +2212,17 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) } VIR_FREE(tmp); + tmp = virXPathString("string(./bridge[1]/@mtu)", ctxt); + if (tmp) { + if (virStrToLong_ui(tmp, NULL, 10, &def->mtu) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid mtu value '%s' in network '%s'"), + tmp, def->name); + goto error; + } + } + VIR_FREE(tmp); + tmp = virXPathString("string(./bridge[1]/@macTableManager)", ctxt); if (tmp) { if ((def->macTableManager @@ -2433,9 +2444,11 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) } /* fall through to next case */ case VIR_NETWORK_FORWARD_BRIDGE: - if (def->delay || stp) { + if (def->delay || stp || def->mtu) { virReportError(VIR_ERR_XML_ERROR, - _("bridge delay/stp options only allowed in route, nat, and isolated mode, not in %s (network '%s')"), + _("bridge delay/stp/mtu options only allowed in " + "route, nat, and isolated mode, not in %s " + "(network '%s')"), virNetworkForwardTypeToString(def->forward.type), def->name); goto error; @@ -2957,6 +2970,9 @@ virNetworkDefFormatBuf(virBufferPtr buf, virBufferAsprintf(buf, " stp='%s' delay='%ld'", def->stp ? "on" : "off", def->delay); } + if (def->mtu) + virBufferAsprintf(buf, " mtu='%u'", def->mtu); + if (def->macTableManager) { virBufferAsprintf(buf, " macTableManager='%s'", virNetworkBridgeMACTableManagerTypeToString(def->macTableManager)); diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index b5c9ea2..9e4ae71 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -240,6 +240,7 @@ struct _virNetworkDef { int domainLocalOnly; /* enum virTristateBool: yes disables dns forwarding */ unsigned long delay; /* Bridge forward delay (ms) */ bool stp; /* Spanning tree protocol */ + unsigned int mtu; /* MTU for bridge, 0 means "default" i.e. unset in config */ virMacAddr mac; /* mac address of bridge device */ bool mac_specified; diff --git a/tests/networkxml2xmlin/set-mtu.xml b/tests/networkxml2xmlin/set-mtu.xml new file mode 100644 index 0000000..86c1729 --- /dev/null +++ b/tests/networkxml2xmlin/set-mtu.xml @@ -0,0 +1,11 @@ +<network ipv6='no'> + <name>private</name> + <uuid>81ff0d90-c91e-6742-64da-4a736edb9a9b</uuid> + <bridge name='virbr2' mtu='9000'/> + <mac address='52:54:00:17:3F:37'/> + <ip address="192.168.152.1" netmask="255.255.255.0"> + <dhcp> + <range start="192.168.152.2" end="192.168.152.254"/> + </dhcp> + </ip> +</network> diff --git a/tests/networkxml2xmlout/set-mtu.xml b/tests/networkxml2xmlout/set-mtu.xml new file mode 100644 index 0000000..df038ae --- /dev/null +++ b/tests/networkxml2xmlout/set-mtu.xml @@ -0,0 +1,11 @@ +<network> + <name>private</name> + <uuid>81ff0d90-c91e-6742-64da-4a736edb9a9b</uuid> + <bridge name='virbr2' stp='on' delay='0' mtu='9000'/> + <mac address='52:54:00:17:3f:37'/> + <ip address='192.168.152.1' netmask='255.255.255.0'> + <dhcp> + <range start='192.168.152.2' end='192.168.152.254'/> + </dhcp> + </ip> +</network> diff --git a/tests/networkxml2xmltest.c b/tests/networkxml2xmltest.c index 01cd6f7..cfaf718 100644 --- a/tests/networkxml2xmltest.c +++ b/tests/networkxml2xmltest.c @@ -158,6 +158,7 @@ mymain(void) DO_TEST_PARSE_ERROR("hostdev-duplicate"); DO_TEST_PARSE_ERROR("passthrough-duplicate"); DO_TEST("metadata"); + DO_TEST("set-mtu"); return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } -- 2.9.3

On 01/23/2017 04:35 PM, Laine Stump wrote:
Example:
<network> ... <bridge name='virbr2' mtu='9000'/> ...
BTW: what if we want to set MTU over some different network types? For instance, if we have a network with a pool of phys NICs. Is it worth to set MTU there too? If so, then we might need to move the attribute out to a separate element: <network> <mtu size='9000'/> </network> Michal

On 01/24/2017 05:34 AM, Michal Privoznik wrote:
On 01/23/2017 04:35 PM, Laine Stump wrote:
Example:
<network> ... <bridge name='virbr2' mtu='9000'/> ...
BTW: what if we want to set MTU over some different network types? For instance, if we have a network with a pool of phys NICs. Is it worth to set MTU there too? If so, then we might need to move the attribute out to a separate element:
<network> <mtu size='9000'/> </network>
That's an *excellent* point, especially since I was just last night thinking about extending the MTU setting to macvtap passthrough devices from a pool. Fortunately I haven't pushed the patches yet. I'll change everything accordingly and repost.

On 01/24/2017 04:24 PM, Laine Stump wrote:
On 01/24/2017 05:34 AM, Michal Privoznik wrote:
On 01/23/2017 04:35 PM, Laine Stump wrote:
Example:
<network> ... <bridge name='virbr2' mtu='9000'/> ...
BTW: what if we want to set MTU over some different network types? For instance, if we have a network with a pool of phys NICs. Is it worth to set MTU there too? If so, then we might need to move the attribute out to a separate element:
<network> <mtu size='9000'/> </network>
That's an *excellent* point, especially since I was just last night thinking about extending the MTU setting to macvtap passthrough devices from a pool. Fortunately I haven't pushed the patches yet. I'll change everything accordingly and repost.
Well, we are quite far away from next release, so no harm in pushing the patches. BTW: macvtap? Can you set a different MTU on a macvtap device than on the physical device that is backing it? Michal

This resolves: https://bugzilla.redhat.com/1224348 --- src/network/bridge_driver.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 7a7bdaf..f383779 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2300,7 +2300,8 @@ networkStartNetworkVirtual(virNetworkDriverStatePtr driver, /* Keep tun fd open and interface up to allow for IPv6 DAD to happen */ if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, &network->def->mac, - NULL, NULL, &tapfd, 1, NULL, NULL, -1, + NULL, NULL, &tapfd, 1, NULL, NULL, + network->def->mtu, VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE | VIR_NETDEV_TAP_CREATE_IFUP | VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { -- 2.9.3

On 23.01.2017 16:35, Laine Stump wrote:
This has been requested several times in #virt on IRC, and on libvirt-users, and a comment posted to https://bugzilla.redhat.com/1224348 reminded me that there's even an open BZ for it, so I finally sat down and wrote the patches for it.
The idea of the patches is that an *empty* bridge device can't have its MTU directly set to anything higher than 1500, but once there are any devices on the bridge, its MTU is limited to the smallest of the MTUs of any attached devices. However, when the very first device is attached to the bridge, the bridge will *increase* its own MTU to match if needed. So the trick to setting a high MTU is to set the MTU of every device to the desired value before it is added (especially the first). libvirt already sets the MTU of every device to the MTU of the bridge, so all we need to do is make it so the first device is set from the network config's "mtu" attribute instead - the bridge will get the larger MTU, and all future devices added will get the higher MTU as well.
This doesn't take care of setting the MTU on the guest side of the tap device; for now that must be done manually, but qemu has recently added a commandline parameter to send the desired MTU to the guest, and the virtio-net guest driver at least honors this. An upcoming patch to libvirt will automatically set this qemu commandline option so that no intervention will be required (beyond setting the MTU once in the network config).
Laine Stump (3): util: add MTU arg to virNetDevTapCreateInBridgePort() conf: support mtu attribute in a network's <bridge> element network: honor mtu setting when creating network
docs/formatnetwork.html.in | 8 +++++++- docs/news.xml | 6 +++++- docs/schemas/network.rng | 6 ++++++ src/bhyve/bhyve_command.c | 2 +- src/conf/network_conf.c | 20 ++++++++++++++++++-- src/conf/network_conf.h | 1 + src/network/bridge_driver.c | 1 + src/qemu/qemu_interface.c | 2 +- src/uml/uml_conf.c | 2 +- src/util/virnetdevtap.c | 20 +++++++++++++++----- src/util/virnetdevtap.h | 1 + tests/bhyvexml2argvmock.c | 1 + tests/networkxml2xmlin/set-mtu.xml | 11 +++++++++++ tests/networkxml2xmlout/set-mtu.xml | 11 +++++++++++ tests/networkxml2xmltest.c | 1 + 15 files changed, 81 insertions(+), 12 deletions(-) create mode 100644 tests/networkxml2xmlin/set-mtu.xml create mode 100644 tests/networkxml2xmlout/set-mtu.xml
ACK series. I've given it some basic testing and it worked just fine. Incidentally, I've started working on this very same feature. But from the other side - from domain's POV. Michal
participants (2)
-
Laine Stump
-
Michal Privoznik