[libvirt] [PATCH v2 0/4] Support setting MTU in libvirt networks

Similar to V1, but changing the XML to match Michal's patches, and adding a patch that plumbs network MTU back through to the qemu host_mtu option. Laine Stump (4): util: add MTU arg to virNetDevTapCreateInBridgePort() conf: support configuring mtu size in a virtual network network: honor mtu setting when creating network qemu: propagate bridge MTU into qemu "host_mtu" option docs/formatnetwork.html.in | 19 ++++++++++++++++++- docs/news.xml | 5 +++-- docs/schemas/network.rng | 5 +++++ src/bhyve/bhyve_command.c | 1 + src/conf/network_conf.c | 31 ++++++++++++++++++++++++++++++- src/conf/network_conf.h | 1 + src/network/bridge_driver.c | 1 + src/qemu/qemu_command.c | 32 ++++++++++++++++++++++---------- src/qemu/qemu_command.h | 3 ++- src/qemu/qemu_hotplug.c | 5 +++-- src/qemu/qemu_interface.c | 4 +++- src/qemu/qemu_interface.h | 3 ++- src/uml/uml_conf.c | 1 + src/util/virnetdevtap.c | 32 +++++++++++++++++++++++++++----- src/util/virnetdevtap.h | 2 ++ tests/bhyvexml2argvmock.c | 2 ++ tests/networkxml2xmlin/set-mtu.xml | 12 ++++++++++++ tests/networkxml2xmlout/set-mtu.xml | 12 ++++++++++++ tests/networkxml2xmltest.c | 1 + 19 files changed, 148 insertions(+), 24 deletions(-) create mode 100644 tests/networkxml2xmlin/set-mtu.xml create mode 100644 tests/networkxml2xmlout/set-mtu.xml -- 2.9.3

virNetDevTapCreateInBridgePort() has always set the new tap device to the current MTU of the bridge it's being attached to. There is one case where we will want to set the new tap device to a different (usually larger) MTU - if that's done with the very first device added to the bridge, the bridge's MTU will be set to the device's MTU. This patch allows for that possibility by adding "int mtu" to the arg list for virNetDevTapCreateInBridgePort(), but all callers are sending -1, so it doesn't yet have any effect. Since the requested MTU isn't necessarily what is used in the end (for example, if there is no MTU requested, the tap device will be set to the current MTU of the bridge), and the hypervisor may want to know the actual MTU used, we also return the actual MTU to the caller (if actualMTU is non-NULL). --- Change from V1: * in addition to sending in a requested MTU, return the MTU that was actually set on the device (so that it can be sent to qemu with host_mtu option) src/bhyve/bhyve_command.c | 1 + src/network/bridge_driver.c | 1 + src/qemu/qemu_interface.c | 1 + src/uml/uml_conf.c | 1 + src/util/virnetdevtap.c | 32 +++++++++++++++++++++++++++----- src/util/virnetdevtap.h | 2 ++ tests/bhyvexml2argvmock.c | 2 ++ 7 files changed, 35 insertions(+), 5 deletions(-) diff --git a/src/bhyve/bhyve_command.c b/src/bhyve/bhyve_command.c index 8a29977..4c076c2 100644 --- a/src/bhyve/bhyve_command.c +++ b/src/bhyve/bhyve_command.c @@ -78,6 +78,7 @@ bhyveBuildNetArgStr(const virDomainDef *def, def->uuid, NULL, NULL, 0, virDomainNetGetActualVirtPortProfile(net), virDomainNetGetActualVlan(net), + 0, NULL, VIR_NETDEV_TAP_CREATE_IFUP | VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { goto cleanup; } diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 7d340ef..f88c261 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2301,6 +2301,7 @@ networkStartNetworkVirtual(virNetworkDriverStatePtr driver, if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, &network->def->mac, NULL, NULL, &tapfd, 1, NULL, NULL, + 0, NULL, VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE | VIR_NETDEV_TAP_CREATE_IFUP | VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c index 455c2d0..ce448d2 100644 --- a/src/qemu/qemu_interface.c +++ b/src/qemu/qemu_interface.c @@ -544,6 +544,7 @@ qemuInterfaceBridgeConnect(virDomainDefPtr def, def->uuid, tunpath, tapfd, *tapfdSize, virDomainNetGetActualVirtPortProfile(net), virDomainNetGetActualVlan(net), + 0, NULL, tap_create_flags) < 0) { virDomainAuditNetDevice(def, net, tunpath, false); goto cleanup; diff --git a/src/uml/uml_conf.c b/src/uml/uml_conf.c index 6754d3c..4663c7d 100644 --- a/src/uml/uml_conf.c +++ b/src/uml/uml_conf.c @@ -126,6 +126,7 @@ umlConnectTapDevice(virDomainDefPtr vm, vm->uuid, net->backend.tap, &tapfd, 1, virDomainNetGetActualVirtPortProfile(net), virDomainNetGetActualVlan(net), + 0, NULL, VIR_NETDEV_TAP_CREATE_IFUP | VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { if (template_ifname) diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index 85c0045..41e68f4 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -513,6 +513,8 @@ int virNetDevTapDelete(const char *ifname ATTRIBUTE_UNUSED, * @tapfd: array of file descriptor return value for the new tap device * @tapfdSize: number of file descriptors in @tapfd * @virtPortProfile: bridge/port specific configuration + * @mtu: requested MTU for port (or 0 for "default") + * @actualMTU: MTU actually set for port (after accounting for bridge's MTU) * @flags: OR of virNetDevTapCreateFlags: * VIR_NETDEV_TAP_CREATE_IFUP @@ -543,6 +545,8 @@ int virNetDevTapCreateInBridgePort(const char *brname, size_t tapfdSize, virNetDevVPortProfilePtr virtPortProfile, virNetDevVlanPtr virtVlan, + unsigned int mtu, + unsigned int *actualMTU, unsigned int flags) { virMacAddr tapmac; @@ -578,12 +582,30 @@ int virNetDevTapCreateInBridgePort(const char *brname, if (virNetDevSetMAC(*ifname, &tapmac) < 0) goto error; - /* We need to set the interface MTU before adding it - * to the bridge, because the bridge will have its - * MTU adjusted automatically when we add the new interface. + /* If an MTU is specified for the new device, set it before + * attaching the device to the bridge, as it may affect the MTU of + * the bridge (in particular if it is the first device attached to + * the bridge, or if it is smaller than the current MTU of the + * bridge). If MTU isn't specified for the new device (i.e. 0), + * we need to set the interface MTU to the current MTU of the + * bridge (to avoid inadvertantly changing the bridge's MTU). */ - if (virNetDevSetMTUFromDevice(*ifname, brname) < 0) - goto error; + if (mtu > 0) { + if (virNetDevSetMTU(*ifname, mtu) < 0) + goto error; + } else { + if (virNetDevSetMTUFromDevice(*ifname, brname) < 0) + goto error; + } + if (actualMTU) { + int retMTU = virNetDevGetMTU(*ifname); + + if (retMTU < 0) + goto error; + + *actualMTU = retMTU; + } + if (virtPortProfile) { if (virtPortProfile->virtPortType == VIR_NETDEV_VPORT_PROFILE_MIDONET) { diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index 259e7c9..6441df4 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -71,6 +71,8 @@ int virNetDevTapCreateInBridgePort(const char *brname, size_t tapfdSize, virNetDevVPortProfilePtr virtPortProfile, virNetDevVlanPtr virtVlan, + unsigned int mtu, + unsigned int *actualMTU, unsigned int flags) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3) ATTRIBUTE_RETURN_CHECK; diff --git a/tests/bhyvexml2argvmock.c b/tests/bhyvexml2argvmock.c index a851632..fd71469 100644 --- a/tests/bhyvexml2argvmock.c +++ b/tests/bhyvexml2argvmock.c @@ -28,6 +28,8 @@ int virNetDevTapCreateInBridgePort(const char *brname ATTRIBUTE_UNUSED, size_t tapfdSize ATTRIBUTE_UNUSED, virNetDevVPortProfilePtr virtPortProfile ATTRIBUTE_UNUSED, virNetDevVlanPtr virtVlan ATTRIBUTE_UNUSED, + unsigned int mtu ATTRIBUTE_UNUSED, + unsigned int *actualMTU ATTRIBUTE_UNUSED, unsigned int fakeflags ATTRIBUTE_UNUSED) { VIR_FREE(*ifname); -- 2.9.3

Example: <network> ... <mtu size='9000'/> ... If mtu is unset, it's assumed that we want the default for whatever is the underlying transport (usually this is 1500). This setting isn't yet wired in, so it will have no effect. This partially resolves: https://bugzilla.redhat.com/1224348 --- Change from V1: * switch from <bridge .... mtu='9000'/> to <mtu size='9000'/> for consistency and to allow for potentially setting an MTU on a network that doesn't have a <bridge> element. * use Michal's mtu element definition from networkcommon.rng rather than rolling my own. docs/formatnetwork.html.in | 19 ++++++++++++++++++- docs/news.xml | 5 +++-- docs/schemas/network.rng | 5 +++++ src/conf/network_conf.c | 31 ++++++++++++++++++++++++++++++- src/conf/network_conf.h | 1 + tests/networkxml2xmlin/set-mtu.xml | 12 ++++++++++++ tests/networkxml2xmlout/set-mtu.xml | 12 ++++++++++++ tests/networkxml2xmltest.c | 1 + 8 files changed, 82 insertions(+), 4 deletions(-) create mode 100644 tests/networkxml2xmlin/set-mtu.xml create mode 100644 tests/networkxml2xmlout/set-mtu.xml diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in index 8638df9..777c341 100644 --- a/docs/formatnetwork.html.in +++ b/docs/formatnetwork.html.in @@ -93,6 +93,7 @@ <pre> ... <bridge name="virbr0" stp="on" delay="5" macTableManager="libvirt"/> +<mtu size="9000"/> <domain name="example.com" localOnly="no"/> <forward mode="nat" dev="eth0"/> ...</pre> @@ -151,9 +152,25 @@ <span class="since">Since 1.2.11, requires kernel 3.17 or newer</span> </p> + </dd> - + <dt><code>mtu</code></dt> + <dd> + The <code>size</code> attribute of the <code>mtu></code> + element specifies the Maximum Transmission Unit (MTU) for the + network. <span class="since">Since 3.1.0</span>. In the case + of a libvirt-managed network (one with forward mode + of <code>nat</code>, <code>route</code>, <code>open</code>, or + no <code>forward</code> element (i.e. an isolated network), + this will be the MTU assigned to the bridge device when + libvirt creates it, and thereafter also assigned to all tap + devices created to connect guest interfaces. Network types not + specifically mentioned here don't support having an MTU set in + the libvirt network config. If mtu size is unspecified, the + default setting for the type of device being used is assumed + (usually 1500). </dd> + <dt><code>domain</code></dt> <dd> The <code>name</code> attribute on the <code>domain</code> diff --git a/docs/news.xml b/docs/news.xml index f408293..8f0d3d6 100644 --- a/docs/news.xml +++ b/docs/news.xml @@ -36,10 +36,11 @@ </change> <change> <summary> - Introduce MTU to domain <interface/> + Introduce MTU to domain <interface/> and <network> </summary> <description> - Allow setting MTU size for some types of domain interface. + Allow setting MTU size for some types of domain interface + and network. </description> </change> </section> diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 8f0a61b..1048dab 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -79,6 +79,11 @@ </element> </optional> + <!-- <mtu> element --> + <optional> + <ref name="mtu"/> + </optional> + <!-- <mac> element --> <optional> <element name="mac"> diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 86ce311..0e20dac 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -2242,6 +2242,17 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) def->mac_specified = true; } + tmp = virXPathString("string(./mtu/@size)", ctxt); + if (tmp) { + if (virStrToLong_ui(tmp, NULL, 10, &def->mtu) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid mtu size '%s' in network '%s'"), + tmp, def->name); + goto error; + } + } + VIR_FREE(tmp); + dnsNode = virXPathNode("./dns", ctxt); if (dnsNode != NULL && virNetworkDNSDefParseXML(def->name, dnsNode, ctxt, &def->dns) < 0) { @@ -2435,7 +2446,9 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) case VIR_NETWORK_FORWARD_BRIDGE: if (def->delay || stp) { virReportError(VIR_ERR_XML_ERROR, - _("bridge delay/stp options only allowed in route, nat, and isolated mode, not in %s (network '%s')"), + _("bridge delay/stp options only allowed in " + "route, nat, and isolated mode, not in %s " + "(network '%s')"), virNetworkForwardTypeToString(def->forward.type), def->name); goto error; @@ -2454,6 +2467,19 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) VIR_FREE(stp); + if (def->mtu && + (def->forward.type != VIR_NETWORK_FORWARD_NONE && + def->forward.type != VIR_NETWORK_FORWARD_NAT && + def->forward.type != VIR_NETWORK_FORWARD_ROUTE && + def->forward.type != VIR_NETWORK_FORWARD_OPEN)) { + virReportError(VIR_ERR_XML_ERROR, + _("mtu size only allowed in open, route, nat, " + "and isolated mode, not in %s (network '%s')"), + virNetworkForwardTypeToString(def->forward.type), + def->name); + goto error; + } + /* Extract custom metadata */ if ((metadataNode = virXPathNode("./metadata[1]", ctxt)) != NULL) { def->metadata = xmlCopyNode(metadataNode, 1); @@ -2964,6 +2990,9 @@ virNetworkDefFormatBuf(virBufferPtr buf, virBufferAddLit(buf, "/>\n"); } + if (def->mtu) + virBufferAsprintf(buf, "<mtu size='%u'/>\n", def->mtu); + if (def->mac_specified) { char macaddr[VIR_MAC_STRING_BUFLEN]; virMacAddrFormat(&def->mac, macaddr); diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index b5c9ea2..9e4ae71 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -240,6 +240,7 @@ struct _virNetworkDef { int domainLocalOnly; /* enum virTristateBool: yes disables dns forwarding */ unsigned long delay; /* Bridge forward delay (ms) */ bool stp; /* Spanning tree protocol */ + unsigned int mtu; /* MTU for bridge, 0 means "default" i.e. unset in config */ virMacAddr mac; /* mac address of bridge device */ bool mac_specified; diff --git a/tests/networkxml2xmlin/set-mtu.xml b/tests/networkxml2xmlin/set-mtu.xml new file mode 100644 index 0000000..d1ef32a --- /dev/null +++ b/tests/networkxml2xmlin/set-mtu.xml @@ -0,0 +1,12 @@ +<network ipv6='no'> + <name>private</name> + <uuid>81ff0d90-c91e-6742-64da-4a736edb9a9b</uuid> + <bridge name='virbr2'/> + <mtu size='9000'/> + <mac address='52:54:00:17:3F:37'/> + <ip address="192.168.152.1" netmask="255.255.255.0"> + <dhcp> + <range start="192.168.152.2" end="192.168.152.254"/> + </dhcp> + </ip> +</network> diff --git a/tests/networkxml2xmlout/set-mtu.xml b/tests/networkxml2xmlout/set-mtu.xml new file mode 100644 index 0000000..10e8155 --- /dev/null +++ b/tests/networkxml2xmlout/set-mtu.xml @@ -0,0 +1,12 @@ +<network> + <name>private</name> + <uuid>81ff0d90-c91e-6742-64da-4a736edb9a9b</uuid> + <bridge name='virbr2' stp='on' delay='0'/> + <mtu size='9000'/> + <mac address='52:54:00:17:3f:37'/> + <ip address='192.168.152.1' netmask='255.255.255.0'> + <dhcp> + <range start='192.168.152.2' end='192.168.152.254'/> + </dhcp> + </ip> +</network> diff --git a/tests/networkxml2xmltest.c b/tests/networkxml2xmltest.c index 01cd6f7..cfaf718 100644 --- a/tests/networkxml2xmltest.c +++ b/tests/networkxml2xmltest.c @@ -158,6 +158,7 @@ mymain(void) DO_TEST_PARSE_ERROR("hostdev-duplicate"); DO_TEST_PARSE_ERROR("passthrough-duplicate"); DO_TEST("metadata"); + DO_TEST("set-mtu"); return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } -- 2.9.3

This resolves: https://bugzilla.redhat.com/1224348 --- No change from V1. src/network/bridge_driver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index f88c261..06759c6 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2301,7 +2301,7 @@ networkStartNetworkVirtual(virNetworkDriverStatePtr driver, if (virNetDevTapCreateInBridgePort(network->def->bridge, &macTapIfName, &network->def->mac, NULL, NULL, &tapfd, 1, NULL, NULL, - 0, NULL, + network->def->mtu, NULL, VIR_NETDEV_TAP_CREATE_USE_MAC_FOR_BRIDGE | VIR_NETDEV_TAP_CREATE_IFUP | VIR_NETDEV_TAP_CREATE_PERSIST) < 0) { -- 2.9.3

libvirt was able to set the host_mtu option when an MTU was explicitly given in the interface config (with <mtu size='n'/>), set the MTU of a libvirt network in the network config (with the same named subelement), and would automatically set the MTU of any tap device to the MTU of the network. This patch ties that all together (for networks based on tap devices and either Linux host bridges or OVS bridges) by learning the MTU of the network (i.e. the bridge) during qemuInterfaceBridgeConnect(), and returning that value so that it can be passed to qemuBuildNicDevStr(); qemuBuildNicDevStr() then sets host_mtu in the interface's commandline options. The result is that a higher MTU for all guests connecting to a particular network will be plumbed top to bottom by simply changing the MTU of the network (in libvirt's config for libvirt-managed networks, or directly on the bridge device for simple host bridges or OVS bridges managed outside of libvirt). One question I have about this - it occurred to me that in the case of migrating a guest from a host with an older libvirt to one with a newer libvirt, the guest may have *not* had the host_mtu option on the older machine, but *will* have it on the newer machine. I'm curious if this could lead to incompatibilities between source and destination (I guess it all depends on whether or not the setting of host_mtu has a practical effect on a guest that is already running - Maxime?) (I hope we don't have to add a "<mtu auto='yes'/>" and only set host_mtu when that is present :-/) Likewise, we could run into problems when migrating from a newer libvirt to older libvirt - The guest would have been told of the higher MTU on the newer libvirt, then migrated to a host that didn't understand <mtu size='blah'/>. (If this really is a problem, it would be a problem with or without the current patch). --- New in V2. src/qemu/qemu_command.c | 32 ++++++++++++++++++++++---------- src/qemu/qemu_command.h | 3 ++- src/qemu/qemu_hotplug.c | 5 +++-- src/qemu/qemu_interface.c | 5 +++-- src/qemu/qemu_interface.h | 3 ++- 5 files changed, 32 insertions(+), 16 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 6d65872..522152d 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3555,7 +3555,8 @@ qemuBuildNicDevStr(virDomainDefPtr def, int vlan, unsigned int bootindex, size_t vhostfdSize, - virQEMUCapsPtr qemuCaps) + virQEMUCapsPtr qemuCaps, + unsigned int mtu) { virBuffer buf = VIR_BUFFER_INITIALIZER; const char *nic = net->model; @@ -3679,13 +3680,23 @@ qemuBuildNicDevStr(virDomainDefPtr def, virBufferAsprintf(&buf, ",rx_queue_size=%u", net->driver.virtio.rx_queue_size); } - if (usingVirtio && net->mtu) { - if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_NET_HOST_MTU)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("setting MTU is not supported with this QEMU binary")); - goto error; + if (usingVirtio && mtu) { + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_NET_HOST_MTU)) { + + virBufferAsprintf(&buf, ",host_mtu=%u", mtu); + + } else { + /* log an error if mtu was requested specifically for this + * interface, otherwise, if it's just what was reported by + * the attached network, ignore it. + */ + if (net->mtu) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("setting MTU is not supported with " + "this QEMU binary")); + goto error; + } } - virBufferAsprintf(&buf, ",host_mtu=%u", net->mtu); } if (vlan == -1) @@ -8042,7 +8053,7 @@ qemuBuildVhostuserCommandLine(virQEMUDriverPtr driver, VIR_FREE(netdev); if (!(nic = qemuBuildNicDevStr(def, net, -1, bootindex, - queues, qemuCaps))) { + queues, qemuCaps, net->mtu))) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Error generating NIC -device string")); goto error; @@ -8088,6 +8099,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, virDomainNetType actualType = virDomainNetGetActualType(net); virNetDevBandwidthPtr actualBandwidth; size_t i; + unsigned int mtu = net->mtu; if (!bootindex) @@ -8142,7 +8154,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, memset(tapfd, -1, tapfdSize * sizeof(tapfd[0])); if (qemuInterfaceBridgeConnect(def, driver, net, - tapfd, &tapfdSize) < 0) + tapfd, &tapfdSize, &mtu) < 0) goto cleanup; break; @@ -8322,7 +8334,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, } if (qemuDomainSupportsNicdev(def, net)) { if (!(nic = qemuBuildNicDevStr(def, net, vlan, bootindex, - vhostfdSize, qemuCaps))) + vhostfdSize, qemuCaps, mtu))) goto cleanup; virCommandAddArgList(cmd, "-device", nic, NULL); } else { diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 3bcfdc6..69fe846 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -101,7 +101,8 @@ char *qemuBuildNicDevStr(virDomainDefPtr def, int vlan, unsigned int bootindex, size_t vhostfdSize, - virQEMUCapsPtr qemuCaps); + virQEMUCapsPtr qemuCaps, + unsigned int mtu); char *qemuDeviceDriveHostAlias(virDomainDiskDefPtr disk); diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index a6de254..c0d1ab9 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -969,6 +969,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, bool charDevPlugged = false; bool netdevPlugged = false; bool hostPlugged = false; + unsigned int mtu = net->mtu; /* preallocate new slot for device */ if (VIR_REALLOC_N(vm->def->nets, vm->def->nnets + 1) < 0) @@ -1025,7 +1026,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, goto cleanup; memset(vhostfd, -1, sizeof(*vhostfd) * vhostfdSize); if (qemuInterfaceBridgeConnect(vm->def, driver, net, - tapfd, &tapfdSize) < 0) + tapfd, &tapfdSize, &mtu) < 0) goto cleanup; iface_connected = true; if (qemuInterfaceOpenVhostNet(vm->def, net, priv->qemuCaps, @@ -1236,7 +1237,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, VIR_FORCE_CLOSE(vhostfd[i]); if (!(nicstr = qemuBuildNicDevStr(vm->def, net, vlan, 0, - queueSize, priv->qemuCaps))) + queueSize, priv->qemuCaps, mtu))) goto try_remove; qemuDomainObjEnterMonitor(driver, vm); diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c index ce448d2..c5dca60 100644 --- a/src/qemu/qemu_interface.c +++ b/src/qemu/qemu_interface.c @@ -503,7 +503,8 @@ qemuInterfaceBridgeConnect(virDomainDefPtr def, virQEMUDriverPtr driver, virDomainNetDefPtr net, int *tapfd, - size_t *tapfdSize) + size_t *tapfdSize, + unsigned int *mtu) { const char *brname; int ret = -1; @@ -544,7 +545,7 @@ qemuInterfaceBridgeConnect(virDomainDefPtr def, def->uuid, tunpath, tapfd, *tapfdSize, virDomainNetGetActualVirtPortProfile(net), virDomainNetGetActualVlan(net), - 0, NULL, + net->mtu, mtu, tap_create_flags) < 0) { virDomainAuditNetDevice(def, net, tunpath, false); goto cleanup; diff --git a/src/qemu/qemu_interface.h b/src/qemu/qemu_interface.h index 66ead2a..0ec873f 100644 --- a/src/qemu/qemu_interface.h +++ b/src/qemu/qemu_interface.h @@ -50,7 +50,8 @@ int qemuInterfaceBridgeConnect(virDomainDefPtr def, virQEMUDriverPtr driver, virDomainNetDefPtr net, int *tapfd, - size_t *tapfdSize) + size_t *tapfdSize, + unsigned int *mtu) ATTRIBUTE_NONNULL(2); int qemuInterfaceOpenVhostNet(virDomainDefPtr def, -- 2.9.3

* Laine Stump (laine@laine.org) wrote:
libvirt was able to set the host_mtu option when an MTU was explicitly given in the interface config (with <mtu size='n'/>), set the MTU of a libvirt network in the network config (with the same named subelement), and would automatically set the MTU of any tap device to the MTU of the network.
This patch ties that all together (for networks based on tap devices and either Linux host bridges or OVS bridges) by learning the MTU of the network (i.e. the bridge) during qemuInterfaceBridgeConnect(), and returning that value so that it can be passed to qemuBuildNicDevStr(); qemuBuildNicDevStr() then sets host_mtu in the interface's commandline options.
The result is that a higher MTU for all guests connecting to a particular network will be plumbed top to bottom by simply changing the MTU of the network (in libvirt's config for libvirt-managed networks, or directly on the bridge device for simple host bridges or OVS bridges managed outside of libvirt).
One question I have about this - it occurred to me that in the case of migrating a guest from a host with an older libvirt to one with a newer libvirt, the guest may have *not* had the host_mtu option on the older machine, but *will* have it on the newer machine. I'm curious if this could lead to incompatibilities between source and destination (I guess it all depends on whether or not the setting of host_mtu has a practical effect on a guest that is already running - Maxime?) (I hope we don't have to add a "<mtu auto='yes'/>" and only set host_mtu when that is present :-/)
Or what happens if I migrate between hosts where the host network hardware has different MTU? Dave
Likewise, we could run into problems when migrating from a newer libvirt to older libvirt - The guest would have been told of the higher MTU on the newer libvirt, then migrated to a host that didn't understand <mtu size='blah'/>. (If this really is a problem, it would be a problem with or without the current patch). ---
New in V2.
src/qemu/qemu_command.c | 32 ++++++++++++++++++++++---------- src/qemu/qemu_command.h | 3 ++- src/qemu/qemu_hotplug.c | 5 +++-- src/qemu/qemu_interface.c | 5 +++-- src/qemu/qemu_interface.h | 3 ++- 5 files changed, 32 insertions(+), 16 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 6d65872..522152d 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3555,7 +3555,8 @@ qemuBuildNicDevStr(virDomainDefPtr def, int vlan, unsigned int bootindex, size_t vhostfdSize, - virQEMUCapsPtr qemuCaps) + virQEMUCapsPtr qemuCaps, + unsigned int mtu) { virBuffer buf = VIR_BUFFER_INITIALIZER; const char *nic = net->model; @@ -3679,13 +3680,23 @@ qemuBuildNicDevStr(virDomainDefPtr def, virBufferAsprintf(&buf, ",rx_queue_size=%u", net->driver.virtio.rx_queue_size); }
- if (usingVirtio && net->mtu) { - if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_NET_HOST_MTU)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("setting MTU is not supported with this QEMU binary")); - goto error; + if (usingVirtio && mtu) { + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_NET_HOST_MTU)) { + + virBufferAsprintf(&buf, ",host_mtu=%u", mtu); + + } else { + /* log an error if mtu was requested specifically for this + * interface, otherwise, if it's just what was reported by + * the attached network, ignore it. + */ + if (net->mtu) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("setting MTU is not supported with " + "this QEMU binary")); + goto error; + } } - virBufferAsprintf(&buf, ",host_mtu=%u", net->mtu); }
if (vlan == -1) @@ -8042,7 +8053,7 @@ qemuBuildVhostuserCommandLine(virQEMUDriverPtr driver, VIR_FREE(netdev);
if (!(nic = qemuBuildNicDevStr(def, net, -1, bootindex, - queues, qemuCaps))) { + queues, qemuCaps, net->mtu))) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Error generating NIC -device string")); goto error; @@ -8088,6 +8099,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, virDomainNetType actualType = virDomainNetGetActualType(net); virNetDevBandwidthPtr actualBandwidth; size_t i; + unsigned int mtu = net->mtu;
if (!bootindex) @@ -8142,7 +8154,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, memset(tapfd, -1, tapfdSize * sizeof(tapfd[0]));
if (qemuInterfaceBridgeConnect(def, driver, net, - tapfd, &tapfdSize) < 0) + tapfd, &tapfdSize, &mtu) < 0) goto cleanup; break;
@@ -8322,7 +8334,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, } if (qemuDomainSupportsNicdev(def, net)) { if (!(nic = qemuBuildNicDevStr(def, net, vlan, bootindex, - vhostfdSize, qemuCaps))) + vhostfdSize, qemuCaps, mtu))) goto cleanup; virCommandAddArgList(cmd, "-device", nic, NULL); } else { diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 3bcfdc6..69fe846 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -101,7 +101,8 @@ char *qemuBuildNicDevStr(virDomainDefPtr def, int vlan, unsigned int bootindex, size_t vhostfdSize, - virQEMUCapsPtr qemuCaps); + virQEMUCapsPtr qemuCaps, + unsigned int mtu);
char *qemuDeviceDriveHostAlias(virDomainDiskDefPtr disk);
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index a6de254..c0d1ab9 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -969,6 +969,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, bool charDevPlugged = false; bool netdevPlugged = false; bool hostPlugged = false; + unsigned int mtu = net->mtu;
/* preallocate new slot for device */ if (VIR_REALLOC_N(vm->def->nets, vm->def->nnets + 1) < 0) @@ -1025,7 +1026,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, goto cleanup; memset(vhostfd, -1, sizeof(*vhostfd) * vhostfdSize); if (qemuInterfaceBridgeConnect(vm->def, driver, net, - tapfd, &tapfdSize) < 0) + tapfd, &tapfdSize, &mtu) < 0) goto cleanup; iface_connected = true; if (qemuInterfaceOpenVhostNet(vm->def, net, priv->qemuCaps, @@ -1236,7 +1237,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, VIR_FORCE_CLOSE(vhostfd[i]);
if (!(nicstr = qemuBuildNicDevStr(vm->def, net, vlan, 0, - queueSize, priv->qemuCaps))) + queueSize, priv->qemuCaps, mtu))) goto try_remove;
qemuDomainObjEnterMonitor(driver, vm); diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c index ce448d2..c5dca60 100644 --- a/src/qemu/qemu_interface.c +++ b/src/qemu/qemu_interface.c @@ -503,7 +503,8 @@ qemuInterfaceBridgeConnect(virDomainDefPtr def, virQEMUDriverPtr driver, virDomainNetDefPtr net, int *tapfd, - size_t *tapfdSize) + size_t *tapfdSize, + unsigned int *mtu) { const char *brname; int ret = -1; @@ -544,7 +545,7 @@ qemuInterfaceBridgeConnect(virDomainDefPtr def, def->uuid, tunpath, tapfd, *tapfdSize, virDomainNetGetActualVirtPortProfile(net), virDomainNetGetActualVlan(net), - 0, NULL, + net->mtu, mtu, tap_create_flags) < 0) { virDomainAuditNetDevice(def, net, tunpath, false); goto cleanup; diff --git a/src/qemu/qemu_interface.h b/src/qemu/qemu_interface.h index 66ead2a..0ec873f 100644 --- a/src/qemu/qemu_interface.h +++ b/src/qemu/qemu_interface.h @@ -50,7 +50,8 @@ int qemuInterfaceBridgeConnect(virDomainDefPtr def, virQEMUDriverPtr driver, virDomainNetDefPtr net, int *tapfd, - size_t *tapfdSize) + size_t *tapfdSize, + unsigned int *mtu) ATTRIBUTE_NONNULL(2);
int qemuInterfaceOpenVhostNet(virDomainDefPtr def, -- 2.9.3
-- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

On 02/03/2017 12:41 PM, Dr. David Alan Gilbert wrote:
* Laine Stump (laine@laine.org) wrote:
One question I have about this - it occurred to me that in the case of migrating a guest from a host with an older libvirt to one with a newer libvirt, the guest may have *not* had the host_mtu option on the older machine, but *will* have it on the newer machine. I'm curious if this could lead to incompatibilities between source and destination (I guess it all depends on whether or not the setting of host_mtu has a practical effect on a guest that is already running - Maxime?) (I hope we don't have to add a "<mtu auto='yes'/>" and only set host_mtu when that is present :-/) Or what happens if I migrate between hosts where the host network hardware has different MTU?
Right! I think that may be problematic even without any support for host_mtu (the guest will at least be the same, but the host side of the tap device will have a different MTU; I haven't even tried to think about what problems that might cause - e.g. guest migrates and tap device now has a 9000 MTU (as does everything else connected to the same bridge), but the guest interface still thinks the MTU is 1500. Does this make anything go screwy? Or does everything correct itself around the difference? Probably someone needs to try it... (/averts gaze, avoids eye contact)

On 03.02.2017 18:35, Laine Stump wrote:
libvirt was able to set the host_mtu option when an MTU was explicitly given in the interface config (with <mtu size='n'/>), set the MTU of a libvirt network in the network config (with the same named subelement), and would automatically set the MTU of any tap device to the MTU of the network.
This patch ties that all together (for networks based on tap devices and either Linux host bridges or OVS bridges) by learning the MTU of the network (i.e. the bridge) during qemuInterfaceBridgeConnect(), and returning that value so that it can be passed to qemuBuildNicDevStr(); qemuBuildNicDevStr() then sets host_mtu in the interface's commandline options.
The result is that a higher MTU for all guests connecting to a particular network will be plumbed top to bottom by simply changing the MTU of the network (in libvirt's config for libvirt-managed networks, or directly on the bridge device for simple host bridges or OVS bridges managed outside of libvirt).
One question I have about this - it occurred to me that in the case of migrating a guest from a host with an older libvirt to one with a newer libvirt, the guest may have *not* had the host_mtu option on the older machine, but *will* have it on the newer machine. I'm curious if this could lead to incompatibilities between source and destination (I guess it all depends on whether or not the setting of host_mtu has a practical effect on a guest that is already running - Maxime?) (I hope we don't have to add a "<mtu auto='yes'/>" and only set host_mtu when that is present :-/)
This is a question for qemu folks. I know nothing about qemu internals.
Likewise, we could run into problems when migrating from a newer libvirt to older libvirt - The guest would have been told of the higher MTU on the newer libvirt, then migrated to a host that didn't understand <mtu size='blah'/>. (If this really is a problem, it would be a problem with or without the current patch).
Well, if it turns out to be a problem there's yet another variation of it: users can supply new domain XML upon migration and thus change MTU. But that should be easy to check (not that we are doing that now).
---
New in V2.
src/qemu/qemu_command.c | 32 ++++++++++++++++++++++---------- src/qemu/qemu_command.h | 3 ++- src/qemu/qemu_hotplug.c | 5 +++-- src/qemu/qemu_interface.c | 5 +++-- src/qemu/qemu_interface.h | 3 ++- 5 files changed, 32 insertions(+), 16 deletions(-)
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 6d65872..522152d 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3555,7 +3555,8 @@ qemuBuildNicDevStr(virDomainDefPtr def, int vlan, unsigned int bootindex, size_t vhostfdSize, - virQEMUCapsPtr qemuCaps) + virQEMUCapsPtr qemuCaps, + unsigned int mtu) { virBuffer buf = VIR_BUFFER_INITIALIZER; const char *nic = net->model; @@ -3679,13 +3680,23 @@ qemuBuildNicDevStr(virDomainDefPtr def, virBufferAsprintf(&buf, ",rx_queue_size=%u", net->driver.virtio.rx_queue_size); }
- if (usingVirtio && net->mtu) { - if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_NET_HOST_MTU)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("setting MTU is not supported with this QEMU binary")); - goto error; + if (usingVirtio && mtu) { + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_NET_HOST_MTU)) { + + virBufferAsprintf(&buf, ",host_mtu=%u", mtu); + + } else { + /* log an error if mtu was requested specifically for this + * interface, otherwise, if it's just what was reported by + * the attached network, ignore it. + */ + if (net->mtu) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("setting MTU is not supported with " + "this QEMU binary")); + goto error; + } }
This requires users to pass net->mtu even though net is already passed into this function. I wonder whether we should alter meaning of @mtu argument slightly. Something like you're going with in 1/4: - a nonzero value means caller is requesting that particular MTU size - a zero value means stick with net->mtu value. Although, now that I've tried and changed code accordingly the difference is just one line changed (apart from these line above): index 0767c6649..a556dc60a 100644 --- i/src/qemu/qemu_command.c +++ w/src/qemu/qemu_command.c @@ -3680,10 +3680,10 @@ qemuBuildNicDevStr(virDomainDefPtr def, virBufferAsprintf(&buf, ",rx_queue_size=%u", net->driver.virtio.rx_queue_size); } - if (usingVirtio && mtu) { + if (usingVirtio && (net->mtu || mtu)) { if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_NET_HOST_MTU)) { - virBufferAsprintf(&buf, ",host_mtu=%u", mtu); + virBufferAsprintf(&buf, ",host_mtu=%u", net->mtu ? net->mtu : mtu); } else { /* log an error if mtu was requested specifically for this @@ -8053,7 +8053,7 @@ qemuBuildVhostuserCommandLine(virQEMUDriverPtr driver, VIR_FREE(netdev); if (!(nic = qemuBuildNicDevStr(def, net, -1, bootindex, - queues, qemuCaps, net->mtu))) { + queues, qemuCaps, 0))) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("Error generating NIC -device string")); goto error; This is because we need one bit as well: @@ -8273,8 +8273,8 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, } } - if (net->mtu && - virNetDevSetMTU(net->ifname, net->mtu) < 0) + if (mtu && + virNetDevSetMTU(net->ifname, mtu) < 0) goto cleanup; if ((actualType == VIR_DOMAIN_NET_TYPE_NETWORK || diff --git i/src/qemu/qemu_hotplug.c w/src/qemu/qemu_hotplug.c index 7784dad3c..a083b2a3f 100644 --- i/src/qemu/qemu_hotplug.c +++ w/src/qemu/qemu_hotplug.c @@ -1116,6 +1116,9 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, goto cleanup; } + if (mtu && + virNetDevSetMTU(net->ifname, mtu) < 0) + /* Set device online immediately */ if (qemuInterfaceStartDevice(net) < 0) goto cleanup; It just feels better than ternary operator. So ACK to whatever version you choose. Michal

On 02/06/2017 01:57 PM, Michal Privoznik wrote:
On 03.02.2017 18:35, Laine Stump wrote:
libvirt was able to set the host_mtu option when an MTU was explicitly given in the interface config (with <mtu size='n'/>), set the MTU of a libvirt network in the network config (with the same named subelement), and would automatically set the MTU of any tap device to the MTU of the network.
This patch ties that all together (for networks based on tap devices and either Linux host bridges or OVS bridges) by learning the MTU of the network (i.e. the bridge) during qemuInterfaceBridgeConnect(), and returning that value so that it can be passed to qemuBuildNicDevStr(); qemuBuildNicDevStr() then sets host_mtu in the interface's commandline options.
The result is that a higher MTU for all guests connecting to a particular network will be plumbed top to bottom by simply changing the MTU of the network (in libvirt's config for libvirt-managed networks, or directly on the bridge device for simple host bridges or OVS bridges managed outside of libvirt).
One question I have about this - it occurred to me that in the case of migrating a guest from a host with an older libvirt to one with a newer libvirt, the guest may have *not* had the host_mtu option on the older machine, but *will* have it on the newer machine. I'm curious if this could lead to incompatibilities between source and destination (I guess it all depends on whether or not the setting of host_mtu has a practical effect on a guest that is already running - Maxime?) (I hope we don't have to add a "<mtu auto='yes'/>" and only set host_mtu when that is present :-/)
This is a question for qemu folks. I know nothing about qemu internals.
Sorry for the late reply. Setting host_mtu on a guest that is already running will have no effect. Indeed, the VIRTIO_NET_F_MTU feature flag will be set at device negotiation time if host_mtu is set. So, if guest started without this host_mtu parameter, this flag won't be set and the value won't be taken into account.

On 03.02.2017 18:35, Laine Stump wrote:
Similar to V1, but changing the XML to match Michal's patches, and adding a patch that plumbs network MTU back through to the qemu host_mtu option.
Laine Stump (4): util: add MTU arg to virNetDevTapCreateInBridgePort() conf: support configuring mtu size in a virtual network network: honor mtu setting when creating network qemu: propagate bridge MTU into qemu "host_mtu" option
docs/formatnetwork.html.in | 19 ++++++++++++++++++- docs/news.xml | 5 +++-- docs/schemas/network.rng | 5 +++++ src/bhyve/bhyve_command.c | 1 + src/conf/network_conf.c | 31 ++++++++++++++++++++++++++++++- src/conf/network_conf.h | 1 + src/network/bridge_driver.c | 1 + src/qemu/qemu_command.c | 32 ++++++++++++++++++++++---------- src/qemu/qemu_command.h | 3 ++- src/qemu/qemu_hotplug.c | 5 +++-- src/qemu/qemu_interface.c | 4 +++- src/qemu/qemu_interface.h | 3 ++- src/uml/uml_conf.c | 1 + src/util/virnetdevtap.c | 32 +++++++++++++++++++++++++++----- src/util/virnetdevtap.h | 2 ++ tests/bhyvexml2argvmock.c | 2 ++ tests/networkxml2xmlin/set-mtu.xml | 12 ++++++++++++ tests/networkxml2xmlout/set-mtu.xml | 12 ++++++++++++ tests/networkxml2xmltest.c | 1 + 19 files changed, 148 insertions(+), 24 deletions(-) create mode 100644 tests/networkxml2xmlin/set-mtu.xml create mode 100644 tests/networkxml2xmlout/set-mtu.xml
ACK series. Regarding the migration issue - that is pre-existing and thus does not hold back pushing of these patches. Michal
participants (4)
-
Dr. David Alan Gilbert
-
Laine Stump
-
Maxime Coquelin
-
Michal Privoznik