[libvirt] [RFC PATCH] Let network bridge MTU to be configured on XML

Hi, Below is an incomplete patch for letting the MTU of a libvirt virtual network to be configured on XML. This implementation has an issue: it works when trying to set the MTU below 1500, but the kernel bridge code doesn't let us to set the bridge MTU above 1500 if there is no interface attached to the bridge yet. To allow setting the bridge MTU higher than 1500, we need to save the configured value somewhere and set it only when attaching tap interface to it. To solve that, I propose changing the way the network driver interface expose things for the domain network code. Today we do the following (src/qemu_conf.c): virNetworkPtr network; virDomainNetDefPtr net; char *brname; int tapfd; brname = virNetworkGetBridgeName(network); brAddTap(brctl, brname, &net->ifname, &tapfd))) With this code, the only information that can flow between the network driver and the bridge code is the bridge name. I was going to suggest having something like this: virNetworkAddTap(network, &net->ifname, &tapfd); However, we wouldn't be able to return an FD when using the 'remote' network driver. But we can split the creation of the tap interface and attaching it to the network bridge, like this: brNewTap(brctl, &net->ifname, &tapfd); virNetworkAttachInterface(network, net->ifname); This way, the network driver code can do anything needed to attach the new tap interface to the virtual network, including but not limited to MTU setting. Another alternative could be creating a virNetworkGetBridgeMtu() function. That would solve our problem for the MTU settings, but in my opinion, attaching interfaces to the network is a task for the network driver. Having the qemu code calling brAddTap() directly was just a shortcut we won't be able to use anymore. What do you think? --- src/bridge.c | 19 +++++++++++++++++++ src/bridge.h | 4 ++++ src/libvirt_sym.version.in | 1 + src/network_conf.c | 5 +++++ src/network_conf.h | 2 ++ src/network_driver.c | 10 ++++++++++ 6 files changed, 41 insertions(+), 0 deletions(-) diff --git a/src/bridge.c b/src/bridge.c index 13d81bc..ed9023b 100644 --- a/src/bridge.c +++ b/src/bridge.c @@ -452,6 +452,25 @@ brAddTap(brControl *ctl, return errno; } + +/** + * brSetMtu + * @ctl: bridge control pointer + * @bridge: the bridge name + * @mtu: the MTU value to be set + * + * Sets the MTU of a bridge. + * + * Returns 0 in case of success or an errno code in case of failure. + */ +int +brSetMtu(brControl *ctl, + const char *bridge, + int mtu) +{ + return ifSetMtu(ctl, bridge, mtu); +} + /** * brSetInterfaceUp: * @ctl: bridge control pointer diff --git a/src/bridge.h b/src/bridge.h index 7eb6166..194e96f 100644 --- a/src/bridge.h +++ b/src/bridge.h @@ -58,6 +58,10 @@ int brDeleteInterface (brControl *ctl, const char *bridge, const char *iface); +int brSetMtu (brControl *ctl, + const char *bridge, + int mtu); + int brAddTap (brControl *ctl, const char *bridge, char **ifname, diff --git a/src/libvirt_sym.version.in b/src/libvirt_sym.version.in index ec9ff3d..668b902 100644 --- a/src/libvirt_sym.version.in +++ b/src/libvirt_sym.version.in @@ -274,6 +274,7 @@ LIBVIRT_PRIVATE_@VERSION@ { global: /* bridge.h */ brAddBridge; + brSetMtu; brAddInterface; brAddTap; brDeleteBridge; diff --git a/src/network_conf.c b/src/network_conf.c index 5add62e..b7ec87e 100644 --- a/src/network_conf.c +++ b/src/network_conf.c @@ -295,6 +295,7 @@ virNetworkDefParseXML(virConnectPtr conn, { virNetworkDefPtr def; char *tmp; + long mtu; if (VIR_ALLOC(def) < 0) { virNetworkReportError(conn, VIR_ERR_NO_MEMORY, NULL); @@ -332,6 +333,8 @@ virNetworkDefParseXML(virConnectPtr conn, /* Parse bridge information */ def->bridge = virXPathString(conn, "string(./bridge[1]/@name)", ctxt); + if (virXPathLong(conn, "number(./bridge[1]/@mtu)", ctxt, &mtu) == 0) + def->bridgeMtu = mtu; tmp = virXPathString(conn, "string(./bridge[1]/@stp)", ctxt); def->stp = (tmp && STREQ(tmp, "off")) ? 0 : 1; VIR_FREE(tmp); @@ -570,6 +573,8 @@ char *virNetworkDefFormat(virConnectPtr conn, virBufferAddLit(&buf, " <bridge"); if (def->bridge) virBufferEscapeString(&buf, " name='%s'", def->bridge); + if (def->bridgeMtu > 0) + virBufferVSprintf(&buf, " mtu='%d'", def->bridgeMtu); virBufferVSprintf(&buf, " stp='%s' forwardDelay='%ld' />\n", def->stp ? "on" : "off", def->delay); diff --git a/src/network_conf.h b/src/network_conf.h index 2614e47..a30cc8d 100644 --- a/src/network_conf.h +++ b/src/network_conf.h @@ -61,6 +61,8 @@ struct _virNetworkDef { char *name; char *bridge; /* Name of bridge device */ + int bridgeMtu; /* MTU of the bridge */ + char *domain; unsigned long delay; /* Bridge forward delay (ms) */ int stp : 1; /* Spanning tree protocol */ diff --git a/src/network_driver.c b/src/network_driver.c index 96d3ddf..10bdd10 100644 --- a/src/network_driver.c +++ b/src/network_driver.c @@ -684,6 +684,16 @@ static int networkStartNetworkDaemon(virConnectPtr conn, return -1; } + if (network->def->bridgeMtu> 0) { + if ((err = brSetMtu(driver->brctl, network->def->bridge, network->def->bridgeMtu))) { + networkReportError(conn, NULL, NULL, VIR_ERR_INTERNAL_ERROR, + _("cannot set MTU of bridge '%s' to %d: %s"), + network->def->bridge, network->def->bridgeMtu, + strerror(err)); + goto err_delbr; + } + } + if (brSetForwardDelay(driver->brctl, network->def->bridge, network->def->delay) < 0) goto err_delbr; -- 1.5.5.GIT -- Eduardo

On Thu, 2008-12-04 at 15:42 -0200, Eduardo Habkost wrote:
This implementation has an issue: it works when trying to set the MTU below 1500, but the kernel bridge code doesn't let us to set the bridge MTU above 1500 if there is no interface attached to the bridge yet. To allow setting the bridge MTU higher than 1500, we need to save the configured value somewhere and set it only when attaching tap interface to it.
Wait, that doesn't make much sense. 1) create bridge, can't set the MTU to greater than 1500 2) add tap device, set tap MTU to that of the bridge, i.e. 1500 3) now setting the bridge MTU to >1500 won't work because it's greater that the MTU of one of the constituents Sounds to me like it's just a kernel bug - br_change_mtu() should allow an MTU of >ETH_DATA_LEN if the list is empty. Cheers, Mark.

On Thu, Dec 04, 2008 at 05:54:24PM +0000, Mark McLoughlin wrote:
On Thu, 2008-12-04 at 15:42 -0200, Eduardo Habkost wrote:
This implementation has an issue: it works when trying to set the MTU below 1500, but the kernel bridge code doesn't let us to set the bridge MTU above 1500 if there is no interface attached to the bridge yet. To allow setting the bridge MTU higher than 1500, we need to save the configured value somewhere and set it only when attaching tap interface to it.
Wait, that doesn't make much sense.
1) create bridge, can't set the MTU to greater than 1500
2) add tap device, set tap MTU to that of the bridge, i.e. 1500
On this case, the code would need to set the tap MTU to the one specified on the configuration, not the one that the kernel has set on the bridge.
3) now setting the bridge MTU to >1500 won't work because it's greater that the MTU of one of the constituents
Sounds to me like it's just a kernel bug - br_change_mtu() should allow an MTU of >ETH_DATA_LEN if the list is empty.
I am seeing this as an feature of the kernel bridging code that we would need to handle. An annoying feature we could ask to be changed on the kernel code, but something that libvirt could handle gracefully in case the user is using an older kernel. -- Eduardo

On Thu, Dec 04, 2008 at 04:07:04PM -0200, Eduardo Habkost wrote:
On Thu, Dec 04, 2008 at 05:54:24PM +0000, Mark McLoughlin wrote:
On Thu, 2008-12-04 at 15:42 -0200, Eduardo Habkost wrote:
This implementation has an issue: it works when trying to set the MTU below 1500, but the kernel bridge code doesn't let us to set the bridge MTU above 1500 if there is no interface attached to the bridge yet. To allow setting the bridge MTU higher than 1500, we need to save the configured value somewhere and set it only when attaching tap interface to it.
Wait, that doesn't make much sense.
1) create bridge, can't set the MTU to greater than 1500
2) add tap device, set tap MTU to that of the bridge, i.e. 1500
On this case, the code would need to set the tap MTU to the one specified on the configuration, not the one that the kernel has set on the bridge.
3) now setting the bridge MTU to >1500 won't work because it's greater that the MTU of one of the constituents
Sounds to me like it's just a kernel bug - br_change_mtu() should allow an MTU of >ETH_DATA_LEN if the list is empty.
I am seeing this as an feature of the kernel bridging code that we would need to handle. An annoying feature we could ask to be changed on the kernel code, but something that libvirt could handle gracefully in case the user is using an older kernel.
IMHO, fix the kernel and anyone that cares enough about this on older kernels can backport the patch. The virtual network capability is primarily targetted to people using desktop/laptop virt / wifi users, rather than servers, so ability set large MTUs is not mission critical. Those users are much more likely to be on latest kernel anyway Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Thu, Dec 04, 2008 at 03:42:12PM -0200, Eduardo Habkost wrote:
Hi,
Below is an incomplete patch for letting the MTU of a libvirt virtual network to be configured on XML.
This implementation has an issue: it works when trying to set the MTU below 1500, but the kernel bridge code doesn't let us to set the bridge MTU above 1500 if there is no interface attached to the bridge yet. To allow setting the bridge MTU higher than 1500, we need to save the configured value somewhere and set it only when attaching tap interface to it.
This is a kernel bug that needs fixing.
To solve that, I propose changing the way the network driver interface expose things for the domain network code. Today we do the following (src/qemu_conf.c):
virNetworkPtr network; virDomainNetDefPtr net; char *brname; int tapfd; brname = virNetworkGetBridgeName(network); brAddTap(brctl, brname, &net->ifname, &tapfd)))
With this code, the only information that can flow between the network driver and the bridge code is the bridge name.
I was going to suggest having something like this:
virNetworkAddTap(network, &net->ifname, &tapfd);
This isn't satisfactory. This only works for the QEMU, LXC & UML drivers but, not other libvirt drivers OpenVZ and Xen where libvirt isn't the one creating the TAP device. Really, we need to be able to set the MTU of the bridge when no devices are enslaved, so we can avoid the tight binding between the hypervisor drivers and the virtual bridge driver. The existing need to call virNetworkGetBridgeName() is already a bit of a pain point in our internal architecture which I don't really want to make worse
Another alternative could be creating a virNetworkGetBridgeMtu() function. That would solve our problem for the MTU settings, but in my opinion, attaching interfaces to the network is a task for the network driver. Having the qemu code calling brAddTap() directly was just a shortcut we won't be able to use anymore.
Nope, its the hypervisor driver's job, because we can't assume that libvirt is the one adding the tap devices. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Thu, Dec 04, 2008 at 07:32:20PM +0000, Daniel P. Berrange wrote:
On Thu, Dec 04, 2008 at 03:42:12PM -0200, Eduardo Habkost wrote: <snip>
I was going to suggest having something like this:
virNetworkAddTap(network, &net->ifname, &tapfd);
This isn't satisfactory. This only works for the QEMU, LXC & UML drivers but, not other libvirt drivers OpenVZ and Xen where libvirt isn't the one creating the TAP device.
Really, we need to be able to set the MTU of the bridge when no devices are enslaved, so we can avoid the tight binding between the hypervisor drivers and the virtual bridge driver. The existing need to call virNetworkGetBridgeName() is already a bit of a pain point in our internal architecture which I don't really want to make worse
Another alternative could be creating a virNetworkGetBridgeMtu() function. That would solve our problem for the MTU settings, but in my opinion, attaching interfaces to the network is a task for the network driver. Having the qemu code calling brAddTap() directly was just a shortcut we won't be able to use anymore.
Nope, its the hypervisor driver's job, because we can't assume that libvirt is the one adding the tap devices.
I was assuming the tap device was always attached to the bridge by libvirt. If it is not, my proposal wouldn't work. Considering that, I will follow up with an updated version of the patch I've sent, probably the only additional change will be updating documentation with the new mtu option. Any comments on the current version of the patch? -- Eduardo

On Thu, Dec 04, 2008 at 06:15:14PM -0200, Eduardo Habkost wrote:
On Thu, Dec 04, 2008 at 07:32:20PM +0000, Daniel P. Berrange wrote:
On Thu, Dec 04, 2008 at 03:42:12PM -0200, Eduardo Habkost wrote: <snip>
I was going to suggest having something like this:
virNetworkAddTap(network, &net->ifname, &tapfd);
This isn't satisfactory. This only works for the QEMU, LXC & UML drivers but, not other libvirt drivers OpenVZ and Xen where libvirt isn't the one creating the TAP device.
Really, we need to be able to set the MTU of the bridge when no devices are enslaved, so we can avoid the tight binding between the hypervisor drivers and the virtual bridge driver. The existing need to call virNetworkGetBridgeName() is already a bit of a pain point in our internal architecture which I don't really want to make worse
Another alternative could be creating a virNetworkGetBridgeMtu() function. That would solve our problem for the MTU settings, but in my opinion, attaching interfaces to the network is a task for the network driver. Having the qemu code calling brAddTap() directly was just a shortcut we won't be able to use anymore.
Nope, its the hypervisor driver's job, because we can't assume that libvirt is the one adding the tap devices.
I was assuming the tap device was always attached to the bridge by libvirt. If it is not, my proposal wouldn't work.
Considering that, I will follow up with an updated version of the patch I've sent, probably the only additional change will be updating documentation with the new mtu option. Any comments on the current version of the patch?
Assuming the kernel guys actually make it possible to set a MTU > 1500 on the bridge, then I see no problem with the patch. Until then its rather academic so I'd prefer to leave this patch idle until the kernel guys decide whether they want to allow this or not. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (3)
-
Daniel P. Berrange
-
Eduardo Habkost
-
Mark McLoughlin