On 01/27/2014 12:10 PM, Dan Kenigsberg wrote:
On Fri, Jan 24, 2014 at 02:18:27PM +0200, Laine Stump wrote:
>
https://bugzilla.redhat.com/show_bug.cgi?id=1057321 pointed out that
> we weren't honoring the <bandwidth> element in libvirt networks using
> <forward mode='bridge'/>. In fact, these networks are just a method of
> giving a libvirt network name to an existing Linux host bridge on the
> system, and even if it were technically possible for us to set
> network-wide bandwidth limits for all the taps on a bridge, it's
> probably not a polite thing to do since libvirt is just using a bridge
> that was created by someone else for other purposes.
Since QoS is not something that libvirt applies based on an explicit
request by the admin, I am not sure that this is a convincing argument:
if the admin does not want something to be done, they would not request
libvirt to do it.
libvirt doesn't have the full information about the network topology to
know to which interface it should apply the bandwidth limit. It only
knows that there is a bridge named "br0", but that bridge might have
multiple ethernets attached to it, or it might have a bond of multiple
ethernets, or a vlan interface, or maybe some combination of all those -
then there is no single choke point to apply bandwidth limits. I'm not
saying that it's an unsolvable problem (well, actually in many of those
cases I think it is), just that libvirt can't do it correctly with the
information it has, and my preference is to openly fail rather than
doing something that doesn't work correctly.
The only way I can see to make this work properly *only in the case of a
bridge with a single physical interface connected, and not counting
traffic destined for the host*, would be to obtain the name of the
physical interface (either by adding it to the <network> config somehow,
or by clever examining of the list of attached interfaces), and apply
the bandwidth limits there.
> So the proper
> thing is to just log an error when someone tries to put a <bandwidth>
> element in that type of network.
Would you explain why the QoS cannot be applied to the bridge interface
itself? The fact that it would limit in-host traffic too?
My understanding is that only traffic destined for the host itself (or
needing to be routed via the host rather than bridged directly out
another interface attached to the bridge) would traverse the "interface"
part of the bridge device. So for traffic that is getting bridged
directly out, e.g., eth0, a bandwidth limit on "br0" would do nothing.
(Please correct me if I'm wrong though - my knowledge of Linux host
bridges has been picked up informally through exposure, and not via
direct examining of the code, so I could very well be wrong (it happens
quite often! :-))
Would you update the docs, to expose the non-support of QoS on these
networks?
Yes, once we all agree on what is and isn't supported/supportable, that
is a *very* good idea.
> While looking through the network XML documentation and comparing
it
> to the networkValidate function, I noticed that we also ignore the
> presence of a mac address in the config, even though we do nothing
> with it in this case either.
>
> This patch updates networkValidate() (which is called any time a
> persistent network is defined, or a transient network created) to log
> an error and fail if it finds either a <bandwidth> or <mac> element
> and the network forward mode is anything except 'route'. 'nat', or
> nothing. (Yes, neither of those elements is acceptable for any macvtap
> mode, nor for a hostdev network).
>
> NB: This does *not* cause failure to start any existing network that
> contains one of those elements, so someone might have erroneously
> defined such a network in the past, and that network will continue to
> function unmodified. I considered it too disruptive to suddenly break
> working configs on the next reboot after a libvirt upgrade.
BTW, is there other means to re-use libvirt's handling of tc in order to
apply QoS on the physical-facing leg of the forwarded bridge (in case
that we end up going that way).
I *think* you're asking for a simple way to translate a <bandwidth>
element into a list of tc commands. I'm not aware of any such thing.
Possibly we could modify libvirt to recognize the "dev" attribute of
<forward mode='bridge'> as the designated physical interface for a
bridge (this attribute is currently unused in the case of <forward
mode='bridge'> when <bridge name='xxx'/> is specified). Then you
could
do this:
<network>
<name>xyzzy</name>
<forward mode='bridge' dev='eth0'/>
<bridge name='br0'/>
<bandwidth .....>
</network>
In this case libvirt would know that the bridge br0 was already attached
to eth0, which was the proper place to apply bandwidth limits. However,
this would start to get confusing wrt other network types. For example,
a network that uses macvtap in bridge mode to connect to eth0 would be
bery similar:
<network>
<name>yyz</name>
<forward mode='bridge' dev='eth0'/>
<bandwidth .....>
</network>
In this case, it *still* may be reasonable to apply the bandwidth to
eth0, but I don't know if traffic from macvtap devices connected to eth0
are counted against eht0's limits or not.
And then what should we do in the case of multiple physical interfaces
(not in a bond)?
<network>
<name>xyzzy</name>
<forward mode='bridge' dev='eth0'/>
<interface dev='eth0'/>
<interface dev='eth1'/>
<interface dev='eth2'/>
<bridge name='br0'/>
<bandwidth .....>
</network>
Do we apply the same bandwidth to all the physical interfaces? That in
effect gives the guests 3x the maximum bandwidth (this is why I say it
can really only be solved in some cases).
It would be *really nice* if a Linux host bridge could be attached to a
port of a Linux host bridge - in that case we could just solve the
problem by creating a new bridge that sent everything out that one port
connected to the actual host bridge, and put the limit on the interface
of the new bridge. Open vSwitch is looking better and better... (I'm
assuming it *does* support a bridge connected to a bridge, and I have no
reason to believe otherwise).