[libvirt] [PATCH v3 0/8] Allow minimal bandwidth guarantee

This is a rework of my one year old patchset [1]. If you are brave enough and start reviewing it, you may want to read the last patch first. The whole magic hidden behind the scene is documented there. For now, the only supported combination where this works is virtual interface connected to virtual network. This network MUST have an inbound average set, at least. This tells how much bandwidth do we have, and how much can we guarantee. If we would miss this info, then we cannot guarantee anything. On the other hand, this is just a very small part of whole problem. Users MUST configure all other devices on traffic path to follow up their setting in libvirt. But that's out of our scope. 1: http://www.redhat.com/archives/libvir-list/2011-December/msg00486.html diff to v1: -Laine's suggestions worked in diff to v2: -again Laine's suggestions worked in (drop libvirt_network.syms, don't check for floor twice, etc.) Michal Privoznik (8): bandwidth: Attach sfq to leaf node bandwidth: add new 'floor' attribute bandwidth: Create hierarchical shaping classes bandwidth: Create (un)plug functions bandwidth: Create rate update function bandwidth: Create network bandwidth (un)plug functions network: Create real network status files domain: Keep assigned class_id in domstatus XML docs/formatdomain.html.in | 23 +++- docs/schemas/networkcommon.rng | 5 + po/POTFILES.in | 1 + src/conf/domain_conf.c | 19 ++- src/conf/domain_conf.h | 1 + src/conf/netdev_bandwidth_conf.c | 67 +++++++-- src/conf/netdev_bandwidth_conf.h | 3 +- src/conf/network_conf.c | 229 +++++++++++++++++++++----- src/conf/network_conf.h | 6 + src/libvirt_private.syms | 4 + src/lxc/lxc_process.c | 3 +- src/network/bridge_driver.c | 242 +++++++++++++++++++++++++++- src/qemu/qemu_command.c | 3 +- src/qemu/qemu_driver.c | 2 +- src/util/virnetdevbandwidth.c | 331 +++++++++++++++++++++++++++++++++++++- src/util/virnetdevbandwidth.h | 25 +++- src/util/virnetdevmacvlan.c | 2 +- 17 files changed, 886 insertions(+), 80 deletions(-) -- 1.7.8.6

SFQ is qdisc which doesn't really shape any traffic but 'just' re-arrange packets in sending buffer so no stream starve. The goal is to ensure fairness. There is basically only one configuration parameter (perturb) which is set to advised value of 10. --- src/util/virnetdevbandwidth.c | 9 +++++++++ 1 files changed, 9 insertions(+), 0 deletions(-) diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c index bddb788..49fc425 100644 --- a/src/util/virnetdevbandwidth.c +++ b/src/util/virnetdevbandwidth.c @@ -103,6 +103,15 @@ virNetDevBandwidthSet(const char *ifname, virCommandFree(cmd); cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "qdisc", "add", "dev", ifname, "parent", + "1:1", "handle", "2:", "sfq", "perturb", + "10", NULL); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + virCommandFree(cmd); + cmd = virCommandNew(TC); virCommandAddArgList(cmd,"filter", "add", "dev", ifname, "parent", "1:0", "protocol", "ip", "handle", "1", "fw", "flowid", "1", NULL); -- 1.7.8.6

On 12/11/2012 11:09 AM, Michal Privoznik wrote:
SFQ is qdisc which doesn't really shape any traffic but 'just' re-arrange packets in sending buffer so no stream starve. The goal is to ensure fairness. There is basically only one configuration parameter (perturb) which is set to advised value of 10. ---
ACK, but please include the meaning of "SFQ and "qdisc" in the commit message (just what the letters stand for / full word is enough).
src/util/virnetdevbandwidth.c | 9 +++++++++ 1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c index bddb788..49fc425 100644 --- a/src/util/virnetdevbandwidth.c +++ b/src/util/virnetdevbandwidth.c @@ -103,6 +103,15 @@ virNetDevBandwidthSet(const char *ifname,
virCommandFree(cmd); cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "qdisc", "add", "dev", ifname, "parent", + "1:1", "handle", "2:", "sfq", "perturb", + "10", NULL); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + virCommandFree(cmd); + cmd = virCommandNew(TC); virCommandAddArgList(cmd,"filter", "add", "dev", ifname, "parent", "1:0", "protocol", "ip", "handle", "1", "fw", "flowid", "1", NULL);

This is however supported only on domain interfaces with type='network'. Moreover, target network needs to have at least inbound QoS set. This is required by hierarchical traffic shaping.
From now on, the required attribute for <inbound/> is either 'average' (old) or 'floor' (new). This new attribute can be used just for interfaces type of network (<interface type='network'/>) currently.
docs/formatdomain.html.in | 23 +++++++++++-- docs/schemas/networkcommon.rng | 5 +++ src/conf/domain_conf.c | 6 ++- src/conf/netdev_bandwidth_conf.c | 67 ++++++++++++++++++++++++++++++++------ src/conf/netdev_bandwidth_conf.h | 3 +- src/conf/network_conf.c | 4 +- src/util/virnetdevbandwidth.h | 1 + 7 files changed, 90 insertions(+), 19 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index c81af8a..8e234fd 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3080,7 +3080,7 @@ qemu-kvm -net nic,model=? /dev/null <source network='default'/> <target dev='vnet0'/> <b><bandwidth> - <inbound average='1000' peak='5000' burst='1024'/> + <inbound average='1000' peak='5000' floor='200' burst='1024'/> <outbound average='128' peak='256' burst='256'/> </bandwidth></b> </interface> @@ -3095,14 +3095,29 @@ qemu-kvm -net nic,model=? /dev/null children element out result in no QoS applied on that traffic direction. So, when you want to shape only domain's incoming traffic, use <code>inbound</code> only, and vice versa. Each of these elements have one - mandatory attribute <code>average</code>. It specifies average bit rate on - interface being shaped. Then there are two optional attributes: + mandatory attribute <code>average</code> (or <code>floor</code> as + described below). <code>average</code> specifies average bit rate on + the interface being shaped. Then there are two optional attributes: <code>peak</code>, which specifies maximum rate at which interface can send data, and <code>burst</code>, amount of bytes that can be burst at <code>peak</code> speed. Accepted values for attributes are integer numbers. The units for <code>average</code> and <code>peak</code> attributes are kilobytes per second, and for the <code>burst</code> just kilobytes. - <span class="since">Since 0.9.4</span> + <span class="since">Since 0.9.4</span> The <code>inbound</code> can + optionally have <code>floor</code> attribute. This is there for + guaranteeing minimal throughput for shaped interfaces. This, however, + requires that all traffic goes through one point where QoS decisions can + take place. That's why this attribute works only for virtual networks for + now (that is <code><interface type='network'/></code> with a + forward type of route, nat, or no forward at all). Moreover, the + virtual network the interface is connected to is required to have at least + inbound QoS set (<code>average</code> at least). Moreover, with + <code>floor<code> attribute users don't need to specify + <code>average</code>. However, <code>peak</code> and <code>burst</code> + attributes still require <code>average</code>. Currently, linux kernel + doesn't allow ingress qdiscs to have any classes therefore + <code>floor</code> can be applied only on <code>inbound</code> and not + </code>outbound</code>. <span class="since">Since 1.0.1</span> </p> <h5><a name="elementVlanTag">Setting VLAN tag (on supported network types only)</a></h5> diff --git a/docs/schemas/networkcommon.rng b/docs/schemas/networkcommon.rng index c7749e7..51ff759 100644 --- a/docs/schemas/networkcommon.rng +++ b/docs/schemas/networkcommon.rng @@ -149,6 +149,11 @@ </attribute> </optional> <optional> + <attribute name="floor"> + <ref name="speed"/> + </attribute> + </optional> + <optional> <attribute name='burst'> <ref name="BurstSize"/> </attribute> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 6aa5f79..18e65ca 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4823,7 +4823,8 @@ virDomainActualNetDefParseXML(xmlNodePtr node, bandwidth_node = virXPathNode("./bandwidth", ctxt); if (bandwidth_node && - !(actual->bandwidth = virNetDevBandwidthParse(bandwidth_node))) + !(actual->bandwidth = virNetDevBandwidthParse(bandwidth_node, + actual->type))) goto error; vlanNode = virXPathNode("./vlan", ctxt); @@ -5011,7 +5012,8 @@ virDomainNetDefParseXML(virCapsPtr caps, goto error; } } else if (xmlStrEqual(cur->name, BAD_CAST "bandwidth")) { - if (!(def->bandwidth = virNetDevBandwidthParse(cur))) + if (!(def->bandwidth = virNetDevBandwidthParse(cur, + def->type))) goto error; } else if (xmlStrEqual(cur->name, BAD_CAST "vlan")) { if (virNetDevVlanParse(cur, ctxt, &def->vlan) < 0) diff --git a/src/conf/netdev_bandwidth_conf.c b/src/conf/netdev_bandwidth_conf.c index 5802eba..15840b9 100644 --- a/src/conf/netdev_bandwidth_conf.c +++ b/src/conf/netdev_bandwidth_conf.c @@ -26,6 +26,7 @@ #include "virterror_internal.h" #include "util.h" #include "memory.h" +#include "domain_conf.h" #define VIR_FROM_THIS VIR_FROM_NONE @@ -36,6 +37,7 @@ virNetDevBandwidthParseRate(xmlNodePtr node, virNetDevBandwidthRatePtr rate) char *average = NULL; char *peak = NULL; char *burst = NULL; + char *floor = NULL; if (!node || !rate) { virReportError(VIR_ERR_INVALID_ARG, "%s", @@ -46,40 +48,55 @@ virNetDevBandwidthParseRate(xmlNodePtr node, virNetDevBandwidthRatePtr rate) average = virXMLPropString(node, "average"); peak = virXMLPropString(node, "peak"); burst = virXMLPropString(node, "burst"); + floor = virXMLPropString(node, "floor"); if (average) { if (virStrToLong_ull(average, NULL, 10, &rate->average) < 0) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("could not convert %s"), + _("could not convert bandwidth average value '%s'"), average); goto cleanup; } - } else { + } else if (!floor) { virReportError(VIR_ERR_XML_DETAIL, "%s", - _("Missing mandatory average attribute")); + _("Missing mandatory average or floor attributes")); + goto cleanup; + } + + if ((peak || burst) && !average) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("'peak' and 'burst' require 'average' attribute")); goto cleanup; } if (peak && virStrToLong_ull(peak, NULL, 10, &rate->peak) < 0) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("could not convert %s"), + _("could not convert bandwidth peak value '%s'"), peak); goto cleanup; } if (burst && virStrToLong_ull(burst, NULL, 10, &rate->burst) < 0) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("could not convert %s"), + _("could not convert bandwidth burst value '%s'"), burst); goto cleanup; } + if (floor && virStrToLong_ull(floor, NULL, 10, &rate->floor) < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("could not convert bandwidth floor value '%s'"), + floor); + goto cleanup; + } + ret = 0; cleanup: VIR_FREE(average); VIR_FREE(peak); VIR_FREE(burst); + VIR_FREE(floor); return ret; } @@ -87,13 +104,17 @@ cleanup: /** * virNetDevBandwidthParse: * @node: XML node + * @net_type: one of virDomainNetType * - * Parse bandwidth XML and return pointer to structure + * Parse bandwidth XML and return pointer to structure. + * @net_type tell to which type will/is interface connected to. + * Pass -1 if this is not called on interface. * * Returns !NULL on success, NULL on error. */ virNetDevBandwidthPtr -virNetDevBandwidthParse(xmlNodePtr node) +virNetDevBandwidthParse(xmlNodePtr node, + int net_type) { virNetDevBandwidthPtr def = NULL; xmlNodePtr cur; @@ -146,6 +167,20 @@ virNetDevBandwidthParse(xmlNodePtr node) /* helper reported error for us */ goto error; } + + if (def->in->floor && net_type != VIR_DOMAIN_NET_TYPE_NETWORK) { + if (net_type == -1) { + /* 'floor' on network isn't supported */ + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("floor attribute isn't supported for " + "network's bandwidth yet")); + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("floor attribute is supported only for " + "interfaces of type network")); + } + goto error; + } } if (out) { @@ -158,6 +193,13 @@ virNetDevBandwidthParse(xmlNodePtr node) /* helper reported error for us */ goto error; } + + if (def->out->floor) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("'floor' attribute allowed " + "only in <inbound> element")); + goto error; + } } return def; @@ -177,13 +219,18 @@ virNetDevBandwidthRateFormat(virNetDevBandwidthRatePtr def, if (!def) return 0; - if (def->average) { - virBufferAsprintf(buf, " <%s average='%llu'", elem_name, - def->average); + if (def->average || def->floor) { + virBufferAsprintf(buf, " <%s", elem_name); + + if (def->average) + virBufferAsprintf(buf, " average='%llu'", def->average); if (def->peak) virBufferAsprintf(buf, " peak='%llu'", def->peak); + if (def->floor) + virBufferAsprintf(buf, " floor='%llu'", def->floor); + if (def->burst) virBufferAsprintf(buf, " burst='%llu'", def->burst); virBufferAddLit(buf, "/>\n"); diff --git a/src/conf/netdev_bandwidth_conf.h b/src/conf/netdev_bandwidth_conf.h index bca5c50..0080165 100644 --- a/src/conf/netdev_bandwidth_conf.h +++ b/src/conf/netdev_bandwidth_conf.h @@ -28,7 +28,8 @@ # include "buf.h" # include "xml.h" -virNetDevBandwidthPtr virNetDevBandwidthParse(xmlNodePtr node) +virNetDevBandwidthPtr virNetDevBandwidthParse(xmlNodePtr node, + int net_type) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; int virNetDevBandwidthFormat(virNetDevBandwidthPtr def, virBufferPtr buf) diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 7220659..3093418 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1192,7 +1192,7 @@ virNetworkPortGroupParseXML(virPortGroupDefPtr def, bandwidth_node = virXPathNode("./bandwidth", ctxt); if (bandwidth_node && - !(def->bandwidth = virNetDevBandwidthParse(bandwidth_node))) { + !(def->bandwidth = virNetDevBandwidthParse(bandwidth_node, -1))) { goto error; } @@ -1285,7 +1285,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) def->domain = virXPathString("string(./domain[1]/@name)", ctxt); if ((bandwidthNode = virXPathNode("./bandwidth", ctxt)) != NULL && - (def->bandwidth = virNetDevBandwidthParse(bandwidthNode)) == NULL) + (def->bandwidth = virNetDevBandwidthParse(bandwidthNode, -1)) == NULL) goto error; vlanNode = virXPathNode("./vlan", ctxt); diff --git a/src/util/virnetdevbandwidth.h b/src/util/virnetdevbandwidth.h index e046230..35f8b89 100644 --- a/src/util/virnetdevbandwidth.h +++ b/src/util/virnetdevbandwidth.h @@ -30,6 +30,7 @@ typedef virNetDevBandwidthRate *virNetDevBandwidthRatePtr; struct _virNetDevBandwidthRate { unsigned long long average; /* kbytes/s */ unsigned long long peak; /* kbytes/s */ + unsigned long long floor; /* kbytes/s */ unsigned long long burst; /* kbytes */ }; -- 1.7.8.6

On 12/11/2012 11:09 AM, Michal Privoznik wrote:
This is however supported only on domain interfaces with type='network'. Moreover, target network needs to have at least inbound QoS set. This is required by hierarchical traffic shaping.
From now on, the required attribute for <inbound/> is either 'average' (old) or 'floor' (new). This new attribute can be used just for interfaces type of network (<interface type='network'/>) currently.
docs/formatdomain.html.in | 23 +++++++++++-- docs/schemas/networkcommon.rng | 5 +++ src/conf/domain_conf.c | 6 ++- src/conf/netdev_bandwidth_conf.c | 67 ++++++++++++++++++++++++++++++++------ src/conf/netdev_bandwidth_conf.h | 3 +- src/conf/network_conf.c | 4 +- src/util/virnetdevbandwidth.h | 1 + 7 files changed, 90 insertions(+), 19 deletions(-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index c81af8a..8e234fd 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3080,7 +3080,7 @@ qemu-kvm -net nic,model=? /dev/null <source network='default'/> <target dev='vnet0'/> <b><bandwidth> - <inbound average='1000' peak='5000' burst='1024'/> + <inbound average='1000' peak='5000' floor='200' burst='1024'/> <outbound average='128' peak='256' burst='256'/> </bandwidth></b> </interface> @@ -3095,14 +3095,29 @@ qemu-kvm -net nic,model=? /dev/null children element out result in no QoS applied on that traffic direction. So, when you want to shape only domain's incoming traffic, use <code>inbound</code> only, and vice versa. Each of these elements have one - mandatory attribute <code>average</code>. It specifies average bit rate on - interface being shaped. Then there are two optional attributes: + mandatory attribute <code>average</code> (or <code>floor</code> as + described below). <code>average</code> specifies average bit rate on + the interface being shaped. Then there are two optional attributes: <code>peak</code>, which specifies maximum rate at which interface can send data, and <code>burst</code>, amount of bytes that can be burst at <code>peak</code> speed. Accepted values for attributes are integer numbers. The units for <code>average</code> and <code>peak</code> attributes are kilobytes per second, and for the <code>burst</code> just kilobytes. - <span class="since">Since 0.9.4</span> + <span class="since">Since 0.9.4</span> The <code>inbound</code> can + optionally have <code>floor</code> attribute. This is there for + guaranteeing minimal throughput for shaped interfaces. This, however, + requires that all traffic goes through one point where QoS decisions can + take place. That's why this attribute works only for virtual networks for + now (that is <code><interface type='network'/></code> with a + forward type of route, nat, or no forward at all). Moreover, the + virtual network the interface is connected to is required to have at least + inbound QoS set (<code>average</code> at least). Moreover, with + <code>floor<code> attribute users don't need to specify + <code>average</code>. However, <code>peak</code> and <code>burst</code> + attributes still require <code>average</code>. Currently, linux kernel + doesn't allow ingress qdiscs to have any classes therefore + <code>floor</code> can be applied only on <code>inbound</code> and not + </code>outbound</code>. <span class="since">Since 1.0.1</span> </p>
<h5><a name="elementVlanTag">Setting VLAN tag (on supported network types only)</a></h5> diff --git a/docs/schemas/networkcommon.rng b/docs/schemas/networkcommon.rng index c7749e7..51ff759 100644 --- a/docs/schemas/networkcommon.rng +++ b/docs/schemas/networkcommon.rng @@ -149,6 +149,11 @@ </attribute> </optional> <optional> + <attribute name="floor"> + <ref name="speed"/> + </attribute> + </optional> + <optional> <attribute name='burst'> <ref name="BurstSize"/> </attribute> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 6aa5f79..18e65ca 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4823,7 +4823,8 @@ virDomainActualNetDefParseXML(xmlNodePtr node,
bandwidth_node = virXPathNode("./bandwidth", ctxt); if (bandwidth_node && - !(actual->bandwidth = virNetDevBandwidthParse(bandwidth_node))) + !(actual->bandwidth = virNetDevBandwidthParse(bandwidth_node, + actual->type))) goto error;
vlanNode = virXPathNode("./vlan", ctxt); @@ -5011,7 +5012,8 @@ virDomainNetDefParseXML(virCapsPtr caps, goto error; } } else if (xmlStrEqual(cur->name, BAD_CAST "bandwidth")) { - if (!(def->bandwidth = virNetDevBandwidthParse(cur))) + if (!(def->bandwidth = virNetDevBandwidthParse(cur, + def->type))) goto error; } else if (xmlStrEqual(cur->name, BAD_CAST "vlan")) { if (virNetDevVlanParse(cur, ctxt, &def->vlan) < 0) diff --git a/src/conf/netdev_bandwidth_conf.c b/src/conf/netdev_bandwidth_conf.c index 5802eba..15840b9 100644 --- a/src/conf/netdev_bandwidth_conf.c +++ b/src/conf/netdev_bandwidth_conf.c @@ -26,6 +26,7 @@ #include "virterror_internal.h" #include "util.h" #include "memory.h" +#include "domain_conf.h"
#define VIR_FROM_THIS VIR_FROM_NONE
@@ -36,6 +37,7 @@ virNetDevBandwidthParseRate(xmlNodePtr node, virNetDevBandwidthRatePtr rate) char *average = NULL; char *peak = NULL; char *burst = NULL; + char *floor = NULL;
if (!node || !rate) { virReportError(VIR_ERR_INVALID_ARG, "%s", @@ -46,40 +48,55 @@ virNetDevBandwidthParseRate(xmlNodePtr node, virNetDevBandwidthRatePtr rate) average = virXMLPropString(node, "average"); peak = virXMLPropString(node, "peak"); burst = virXMLPropString(node, "burst"); + floor = virXMLPropString(node, "floor");
if (average) { if (virStrToLong_ull(average, NULL, 10, &rate->average) < 0) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("could not convert %s"), + _("could not convert bandwidth average value '%s'"), average); goto cleanup; } - } else { + } else if (!floor) { virReportError(VIR_ERR_XML_DETAIL, "%s", - _("Missing mandatory average attribute")); + _("Missing mandatory average or floor attributes")); + goto cleanup; + } + + if ((peak || burst) && !average) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("'peak' and 'burst' require 'average' attribute")); goto cleanup; }
if (peak && virStrToLong_ull(peak, NULL, 10, &rate->peak) < 0) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("could not convert %s"), + _("could not convert bandwidth peak value '%s'"), peak); goto cleanup; }
if (burst && virStrToLong_ull(burst, NULL, 10, &rate->burst) < 0) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("could not convert %s"), + _("could not convert bandwidth burst value '%s'"), burst); goto cleanup; }
+ if (floor && virStrToLong_ull(floor, NULL, 10, &rate->floor) < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("could not convert bandwidth floor value '%s'"), + floor); + goto cleanup; + } + ret = 0;
cleanup: VIR_FREE(average); VIR_FREE(peak); VIR_FREE(burst); + VIR_FREE(floor);
return ret; } @@ -87,13 +104,17 @@ cleanup: /** * virNetDevBandwidthParse: * @node: XML node + * @net_type: one of virDomainNetType * - * Parse bandwidth XML and return pointer to structure + * Parse bandwidth XML and return pointer to structure. + * @net_type tell to which type will/is interface connected to. + * Pass -1 if this is not called on interface. * * Returns !NULL on success, NULL on error. */ virNetDevBandwidthPtr -virNetDevBandwidthParse(xmlNodePtr node) +virNetDevBandwidthParse(xmlNodePtr node, + int net_type) { virNetDevBandwidthPtr def = NULL; xmlNodePtr cur; @@ -146,6 +167,20 @@ virNetDevBandwidthParse(xmlNodePtr node) /* helper reported error for us */ goto error; } + + if (def->in->floor && net_type != VIR_DOMAIN_NET_TYPE_NETWORK) { + if (net_type == -1) { + /* 'floor' on network isn't supported */ + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("floor attribute isn't supported for " + "network's bandwidth yet")); + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("floor attribute is supported only for " + "interfaces of type network")); + } + goto error; + } }
if (out) { @@ -158,6 +193,13 @@ virNetDevBandwidthParse(xmlNodePtr node) /* helper reported error for us */ goto error; } + + if (def->out->floor) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("'floor' attribute allowed " + "only in <inbound> element")); + goto error; + } }
return def; @@ -177,13 +219,18 @@ virNetDevBandwidthRateFormat(virNetDevBandwidthRatePtr def, if (!def) return 0;
- if (def->average) { - virBufferAsprintf(buf, " <%s average='%llu'", elem_name, - def->average); + if (def->average || def->floor) { + virBufferAsprintf(buf, " <%s", elem_name); + + if (def->average) + virBufferAsprintf(buf, " average='%llu'", def->average);
if (def->peak) virBufferAsprintf(buf, " peak='%llu'", def->peak);
+ if (def->floor) + virBufferAsprintf(buf, " floor='%llu'", def->floor); + if (def->burst) virBufferAsprintf(buf, " burst='%llu'", def->burst); virBufferAddLit(buf, "/>\n"); diff --git a/src/conf/netdev_bandwidth_conf.h b/src/conf/netdev_bandwidth_conf.h index bca5c50..0080165 100644 --- a/src/conf/netdev_bandwidth_conf.h +++ b/src/conf/netdev_bandwidth_conf.h @@ -28,7 +28,8 @@ # include "buf.h" # include "xml.h"
-virNetDevBandwidthPtr virNetDevBandwidthParse(xmlNodePtr node) +virNetDevBandwidthPtr virNetDevBandwidthParse(xmlNodePtr node, + int net_type) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; int virNetDevBandwidthFormat(virNetDevBandwidthPtr def, virBufferPtr buf) diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 7220659..3093418 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1192,7 +1192,7 @@ virNetworkPortGroupParseXML(virPortGroupDefPtr def,
bandwidth_node = virXPathNode("./bandwidth", ctxt); if (bandwidth_node && - !(def->bandwidth = virNetDevBandwidthParse(bandwidth_node))) { + !(def->bandwidth = virNetDevBandwidthParse(bandwidth_node, -1))) { goto error; }
@@ -1285,7 +1285,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) def->domain = virXPathString("string(./domain[1]/@name)", ctxt);
if ((bandwidthNode = virXPathNode("./bandwidth", ctxt)) != NULL && - (def->bandwidth = virNetDevBandwidthParse(bandwidthNode)) == NULL) + (def->bandwidth = virNetDevBandwidthParse(bandwidthNode, -1)) == NULL) goto error;
vlanNode = virXPathNode("./vlan", ctxt); diff --git a/src/util/virnetdevbandwidth.h b/src/util/virnetdevbandwidth.h index e046230..35f8b89 100644 --- a/src/util/virnetdevbandwidth.h +++ b/src/util/virnetdevbandwidth.h @@ -30,6 +30,7 @@ typedef virNetDevBandwidthRate *virNetDevBandwidthRatePtr; struct _virNetDevBandwidthRate { unsigned long long average; /* kbytes/s */ unsigned long long peak; /* kbytes/s */ + unsigned long long floor; /* kbytes/s */ unsigned long long burst; /* kbytes */ };
ACK.

These classes can borrow unused bandwidth. Basically, only egress qdsics can have classes, therefore we can do this kind of traffic shaping only on host's outgoing, that is domain's incoming traffic. --- src/lxc/lxc_process.c | 3 +- src/network/bridge_driver.c | 3 +- src/qemu/qemu_command.c | 3 +- src/qemu/qemu_driver.c | 2 +- src/util/virnetdevbandwidth.c | 93 +++++++++++++++++++++++++++++++++++++--- src/util/virnetdevbandwidth.h | 4 +- src/util/virnetdevmacvlan.c | 2 +- 7 files changed, 97 insertions(+), 13 deletions(-) diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index 50c61c5..3e7fcb8 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -341,7 +341,8 @@ static int virLXCProcessSetupInterfaceBridged(virConnectPtr conn, goto cleanup; if (virNetDevBandwidthSet(net->ifname, - virDomainNetGetActualBandwidth(net)) < 0) { + virDomainNetGetActualBandwidth(net), + false) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), net->ifname); diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 00cffee..58f1d2e 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2284,7 +2284,8 @@ networkStartNetworkVirtual(struct network_driver *driver, VIR_FORCE_CLOSE(tapfd); } - if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { + if (virNetDevBandwidthSet(network->def->bridge, + network->def->bandwidth, true) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), network->def->bridge); diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 9009bd2..e10eb09 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -292,7 +292,8 @@ qemuNetworkIfaceConnect(virDomainDefPtr def, if (tapfd >= 0 && virNetDevBandwidthSet(net->ifname, - virDomainNetGetActualBandwidth(net)) < 0) { + virDomainNetGetActualBandwidth(net), + false) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), net->ifname); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d449579..e6ae3fd 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -9034,7 +9034,7 @@ qemuDomainSetInterfaceParameters(virDomainPtr dom, sizeof(*newBandwidth->out)); } - if (virNetDevBandwidthSet(net->ifname, newBandwidth) < 0) { + if (virNetDevBandwidthSet(net->ifname, newBandwidth, false) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), device); diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c index 49fc425..71c272e 100644 --- a/src/util/virnetdevbandwidth.c +++ b/src/util/virnetdevbandwidth.c @@ -45,17 +45,21 @@ virNetDevBandwidthFree(virNetDevBandwidthPtr def) * virNetDevBandwidthSet: * @ifname: on which interface * @bandwidth: rates to set (may be NULL) + * @hierarchical_class: whether to create hierarchical class * * This function enables QoS on specified interface * and set given traffic limits for both, incoming * and outgoing traffic. Any previous setting get - * overwritten. + * overwritten. If @hierarchical_class is TRUE, create + * hierarchical class. It is used to guarantee minimal + * throughput ('floor' attribute in NIC). * * Return 0 on success, -1 otherwise. */ int virNetDevBandwidthSet(const char *ifname, - virNetDevBandwidthPtr bandwidth) + virNetDevBandwidthPtr bandwidth, + bool hierarchical_class) { int ret = -1; virCommandPtr cmd = NULL; @@ -71,7 +75,7 @@ virNetDevBandwidthSet(const char *ifname, virNetDevBandwidthClear(ifname); - if (bandwidth->in) { + if (bandwidth->in && bandwidth->in->average) { if (virAsprintf(&average, "%llukbps", bandwidth->in->average) < 0) goto cleanup; if (bandwidth->in->peak && @@ -83,15 +87,89 @@ virNetDevBandwidthSet(const char *ifname, cmd = virCommandNew(TC); virCommandAddArgList(cmd, "qdisc", "add", "dev", ifname, "root", - "handle", "1:", "htb", "default", "1", NULL); + "handle", "1:", "htb", "default", + hierarchical_class ? "2" : "1", NULL); if (virCommandRun(cmd, NULL) < 0) goto cleanup; + /* If we are creating a hierarchical class, all non guaranteed traffic + * goes to the 1:2 class which will adjust 'rate' dynamically as NICs + * with guaranteed throughput are plugged and unplugged. Class 1:1 + * exists so we don't exceed the maximum limit for the network. For each + * NIC with guaranteed throughput a separate classid will be created. + * NB '1:' is just a shorter notation of '1:0'. + * + * To get a picture how this works: + * + * +-----+ +---------+ +-----------+ +-----------+ +-----+ + * | | | qdisc | | class 1:1 | | class 1:2 | | | + * | NIC | | def 1:2 | | rate | | rate | | sfq | + * | | --> | | --> | peak | -+-> | peak | --> | | + * +-----+ +---------+ +-----------+ | +-----------+ +-----+ + * | + * | +-----------+ +-----+ + * | | class 1:3 | | | + * | | rate | | sfq | + * +-> | peak | --> | | + * | +-----------+ +-----+ + * ... + * | +-----------+ +-----+ + * | | class 1:n | | | + * | | rate | | sfq | + * +-> | peak | --> | | + * +-----------+ +-----+ + * + * After the routing decision, when is it clear a packet is to be sent + * via a particular NIC, it is sent to the root qdisc (queueing + * discipline). In this case HTB (Hierarchical Token Bucket). It has + * only one direct child class (with id 1:1) which shapes the overall + * rate that is sent through the NIC. This class has at least one child + * (1:2) which is meant for all non-privileged (non guaranteed) traffic + * from all domains. Then, for each interface with guaranteed + * throughput, a separate class (1:n) is created. Imagine a class is a + * box. Whenever a packet ends up in a class it is stored in this box + * until the kernel sends it, then it is removed from box. Packets are + * placed into boxes based on rules (filters) - e.g. depending on + * destination IP/MAC address. If there is no rule to be applied, the + * root qdisc has a default where such packets go (1:2 in this case). + * Packets come in over and over again and boxes get filled more and + * more. Imagine that kernel sends packets just once a second. So it + * starts to traverse through this tree. It starts with the root qdisc + * and through 1:1 it gets to 1:2. It sends packets up to 1:2's 'rate'. + * Then it moves to 1:3 and again sends packets up to 1:3's 'rate'. The + * whole process is repeated until 1:n is processed. So now we have + * ensured each class its guaranteed bandwidth. If the sum of sent data + * doesn't exceed the 'rate' in 1:1 class, we can go further and send + * more packets. The rest of available bandwidth is distributed to the + * 1:2,1:3...1:n classes by ratio of their 'rate'. As soon as the root + * 'rate' limit is reached or there are no more packets to send, we stop + * sending and wait another second. Each class has an SFQ qdisc which + * shuffles packets in boxes stochastically, so one sender cannot + * starve others. + * + * Therefore, whenever we want to plug in a new guaranteed interface, we + * need to create a new class and adjust the 'rate' of the 1:2 class. + * When unplugging we do the exact opposite - remove the associated + * class, and adjust the 'rate'. + * + * This description is rather long, but it is still a good idea to read + * it before you dig into the code. + */ + if (hierarchical_class) { + virCommandFree(cmd); + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "class", "add", "dev", ifname, "parent", + "1:", "classid", "1:1", "htb", "rate", average, + "ceil", peak ? peak : average, NULL); + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + } virCommandFree(cmd); cmd = virCommandNew(TC); virCommandAddArgList(cmd,"class", "add", "dev", ifname, "parent", - "1:", "classid", "1:1", "htb", NULL); - virCommandAddArgList(cmd, "rate", average, NULL); + hierarchical_class ? "1:1" : "1:", "classid", + hierarchical_class ? "1:2" : "1:1", "htb", + "rate", average, NULL); if (peak) virCommandAddArgList(cmd, "ceil", peak, NULL); @@ -104,7 +182,8 @@ virNetDevBandwidthSet(const char *ifname, virCommandFree(cmd); cmd = virCommandNew(TC); virCommandAddArgList(cmd, "qdisc", "add", "dev", ifname, "parent", - "1:1", "handle", "2:", "sfq", "perturb", + hierarchical_class ? "1:2" : "1:1", + "handle", "2:", "sfq", "perturb", "10", NULL); if (virCommandRun(cmd, NULL) < 0) diff --git a/src/util/virnetdevbandwidth.h b/src/util/virnetdevbandwidth.h index 35f8b89..d308ab2 100644 --- a/src/util/virnetdevbandwidth.h +++ b/src/util/virnetdevbandwidth.h @@ -42,7 +42,9 @@ struct _virNetDevBandwidth { void virNetDevBandwidthFree(virNetDevBandwidthPtr def); -int virNetDevBandwidthSet(const char *ifname, virNetDevBandwidthPtr bandwidth) +int virNetDevBandwidthSet(const char *ifname, + virNetDevBandwidthPtr bandwidth, + bool hierarchical_class) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; int virNetDevBandwidthClear(const char *ifname) ATTRIBUTE_NONNULL(1); diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index d8e646a..657c484 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -925,7 +925,7 @@ create_name: rc = 0; } - if (virNetDevBandwidthSet(cr_ifname, bandwidth) < 0) { + if (virNetDevBandwidthSet(cr_ifname, bandwidth, false) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), cr_ifname); -- 1.7.8.6

On 12/11/2012 11:09 AM, Michal Privoznik wrote:
These classes can borrow unused bandwidth. Basically, only egress qdsics can have classes, therefore we can
s/qdsic/qdisc/
do this kind of traffic shaping only on host's outgoing, that is domain's incoming traffic. --- src/lxc/lxc_process.c | 3 +- src/network/bridge_driver.c | 3 +- src/qemu/qemu_command.c | 3 +- src/qemu/qemu_driver.c | 2 +- src/util/virnetdevbandwidth.c | 93 +++++++++++++++++++++++++++++++++++++--- src/util/virnetdevbandwidth.h | 4 +- src/util/virnetdevmacvlan.c | 2 +- 7 files changed, 97 insertions(+), 13 deletions(-)
diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index 50c61c5..3e7fcb8 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -341,7 +341,8 @@ static int virLXCProcessSetupInterfaceBridged(virConnectPtr conn, goto cleanup;
if (virNetDevBandwidthSet(net->ifname, - virDomainNetGetActualBandwidth(net)) < 0) { + virDomainNetGetActualBandwidth(net), + false) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), net->ifname); diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 00cffee..58f1d2e 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2284,7 +2284,8 @@ networkStartNetworkVirtual(struct network_driver *driver, VIR_FORCE_CLOSE(tapfd); }
- if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) { + if (virNetDevBandwidthSet(network->def->bridge, + network->def->bandwidth, true) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), network->def->bridge); diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 9009bd2..e10eb09 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -292,7 +292,8 @@ qemuNetworkIfaceConnect(virDomainDefPtr def,
if (tapfd >= 0 && virNetDevBandwidthSet(net->ifname, - virDomainNetGetActualBandwidth(net)) < 0) { + virDomainNetGetActualBandwidth(net), + false) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), net->ifname); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d449579..e6ae3fd 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -9034,7 +9034,7 @@ qemuDomainSetInterfaceParameters(virDomainPtr dom, sizeof(*newBandwidth->out)); }
- if (virNetDevBandwidthSet(net->ifname, newBandwidth) < 0) { + if (virNetDevBandwidthSet(net->ifname, newBandwidth, false) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), device); diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c index 49fc425..71c272e 100644 --- a/src/util/virnetdevbandwidth.c +++ b/src/util/virnetdevbandwidth.c @@ -45,17 +45,21 @@ virNetDevBandwidthFree(virNetDevBandwidthPtr def) * virNetDevBandwidthSet: * @ifname: on which interface * @bandwidth: rates to set (may be NULL) + * @hierarchical_class: whether to create hierarchical class * * This function enables QoS on specified interface * and set given traffic limits for both, incoming * and outgoing traffic. Any previous setting get - * overwritten. + * overwritten. If @hierarchical_class is TRUE, create + * hierarchical class. It is used to guarantee minimal + * throughput ('floor' attribute in NIC). * * Return 0 on success, -1 otherwise. */ int virNetDevBandwidthSet(const char *ifname, - virNetDevBandwidthPtr bandwidth) + virNetDevBandwidthPtr bandwidth, + bool hierarchical_class) { int ret = -1; virCommandPtr cmd = NULL; @@ -71,7 +75,7 @@ virNetDevBandwidthSet(const char *ifname,
virNetDevBandwidthClear(ifname);
- if (bandwidth->in) { + if (bandwidth->in && bandwidth->in->average) { if (virAsprintf(&average, "%llukbps", bandwidth->in->average) < 0) goto cleanup; if (bandwidth->in->peak && @@ -83,15 +87,89 @@ virNetDevBandwidthSet(const char *ifname,
cmd = virCommandNew(TC); virCommandAddArgList(cmd, "qdisc", "add", "dev", ifname, "root", - "handle", "1:", "htb", "default", "1", NULL); + "handle", "1:", "htb", "default", + hierarchical_class ? "2" : "1", NULL); if (virCommandRun(cmd, NULL) < 0) goto cleanup;
+ /* If we are creating a hierarchical class, all non guaranteed traffic + * goes to the 1:2 class which will adjust 'rate' dynamically as NICs + * with guaranteed throughput are plugged and unplugged. Class 1:1 + * exists so we don't exceed the maximum limit for the network. For each + * NIC with guaranteed throughput a separate classid will be created. + * NB '1:' is just a shorter notation of '1:0'. + * + * To get a picture how this works: + * + * +-----+ +---------+ +-----------+ +-----------+ +-----+ + * | | | qdisc | | class 1:1 | | class 1:2 | | | + * | NIC | | def 1:2 | | rate | | rate | | sfq | + * | | --> | | --> | peak | -+-> | peak | --> | | + * +-----+ +---------+ +-----------+ | +-----------+ +-----+ + * | + * | +-----------+ +-----+ + * | | class 1:3 | | | + * | | rate | | sfq | + * +-> | peak | --> | | + * | +-----------+ +-----+ + * ... + * | +-----------+ +-----+ + * | | class 1:n | | | + * | | rate | | sfq | + * +-> | peak | --> | | + * +-----------+ +-----+ + * + * After the routing decision, when is it clear a packet is to be sent + * via a particular NIC, it is sent to the root qdisc (queueing + * discipline). In this case HTB (Hierarchical Token Bucket). It has + * only one direct child class (with id 1:1) which shapes the overall + * rate that is sent through the NIC. This class has at least one child + * (1:2) which is meant for all non-privileged (non guaranteed) traffic + * from all domains. Then, for each interface with guaranteed + * throughput, a separate class (1:n) is created. Imagine a class is a + * box. Whenever a packet ends up in a class it is stored in this box + * until the kernel sends it, then it is removed from box. Packets are + * placed into boxes based on rules (filters) - e.g. depending on + * destination IP/MAC address. If there is no rule to be applied, the + * root qdisc has a default where such packets go (1:2 in this case). + * Packets come in over and over again and boxes get filled more and + * more. Imagine that kernel sends packets just once a second. So it + * starts to traverse through this tree. It starts with the root qdisc + * and through 1:1 it gets to 1:2. It sends packets up to 1:2's 'rate'. + * Then it moves to 1:3 and again sends packets up to 1:3's 'rate'. The + * whole process is repeated until 1:n is processed. So now we have + * ensured each class its guaranteed bandwidth. If the sum of sent data + * doesn't exceed the 'rate' in 1:1 class, we can go further and send + * more packets. The rest of available bandwidth is distributed to the + * 1:2,1:3...1:n classes by ratio of their 'rate'. As soon as the root + * 'rate' limit is reached or there are no more packets to send, we stop + * sending and wait another second. Each class has an SFQ qdisc which + * shuffles packets in boxes stochastically, so one sender cannot + * starve others. + * + * Therefore, whenever we want to plug in a new guaranteed interface, we + * need to create a new class and adjust the 'rate' of the 1:2 class. + * When unplugging we do the exact opposite - remove the associated + * class, and adjust the 'rate'. + * + * This description is rather long, but it is still a good idea to read + * it before you dig into the code. + */ + if (hierarchical_class) { + virCommandFree(cmd); + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "class", "add", "dev", ifname, "parent", + "1:", "classid", "1:1", "htb", "rate", average, + "ceil", peak ? peak : average, NULL); + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + } virCommandFree(cmd); cmd = virCommandNew(TC); virCommandAddArgList(cmd,"class", "add", "dev", ifname, "parent", - "1:", "classid", "1:1", "htb", NULL); - virCommandAddArgList(cmd, "rate", average, NULL); + hierarchical_class ? "1:1" : "1:", "classid", + hierarchical_class ? "1:2" : "1:1", "htb", + "rate", average, NULL);
if (peak) virCommandAddArgList(cmd, "ceil", peak, NULL); @@ -104,7 +182,8 @@ virNetDevBandwidthSet(const char *ifname, virCommandFree(cmd); cmd = virCommandNew(TC); virCommandAddArgList(cmd, "qdisc", "add", "dev", ifname, "parent", - "1:1", "handle", "2:", "sfq", "perturb", + hierarchical_class ? "1:2" : "1:1", + "handle", "2:", "sfq", "perturb", "10", NULL);
if (virCommandRun(cmd, NULL) < 0) diff --git a/src/util/virnetdevbandwidth.h b/src/util/virnetdevbandwidth.h index 35f8b89..d308ab2 100644 --- a/src/util/virnetdevbandwidth.h +++ b/src/util/virnetdevbandwidth.h @@ -42,7 +42,9 @@ struct _virNetDevBandwidth {
void virNetDevBandwidthFree(virNetDevBandwidthPtr def);
-int virNetDevBandwidthSet(const char *ifname, virNetDevBandwidthPtr bandwidth) +int virNetDevBandwidthSet(const char *ifname, + virNetDevBandwidthPtr bandwidth, + bool hierarchical_class) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; int virNetDevBandwidthClear(const char *ifname) ATTRIBUTE_NONNULL(1); diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index d8e646a..657c484 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -925,7 +925,7 @@ create_name: rc = 0; }
- if (virNetDevBandwidthSet(cr_ifname, bandwidth) < 0) { + if (virNetDevBandwidthSet(cr_ifname, bandwidth, false) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("cannot set bandwidth limits on %s"), cr_ifname);
ACK.

These set bridge part of QoS when bringing domain's interface up. Long story short, if there's a 'floor' set, a new QoS class is created. ClassID MUST be unique within the bridge and should be kept for unplug phase. --- po/POTFILES.in | 1 + src/util/virnetdevbandwidth.c | 182 +++++++++++++++++++++++++++++++++++++++++ src/util/virnetdevbandwidth.h | 14 +++ 3 files changed, 197 insertions(+), 0 deletions(-) diff --git a/po/POTFILES.in b/po/POTFILES.in index 586aa2b..d47f445 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -158,6 +158,7 @@ src/util/virinitctl.c src/util/virkeyfile.c src/util/virlockspace.c src/util/virnetdev.c +src/util/virnetdevbandwidth.c src/util/virnetdevbridge.c src/util/virnetdevmacvlan.c src/util/virnetdevopenvswitch.c diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c index 71c272e..f6f2a86 100644 --- a/src/util/virnetdevbandwidth.c +++ b/src/util/virnetdevbandwidth.c @@ -352,3 +352,185 @@ virNetDevBandwidthEqual(virNetDevBandwidthPtr a, return true; } + +/* + * virNetDevBandwidthPlug: + * @brname: name of the bridge + * @net_bandwidth: QoS settings on @brname + * @ifmac: MAC of interface + * @bandwidth: QoS settings for interface + * @id: unique ID (MUST be greater than 2) + * + * Set bridge part of interface QoS settings, e.g. guaranteed + * bandwidth. @id is an unique ID (among @brname) from which + * other identifiers for class, qdisc and filter are derived. + * However, two classes were already set up (by + * virNetDevBandwidthSet). That's why this @id MUST be greater + * than 2. You may want to keep passed @id, as it is used later + * by virNetDevBandwidthUnplug. + * + * Returns: + * 0 if QoS set successfully + * -1 otherwise. + */ +int +virNetDevBandwidthPlug(const char *brname, + virNetDevBandwidthPtr net_bandwidth, + const virMacAddrPtr ifmac_ptr, + virNetDevBandwidthPtr bandwidth, + unsigned int id) +{ + int ret = -1; + virCommandPtr cmd = NULL; + char *class_id = NULL; + char *qdisc_id = NULL; + char *filter_id = NULL; + char *floor = NULL; + char *ceil = NULL; + unsigned char ifmac[VIR_MAC_BUFLEN]; + char ifmacStr[VIR_MAC_STRING_BUFLEN]; + char *mac[2] = {NULL, NULL}; + + if (id <= 2) { + virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid class ID %d"), id); + return -1; + } + + virMacAddrGetRaw(ifmac_ptr, ifmac); + virMacAddrFormat(ifmac_ptr, ifmacStr); + + if (!net_bandwidth || !net_bandwidth->in) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Bridge '%s' has no QoS set, therefore " + "unable to set 'floor' on '%s'"), + brname, ifmacStr); + return -1; + } + + if (virAsprintf(&class_id, "1:%x", id) < 0 || + virAsprintf(&qdisc_id, "%x:", id) < 0 || + virAsprintf(&filter_id, "%u", id) < 0 || + virAsprintf(&mac[0], "0x%02x%02x%02x%02x", ifmac[2], + ifmac[3], ifmac[4], ifmac[5]) < 0 || + virAsprintf(&mac[1], "0x%02x%02x", ifmac[0], ifmac[1]) < 0 || + virAsprintf(&floor, "%llukbps", bandwidth->in->floor) < 0 || + virAsprintf(&ceil, "%llukbps", net_bandwidth->in->peak ? + net_bandwidth->in->peak : + net_bandwidth->in->average) < 0) { + virReportOOMError(); + goto cleanup; + } + + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "class", "add", "dev", brname, "parent", "1:1", + "classid", class_id, "htb", "rate", floor, + "ceil", ceil, NULL); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + virCommandFree(cmd); + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "qdisc", "add", "dev", brname, "parent", + class_id, "handle", qdisc_id, "sfq", "perturb", + "10", NULL); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + virCommandFree(cmd); + cmd = virCommandNew(TC); + /* Okay, this not nice. But since libvirt does not know anything about + * interface IP address(es), and tc fw filter simply refuse to use ebtables + * marks, we need to use u32 selector to match MAC address. + * If libvirt will ever know something, remove this FIXME + */ + virCommandAddArgList(cmd, "filter", "add", "dev", brname, "protocol", "ip", + "prio", filter_id, "u32", + "match", "u16", "0x0800", "0xffff", "at", "-2", + "match", "u32", mac[0], "0xffffffff", "at", "-12", + "match", "u16", mac[1], "0xffff", "at", "-14", + "flowid", class_id, NULL); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + ret = 0; + +cleanup: + VIR_FREE(mac[1]); + VIR_FREE(mac[0]); + VIR_FREE(ceil); + VIR_FREE(floor); + VIR_FREE(filter_id); + VIR_FREE(qdisc_id); + VIR_FREE(class_id); + virCommandFree(cmd); + return ret; +} + +/* + * virNetDevBandwidthUnplug: + * @brname: from which bridge are we unplugging + * @id: unique identifier (MUST be greater than 2) + * + * Remove QoS settings from bridge. + * + * Returns 0 on success, -1 otherwise. + */ +int +virNetDevBandwidthUnplug(const char *brname, + unsigned int id) +{ + int ret = -1; + int cmd_ret = 0; + virCommandPtr cmd = NULL; + char *class_id = NULL; + char *qdisc_id = NULL; + char *filter_id = NULL; + + if (id <= 2) { + virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid class ID %d"), id); + return -1; + } + + if (virAsprintf(&class_id, "1:%x", id) < 0 || + virAsprintf(&qdisc_id, "%x:", id) < 0 || + virAsprintf(&filter_id, "%u", id) < 0) { + virReportOOMError(); + goto cleanup; + } + + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "qdisc", "del", "dev", brname, + "handle", qdisc_id, NULL); + + /* Don't threat tc errors as fatal, but + * try to remove as much as possible */ + if (virCommandRun(cmd, &cmd_ret) < 0) + goto cleanup; + + virCommandFree(cmd); + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "filter", "del", "dev", brname, + "prio", filter_id, NULL); + + if (virCommandRun(cmd, &cmd_ret) < 0) + goto cleanup; + + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "class", "del", "dev", brname, + "classid", class_id, NULL); + + if (virCommandRun(cmd, &cmd_ret) < 0) + goto cleanup; + + ret = 0; + +cleanup: + VIR_FREE(filter_id); + VIR_FREE(qdisc_id); + VIR_FREE(class_id); + virCommandFree(cmd); + return ret; +} diff --git a/src/util/virnetdevbandwidth.h b/src/util/virnetdevbandwidth.h index d308ab2..5c9bce3 100644 --- a/src/util/virnetdevbandwidth.h +++ b/src/util/virnetdevbandwidth.h @@ -24,6 +24,7 @@ # define __VIR_NETDEV_BANDWIDTH_H__ # include "internal.h" +# include "virmacaddr.h" typedef struct _virNetDevBandwidthRate virNetDevBandwidthRate; typedef virNetDevBandwidthRate *virNetDevBandwidthRatePtr; @@ -53,4 +54,17 @@ int virNetDevBandwidthCopy(virNetDevBandwidthPtr *dest, const virNetDevBandwidth bool virNetDevBandwidthEqual(virNetDevBandwidthPtr a, virNetDevBandwidthPtr b); +int virNetDevBandwidthPlug(const char *brname, + virNetDevBandwidthPtr net_bandwidth, + const virMacAddrPtr ifmac_ptr, + virNetDevBandwidthPtr bandwidth, + unsigned int id) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) + ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4) + ATTRIBUTE_RETURN_CHECK; + +int virNetDevBandwidthUnplug(const char *brname, + unsigned int id) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; + #endif /* __VIR_NETDEV_BANDWIDTH_H__ */ -- 1.7.8.6

On 12/11/2012 11:09 AM, Michal Privoznik wrote:
These set bridge part of QoS when bringing domain's interface up. Long story short, if there's a 'floor' set, a new QoS class is created. ClassID MUST be unique within the bridge and should be kept for unplug phase. --- po/POTFILES.in | 1 + src/util/virnetdevbandwidth.c | 182 +++++++++++++++++++++++++++++++++++++++++ src/util/virnetdevbandwidth.h | 14 +++ 3 files changed, 197 insertions(+), 0 deletions(-)
diff --git a/po/POTFILES.in b/po/POTFILES.in index 586aa2b..d47f445 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -158,6 +158,7 @@ src/util/virinitctl.c src/util/virkeyfile.c src/util/virlockspace.c src/util/virnetdev.c +src/util/virnetdevbandwidth.c src/util/virnetdevbridge.c src/util/virnetdevmacvlan.c src/util/virnetdevopenvswitch.c diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c index 71c272e..f6f2a86 100644 --- a/src/util/virnetdevbandwidth.c +++ b/src/util/virnetdevbandwidth.c @@ -352,3 +352,185 @@ virNetDevBandwidthEqual(virNetDevBandwidthPtr a,
return true; } + +/* + * virNetDevBandwidthPlug: + * @brname: name of the bridge + * @net_bandwidth: QoS settings on @brname + * @ifmac: MAC of interface + * @bandwidth: QoS settings for interface + * @id: unique ID (MUST be greater than 2) + * + * Set bridge part of interface QoS settings, e.g. guaranteed + * bandwidth. @id is an unique ID (among @brname) from which + * other identifiers for class, qdisc and filter are derived. + * However, two classes were already set up (by + * virNetDevBandwidthSet). That's why this @id MUST be greater + * than 2. You may want to keep passed @id, as it is used later + * by virNetDevBandwidthUnplug. + * + * Returns: + * 0 if QoS set successfully + * -1 otherwise. + */ +int +virNetDevBandwidthPlug(const char *brname, + virNetDevBandwidthPtr net_bandwidth, + const virMacAddrPtr ifmac_ptr, + virNetDevBandwidthPtr bandwidth, + unsigned int id) +{ + int ret = -1; + virCommandPtr cmd = NULL; + char *class_id = NULL; + char *qdisc_id = NULL; + char *filter_id = NULL; + char *floor = NULL; + char *ceil = NULL; + unsigned char ifmac[VIR_MAC_BUFLEN]; + char ifmacStr[VIR_MAC_STRING_BUFLEN]; + char *mac[2] = {NULL, NULL}; + + if (id <= 2) { + virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid class ID %d"), id); + return -1; + } + + virMacAddrGetRaw(ifmac_ptr, ifmac); + virMacAddrFormat(ifmac_ptr, ifmacStr); + + if (!net_bandwidth || !net_bandwidth->in) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Bridge '%s' has no QoS set, therefore " + "unable to set 'floor' on '%s'"), + brname, ifmacStr); + return -1; + } + + if (virAsprintf(&class_id, "1:%x", id) < 0 || + virAsprintf(&qdisc_id, "%x:", id) < 0 || + virAsprintf(&filter_id, "%u", id) < 0 || + virAsprintf(&mac[0], "0x%02x%02x%02x%02x", ifmac[2], + ifmac[3], ifmac[4], ifmac[5]) < 0 || + virAsprintf(&mac[1], "0x%02x%02x", ifmac[0], ifmac[1]) < 0 || + virAsprintf(&floor, "%llukbps", bandwidth->in->floor) < 0 || + virAsprintf(&ceil, "%llukbps", net_bandwidth->in->peak ? + net_bandwidth->in->peak : + net_bandwidth->in->average) < 0) { + virReportOOMError(); + goto cleanup; + } + + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "class", "add", "dev", brname, "parent", "1:1", + "classid", class_id, "htb", "rate", floor, + "ceil", ceil, NULL); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + virCommandFree(cmd); + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "qdisc", "add", "dev", brname, "parent", + class_id, "handle", qdisc_id, "sfq", "perturb", + "10", NULL); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + virCommandFree(cmd); + cmd = virCommandNew(TC); + /* Okay, this not nice. But since libvirt does not know anything about + * interface IP address(es), and tc fw filter simply refuse to use ebtables + * marks, we need to use u32 selector to match MAC address. + * If libvirt will ever know something, remove this FIXME + */ + virCommandAddArgList(cmd, "filter", "add", "dev", brname, "protocol", "ip", + "prio", filter_id, "u32", + "match", "u16", "0x0800", "0xffff", "at", "-2", + "match", "u32", mac[0], "0xffffffff", "at", "-12", + "match", "u16", mac[1], "0xffff", "at", "-14", + "flowid", class_id, NULL); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + ret = 0; + +cleanup: + VIR_FREE(mac[1]); + VIR_FREE(mac[0]); + VIR_FREE(ceil); + VIR_FREE(floor); + VIR_FREE(filter_id); + VIR_FREE(qdisc_id); + VIR_FREE(class_id); + virCommandFree(cmd); + return ret; +} + +/* + * virNetDevBandwidthUnplug: + * @brname: from which bridge are we unplugging + * @id: unique identifier (MUST be greater than 2) + * + * Remove QoS settings from bridge. + * + * Returns 0 on success, -1 otherwise. + */ +int +virNetDevBandwidthUnplug(const char *brname, + unsigned int id) +{ + int ret = -1; + int cmd_ret = 0; + virCommandPtr cmd = NULL; + char *class_id = NULL; + char *qdisc_id = NULL; + char *filter_id = NULL; + + if (id <= 2) { + virReportError(VIR_ERR_INTERNAL_ERROR, _("Invalid class ID %d"), id); + return -1; + } + + if (virAsprintf(&class_id, "1:%x", id) < 0 || + virAsprintf(&qdisc_id, "%x:", id) < 0 || + virAsprintf(&filter_id, "%u", id) < 0) { + virReportOOMError(); + goto cleanup; + } + + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "qdisc", "del", "dev", brname, + "handle", qdisc_id, NULL); + + /* Don't threat tc errors as fatal, but + * try to remove as much as possible */ + if (virCommandRun(cmd, &cmd_ret) < 0) + goto cleanup; + + virCommandFree(cmd); + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "filter", "del", "dev", brname, + "prio", filter_id, NULL); + + if (virCommandRun(cmd, &cmd_ret) < 0) + goto cleanup; + + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "class", "del", "dev", brname, + "classid", class_id, NULL); + + if (virCommandRun(cmd, &cmd_ret) < 0) + goto cleanup; + + ret = 0; + +cleanup: + VIR_FREE(filter_id); + VIR_FREE(qdisc_id); + VIR_FREE(class_id); + virCommandFree(cmd); + return ret; +} diff --git a/src/util/virnetdevbandwidth.h b/src/util/virnetdevbandwidth.h index d308ab2..5c9bce3 100644 --- a/src/util/virnetdevbandwidth.h +++ b/src/util/virnetdevbandwidth.h @@ -24,6 +24,7 @@ # define __VIR_NETDEV_BANDWIDTH_H__
# include "internal.h" +# include "virmacaddr.h"
typedef struct _virNetDevBandwidthRate virNetDevBandwidthRate; typedef virNetDevBandwidthRate *virNetDevBandwidthRatePtr; @@ -53,4 +54,17 @@ int virNetDevBandwidthCopy(virNetDevBandwidthPtr *dest, const virNetDevBandwidth
bool virNetDevBandwidthEqual(virNetDevBandwidthPtr a, virNetDevBandwidthPtr b);
+int virNetDevBandwidthPlug(const char *brname, + virNetDevBandwidthPtr net_bandwidth, + const virMacAddrPtr ifmac_ptr, + virNetDevBandwidthPtr bandwidth, + unsigned int id) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) + ATTRIBUTE_NONNULL(3) ATTRIBUTE_NONNULL(4) + ATTRIBUTE_RETURN_CHECK; + +int virNetDevBandwidthUnplug(const char *brname, + unsigned int id) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; + #endif /* __VIR_NETDEV_BANDWIDTH_H__ */
You should add virNetDevBandwidth plug to libvirt_private.syms during this patch, but since you're not actually using it yet, it doesn't cause a make failure, (and you *do* add it in on the patch where you use it) so I'll let it slide :-) ACK.

This will be used whenever a NIC with guaranteed throughput is to be plugged into a bridge. It will adjust the average throughput of non guaranteed NICs (classid 1:2) to meet new requirements. --- src/util/virnetdevbandwidth.c | 49 +++++++++++++++++++++++++++++++++++++++++ src/util/virnetdevbandwidth.h | 6 +++++ 2 files changed, 55 insertions(+), 0 deletions(-) diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c index f6f2a86..19c00b6 100644 --- a/src/util/virnetdevbandwidth.c +++ b/src/util/virnetdevbandwidth.c @@ -534,3 +534,52 @@ cleanup: virCommandFree(cmd); return ret; } + +/** + * virNetDevBandwidthUpdateRate: + * @ifname: interface name + * @classid: ID of class to update + * @new_rate: new rate + * + * This function updates the 'rate' attribute of HTB class. + * It can be used whenever a new interface is plugged to a + * bridge to adjust average throughput of non guaranteed + * NICs. + * + * Returns 0 on success, -1 otherwise. + */ +int +virNetDevBandwidthUpdateRate(const char *ifname, + const char *class_id, + virNetDevBandwidthPtr bandwidth, + unsigned long long new_rate) +{ + int ret = -1; + virCommandPtr cmd = NULL; + char *rate = NULL; + char *ceil = NULL; + + if (virAsprintf(&rate, "%llukbps", new_rate) < 0 || + virAsprintf(&ceil, "%llukbps", bandwidth->in->peak ? + bandwidth->in->peak : + bandwidth->in->average) < 0) { + virReportOOMError(); + goto cleanup; + } + + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "class", "change", "dev", ifname, + "classid", class_id, "htb", "rate", rate, + "ceil", ceil, NULL); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + ret = 0; + +cleanup: + virCommandFree(cmd); + VIR_FREE(rate); + VIR_FREE(ceil); + return ret; +} diff --git a/src/util/virnetdevbandwidth.h b/src/util/virnetdevbandwidth.h index 5c9bce3..01a2ba5 100644 --- a/src/util/virnetdevbandwidth.h +++ b/src/util/virnetdevbandwidth.h @@ -67,4 +67,10 @@ int virNetDevBandwidthUnplug(const char *brname, unsigned int id) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK; +int virNetDevBandwidthUpdateRate(const char *ifname, + const char *class_id, + virNetDevBandwidthPtr bandwidth, + unsigned long long new_rate) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) + ATTRIBUTE_RETURN_CHECK; #endif /* __VIR_NETDEV_BANDWIDTH_H__ */ -- 1.7.8.6

On 12/11/2012 11:09 AM, Michal Privoznik wrote:
This will be used whenever a NIC with guaranteed throughput is to be plugged into a bridge. It will adjust the average throughput of non guaranteed NICs (classid 1:2) to meet new requirements. --- src/util/virnetdevbandwidth.c | 49 +++++++++++++++++++++++++++++++++++++++++ src/util/virnetdevbandwidth.h | 6 +++++ 2 files changed, 55 insertions(+), 0 deletions(-)
diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c index f6f2a86..19c00b6 100644 --- a/src/util/virnetdevbandwidth.c +++ b/src/util/virnetdevbandwidth.c @@ -534,3 +534,52 @@ cleanup: virCommandFree(cmd); return ret; } + +/** + * virNetDevBandwidthUpdateRate: + * @ifname: interface name + * @classid: ID of class to update + * @new_rate: new rate + * + * This function updates the 'rate' attribute of HTB class. + * It can be used whenever a new interface is plugged to a + * bridge to adjust average throughput of non guaranteed + * NICs. + * + * Returns 0 on success, -1 otherwise. + */ +int +virNetDevBandwidthUpdateRate(const char *ifname, + const char *class_id, + virNetDevBandwidthPtr bandwidth, + unsigned long long new_rate) +{ + int ret = -1; + virCommandPtr cmd = NULL; + char *rate = NULL; + char *ceil = NULL; + + if (virAsprintf(&rate, "%llukbps", new_rate) < 0 || + virAsprintf(&ceil, "%llukbps", bandwidth->in->peak ? + bandwidth->in->peak : + bandwidth->in->average) < 0) { + virReportOOMError(); + goto cleanup; + } + + cmd = virCommandNew(TC); + virCommandAddArgList(cmd, "class", "change", "dev", ifname, + "classid", class_id, "htb", "rate", rate, + "ceil", ceil, NULL); + + if (virCommandRun(cmd, NULL) < 0) + goto cleanup; + + ret = 0; + +cleanup: + virCommandFree(cmd); + VIR_FREE(rate); + VIR_FREE(ceil); + return ret; +} diff --git a/src/util/virnetdevbandwidth.h b/src/util/virnetdevbandwidth.h index 5c9bce3..01a2ba5 100644 --- a/src/util/virnetdevbandwidth.h +++ b/src/util/virnetdevbandwidth.h @@ -67,4 +67,10 @@ int virNetDevBandwidthUnplug(const char *brname, unsigned int id) ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK;
+int virNetDevBandwidthUpdateRate(const char *ifname, + const char *class_id, + virNetDevBandwidthPtr bandwidth, + unsigned long long new_rate) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) + ATTRIBUTE_RETURN_CHECK; #endif /* __VIR_NETDEV_BANDWIDTH_H__ */
Same comment about adding the function to libvirt_private.syms in the same patch where it is introduced, but since it's not yet used anywhere, and thus won't cause a build failure, I'll allow it. ACK.

Network should be notified if we plug in or unplug an interface, so it can perform some action, e.g. set/unset network part of QoS. However, we are doing this in very early stage, so iface->ifname isn't filled in yet. So whenever we want to report an error, we must use a different identifier, e.g. the MAC address. --- src/conf/domain_conf.h | 1 + src/conf/network_conf.c | 18 ++++ src/conf/network_conf.h | 4 + src/libvirt_private.syms | 3 + src/network/bridge_driver.c | 212 ++++++++++++++++++++++++++++++++++++++++++- 5 files changed, 236 insertions(+), 2 deletions(-) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 7ad5377..e6659cd 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -809,6 +809,7 @@ struct _virDomainActualNetDef { virNetDevVPortProfilePtr virtPortProfile; virNetDevBandwidthPtr bandwidth; virNetDevVlan vlan; + unsigned int class_id; /* class ID for bandwidth 'floor' */ }; /* Stores the virtual network interface configuration */ diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 3093418..ac326e1 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -47,6 +47,9 @@ #define MAX_BRIDGE_ID 256 #define VIR_FROM_THIS VIR_FROM_NETWORK +#define NEXT_FREE_CLASS_ID 3 +/* currently, /sbin/tc implementation allows up to 16 bits for minor class size */ +#define CLASS_ID_BITMAP_SIZE (1<<16) VIR_ENUM_IMPL(virNetworkForward, VIR_NETWORK_FORWARD_LAST, @@ -319,10 +322,25 @@ virNetworkAssignDef(virNetworkObjListPtr nets, virNetworkObjLock(network); network->def = def; + if (!(network->class_id = virBitmapNew(CLASS_ID_BITMAP_SIZE))) { + virReportOOMError(); + goto error; + } + + /* The first three class IDs are already taken */ + ignore_value(virBitmapSetBit(network->class_id, 0)); + ignore_value(virBitmapSetBit(network->class_id, 1)); + ignore_value(virBitmapSetBit(network->class_id, 2)); + + network->def = def; nets->objs[nets->count] = network; nets->count++; return network; +error: + virNetworkObjUnlock(network); + virNetworkObjFree(network); + return NULL; } diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index 949b3d2..364372d 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -38,6 +38,7 @@ # include "virnetdevvlan.h" # include "virmacaddr.h" # include "device_conf.h" +# include "bitmap.h" enum virNetworkForwardType { VIR_NETWORK_FORWARD_NONE = 0, @@ -226,6 +227,9 @@ struct _virNetworkObj { virNetworkDefPtr def; /* The current definition */ virNetworkDefPtr newDef; /* New definition to activate at shutdown */ + + virBitmapPtr class_id; /* bitmap of class IDs for QoS */ + unsigned long long floor_sum; /* sum of all 'floor'-s of attached NICs */ }; typedef struct _virNetworkObjList virNetworkObjList; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 499ab3b..6564676 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1506,7 +1506,10 @@ virNetDevBandwidthClear; virNetDevBandwidthCopy; virNetDevBandwidthEqual; virNetDevBandwidthFree; +virNetDevBandwidthPlug; virNetDevBandwidthSet; +virNetDevBandwidthUnplug; +virNetDevBandwidthUpdateRate; # virnetdevbridge.h diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 58f1d2e..0bee453 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -121,6 +121,11 @@ static int networkShutdownNetworkExternal(struct network_driver *driver, static void networkReloadIptablesRules(struct network_driver *driver); static void networkRefreshDaemons(struct network_driver *driver); +static int networkPlugBandwidth(virNetworkObjPtr net, + virDomainNetDefPtr iface); +static int networkUnplugBandwidth(virNetworkObjPtr net, + virDomainNetDefPtr iface); + static struct network_driver *driverState = NULL; static char * @@ -3491,6 +3496,7 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) enum virDomainNetType actualType = iface->type; virNetworkObjPtr network = NULL; virNetworkDefPtr netdef = NULL; + virNetDevBandwidthPtr bandwidth = NULL; virPortGroupDefPtr portgroup = NULL; virNetDevVPortProfilePtr virtport = iface->virtPortProfile; virNetDevVlanPtr vlan = NULL; @@ -3529,7 +3535,13 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) * (already in NetDef). Otherwise, if there is bandwidth info in * the portgroup, fill that into the ActualDef. */ - if (portgroup && !iface->bandwidth) { + + if (iface->bandwidth) + bandwidth = iface->bandwidth; + else if (portgroup && portgroup->bandwidth) + bandwidth = portgroup->bandwidth; + + if (bandwidth) { if (!iface->data.network.actual && (VIR_ALLOC(iface->data.network.actual) < 0)) { virReportOOMError(); @@ -3537,7 +3549,7 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) } if (virNetDevBandwidthCopy(&iface->data.network.actual->bandwidth, - portgroup->bandwidth) < 0) + bandwidth) < 0) goto error; } @@ -3550,6 +3562,10 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) */ if (iface->data.network.actual) iface->data.network.actual->type = VIR_DOMAIN_NET_TYPE_NETWORK; + + if (networkPlugBandwidth(network, iface) < 0) + goto error; + } else if ((netdef->forwardType == VIR_NETWORK_FORWARD_BRIDGE) && netdef->bridge) { @@ -4061,6 +4077,12 @@ networkReleaseActualDevice(virDomainNetDefPtr iface) } netdef = network->def; + if ((netdef->forwardType == VIR_NETWORK_FORWARD_NONE || + netdef->forwardType == VIR_NETWORK_FORWARD_NAT || + netdef->forwardType == VIR_NETWORK_FORWARD_ROUTE) && + networkUnplugBandwidth(network, iface) < 0) + goto error; + if ((!iface->data.network.actual) || ((actualType != VIR_DOMAIN_NET_TYPE_DIRECT) && (actualType != VIR_DOMAIN_NET_TYPE_HOSTDEV))) { @@ -4264,3 +4286,189 @@ cleanup: error: goto cleanup; } + +/** + * networkCheckBandwidth: + * @net: network QoS + * @iface: interface QoS + * @new_rate: new rate for non guaranteed class + * + * Returns: -1 if plugging would overcommit network QoS + * 0 if plugging is safe (@new_rate updated) + * 1 if no QoS is set (@new_rate untouched) + */ +static int +networkCheckBandwidth(virNetworkObjPtr net, + virDomainNetDefPtr iface, + unsigned long long *new_rate) +{ + int ret = -1; + virNetDevBandwidthPtr netBand = net->def->bandwidth; + virNetDevBandwidthPtr ifaceBand = iface->bandwidth; + unsigned long long tmp_floor_sum = net->floor_sum; + unsigned long long tmp_new_rate = 0; + char ifmac[VIR_MAC_STRING_BUFLEN]; + + if (!ifaceBand || !ifaceBand->in || !ifaceBand->in->floor || + !netBand || !netBand->in) + return 1; + + virMacAddrFormat(&iface->mac, ifmac); + + tmp_new_rate = netBand->in->average; + tmp_floor_sum += ifaceBand->in->floor; + + /* check against peak */ + if (netBand->in->peak) { + tmp_new_rate = netBand->in->peak; + if (tmp_floor_sum > netBand->in->peak) { + virReportError(VIR_ERR_OPERATION_INVALID, + _("Cannot plug '%s' interface into '%s' because it " + "would overcommit 'peak' on network '%s'"), + ifmac, + net->def->bridge, + net->def->name); + goto cleanup; + } + } else if (tmp_floor_sum > netBand->in->average) { + /* tmp_floor_sum can be between 'average' and 'peak' iff 'peak' is set. + * Otherwise, tmp_floor_sum must be below 'average'. */ + virReportError(VIR_ERR_OPERATION_INVALID, + _("Cannot plug '%s' interface into '%s' because it " + "would overcommit 'average' on network '%s'"), + ifmac, + net->def->bridge, + net->def->name); + goto cleanup; + } + + *new_rate = tmp_new_rate; + ret = 0; + +cleanup: + return ret; +} + +/** + * networkNextClassID: + * @net: network object + * + * Find next free class ID. @net is supposed + * to be locked already. If there is a free ID, + * it is marked as used and returned. + * + * Returns next free class ID or -1 if none is available. + */ +static ssize_t +networkNextClassID(virNetworkObjPtr net) +{ + size_t ret = 0; + bool is_set = false; + + while (virBitmapGetBit(net->class_id, ret, &is_set) == 0 && is_set) + ret++; + + if (is_set || virBitmapSetBit(net->class_id, ret) < 0) + return -1; + + return ret; +} + +static int +networkPlugBandwidth(virNetworkObjPtr net, + virDomainNetDefPtr iface) +{ + int ret = -1; + int plug_ret; + unsigned long long new_rate = 0; + ssize_t class_id = 0; + char ifmac[VIR_MAC_STRING_BUFLEN]; + + if ((plug_ret = networkCheckBandwidth(net, iface, &new_rate)) < 0) { + /* helper reported error */ + goto cleanup; + } + + if (plug_ret > 0) { + /* no QoS needs to be set; claim success */ + ret = 0; + goto cleanup; + } + + virMacAddrFormat(&iface->mac, ifmac); + if (iface->type != VIR_DOMAIN_NET_TYPE_NETWORK || + !iface->data.network.actual) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot set bandwidth on interface '%s' of type %d"), + ifmac, iface->type); + goto cleanup; + } + + /* generate new class_id */ + if ((class_id = networkNextClassID(net)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not generate next class ID")); + goto cleanup; + } + + plug_ret = virNetDevBandwidthPlug(net->def->bridge, + net->def->bandwidth, + &iface->mac, + iface->bandwidth, + class_id); + if (plug_ret < 0) { + ignore_value(virNetDevBandwidthUnplug(net->def->bridge, class_id)); + goto cleanup; + } + + /* QoS was set, generate new class ID */ + iface->data.network.actual->class_id = class_id; + /* update sum of 'floor'-s of attached NICs */ + net->floor_sum += iface->bandwidth->in->floor; + /* update rate for non guaranteed NICs */ + new_rate -= net->floor_sum; + if (virNetDevBandwidthUpdateRate(net->def->bridge, "1:2", + net->def->bandwidth, new_rate) < 0) + VIR_WARN("Unable to update rate for 1:2 class on %s bridge", + net->def->bridge); + + ret = 0; + +cleanup: + return ret; +} + +static int +networkUnplugBandwidth(virNetworkObjPtr net, + virDomainNetDefPtr iface) +{ + int ret = 0; + unsigned long long new_rate; + + if (iface->data.network.actual && + iface->data.network.actual->class_id) { + /* we must remove class from bridge */ + new_rate = net->def->bandwidth->in->average; + + if (net->def->bandwidth->in->peak > 0) + new_rate = net->def->bandwidth->in->peak; + + ret = virNetDevBandwidthUnplug(net->def->bridge, + iface->data.network.actual->class_id); + if (ret < 0) + goto cleanup; + /* update sum of 'floor'-s of attached NICs */ + net->floor_sum -= iface->bandwidth->in->floor; + /* update rate for non guaranteed NICs */ + new_rate -= net->floor_sum; + if (virNetDevBandwidthUpdateRate(net->def->bridge, "1:2", + net->def->bandwidth, new_rate) < 0) + VIR_WARN("Unable to update rate for 1:2 class on %s bridge", + net->def->bridge); + /* no class is associated any longer */ + iface->data.network.actual->class_id = 0; + } + +cleanup: + return ret; +} -- 1.7.8.6

On 12/11/2012 11:09 AM, Michal Privoznik wrote:
Network should be notified if we plug in or unplug an interface, so it can perform some action, e.g. set/unset network part of QoS. However, we are doing this in very early stage, so iface->ifname isn't filled in yet. So whenever we want to report an error, we must use a different identifier, e.g. the MAC address. --- src/conf/domain_conf.h | 1 + src/conf/network_conf.c | 18 ++++ src/conf/network_conf.h | 4 + src/libvirt_private.syms | 3 + src/network/bridge_driver.c | 212 ++++++++++++++++++++++++++++++++++++++++++- 5 files changed, 236 insertions(+), 2 deletions(-)
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 7ad5377..e6659cd 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -809,6 +809,7 @@ struct _virDomainActualNetDef { virNetDevVPortProfilePtr virtPortProfile; virNetDevBandwidthPtr bandwidth; virNetDevVlan vlan; + unsigned int class_id; /* class ID for bandwidth 'floor' */ };
/* Stores the virtual network interface configuration */ diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 3093418..ac326e1 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -47,6 +47,9 @@
#define MAX_BRIDGE_ID 256 #define VIR_FROM_THIS VIR_FROM_NETWORK +#define NEXT_FREE_CLASS_ID 3 +/* currently, /sbin/tc implementation allows up to 16 bits for minor class size */ +#define CLASS_ID_BITMAP_SIZE (1<<16)
VIR_ENUM_IMPL(virNetworkForward, VIR_NETWORK_FORWARD_LAST, @@ -319,10 +322,25 @@ virNetworkAssignDef(virNetworkObjListPtr nets, virNetworkObjLock(network); network->def = def;
+ if (!(network->class_id = virBitmapNew(CLASS_ID_BITMAP_SIZE))) { + virReportOOMError(); + goto error; + } + + /* The first three class IDs are already taken */ + ignore_value(virBitmapSetBit(network->class_id, 0)); + ignore_value(virBitmapSetBit(network->class_id, 1)); + ignore_value(virBitmapSetBit(network->class_id, 2)); + + network->def = def; nets->objs[nets->count] = network; nets->count++;
return network; +error: + virNetworkObjUnlock(network); + virNetworkObjFree(network); + return NULL;
}
diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index 949b3d2..364372d 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -38,6 +38,7 @@ # include "virnetdevvlan.h" # include "virmacaddr.h" # include "device_conf.h" +# include "bitmap.h"
enum virNetworkForwardType { VIR_NETWORK_FORWARD_NONE = 0, @@ -226,6 +227,9 @@ struct _virNetworkObj {
virNetworkDefPtr def; /* The current definition */ virNetworkDefPtr newDef; /* New definition to activate at shutdown */ + + virBitmapPtr class_id; /* bitmap of class IDs for QoS */ + unsigned long long floor_sum; /* sum of all 'floor'-s of attached NICs */ };
typedef struct _virNetworkObjList virNetworkObjList; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 499ab3b..6564676 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1506,7 +1506,10 @@ virNetDevBandwidthClear; virNetDevBandwidthCopy; virNetDevBandwidthEqual; virNetDevBandwidthFree; +virNetDevBandwidthPlug; virNetDevBandwidthSet; +virNetDevBandwidthUnplug; +virNetDevBandwidthUpdateRate;
Yeah, those are the changes I would have put with the earlier patch...
# virnetdevbridge.h diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 58f1d2e..0bee453 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -121,6 +121,11 @@ static int networkShutdownNetworkExternal(struct network_driver *driver, static void networkReloadIptablesRules(struct network_driver *driver); static void networkRefreshDaemons(struct network_driver *driver);
+static int networkPlugBandwidth(virNetworkObjPtr net, + virDomainNetDefPtr iface); +static int networkUnplugBandwidth(virNetworkObjPtr net, + virDomainNetDefPtr iface); + static struct network_driver *driverState = NULL;
static char * @@ -3491,6 +3496,7 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) enum virDomainNetType actualType = iface->type; virNetworkObjPtr network = NULL; virNetworkDefPtr netdef = NULL; + virNetDevBandwidthPtr bandwidth = NULL; virPortGroupDefPtr portgroup = NULL; virNetDevVPortProfilePtr virtport = iface->virtPortProfile; virNetDevVlanPtr vlan = NULL; @@ -3529,7 +3535,13 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) * (already in NetDef). Otherwise, if there is bandwidth info in * the portgroup, fill that into the ActualDef. */ - if (portgroup && !iface->bandwidth) { + + if (iface->bandwidth) + bandwidth = iface->bandwidth; + else if (portgroup && portgroup->bandwidth) + bandwidth = portgroup->bandwidth; + + if (bandwidth) { if (!iface->data.network.actual && (VIR_ALLOC(iface->data.network.actual) < 0)) { virReportOOMError(); @@ -3537,7 +3549,7 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) }
if (virNetDevBandwidthCopy(&iface->data.network.actual->bandwidth, - portgroup->bandwidth) < 0) + bandwidth) < 0) goto error; }
@@ -3550,6 +3562,10 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) */ if (iface->data.network.actual) iface->data.network.actual->type = VIR_DOMAIN_NET_TYPE_NETWORK; + + if (networkPlugBandwidth(network, iface) < 0) + goto error; + } else if ((netdef->forwardType == VIR_NETWORK_FORWARD_BRIDGE) && netdef->bridge) {
@@ -4061,6 +4077,12 @@ networkReleaseActualDevice(virDomainNetDefPtr iface) } netdef = network->def;
+ if ((netdef->forwardType == VIR_NETWORK_FORWARD_NONE || + netdef->forwardType == VIR_NETWORK_FORWARD_NAT || + netdef->forwardType == VIR_NETWORK_FORWARD_ROUTE) && + networkUnplugBandwidth(network, iface) < 0) + goto error; + if ((!iface->data.network.actual) || ((actualType != VIR_DOMAIN_NET_TYPE_DIRECT) && (actualType != VIR_DOMAIN_NET_TYPE_HOSTDEV))) { @@ -4264,3 +4286,189 @@ cleanup: error: goto cleanup; } + +/** + * networkCheckBandwidth: + * @net: network QoS + * @iface: interface QoS + * @new_rate: new rate for non guaranteed class + * + * Returns: -1 if plugging would overcommit network QoS + * 0 if plugging is safe (@new_rate updated) + * 1 if no QoS is set (@new_rate untouched) + */ +static int +networkCheckBandwidth(virNetworkObjPtr net, + virDomainNetDefPtr iface, + unsigned long long *new_rate) +{ + int ret = -1; + virNetDevBandwidthPtr netBand = net->def->bandwidth; + virNetDevBandwidthPtr ifaceBand = iface->bandwidth; + unsigned long long tmp_floor_sum = net->floor_sum; + unsigned long long tmp_new_rate = 0; + char ifmac[VIR_MAC_STRING_BUFLEN]; + + if (!ifaceBand || !ifaceBand->in || !ifaceBand->in->floor || + !netBand || !netBand->in) + return 1; + + virMacAddrFormat(&iface->mac, ifmac); + + tmp_new_rate = netBand->in->average; + tmp_floor_sum += ifaceBand->in->floor; + + /* check against peak */ + if (netBand->in->peak) { + tmp_new_rate = netBand->in->peak; + if (tmp_floor_sum > netBand->in->peak) { + virReportError(VIR_ERR_OPERATION_INVALID, + _("Cannot plug '%s' interface into '%s' because it " + "would overcommit 'peak' on network '%s'"), + ifmac, + net->def->bridge, + net->def->name); + goto cleanup; + } + } else if (tmp_floor_sum > netBand->in->average) { + /* tmp_floor_sum can be between 'average' and 'peak' iff 'peak' is set. + * Otherwise, tmp_floor_sum must be below 'average'. */ + virReportError(VIR_ERR_OPERATION_INVALID, + _("Cannot plug '%s' interface into '%s' because it " + "would overcommit 'average' on network '%s'"), + ifmac, + net->def->bridge, + net->def->name); + goto cleanup; + } + + *new_rate = tmp_new_rate; + ret = 0; + +cleanup: + return ret; +} + +/** + * networkNextClassID: + * @net: network object + * + * Find next free class ID. @net is supposed + * to be locked already. If there is a free ID, + * it is marked as used and returned. + * + * Returns next free class ID or -1 if none is available. + */ +static ssize_t +networkNextClassID(virNetworkObjPtr net) +{ + size_t ret = 0; + bool is_set = false; + + while (virBitmapGetBit(net->class_id, ret, &is_set) == 0 && is_set) + ret++; + + if (is_set || virBitmapSetBit(net->class_id, ret) < 0) + return -1; + + return ret; +} + +static int +networkPlugBandwidth(virNetworkObjPtr net, + virDomainNetDefPtr iface) +{ + int ret = -1; + int plug_ret; + unsigned long long new_rate = 0; + ssize_t class_id = 0; + char ifmac[VIR_MAC_STRING_BUFLEN]; + + if ((plug_ret = networkCheckBandwidth(net, iface, &new_rate)) < 0) { + /* helper reported error */ + goto cleanup; + } + + if (plug_ret > 0) { + /* no QoS needs to be set; claim success */ + ret = 0; + goto cleanup; + } + + virMacAddrFormat(&iface->mac, ifmac); + if (iface->type != VIR_DOMAIN_NET_TYPE_NETWORK || + !iface->data.network.actual) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Cannot set bandwidth on interface '%s' of type %d"), + ifmac, iface->type); + goto cleanup; + } + + /* generate new class_id */ + if ((class_id = networkNextClassID(net)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not generate next class ID")); + goto cleanup; + } + + plug_ret = virNetDevBandwidthPlug(net->def->bridge, + net->def->bandwidth, + &iface->mac, + iface->bandwidth, + class_id); + if (plug_ret < 0) { + ignore_value(virNetDevBandwidthUnplug(net->def->bridge, class_id)); + goto cleanup; + } + + /* QoS was set, generate new class ID */ + iface->data.network.actual->class_id = class_id; + /* update sum of 'floor'-s of attached NICs */ + net->floor_sum += iface->bandwidth->in->floor; + /* update rate for non guaranteed NICs */ + new_rate -= net->floor_sum; + if (virNetDevBandwidthUpdateRate(net->def->bridge, "1:2", + net->def->bandwidth, new_rate) < 0) + VIR_WARN("Unable to update rate for 1:2 class on %s bridge", + net->def->bridge); + + ret = 0; + +cleanup: + return ret; +} + +static int +networkUnplugBandwidth(virNetworkObjPtr net, + virDomainNetDefPtr iface) +{ + int ret = 0; + unsigned long long new_rate; + + if (iface->data.network.actual && + iface->data.network.actual->class_id) { + /* we must remove class from bridge */ + new_rate = net->def->bandwidth->in->average; + + if (net->def->bandwidth->in->peak > 0) + new_rate = net->def->bandwidth->in->peak; + + ret = virNetDevBandwidthUnplug(net->def->bridge, + iface->data.network.actual->class_id); + if (ret < 0) + goto cleanup; + /* update sum of 'floor'-s of attached NICs */ + net->floor_sum -= iface->bandwidth->in->floor; + /* update rate for non guaranteed NICs */ + new_rate -= net->floor_sum; + if (virNetDevBandwidthUpdateRate(net->def->bridge, "1:2", + net->def->bandwidth, new_rate) < 0) + VIR_WARN("Unable to update rate for 1:2 class on %s bridge", + net->def->bridge); + /* no class is associated any longer */ + iface->data.network.actual->class_id = 0; + } + +cleanup: + return ret; +}
ACK.

Currently, we are only keeping a inactive XML configuration in status dir. This is no longer enough as we need to keep this class_id attribute so we don't overwrite old entries when the daemon restarts. However, since there has already been release which has just <network/> as root element, and we want to keep things compatible, detect that loaded status file is older one, and don't scream about it. --- src/conf/network_conf.c | 207 ++++++++++++++++++++++++++++++++++--------- src/conf/network_conf.h | 2 + src/libvirt_private.syms | 1 + src/network/bridge_driver.c | 27 +++++-- 4 files changed, 190 insertions(+), 47 deletions(-) diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index ac326e1..29e3127 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1695,6 +1695,79 @@ cleanup: return def; } +int +virNetworkObjUpdateParseFile(const char *filename, + virNetworkObjPtr net) +{ + int ret = -1; + xmlDocPtr xml = NULL; + xmlNodePtr node = NULL; + virNetworkDefPtr tmp = NULL; + xmlXPathContextPtr ctxt = NULL; + + xml = virXMLParse(filename, NULL, _("(network status)")); + if (!xml) + return -1; + + ctxt = xmlXPathNewContext(xml); + if (ctxt == NULL) { + virReportOOMError(); + goto cleanup; + } + + node = xmlDocGetRootElement(xml); + if (xmlStrEqual(node->name, BAD_CAST "networkstatus")) { + /* Newer network status file. Contains useful + * info which are not to be found in bare config XML */ + char *class_id = NULL; + char *floor_sum = NULL; + + ctxt->node = node; + class_id = virXPathString("string(./class_id[1]/@bitmap)", ctxt); + if (class_id && + virBitmapParse(class_id, ',', + &net->class_id, CLASS_ID_BITMAP_SIZE) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Malformed 'class_id' attribute: %s"), + class_id); + VIR_FREE(class_id); + goto cleanup; + } + VIR_FREE(class_id); + + floor_sum = virXPathString("string(./floor[1]/@sum)", ctxt); + if (floor_sum && + virStrToLong_ull(floor_sum, NULL, 10, &net->floor_sum) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Malformed 'floor_sum' attribute: %s"), + floor_sum); + VIR_FREE(floor_sum); + } + VIR_FREE(floor_sum); + } + + node = virXPathNode("//network", ctxt); + if (!node) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not find any 'network' element")); + goto cleanup; + } + + ctxt->node = node; + tmp = virNetworkDefParseXML(ctxt); + + if (tmp) { + net->newDef = net->def; + net->def = tmp; + } + + ret = 0; + +cleanup: + xmlXPathFreeContext(ctxt); + return ret; +} + static int virNetworkDNSDefFormat(virBufferPtr buf, virNetworkDNSDefPtr def) @@ -1873,26 +1946,28 @@ virPortGroupDefFormat(virBufferPtr buf, return 0; } -char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags) +static int +virNetworkDefFormatInternal(virBufferPtr buf, + const virNetworkDefPtr def, + unsigned int flags) { - virBuffer buf = VIR_BUFFER_INITIALIZER; unsigned char *uuid; char uuidstr[VIR_UUID_STRING_BUFLEN]; int ii; - virBufferAddLit(&buf, "<network"); + virBufferAddLit(buf, "<network"); if (!(flags & VIR_NETWORK_XML_INACTIVE) && (def->connections > 0)) { - virBufferAsprintf(&buf, " connections='%d'", def->connections); + virBufferAsprintf(buf, " connections='%d'", def->connections); } if (def->ipv6nogw) - virBufferAddLit(&buf, " ipv6='yes'"); - virBufferAddLit(&buf, ">\n"); - virBufferAdjustIndent(&buf, 2); - virBufferEscapeString(&buf, "<name>%s</name>\n", def->name); + virBufferAddLit(buf, " ipv6='yes'"); + virBufferAddLit(buf, ">\n"); + virBufferAdjustIndent(buf, 2); + virBufferEscapeString(buf, "<name>%s</name>\n", def->name); uuid = def->uuid; virUUIDFormat(uuid, uuidstr); - virBufferAsprintf(&buf, "<uuid>%s</uuid>\n", uuidstr); + virBufferAsprintf(buf, "<uuid>%s</uuid>\n", uuidstr); if (def->forwardType != VIR_NETWORK_FORWARD_NONE) { const char *dev = NULL; @@ -1906,40 +1981,40 @@ char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags) def->forwardType, def->name); goto error; } - virBufferAddLit(&buf, "<forward"); - virBufferEscapeString(&buf, " dev='%s'", dev); - virBufferAsprintf(&buf, " mode='%s'", mode); + virBufferAddLit(buf, "<forward"); + virBufferEscapeString(buf, " dev='%s'", dev); + virBufferAsprintf(buf, " mode='%s'", mode); if (def->forwardType == VIR_NETWORK_FORWARD_HOSTDEV) { if (def->managed == 1) - virBufferAddLit(&buf, " managed='yes'"); + virBufferAddLit(buf, " managed='yes'"); else - virBufferAddLit(&buf, " managed='no'"); + virBufferAddLit(buf, " managed='no'"); } - virBufferAsprintf(&buf, "%s>\n", + virBufferAsprintf(buf, "%s>\n", (def->nForwardIfs || def->nForwardPfs) ? "" : "/"); - virBufferAdjustIndent(&buf, 2); + virBufferAdjustIndent(buf, 2); /* For now, hard-coded to at most 1 forwardPfs */ if (def->nForwardPfs) - virBufferEscapeString(&buf, "<pf dev='%s'/>\n", + virBufferEscapeString(buf, "<pf dev='%s'/>\n", def->forwardPfs[0].dev); if (def->nForwardIfs && (!def->nForwardPfs || !(flags & VIR_NETWORK_XML_INACTIVE))) { for (ii = 0; ii < def->nForwardIfs; ii++) { if (def->forwardType != VIR_NETWORK_FORWARD_HOSTDEV) { - virBufferEscapeString(&buf, "<interface dev='%s'", + virBufferEscapeString(buf, "<interface dev='%s'", def->forwardIfs[ii].device.dev); if (!(flags & VIR_NETWORK_XML_INACTIVE) && (def->forwardIfs[ii].connections > 0)) { - virBufferAsprintf(&buf, " connections='%d'", + virBufferAsprintf(buf, " connections='%d'", def->forwardIfs[ii].connections); } - virBufferAddLit(&buf, "/>\n"); + virBufferAddLit(buf, "/>\n"); } else { if (def->forwardIfs[ii].type == VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_PCI) { - if (virDevicePCIAddressFormat(&buf, + if (virDevicePCIAddressFormat(buf, def->forwardIfs[ii].device.pci, true) < 0) goto error; @@ -1947,67 +2022,116 @@ char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags) } } } - virBufferAdjustIndent(&buf, -2); + virBufferAdjustIndent(buf, -2); if (def->nForwardPfs || def->nForwardIfs) - virBufferAddLit(&buf, "</forward>\n"); + virBufferAddLit(buf, "</forward>\n"); } if (def->forwardType == VIR_NETWORK_FORWARD_NONE || - def->forwardType == VIR_NETWORK_FORWARD_NAT || - def->forwardType == VIR_NETWORK_FORWARD_ROUTE) { + def->forwardType == VIR_NETWORK_FORWARD_NAT || + def->forwardType == VIR_NETWORK_FORWARD_ROUTE) { - virBufferAddLit(&buf, "<bridge"); + virBufferAddLit(buf, "<bridge"); if (def->bridge) - virBufferEscapeString(&buf, " name='%s'", def->bridge); - virBufferAsprintf(&buf, " stp='%s' delay='%ld' />\n", + virBufferEscapeString(buf, " name='%s'", def->bridge); + virBufferAsprintf(buf, " stp='%s' delay='%ld' />\n", def->stp ? "on" : "off", def->delay); } else if (def->forwardType == VIR_NETWORK_FORWARD_BRIDGE && def->bridge) { - virBufferEscapeString(&buf, "<bridge name='%s' />\n", def->bridge); + virBufferEscapeString(buf, "<bridge name='%s' />\n", def->bridge); } if (def->mac_specified) { char macaddr[VIR_MAC_STRING_BUFLEN]; virMacAddrFormat(&def->mac, macaddr); - virBufferAsprintf(&buf, "<mac address='%s'/>\n", macaddr); + virBufferAsprintf(buf, "<mac address='%s'/>\n", macaddr); } if (def->domain) - virBufferAsprintf(&buf, "<domain name='%s'/>\n", def->domain); + virBufferAsprintf(buf, "<domain name='%s'/>\n", def->domain); - if (virNetworkDNSDefFormat(&buf, def->dns) < 0) + if (virNetworkDNSDefFormat(buf, def->dns) < 0) goto error; - if (virNetDevVlanFormat(&def->vlan, &buf) < 0) + if (virNetDevVlanFormat(&def->vlan, buf) < 0) goto error; - if (virNetDevBandwidthFormat(def->bandwidth, &buf) < 0) + if (virNetDevBandwidthFormat(def->bandwidth, buf) < 0) goto error; for (ii = 0; ii < def->nips; ii++) { - if (virNetworkIpDefFormat(&buf, &def->ips[ii]) < 0) + if (virNetworkIpDefFormat(buf, &def->ips[ii]) < 0) goto error; } - if (virNetDevVPortProfileFormat(def->virtPortProfile, &buf) < 0) + if (virNetDevVPortProfileFormat(def->virtPortProfile, buf) < 0) goto error; for (ii = 0; ii < def->nPortGroups; ii++) - if (virPortGroupDefFormat(&buf, &def->portGroups[ii]) < 0) + if (virPortGroupDefFormat(buf, &def->portGroups[ii]) < 0) goto error; + virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</network>\n"); + + return 0; + +error: + return -1; +} + +char * +virNetworkDefFormat(virNetworkDefPtr def, + unsigned int flags) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (virNetworkDefFormatInternal(&buf, def, flags) < 0) + goto error; + + if (virBufferError(&buf)) + goto no_memory; + + return virBufferContentAndReset(&buf); + +no_memory: + virReportOOMError(); +error: + virBufferFreeAndReset(&buf); + return NULL; +} + +static char * +virNetworkObjFormat(virNetworkObjPtr net, + unsigned int flags) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + char *class_id = virBitmapFormat(net->class_id); + + if (!class_id) + goto no_memory; + + virBufferAddLit(&buf, "<networkstatus>\n"); + virBufferAsprintf(&buf, " <class_id bitmap='%s'/>\n", class_id); + virBufferAsprintf(&buf, " <floor sum='%llu'/>\n", net->floor_sum); + VIR_FREE(class_id); + + virBufferAdjustIndent(&buf, 2); + if (virNetworkDefFormatInternal(&buf, net->def, flags) < 0) + goto error; + virBufferAdjustIndent(&buf, -2); - virBufferAddLit(&buf, "</network>\n"); + virBufferAddLit(&buf, "</networkstatus>"); if (virBufferError(&buf)) goto no_memory; return virBufferContentAndReset(&buf); - no_memory: +no_memory: virReportOOMError(); - error: +error: virBufferFreeAndReset(&buf); return NULL; } @@ -2079,9 +2203,10 @@ int virNetworkSaveStatus(const char *statusDir, virNetworkObjPtr network) { int ret = -1; + int flags = 0; char *xml; - if (!(xml = virNetworkDefFormat(network->def, 0))) + if (!(xml = virNetworkObjFormat(network, flags))) goto cleanup; if (virNetworkSaveXML(statusDir, network->def, xml)) diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index 364372d..519f73d 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -276,6 +276,8 @@ virNetworkDefPtr virNetworkDefParseString(const char *xmlStr); virNetworkDefPtr virNetworkDefParseFile(const char *filename); virNetworkDefPtr virNetworkDefParseNode(xmlDocPtr xml, xmlNodePtr root); +int virNetworkObjUpdateParseFile(const char *filename, + virNetworkObjPtr net); char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 6564676..ab44642 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -873,6 +873,7 @@ virNetworkObjSetDefTransient; virNetworkObjUnlock; virNetworkObjUnsetDefTransient; virNetworkObjUpdate; +virNetworkObjUpdateParseFile; virNetworkRemoveInactive; virNetworkSaveConfig; virNetworkSaveStatus; diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 0bee453..cad2192 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -242,7 +242,6 @@ networkFindActiveConfigs(struct network_driver *driver) { for (i = 0 ; i < driver->networks.count ; i++) { virNetworkObjPtr obj = driver->networks.objs[i]; - virNetworkDefPtr tmp; char *config; virNetworkObjLock(obj); @@ -260,12 +259,10 @@ networkFindActiveConfigs(struct network_driver *driver) { } /* Try and load the live config */ - tmp = virNetworkDefParseFile(config); + if (virNetworkObjUpdateParseFile(config, obj) < 0) + VIR_WARN("Unable to update config of '%s' network", + obj->def->name); VIR_FREE(config); - if (tmp) { - obj->newDef = obj->def; - obj->def = tmp; - } /* If bridge exists, then mark it active */ if (obj->def->bridge && @@ -4425,6 +4422,14 @@ networkPlugBandwidth(virNetworkObjPtr net, iface->data.network.actual->class_id = class_id; /* update sum of 'floor'-s of attached NICs */ net->floor_sum += iface->bandwidth->in->floor; + /* update status file */ + if (virNetworkSaveStatus(NETWORK_STATE_DIR, net) < 0) { + ignore_value(virBitmapClearBit(net->class_id, class_id)); + net->floor_sum -= iface->bandwidth->in->floor; + iface->data.network.actual->class_id = 0; + ignore_value(virNetDevBandwidthUnplug(net->def->bridge, class_id)); + goto cleanup; + } /* update rate for non guaranteed NICs */ new_rate -= net->floor_sum; if (virNetDevBandwidthUpdateRate(net->def->bridge, "1:2", @@ -4459,6 +4464,16 @@ networkUnplugBandwidth(virNetworkObjPtr net, goto cleanup; /* update sum of 'floor'-s of attached NICs */ net->floor_sum -= iface->bandwidth->in->floor; + /* return class ID */ + ignore_value(virBitmapClearBit(net->class_id, + iface->data.network.actual->class_id)); + /* update status file */ + if (virNetworkSaveStatus(NETWORK_STATE_DIR, net) < 0) { + net->floor_sum += iface->bandwidth->in->floor; + ignore_value(virBitmapSetBit(net->class_id, + iface->data.network.actual->class_id)); + goto cleanup; + } /* update rate for non guaranteed NICs */ new_rate -= net->floor_sum; if (virNetDevBandwidthUpdateRate(net->def->bridge, "1:2", -- 1.7.8.6

On 12/11/2012 11:09 AM, Michal Privoznik wrote:
Currently, we are only keeping a inactive XML configuration in status dir. This is no longer enough as we need to keep this class_id attribute so we don't overwrite old entries when the daemon restarts. However, since there has already been release which has just <network/> as root element, and we want to keep things compatible, detect that loaded status file is older one, and don't scream about it. --- src/conf/network_conf.c | 207 ++++++++++++++++++++++++++++++++++--------- src/conf/network_conf.h | 2 + src/libvirt_private.syms | 1 + src/network/bridge_driver.c | 27 +++++-- 4 files changed, 190 insertions(+), 47 deletions(-)
diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index ac326e1..29e3127 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1695,6 +1695,79 @@ cleanup: return def; }
+int +virNetworkObjUpdateParseFile(const char *filename, + virNetworkObjPtr net) +{ + int ret = -1; + xmlDocPtr xml = NULL; + xmlNodePtr node = NULL; + virNetworkDefPtr tmp = NULL; + xmlXPathContextPtr ctxt = NULL; + + xml = virXMLParse(filename, NULL, _("(network status)")); + if (!xml) + return -1; + + ctxt = xmlXPathNewContext(xml); + if (ctxt == NULL) { + virReportOOMError(); + goto cleanup; + } + + node = xmlDocGetRootElement(xml); + if (xmlStrEqual(node->name, BAD_CAST "networkstatus")) { + /* Newer network status file. Contains useful + * info which are not to be found in bare config XML */ + char *class_id = NULL; + char *floor_sum = NULL; + + ctxt->node = node; + class_id = virXPathString("string(./class_id[1]/@bitmap)", ctxt); + if (class_id && + virBitmapParse(class_id, ',', + &net->class_id, CLASS_ID_BITMAP_SIZE) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Malformed 'class_id' attribute: %s"), + class_id); + VIR_FREE(class_id); + goto cleanup; + } + VIR_FREE(class_id); + + floor_sum = virXPathString("string(./floor[1]/@sum)", ctxt); + if (floor_sum && + virStrToLong_ull(floor_sum, NULL, 10, &net->floor_sum) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Malformed 'floor_sum' attribute: %s"), + floor_sum); + VIR_FREE(floor_sum); + } + VIR_FREE(floor_sum); + } + + node = virXPathNode("//network", ctxt); + if (!node) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not find any 'network' element")); + goto cleanup; + } + + ctxt->node = node; + tmp = virNetworkDefParseXML(ctxt); + + if (tmp) { + net->newDef = net->def; + net->def = tmp; + } + + ret = 0; + +cleanup: + xmlXPathFreeContext(ctxt); + return ret; +} + static int virNetworkDNSDefFormat(virBufferPtr buf, virNetworkDNSDefPtr def) @@ -1873,26 +1946,28 @@ virPortGroupDefFormat(virBufferPtr buf, return 0; }
-char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags) +static int +virNetworkDefFormatInternal(virBufferPtr buf, + const virNetworkDefPtr def, + unsigned int flags) { - virBuffer buf = VIR_BUFFER_INITIALIZER; unsigned char *uuid; char uuidstr[VIR_UUID_STRING_BUFLEN]; int ii;
- virBufferAddLit(&buf, "<network"); + virBufferAddLit(buf, "<network"); if (!(flags & VIR_NETWORK_XML_INACTIVE) && (def->connections > 0)) { - virBufferAsprintf(&buf, " connections='%d'", def->connections); + virBufferAsprintf(buf, " connections='%d'", def->connections); } if (def->ipv6nogw) - virBufferAddLit(&buf, " ipv6='yes'"); - virBufferAddLit(&buf, ">\n"); - virBufferAdjustIndent(&buf, 2); - virBufferEscapeString(&buf, "<name>%s</name>\n", def->name); + virBufferAddLit(buf, " ipv6='yes'"); + virBufferAddLit(buf, ">\n"); + virBufferAdjustIndent(buf, 2); + virBufferEscapeString(buf, "<name>%s</name>\n", def->name);
uuid = def->uuid; virUUIDFormat(uuid, uuidstr); - virBufferAsprintf(&buf, "<uuid>%s</uuid>\n", uuidstr); + virBufferAsprintf(buf, "<uuid>%s</uuid>\n", uuidstr);
if (def->forwardType != VIR_NETWORK_FORWARD_NONE) { const char *dev = NULL; @@ -1906,40 +1981,40 @@ char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags) def->forwardType, def->name); goto error; } - virBufferAddLit(&buf, "<forward"); - virBufferEscapeString(&buf, " dev='%s'", dev); - virBufferAsprintf(&buf, " mode='%s'", mode); + virBufferAddLit(buf, "<forward"); + virBufferEscapeString(buf, " dev='%s'", dev); + virBufferAsprintf(buf, " mode='%s'", mode); if (def->forwardType == VIR_NETWORK_FORWARD_HOSTDEV) { if (def->managed == 1) - virBufferAddLit(&buf, " managed='yes'"); + virBufferAddLit(buf, " managed='yes'"); else - virBufferAddLit(&buf, " managed='no'"); + virBufferAddLit(buf, " managed='no'"); } - virBufferAsprintf(&buf, "%s>\n", + virBufferAsprintf(buf, "%s>\n", (def->nForwardIfs || def->nForwardPfs) ? "" : "/"); - virBufferAdjustIndent(&buf, 2); + virBufferAdjustIndent(buf, 2);
/* For now, hard-coded to at most 1 forwardPfs */ if (def->nForwardPfs) - virBufferEscapeString(&buf, "<pf dev='%s'/>\n", + virBufferEscapeString(buf, "<pf dev='%s'/>\n", def->forwardPfs[0].dev);
if (def->nForwardIfs && (!def->nForwardPfs || !(flags & VIR_NETWORK_XML_INACTIVE))) { for (ii = 0; ii < def->nForwardIfs; ii++) { if (def->forwardType != VIR_NETWORK_FORWARD_HOSTDEV) { - virBufferEscapeString(&buf, "<interface dev='%s'", + virBufferEscapeString(buf, "<interface dev='%s'", def->forwardIfs[ii].device.dev); if (!(flags & VIR_NETWORK_XML_INACTIVE) && (def->forwardIfs[ii].connections > 0)) { - virBufferAsprintf(&buf, " connections='%d'", + virBufferAsprintf(buf, " connections='%d'", def->forwardIfs[ii].connections); } - virBufferAddLit(&buf, "/>\n"); + virBufferAddLit(buf, "/>\n"); } else { if (def->forwardIfs[ii].type == VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_PCI) { - if (virDevicePCIAddressFormat(&buf, + if (virDevicePCIAddressFormat(buf, def->forwardIfs[ii].device.pci, true) < 0) goto error; @@ -1947,67 +2022,116 @@ char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags) } } } - virBufferAdjustIndent(&buf, -2); + virBufferAdjustIndent(buf, -2); if (def->nForwardPfs || def->nForwardIfs) - virBufferAddLit(&buf, "</forward>\n"); + virBufferAddLit(buf, "</forward>\n"); }
if (def->forwardType == VIR_NETWORK_FORWARD_NONE || - def->forwardType == VIR_NETWORK_FORWARD_NAT || - def->forwardType == VIR_NETWORK_FORWARD_ROUTE) { + def->forwardType == VIR_NETWORK_FORWARD_NAT || + def->forwardType == VIR_NETWORK_FORWARD_ROUTE) {
- virBufferAddLit(&buf, "<bridge"); + virBufferAddLit(buf, "<bridge"); if (def->bridge) - virBufferEscapeString(&buf, " name='%s'", def->bridge); - virBufferAsprintf(&buf, " stp='%s' delay='%ld' />\n", + virBufferEscapeString(buf, " name='%s'", def->bridge); + virBufferAsprintf(buf, " stp='%s' delay='%ld' />\n", def->stp ? "on" : "off", def->delay); } else if (def->forwardType == VIR_NETWORK_FORWARD_BRIDGE && def->bridge) { - virBufferEscapeString(&buf, "<bridge name='%s' />\n", def->bridge); + virBufferEscapeString(buf, "<bridge name='%s' />\n", def->bridge); }
if (def->mac_specified) { char macaddr[VIR_MAC_STRING_BUFLEN]; virMacAddrFormat(&def->mac, macaddr); - virBufferAsprintf(&buf, "<mac address='%s'/>\n", macaddr); + virBufferAsprintf(buf, "<mac address='%s'/>\n", macaddr); }
if (def->domain) - virBufferAsprintf(&buf, "<domain name='%s'/>\n", def->domain); + virBufferAsprintf(buf, "<domain name='%s'/>\n", def->domain);
- if (virNetworkDNSDefFormat(&buf, def->dns) < 0) + if (virNetworkDNSDefFormat(buf, def->dns) < 0) goto error;
- if (virNetDevVlanFormat(&def->vlan, &buf) < 0) + if (virNetDevVlanFormat(&def->vlan, buf) < 0) goto error; - if (virNetDevBandwidthFormat(def->bandwidth, &buf) < 0) + if (virNetDevBandwidthFormat(def->bandwidth, buf) < 0) goto error;
for (ii = 0; ii < def->nips; ii++) { - if (virNetworkIpDefFormat(&buf, &def->ips[ii]) < 0) + if (virNetworkIpDefFormat(buf, &def->ips[ii]) < 0) goto error; }
- if (virNetDevVPortProfileFormat(def->virtPortProfile, &buf) < 0) + if (virNetDevVPortProfileFormat(def->virtPortProfile, buf) < 0) goto error;
for (ii = 0; ii < def->nPortGroups; ii++) - if (virPortGroupDefFormat(&buf, &def->portGroups[ii]) < 0) + if (virPortGroupDefFormat(buf, &def->portGroups[ii]) < 0) goto error;
+ virBufferAdjustIndent(buf, -2); + virBufferAddLit(buf, "</network>\n"); + + return 0; + +error: + return -1; +} + +char * +virNetworkDefFormat(virNetworkDefPtr def, + unsigned int flags) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + + if (virNetworkDefFormatInternal(&buf, def, flags) < 0) + goto error; + + if (virBufferError(&buf)) + goto no_memory; + + return virBufferContentAndReset(&buf); + +no_memory: + virReportOOMError(); +error: + virBufferFreeAndReset(&buf); + return NULL; +} + +static char * +virNetworkObjFormat(virNetworkObjPtr net, + unsigned int flags) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + char *class_id = virBitmapFormat(net->class_id); + + if (!class_id) + goto no_memory; + + virBufferAddLit(&buf, "<networkstatus>\n"); + virBufferAsprintf(&buf, " <class_id bitmap='%s'/>\n", class_id); + virBufferAsprintf(&buf, " <floor sum='%llu'/>\n", net->floor_sum); + VIR_FREE(class_id); + + virBufferAdjustIndent(&buf, 2); + if (virNetworkDefFormatInternal(&buf, net->def, flags) < 0) + goto error; + virBufferAdjustIndent(&buf, -2); - virBufferAddLit(&buf, "</network>\n"); + virBufferAddLit(&buf, "</networkstatus>");
if (virBufferError(&buf)) goto no_memory;
return virBufferContentAndReset(&buf);
- no_memory: +no_memory: virReportOOMError(); - error: +error: virBufferFreeAndReset(&buf); return NULL; } @@ -2079,9 +2203,10 @@ int virNetworkSaveStatus(const char *statusDir, virNetworkObjPtr network) { int ret = -1; + int flags = 0; char *xml;
- if (!(xml = virNetworkDefFormat(network->def, 0))) + if (!(xml = virNetworkObjFormat(network, flags))) goto cleanup;
if (virNetworkSaveXML(statusDir, network->def, xml)) diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index 364372d..519f73d 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -276,6 +276,8 @@ virNetworkDefPtr virNetworkDefParseString(const char *xmlStr); virNetworkDefPtr virNetworkDefParseFile(const char *filename); virNetworkDefPtr virNetworkDefParseNode(xmlDocPtr xml, xmlNodePtr root); +int virNetworkObjUpdateParseFile(const char *filename, + virNetworkObjPtr net);
char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags);
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 6564676..ab44642 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -873,6 +873,7 @@ virNetworkObjSetDefTransient; virNetworkObjUnlock; virNetworkObjUnsetDefTransient; virNetworkObjUpdate; +virNetworkObjUpdateParseFile; virNetworkRemoveInactive; virNetworkSaveConfig; virNetworkSaveStatus; diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 0bee453..cad2192 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -242,7 +242,6 @@ networkFindActiveConfigs(struct network_driver *driver) {
for (i = 0 ; i < driver->networks.count ; i++) { virNetworkObjPtr obj = driver->networks.objs[i]; - virNetworkDefPtr tmp; char *config;
virNetworkObjLock(obj); @@ -260,12 +259,10 @@ networkFindActiveConfigs(struct network_driver *driver) { }
/* Try and load the live config */ - tmp = virNetworkDefParseFile(config); + if (virNetworkObjUpdateParseFile(config, obj) < 0) + VIR_WARN("Unable to update config of '%s' network", + obj->def->name); VIR_FREE(config); - if (tmp) { - obj->newDef = obj->def; - obj->def = tmp; - }
/* If bridge exists, then mark it active */ if (obj->def->bridge && @@ -4425,6 +4422,14 @@ networkPlugBandwidth(virNetworkObjPtr net, iface->data.network.actual->class_id = class_id; /* update sum of 'floor'-s of attached NICs */ net->floor_sum += iface->bandwidth->in->floor; + /* update status file */ + if (virNetworkSaveStatus(NETWORK_STATE_DIR, net) < 0) { + ignore_value(virBitmapClearBit(net->class_id, class_id)); + net->floor_sum -= iface->bandwidth->in->floor; + iface->data.network.actual->class_id = 0; + ignore_value(virNetDevBandwidthUnplug(net->def->bridge, class_id)); + goto cleanup; + } /* update rate for non guaranteed NICs */ new_rate -= net->floor_sum; if (virNetDevBandwidthUpdateRate(net->def->bridge, "1:2", @@ -4459,6 +4464,16 @@ networkUnplugBandwidth(virNetworkObjPtr net, goto cleanup; /* update sum of 'floor'-s of attached NICs */ net->floor_sum -= iface->bandwidth->in->floor; + /* return class ID */ + ignore_value(virBitmapClearBit(net->class_id, + iface->data.network.actual->class_id)); + /* update status file */ + if (virNetworkSaveStatus(NETWORK_STATE_DIR, net) < 0) { + net->floor_sum += iface->bandwidth->in->floor; + ignore_value(virBitmapSetBit(net->class_id, + iface->data.network.actual->class_id)); + goto cleanup; + } /* update rate for non guaranteed NICs */ new_rate -= net->floor_sum; if (virNetDevBandwidthUpdateRate(net->def->bridge, "1:2",
ACK.

Interfaces keeps a class_id, which is an ID from which bridge part of QoS settings is derived. We need to store class_id in domain status file, so we can later pass it to virNetDevBandwidthUnplug. --- src/conf/domain_conf.c | 13 +++++++++++++ 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 18e65ca..f5e2f71 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4819,6 +4819,17 @@ virDomainActualNetDefParseXML(xmlNodePtr node, hostdev, flags) < 0) { goto error; } + } else if (actual->type == VIR_DOMAIN_NET_TYPE_NETWORK) { + char *class_id = virXPathString("string(./class/@id)", ctxt); + if (class_id && + virStrToLong_ui(class_id, NULL, 10, &actual->class_id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse class id '%s'"), + class_id); + VIR_FREE(class_id); + goto error; + } + VIR_FREE(class_id); } bandwidth_node = virXPathNode("./bandwidth", ctxt); @@ -12511,6 +12522,8 @@ virDomainActualNetDefFormat(virBufferPtr buf, break; case VIR_DOMAIN_NET_TYPE_NETWORK: + if (def->class_id) + virBufferAsprintf(buf, "<class id='%u'/>", def->class_id); break; default: virReportError(VIR_ERR_INTERNAL_ERROR, -- 1.7.8.6

On 12/11/2012 11:09 AM, Michal Privoznik wrote:
Interfaces keeps a class_id, which is an ID from which bridge part of QoS settings is derived. We need to store class_id in domain status file, so we can later pass it to virNetDevBandwidthUnplug. --- src/conf/domain_conf.c | 13 +++++++++++++ 1 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 18e65ca..f5e2f71 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4819,6 +4819,17 @@ virDomainActualNetDefParseXML(xmlNodePtr node, hostdev, flags) < 0) { goto error; } + } else if (actual->type == VIR_DOMAIN_NET_TYPE_NETWORK) { + char *class_id = virXPathString("string(./class/@id)", ctxt); + if (class_id && + virStrToLong_ui(class_id, NULL, 10, &actual->class_id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse class id '%s'"), + class_id); + VIR_FREE(class_id); + goto error; + } + VIR_FREE(class_id); }
bandwidth_node = virXPathNode("./bandwidth", ctxt); @@ -12511,6 +12522,8 @@ virDomainActualNetDefFormat(virBufferPtr buf, break;
case VIR_DOMAIN_NET_TYPE_NETWORK: + if (def->class_id) + virBufferAsprintf(buf, "<class id='%u'/>", def->class_id); break; default: virReportError(VIR_ERR_INTERNAL_ERROR,
ACK.

On 11.12.2012 17:38, Laine Stump wrote:
On 12/11/2012 11:09 AM, Michal Privoznik wrote:
Interfaces keeps a class_id, which is an ID from which bridge part of QoS settings is derived. We need to store class_id in domain status file, so we can later pass it to virNetDevBandwidthUnplug. --- src/conf/domain_conf.c | 13 +++++++++++++ 1 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 18e65ca..f5e2f71 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4819,6 +4819,17 @@ virDomainActualNetDefParseXML(xmlNodePtr node, hostdev, flags) < 0) { goto error; } + } else if (actual->type == VIR_DOMAIN_NET_TYPE_NETWORK) { + char *class_id = virXPathString("string(./class/@id)", ctxt); + if (class_id && + virStrToLong_ui(class_id, NULL, 10, &actual->class_id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unable to parse class id '%s'"), + class_id); + VIR_FREE(class_id); + goto error; + } + VIR_FREE(class_id); }
bandwidth_node = virXPathNode("./bandwidth", ctxt); @@ -12511,6 +12522,8 @@ virDomainActualNetDefFormat(virBufferPtr buf, break;
case VIR_DOMAIN_NET_TYPE_NETWORK: + if (def->class_id) + virBufferAsprintf(buf, "<class id='%u'/>", def->class_id); break; default: virReportError(VIR_ERR_INTERNAL_ERROR,
ACK.
Thanks, I've pushed the patches now. Michal
participants (2)
-
Laine Stump
-
Michal Privoznik