[libvirt] 802.1q vlan-tagging support for libvirt

Are there any plans to support configuring 802.1q VLAN tagging through libvirt? My understanding is that libvirt would use vconfig to create tagged interfaces, while using a physical interface as trunk (e.g., eth0 is the trunk, eth0.20 the interface with the '20' vlan tag). Then it would add the tagged interface (eth0.20) to a bridge with the guest virtual interface. I don't know how libvirt would need to be extended to be able to properly 'manage' those 802.1q-enabled virtual switches. Another question would be how libvirt-CIM would support this. Is anyone interested/working on this? Thanks, -Klaus -- Klaus Heinrich Kiwi <klausk@linux.vnet.ibm.com> Linux Security Development, IBM Linux Technology Center

On Thu, Apr 02, 2009 at 03:50:32PM -0300, Klaus Heinrich Kiwi wrote:
Are there any plans to support configuring 802.1q VLAN tagging through libvirt?
My understanding is that libvirt would use vconfig to create tagged interfaces, while using a physical interface as trunk (e.g., eth0 is the trunk, eth0.20 the interface with the '20' vlan tag).
Yes, this type of configuration is within scope of the network interface management APis, that Laine & David Lutterkort are currently working on. THe underlying config will be done by netcf library.
I don't know how libvirt would need to be extended to be able to properly 'manage' those 802.1q-enabled virtual switches. Another question would be how libvirt-CIM would support this.
Once libvirt provides the network interface functionality, then the libvirt-CIM developers can map it into the appropriate DMTF schema Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

I don't know how libvirt would need to be extended to be able to properly 'manage' those 802.1q-enabled virtual switches. Another question would be how libvirt-CIM would support this.
Once libvirt provides the network interface functionality, then the libvirt-CIM developers can map it into the appropriate DMTF schema
Right - the support for the VLANs would be similar to how libvirt-cim represents the virtual network pools. -- Kaitlin Rupert IBM Linux Technology Center kaitlin@linux.vnet.ibm.com

Hi Klaus, On Thu, 2009-04-02 at 15:50 -0300, Klaus Heinrich Kiwi wrote:
My understanding is that libvirt would use vconfig to create tagged interfaces, while using a physical interface as trunk (e.g., eth0 is the trunk, eth0.20 the interface with the '20' vlan tag).
Then it would add the tagged interface (eth0.20) to a bridge with the guest virtual interface.
as Dan said, the actual functionality will be provided by netcf[1] VLAN is very high on the list of things that need to be done next - and any help with that would be much appreciated ;) David [1] https://fedorahosted.org/netcf/

On Thu, 2009-04-02 at 16:34 -0700, David Lutterkort wrote:
Hi Klaus,
On Thu, 2009-04-02 at 15:50 -0300, Klaus Heinrich Kiwi wrote:
My understanding is that libvirt would use vconfig to create tagged interfaces, while using a physical interface as trunk (e.g., eth0 is the trunk, eth0.20 the interface with the '20' vlan tag).
Then it would add the tagged interface (eth0.20) to a bridge with the guest virtual interface.
I was thinking about the semantics I described above. It ultimately means that we'll have a bridge for each VLAN tag that crosses the trunk interface. So for example if guests A, B and C are all associated with VLAN ID 20, then: eth0 -> eth0.20 -> br0 -> [tap0, tap1, tap2] (where tap[0-3] are associated with guests A, B, C respectively) Adding a new guest to a VLAN would mean searching the host system and check if there is already a bridge running on that VLAN ID. Create a new one if needed, or use the existing one. The things that concerns me the most are: 1) How scalable this really is 2) The semantics are really different from how physical, 802.1q-enabled switches would work. Because (2) really creates new switches for each new VLAN tag, I wonder how management would be different from what we have today with physical switches (i.e., defining a port with a VLAN ID, assigning that port to a physical machine) - unless we hide it behind libvirt somehow. Still, things like SNMP-management can have issues with such approach (snmp-based network managers would go crazy identifying several switches/bridges per box - not really useful from a management PoV). Are there other options? Since a tagged interface like eth0.20 is kind of a virtual interface itself, would it be appropriate to use those directly? Or maybe extending the existing bridging code to be 802.1q-aware? Thanks, -Klaus
as Dan said, the actual functionality will be provided by netcf[1]
VLAN is very high on the list of things that need to be done next - and any help with that would be much appreciated ;)
David
-- Klaus Heinrich Kiwi <klausk@linux.vnet.ibm.com> Linux Security Development, IBM Linux Technology Center

On Tue, Apr 07, 2009 at 06:39:17PM -0300, Klaus Heinrich Kiwi wrote:
I was thinking about the semantics I described above. It ultimately
I just caught this discussion. We have implemented basic vlan support (for xend, bridged only). Is there an existing patch from you guys for the user interface? Below is our current libvirt patch regards john Support setting of vlan ID for network interfaces Signed-off-by: John Levon <john.levon@sun.com> Signed-off-by: Max Zhen <max.zhen@sun.com> diff --git a/docs/schemas/domain.rng b/docs/schemas/domain.rng --- a/docs/schemas/domain.rng +++ b/docs/schemas/domain.rng @@ -692,6 +692,13 @@ </optional> </element> </optional> + <optional> + <element name='vlan'> + <attribute name='id'> + <ref name='unsignedInt'/> + </attribute> + </element> + </optional> </interleave> </define> diff --git a/src/domain_conf.c b/src/domain_conf.c --- a/src/domain_conf.c +++ b/src/domain_conf.c @@ -918,6 +918,7 @@ virDomainNetDefParseXML(virConnectPtr co char *address = NULL; char *port = NULL; char *model = NULL; + char *vlanid = NULL; if (VIR_ALLOC(def) < 0) { virReportOOMError(conn); @@ -983,6 +984,8 @@ virDomainNetDefParseXML(virConnectPtr co model = virXMLPropString(cur, "type"); } else if (xmlStrEqual (cur->name, BAD_CAST "networkresource")) { virDomainNetDefParseXMLRate(def, cur); + } else if (xmlStrEqual (cur->name, BAD_CAST "vlan")) { + vlanid = virXMLPropString(cur, "id"); } } cur = cur->next; @@ -1093,6 +1096,16 @@ virDomainNetDefParseXML(virConnectPtr co model = NULL; } + def->vlanid = 0; + + if (vlanid != NULL) { + if (virStrToLong_i(vlanid, NULL, 10, &def->vlanid) < 0) { + virDomainReportError(conn, VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot parse vlan ID")); + goto error; + } + } + cleanup: VIR_FREE(macaddr); VIR_FREE(network); @@ -1104,6 +1117,7 @@ cleanup: VIR_FREE(bridge); VIR_FREE(model); VIR_FREE(type); + VIR_FREE(vlanid); return def; @@ -3035,6 +3049,9 @@ virDomainNetDefFormat(virConnectPtr conn unit, period, def->rate.value); } + if (def->vlanid) + virBufferVSprintf(buf, " <vlan id='%d' />\n", def->vlanid); + virBufferAddLit(buf, " </interface>\n"); return 0; diff --git a/src/domain_conf.h b/src/domain_conf.h --- a/src/domain_conf.h +++ b/src/domain_conf.h @@ -186,6 +186,7 @@ struct _virDomainNetDef { int period; long value; } rate; + int vlanid; }; enum virDomainChrSrcType { diff --git a/src/virsh.c b/src/virsh.c --- a/src/virsh.c +++ b/src/virsh.c @@ -4762,6 +4762,7 @@ static const vshCmdOptDef opts_attach_in {"mac", VSH_OT_DATA, 0, gettext_noop("MAC address")}, {"script", VSH_OT_DATA, 0, gettext_noop("script used to bridge network interface")}, {"capped-bandwidth", VSH_OT_STRING, 0, gettext_noop("bandwidth limit for this interface")}, + {"vlanid", VSH_OT_INT, 0, gettext_noop("VLAN ID attached to this interface")}, {NULL, 0, 0, NULL} }; @@ -4769,7 +4770,7 @@ cmdAttachInterface(vshControl *ctl, cons cmdAttachInterface(vshControl *ctl, const vshCmd *cmd) { virDomainPtr dom = NULL; - char *mac, *target, *script, *type, *source, *bw; + char *mac, *target, *script, *type, *source, *bw, *vlanid; int typ, ret = FALSE; char *buf = NULL, *tmp = NULL; @@ -4787,6 +4788,7 @@ cmdAttachInterface(vshControl *ctl, cons mac = vshCommandOptString(cmd, "mac", NULL); script = vshCommandOptString(cmd, "script", NULL); bw = vshCommandOptString(cmd, "capped-bandwidth", NULL); + vlanid = vshCommandOptString(cmd, "vlanid", NULL); /* check interface type */ if (STREQ(type, "network")) { @@ -4866,6 +4868,21 @@ cmdAttachInterface(vshControl *ctl, cons tmp = vshRealloc(ctl, tmp, strlen(unit) + strlen(bw) + strlen(format)); if (!tmp) goto cleanup; sprintf(tmp, format, unit, r); + buf = vshRealloc(ctl, buf, strlen(buf) + strlen(tmp) + 1); + if (!buf) goto cleanup; + strcat(buf, tmp); + } + + if (vlanid != NULL) { + char *left; + long r = strtol(vlanid, &left, 10); + if ((r <= 0) || (r >= 4095) || (*left != '\0')) { + vshError(ctl, FALSE, _("Bad VLAN ID: %s in command 'attach-interface'"), vlanid); + goto cleanup; + } + tmp = vshRealloc(ctl, tmp, strlen(vlanid) + 20); + if (!tmp) goto cleanup; + sprintf(tmp, " <vlan id='%s'/>\n", vlanid); buf = vshRealloc(ctl, buf, strlen(buf) + strlen(tmp) + 1); if (!buf) goto cleanup; strcat(buf, tmp); diff --git a/src/xend_internal.c b/src/xend_internal.c --- a/src/xend_internal.c +++ b/src/xend_internal.c @@ -1896,7 +1896,10 @@ xend_parse_rate(const char *ratestr, int if (sscanf(ratestr, "%lu,%lu", &amount, &interval) == 2) { *unit = VIR_DOMAIN_NET_RATE_KB; *period = VIR_DOMAIN_NET_PERIOD_S; - *val = (amount * 8) / (interval / 1000000); + /* bytes to kilobits */ + *val = (amount / 125); + /* factor in period interval */ + *val /= (interval / 1000000); return 0; } @@ -2046,6 +2049,15 @@ xenDaemonParseSxprNets(virConnectPtr con _("ignoring malformed rate limit '%s'"), tmp); } else { net->rate.enabled = 1; + } + } + + tmp = sexpr_node(node, "device/vif/vlanid"); + if (tmp) { + if (virStrToLong_i(tmp, NULL, 0, &net->vlanid) != 0) { + virXendError(conn, VIR_ERR_INTERNAL_ERROR, + _("malformed vlanid '%s'"), tmp); + goto cleanup; } } @@ -5553,6 +5565,9 @@ xenDaemonFormatSxprNet(virConnectPtr con unit, period); } + if (def->vlanid) + virBufferVSprintf(buf, "(vlanid '%d')", def->vlanid); + /* * apparently (type ioemu) breaks paravirt drivers on HVM so skip this * from Xen 3.1.0

On Tue, 2009-04-07 at 18:39 -0300, Klaus Heinrich Kiwi wrote:
I was thinking about the semantics I described above. It ultimately means that we'll have a bridge for each VLAN tag that crosses the trunk interface. So for example if guests A, B and C are all associated with VLAN ID 20, then:
eth0 -> eth0.20 -> br0 -> [tap0, tap1, tap2]
(where tap[0-3] are associated with guests A, B, C respectively)
Yes, I think that's how it should work; it would also mean that you'd first set up eth0 as a separate interface, and new bridge/vlan interface combos afterwards. AFAIK, for the bridge, only bootproto=none would make sense.
The things that concerns me the most are: 1) How scalable this really is
I don't know either ... we'll find out ;)
2) The semantics are really different from how physical, 802.1q-enabled switches would work.
Because (2) really creates new switches for each new VLAN tag, I wonder how management would be different from what we have today with physical switches (i.e., defining a port with a VLAN ID, assigning that port to a physical machine) - unless we hide it behind libvirt somehow.
The reason we are creating all those bridges isn't the VLAN's - it's that we want to share the same physical interface amongst several guests. And I don't know of another way to do that.
Are there other options? Since a tagged interface like eth0.20 is kind of a virtual interface itself, would it be appropriate to use those directly?
You can use it directly, I just don't know how else you would share it amongst VM's without a bridge. David

On Tue, Apr 07, 2009 at 06:32:43PM -0700, David Lutterkort wrote:
On Tue, 2009-04-07 at 18:39 -0300, Klaus Heinrich Kiwi wrote:
I was thinking about the semantics I described above. It ultimately means that we'll have a bridge for each VLAN tag that crosses the trunk interface. So for example if guests A, B and C are all associated with VLAN ID 20, then:
eth0 -> eth0.20 -> br0 -> [tap0, tap1, tap2]
(where tap[0-3] are associated with guests A, B, C respectively)
Yes, I think that's how it should work; it would also mean that you'd first set up eth0 as a separate interface, and new bridge/vlan interface combos afterwards. AFAIK, for the bridge, only bootproto=none would make sense.
The things that concerns me the most are: 1) How scalable this really is
I don't know either ... we'll find out ;)
I don't think that's really a scalability problem from libvirt's POV. I know people use this setup quite widely already even with plain ifcfg-XXX scripts. Any scalability problems nmost likely fall into the kernel / networking code and whether it is good at avoiding unneccessary data copies when you have stacked NIC -> VLAN -> BRIDGE -> TAP
2) The semantics are really different from how physical, 802.1q-enabled switches would work.
Because (2) really creates new switches for each new VLAN tag, I wonder how management would be different from what we have today with physical switches (i.e., defining a port with a VLAN ID, assigning that port to a physical machine) - unless we hide it behind libvirt somehow.
I think one thing to consider is the difference between the physical and logical models. The libvirt API / representation here is fairly low level, dealing in individuals NICs. I think management apps would likely want to present this in a slightly alternate way dealing more in logical entities than physical NICs. eg oVirt's network model is closer to the one you describe where the user defines a new switch for each VLAN tag. It then maps this into the low level physical model of individual NICs as needed. I think it si important that libvirt use the physical model here to give apps flexibility in how they expose it to users.
The reason we are creating all those bridges isn't the VLAN's - it's that we want to share the same physical interface amongst several guests. And I don't know of another way to do that.
Are there other options? Since a tagged interface like eth0.20 is kind of a virtual interface itself, would it be appropriate to use those directly?
You can use it directly, I just don't know how else you would share it amongst VM's without a bridge.
In the (nearish) future NICs will start appearing with SR-IOV capabilities. This gives you one physical PCI device, whcih exposes multiple functions. So a single physical NIC appears as 8 NICs to the OS. You can thus directly assign each of these virtual NICs to a different VM directly,a voiding the need to bridge them. I don't think its worth spending too much time trying to come up with other non-bridged NIC sharing setups when hardware is about todo it all for us :-) Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Anno domini 2009 Daniel P. Berrange scripsit: [...]
In the (nearish) future NICs will start appearing with SR-IOV capabilities. This gives you one physical PCI device, whcih exposes multiple functions. So a single physical NIC appears as 8 NICs to the OS. You can thus directly assign each of these virtual NICs to a different VM directly,a voiding the need to bridge them.
I don't think its worth spending too much time trying to come up with other non-bridged NIC sharing setups when hardware is about todo it all for us :-)
Is this hope or do you know about any timelime for this? Ciao Max -- Follow the white penguin.

On Wed, Apr 08, 2009 at 01:06:48PM +0200, Maximilian Wilhelm wrote:
Anno domini 2009 Daniel P. Berrange scripsit:
[...]
In the (nearish) future NICs will start appearing with SR-IOV capabilities. This gives you one physical PCI device, whcih exposes multiple functions. So a single physical NIC appears as 8 NICs to the OS. You can thus directly assign each of these virtual NICs to a different VM directly,a voiding the need to bridge them.
I don't think its worth spending too much time trying to come up with other non-bridged NIC sharing setups when hardware is about todo it all for us :-)
Is this hope or do you know about any timelime for this?
Hardware exists today & patches are avialable for upstream Linux. http://lwn.net/Articles/308238/ The question mark is more over how long it takes before the hardware is widely available & shipped to consumers by vendors. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Anno domini 2009 Daniel P. Berrange scripsit:
On Wed, Apr 08, 2009 at 01:06:48PM +0200, Maximilian Wilhelm wrote:
Anno domini 2009 Daniel P. Berrange scripsit:
That was a *really* _fast_ reply :)
[...]
In the (nearish) future NICs will start appearing with SR-IOV capabilities. This gives you one physical PCI device, whcih exposes multiple functions. So a single physical NIC appears as 8 NICs to the OS. You can thus directly assign each of these virtual NICs to a different VM directly,a voiding the need to bridge them.
I don't think its worth spending too much time trying to come up with other non-bridged NIC sharing setups when hardware is about todo it all for us :-)
Is this hope or do you know about any timelime for this?
Hardware exists today & patches are avialable for upstream Linux.
Oh, nice. That went below my radar.
The question mark is more over how long it takes before the hardware is widely available & shipped to consumers by vendors.
*nod* Let's hope and see :) Thanks for the info! Ciao Max -- Follow the white penguin.

On Wed, Apr 08, 2009 at 12:09:06PM +0100, Daniel P. Berrange wrote:
On Wed, Apr 08, 2009 at 01:06:48PM +0200, Maximilian Wilhelm wrote:
Anno domini 2009 Daniel P. Berrange scripsit:
[...]
In the (nearish) future NICs will start appearing with SR-IOV capabilities. This gives you one physical PCI device, whcih exposes multiple functions. So a single physical NIC appears as 8 NICs to the OS. You can thus directly assign each of these virtual NICs to a different VM directly,a voiding the need to bridge them.
I don't think its worth spending too much time trying to come up with other non-bridged NIC sharing setups when hardware is about todo it all for us :-)
Is this hope or do you know about any timelime for this?
Hardware exists today & patches are avialable for upstream Linux.
http://lwn.net/Articles/308238/
The question mark is more over how long it takes before the hardware is widely available & shipped to consumers by vendors.
The hardware is not appropriate for all environments. For example, using the hardware removes the ability of the host to manage the packet flow to the guest (filtering, bandwidth limits, etc.). Domain migration is also complicated, though not impossible. SR-IOV isn't the answer to all problems.
participants (7)
-
Daniel P. Berrange
-
David Edmondson
-
David Lutterkort
-
John Levon
-
Kaitlin Rupert
-
Klaus Heinrich Kiwi
-
Maximilian Wilhelm