Re: [libvirt] [PATCH 0/5] Interface pools and passthrough mode

6 Dec 2011

      On 12/06/2011 10:16 AM, Daniel P. Berrange wrote:
...
On Mon, Dec 05, 2011 at 02:00:52PM -0500, Laine Stump wrote:
...
On 12/05/2011 06:37 AM, Daniel P. Berrange wrote:
...
On Tue, Nov 29, 2011 at 08:29:35PM -0500, Laine Stump wrote:
...
On 11/29/2011 02:53 PM, Daniel P. Berrange wrote:
...
On Tue, Nov 29, 2011 at 03:46:13PM +0000, Shradha Shah wrote:
...
Interface Pools and Passthrough mode:
Current Method:
The passthrough mode uses a macvtap a direct connection to connect each guest to the network. The physical interface to be used is picked from among those listed in<interface>   sub elements of the<forward>   element.
The current specification for<forward>   extends to allow 0 or more<interface>   sub-elements:
Example:
<forward mode='passthrough' dev='eth10'/>
<interface dev='eth10'/>
<interface dev='eth12'/>
<interface dev='eth18'/>
<interface dev='eth20'/>
</forward>
However with an ethernet card with 64 VF's or more, the above method gets tedious on the system.
Ignoring the ABI issue, I'm concerned that as we get PFs with an increasingly
large number of VFs, we may well *not* want to associate all VFs with a single
virtual network definition. eg, we might wna to put 32 VFs in one network and
32 VFs in another network.  Or if we have 2 PFs, we might want to interleave
VFs from several PFs across virtual networks. If all we can do is list the
PF in the XML, we loose significant flexibility in how VFs are assigned.
My first concern too when I saw the patch was the semantic change
(but also the loss of flexibility), which is obviously a no-go. It's
a convenient capability to have though, so it would be nice to get
it in somehow. What if we allowed including all the VFs associated
with a PF by adding an extra attribute?  e.g.:
<interface dev='eth10' type='sriov'/>
This feels a little bit wrong to me.
...
(or whatever is more appropriate in place of "sriov"). Or possibly a
different element type could be used:
<pf dev='eth10'/>
I like this idea, because it is providing additional useful info,
rather than changing existing elements, so it is maximally
compatible.
...
(didn't want to spend time thinking of a better name than "pf"...).
At the time the network is created, this would cause libvirt to get
the list of all VFs for the given PF and put them into the pool.
This could be used instead of, or in combination with, the existing
<interface dev='eth1'/>  form. Thus the existing semantics would be
preserved, the flexibility of specifying individual devices would be
retained, and the desired convenience of adding all VFs of a PF with
a single line would be added.
IIUC, what you're suggesting is the following behaviour:
* Explicit interface list. App inputs:
<forward mode='passthrough'>
      <interface dev='eth10'/>
      <interface dev='eth11'/>
      <interface dev='eth12'/>
      <interface dev='eth13'/>
    </forward>
libvirt does not change XML
* Automatically interface list from PF. App inputs:
<forward mode='passthrough'>
       <pf dev='eth0'/>
     </forward>
libvirt expands XML to be
<forward mode='passthrough'>
      <pf dev='eth0'/>
      <interface dev='eth10'/>
      <interface dev='eth11'/>
      <interface dev='eth12'/>
      <interface dev='eth13'/>
    </forward>
This is good because all previous info is still intact
I actually hadn't thought of modifying the XML and displaying it in
net-dumpxml or (netdumpxml --inactive), which is what I think you
may be implying here. This would have the advantage of making a
management application's job easier when displaying status
(available interfaces, etc), but could lead to confusion when a
host's hardware was changed (since there would be no detectable
difference between dev elements that were entered by hand, and those
that were automatically derived from a pf element). Also, it would
end up cluttering up the config file again, which is part of what
this is trying to avoid (although eliminating the need to type in
all N vf names is the primary concern).
Unless we come up with a way of differentiating between
auto-generated <interface> elements (including keeping track of the
parent <pf>) and those entered by hand, I think the XML itself
shouldn't be changed, but only the contents of the interface pool in
memory.
As with domains, every network has both an active and inactive
XML config. When the network is not running, we should only be
showing the user provided <interface> elements. Only once you
start the network, do we automatically fill in <interface>
elements based on <pf>. So if we add a flag for virNetworkGetXMLDesc()
like VIR_NETWORK_XML_INACTIVE, then you can distinguish by comparing
the live XML to the inactive XML.
I agree that an inactive XML should show only the user provided data. 
But in the case of active XML, would it be wise to display the entire VF pool in the XML?
I think this would clutter the config file when a NIC supports 127 VF's per port like in the case of Solarflare.

Also, the free Vf's are discovered only after virNetworkDefParseXML and networkAllocateActualDevice (in mode Passthrough). Is there a way of modifying the active XML at this point?

Shradha
...
Daniel