On 02/04/2014 05:10 PM, Yoann Juet wrote:
Hi all,

I'm testing on debian/unstable SR-IOV feature with Broadcom BCM57810 cards and KVM hypervisor:

Compiled against library: libvirt 1.2.1
Using library: libvirt 1.2.1
Using API: QEMU 1.2.1
Running hypervisor: QEMU 1.7.0

bnx2x
-> firmware 7.8.17
-> driver from kernel 3.12.7

8 VFs are created on the first PF. For each VF, a specific mac address is set manually using "ip link set eth0 vf x mac xx:xx:xx:xx:xx" command.

Instead of using <hostdev>, you should instead try using <interface type='hostdev'>, which will allow you to specify the mac address for the interface directly in the guest's XML config (rather than needing to do it separately). Here's a link to documentation on this feature:

  http://wiki.libvirt.org/page/Networking#PCI_Passthrough_of_host_network_devices

(look down to the section titled "Assignment with <interface type='hostdev'>")

Or even better, use <interface type='network'> in your guest config (still put the <mac address='xx:xx:xx:xx:xx:xx'/> element in each one), and define a libvirt network which is a pool of SRIOV VFs - this is described further down the same page.

This will not make a difference to the issue you describe below, but it should make managing your guest config and lifecycle much simpler.


I run several KVM guests with PCI passthrough (same kernel, bnx2x driver and firmware as the host), performance is close to bare metal.

Well, that sounds good, until I start capturing the traffic inside each VM: host traffic is visible as well as traffic destined to other VM. It's like if internal card switching was inoperable. I made several tests with different kernels, different PCIe Passthrough method assignments for libvirt. All failed.

Define "failed". Do you mean that the cards communicated, but the guests can see each others' traffic? Or do you mean that they see traffic from each other, but can't seem to communicate normally?

If the problem is the latter, then make sure the PF (eth0 for you, I guess) has status UP and RUNNING before you start the guests.

For the former, I'm not clear on the internal rules of switching of an SRIOV card. I think in most cases, the SRIOV card's internal switch may need to make everything from each VF visible to all other VFs, because the physical switch it's connected to may not mirror back traffic that really does need to go from one guest to the other. 802.1Qbh (which libvirt supports via the <virtualport type='802.1Qbh'> element) does this differently, requiring all traffic to travel out to the switch, with the switch making the decision about what gets mirrored back, but you need an 802.1Qbh-capable switch for that.


Has anyone successfully experiment SR-IOV with Broadcom cards on linux ?

-----

Some details:

01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)
01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)

01:09.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function
01:09.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function
01:09.2 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function
01:09.3 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function
01:09.4 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function
01:09.5 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function
01:09.6 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function
01:09.7 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function


# virsh nodedev-dumpxml pci_0000_01_09_0
<device>
  <name>pci_0000_01_09_0</name>
  <path>/sys/devices/pci0000:00/0000:00:01.0/0000:01:09.0</path>
  <parent>pci_0000_00_01_0</parent>
  <driver>
    <name>vfio-pci</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>1</bus>
    <slot>9</slot>
    <function>0</function>
    <product id='0x16af'>NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function</product>
    <vendor id='0x14e4'>Broadcom Corporation</vendor>
    <capability type='phys_function'>
      <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
    </capability>
    <iommuGroup number='35'>
      <address domain='0x0000' bus='0x01' slot='0x09' function='0x0'/>
    </iommuGroup>
  </capability>
</device>


# virsh nodedev-dumpxml pci_0000_01_09_1
<device>
  <name>pci_0000_01_09_1</name>
  <path>/sys/devices/pci0000:00/0000:00:01.0/0000:01:09.1</path>
  <parent>pci_0000_00_01_0</parent>
  <driver>
    <name>vfio-pci</name>
  </driver>
  <capability type='pci'>
    <domain>0</domain>
    <bus>1</bus>
    <slot>9</slot>
    <function>1</function>
    <product id='0x16af'>NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function</product>
    <vendor id='0x14e4'>Broadcom Corporation</vendor>
    <capability type='phys_function'>
      <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/>
    </capability>
    <iommuGroup number='36'>
      <address domain='0x0000' bus='0x01' slot='0x09' function='0x1'/>
    </iommuGroup>
  </capability>
</device>


Guest A XML:
    ...
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x09' function='0x0'/>
      </source>
    </hostdev>
    ...


Guest B XML:

    ...
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x01' slot='0x09' function='0x1'/>
      </source>
    </hostdev>
    ...




_______________________________________________
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users