
On Aug 23, 2011, at 12:50 PM, Daniel P. Berrange wrote:
[...snip...]
This makes using SRIOV VFs via PCI passthrough very unpalatable. The problem can be solved by setting the MAC address of the ethernet device prior to assigning it to the guest, but of course the <hostdev> element used to assign PCI devices to guests has no place to specify a MAC address (and I'm not sure it would be appropriate to add something that function-specific to <hostdev>).
In discussions at the KVM forum, other related problems were noted too. Specifically when using an SRIOV VF with VEPA/VNLink we need to be able to set the port profile on the VF before assigning it to the guest, to lock down what the guest can do. We also likely need to a specify a VLAN tag on the NIC. The VLAN tag is actally something we need to be able todo for normal non-PCI passthrough usage of SRIOV networks too.
I guess there is a issue with PCI-passtrough here, If the VEPA link is set up prior to VM start then that information is lost when the VM OS resets the device during initialization. Only on NICs with an integrated bridge can this setup be persistent because the bridge can handle the VLAN tagging and port setup. I see a major drawback with storing MAC adresses in <hostdev> elements: It would require great care to make sure that MAC adresses are unique across a big datacenter.
Dave Allan and I have discussed a different possible method of eliminating this problem (using a new forward type for libvirt networks) that I've outlined below. Please let me know what you think - is this reasonable in general? If so, what about the details? If not, any counter-proposals to solve the problem?
The issue I see is that if an application wants to know what PCI devices have been assigned to a guest, they can no longer just look at <hostdev> elements. They also need to look at <interface> elements. If we follow this proposed model in other areas, we could end up with PCI devices appearing as <disks> <controllers> and who knows what else. I think this is not very desirable for applications, and it is also not good for our internal code that manages PCI devices. ie the security drivers now have to look at many different places to find what PCI devices need labelling.
The same is true for network setups, the available options are becomming more and more confusing. Regards, D.Herrendoerfer