On 09/03/2016 11:08 AM, Moshe Levi wrote:
> -----Original Message-----
> From: sendmail [mailto:justsendmailnothingelse@gmail.com] On Behalf Of
> Laine Stump
> Sent: Thursday, September 01, 2016 5:59 PM
> To: Libvirt <libvir-list(a)redhat.com>
> Cc: Moshe Levi <moshele(a)mellanox.com>; Edan David
> <edand(a)mellanox.com>
> Subject: Re: [libvirt] <interface type='direct'>
>
> On 09/01/2016 04:05 AM, Moshe Levi wrote:
>> Hi,
>>
>> In OpenStack we have a port type macvtap.
>> Mavtap port is just a tap device connected to VF.
>>
>> In Libvirt the guest xml look like
>> <interface type='direct'>
>> <mac address='fa:16:3e:b1:06:4e'/>
>> <source dev='p1p6' mode='passthrough'/>
>> <target dev='macvtap1'/>
>> <model type='virtio'/>
>> <driver name='vhost'/>
>> <alias name='net0'/>
>> <address type='pci' domain='0x0000' bus='0x00'
slot='0x03'
>> function='0x0'/> </interface>
>>
>>
>> In the hypervisor we can see that the mac of the VF which is
>> fa:16:3e:f3:9b:e8 - is set by OpenStack see [1]
>> 9: ens3f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq
> master ovs-system state UP mode DEFAULT group default qlen 1000
>> link/ether 7c:fe:90:29:24:4e brd ff:ff:ff:ff:ff:ff
>> vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state disable
>> vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state disable
>> vf 2 MAC fa:16:3e:f3:9b:e8, vlan 48, spoof checking on, link-state enable
>> vf 3 MAC fa:16:3e:f6:02:c8, vlan 48, spoof checking on,
>> link-state enable
>> 41: ens3f4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq
> state UP mode DEFAULT group default qlen 1000
>> link/ether fa:16:3e:f6:02:c8 brd ff:ff:ff:ff:ff:ff
>> 42: macvtap0@ens3f4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu
> 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 500
>> link/ether fa:16:3e:f6:02:c8, brd ff:ff:ff:ff:ff:ff
>>
>> The netdevice of the VF which is ens3f4 has also the same mac. This
>> mac is set when using Libvirt 1.2.2 (Ubuntu 14.04), But when we tested
> with new Libvirt versions >= 1.2.17 (Fedora 23/Ubuntu 16.04) the mac
> netdevice of the VF (ens3f4) is not set.
>> This change in Libvirt breaks the guest from getting DHCP in OpenStack.
>> Do you know why the behavior change in newer releases?
> The MAC address is now set with a netlink command to set the VFINFO of
> the particular VF# of the PF. This change was made in response to a bug
> report stating that once the MAC address had been set for a hostdev
> assignment of a VF (in which case this method is required), it was no longer
> possible to set the MAC address for macvtap passthrough (the VF driver
> would complain "MAC has been administratively set", on Intel igbvf at
least).
> Unfortunately I recently found that when you set the MAC address in this
> manner, it doesn't take effect on the actual device
> - it's only saved in memory to be applied the *next time* the host driver is
> rebound to the VF.
Are saying that the change was to update the MAC of the VF?
So I don’t understand how this effect the issue that VF netdevice MAC don't get set
Look at the explanation in commit cb3fe38c and also
https://bugzilla.redhat.com/show_bug.cgi?id=1113474
That commit switched from using a simple ioctl(SIOCSIFHWADDR) to the
VF's netdev name, to using a netlink RTM_SETLINK message to the netdev
of the *PF* for the given VF.
This was done because the latter is the *only* way you can set the MAC
address for a VF that you're going to assign to the guest with vfio
device assignment, and once you've set the MAC address that way, future
attempts to set the MAC address with ioctl(SIOCSIGHWADDR) result in
failure and a kernel log like this:
kernel: igb 0000:0e:00.1: VF 1 attempted to override administratively set MAC address
kernel: Reload the VF driver to resume operations
Looking into the kernel, it appears that once the MAC address for a VF
has been set via RTM_SETLINK, the igb driver (and I believe also the
ixgbe driver, not sure about others) doesn't allow it to be changed via
ioctl until the PF driver is reloaded (which can't realistically be done
on an active system)
But recently there was another report of the MAC address not getting set
properly for macvtap passthrough mode when the device is an SRIOV VF (I
can't find it in bugzilla, so it must have been an email to one of the
lists) and when I tried it myself I found they were correct - in the
output of "ip link show" the MAC address showed in the list of VFs under
the PF is correctly modified, but it's not set properly in the VF's
netdev instance - apparently the MAC addresses in the VF list aren't set
in the VF's netdev immediately, they're just saved to be set *the next
time the VF is re-bound to the VF netdev driver*. I think in the past
the interface may have been in promiscuous mode so it didn't matter, but
now it isn't? I'm not sure as I haven't had much time to investigate.
Does that make any more sense now?
> Since I don't see a reasonably efficient way to get this to
work, I need to
> make a patch to revert to the old behavior, and we'll then just have to tell
> people "If you do hostdev device assignment of VFs, then you can't later
re-
> use the same device for macvtap passthrough mode".
>
> (actually, I *think* an alternative would be to unbind/rebind the host driver
> to the VF after setting the VF MAC address, but that seems a bit
> disruptive/extreme to work around a problem that is probably only seen in
> QE labs, but not in the real world (realistically, production systems likely use
> either hostdev or macvtap, and don't switch back and forth between them).
>
> A question - I notice you have the vlan set for the VF. Does *that* properly
> take effect? (it's set in the same manner as the MAC address, via a netlink
> command to set the VFINFO)
I am not sure what you mean, but we set the vlan in OpenStack after we create the guest
xml.
What command do you use to set it? Do you use "ip link set $PF vf $VF#
vlan $VLANID" ? I think that's what it's showing here:
https://review.openstack.org/#/c/364121/1/nova/network/linux_net.py
(I don't know my way around openstack code, but arrived at that page via
clicking on links from a google search)
In OpenStack we put the MAC of the VF and the vlan using iproute2.
I just want to know if that should be the part of Libvirt setting mac/vlan or
Libvirt just create the macvtap interface and we should put the mac/vlan?
libvirt *should* do it.
>> We have a WIP patch in OpenStack for setting also the mac
for the
> netdevice of the VF [2]. Just wanted to know that this is the correct
> approach.
Can you confirm that setting the VF netdevice mac in OpenStack is a reasonable workaround
for the newer Libvirt versions?
If libvirt isn't getting the job done, and you can set it yourself, then
that's a workaround. I don't know that I'd call it "reasaonable"
though.
If everybody puts in special code to workaround bugs in libvirt (which
is apparently what's been done) rather than actually reporting the bug
(what you're doing now - Thanks!) then we are tricked into thinking that
either the code works, or that nobody is using it so it doesn't matter
if it's broken.
The *best* way of overcoming this problem is to fix libvirt so it does
what it's supposed to do.
It's possible we can make it work by adding some operation after we send
the RTM_SETLINK (maybe unbind the VF from its netdev driver, then
re-bind, but that seems so drastic and time consuming!), or maybe we'll
have to revert to using ioctl(SIOCSIFHWADDR), but of course that will
fail if the interface has been used for hostdev assignment since the
last host reboot.
It's interesting that openstack is apparently using the RTM_SETLINK
method to set the mac address (afaik, that's what is used by the "ip
link set $pf_ifname vf $vf_num mac $mac_addr vlan $vlanid" command
that's shown in the bit of code from nova/network/linux_net.py at the
link I posted above).