On Thu, Feb 2, 2012 at 10:47 PM, Ansis Atteka <aatteka@nicira.com> wrote:

On Thu, Feb 2, 2012 at 10:10 PM, Laine Stump <laine@laine.org> wrote:

On 02/02/2012 01:28 PM, Ansis Atteka wrote:

On Thu, Feb 2, 2012 at 6:08 PM, Laine Stump <laine@laine.org> wrote:

On 02/02/2012 09:30 AM, Ansis Atteka wrote:

Libvirt has a function virNetDevBridgeRemovePort() which can
remove port from the Linux Bridge, but it seems that no one calls it.

Wanted to confirm if port removal happens automatically for Linux
Bridges if VM goes down?

* when it's time to detach the device or destroy the guest, libvirt just sends a monitor command to qemu, which ends up closing the tap device. Because the tap device was non-persistent, that automatically leads to 1) removal of the tap device from the bridge, and 2) deletion of the tap device itself.

Yeah, the problem is that OVS does not do 1) when tap device gets destroyed.

Interesting. So what happens if there is traffic for a port on the switch that has a now-nonexistent tap device? Does it ignore it? Explode?
To be more precise OVS has two parts:
user-space (database that contains bridge configuration); and
kernel-space (the actual state of the bridge)

When I say that the port remains attached to the bridge after tap device is closed, I mean that this port config is still in user-space DB. If tap device with exactly the same name would reappear later, then the same config would be reapplied and kernel-space would end up doing exactly the the same thing.

If a non-persistent tap device that is attached to an OVS is closed, does OVS not notice this and automatically detach it? You may want to experiment with that; possibly nothing is needed.

(it would be much better if not, because otherwise there will need to be special care taken to prevent dangling tap devices (or dangling references to deleted tap devices))

The difference between OVS and
Linux Bridge is that OVS will need a hook that removes all ports on
VM shutdown event (and maybe also for some other events?).

Not just when a guest is shutdown, but also if a network device is detached from a running domain.

If it's necessary to explicitly detach the tap from the OVS, whatever hook is added in to do that can hopefully just as well be identical for a Linux bridge (i.e., the only OVS-specific code should be in the lowest level function that does that bridge detach).

Another point - since a shutdown initiated by the guest would likely end up destroying the tap device, we can't just add in a hook to detach it from the bridge - too early and the guest won't be done with it yet, too late and it will already not exist. I'm thinking that instead we may need to create the tap as persistent, then explicitly detach it from the bridge and delete it after the domain is finished with it.

It wouldn't be too late. It's ok If actual tap device is not alive anymore.

Well, as long as there are no negative consequences to the port being assigned past the time when the tap device is deleted.

As I mentioned before, you should modify virNetDevBridgeRemovePort to do this removal (and do it appropriately depending on the type of bridge), but change it so that it should return success if it would fail simply because the tap device already is not on the bridge. (This way we can leave the tap device as non-persistent, and it will be an effective NOP for tap devices on linux bridges.

Also, for consistency we should be just always calling the function to detach from the bridge if virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_BRIDGE, regardless of whether it's a linux bridge or ovs.

As far as where to put the calls to this - look for where there are calls to networkReleaseActualDevice(), and do it up above that (for example, you can see in qemuDomainDetachNetDevice() how there is an "if virDomainNetGetActualType(detach) == VIR_DOMAIN_NET_TYPE_DIRECT)" - you can change that to a switch(actualtype), and add a case for VIR_DOMAIN_NET_TYPE_BRIDGE that calls virNetDevBridgeRemovePort().
Ok. Thanks for the hints!

If that is needed, I think it should be done in a separate patch that is a prerequisite to the OVS patch. That way the two things can be tested independent of each other.

As soon as I get out the patch I'm working on now, I'll take a quick look at this and see if I can point your more in the right direction for this prerequisite patch. In the meantime, it would be useful if you could do the experiment I mentioned above (i.e. do nothing and see if it explodes), and modify virNetDevBridgeRemovePort for your patch to do the right thing in the case of the bridge device being an OVS.

If by experiment you meant "Whether OVS automatically detaches tap device from OVS bridge when tap device gets closed?" then I can confirm that in contrast to Linux Bridge it does not do that. I will look into possibilities to remove ports on "detach-interface" and "VM shutdown" events.

"detach-interface" = qemuDomainDetachNetDevice() - see above.

"VM shutdown" = qemuProcessStop() - almost exactly the same situation as detach (turn the if() into a switch() and add a case for bridged devices).

Don't forget LXC support! :-) It can use bridge network devices too.

There are a few other places where it may be appropriate to do the bridge removal during error paths; this same search may show you some of them, and some others may show up when you search for where virNetDevTapCreateInBridgePort.

Wouldn't it be simpler to do port removal just inside the networkReleaseActualDevice() function if this is interface that was attached to an OVS bridge? Would this make any problems to the overall design? The code seems to work...