On 04/28/2017 05:33 AM, Daniel P. Berrange wrote:
On Fri, Apr 28, 2017 at 05:23:19PM +0800, ZhiPeng Lu wrote:
> Creating tap device and adding the device to bridge are not atomic operation.
> Similarly deleting tap device and removing it from bridge are not atomic operation.
> The Problem occurs when two vms start and shutdown. When one vm with the nic
> named "vnet0" stopping, it deleted tap device but not removing port from
bridge.
> At this time, another vm created the tap device named "vnet0" and added
port to the
> same bridge. Then, the first vm deleted the tap device from the same bridge.
> Finally, the tap device of the second vm don't attached to the bridge.
> So, we can add domid to vm's nic name. For example, the vm's domid is 1 and
vnet0
> is renamed to vnet1.0.
Surely deleting the NIC automatically removes it from the bridge so we
can just remove the code that delets the bridge port.
That is true when using a Linux host bridge, but I recall that for
openvswitch (which I think is what ZhiPeng is using, based on an earlier
patch), you must explicitly remove the port from the bridge - apparently
the port is still there in openvswitch's table as some sort of "zombie"
connection even after the tap device itself no longer exists.
But instead of changing the naming scheme, maybe we should just delete
the bridge port *before* deleting the tap device instead of after. (Am I
recalling correctly that the tap device is deleted automatically when
the qemu process is killed? If so, then what's needed is to move the
loop in qemuProcessStop() that cleans up network interfaces so that it
happens before qemuProcessKill() is called.