[libvirt] [PATCH] Fix macvtap device tear down problem on virsh destroy

This patch fixes the problem with the tear down of the macvtap device when issuing a 'virsh destroy' by moving the tear down block past the point of killing the Qemu process. It seems necessary to loop at least once in the case of 'virsh destory' since the device seems to be busy for a while after the Qemu process has been killed. This also still properly tears down the macvtap device when the VM is 'virsh shutdown'ed or halted from inside. Signed-off-by: Stefan Berger <stefanb@us.ibm.com>

On Mon, Feb 15, 2010 at 4:39 PM, Stefan Berger <stefanb@us.ibm.com> wrote:
This patch fixes the problem with the tear down of the macvtap device when issuing a 'virsh destroy' by moving the tear down block past the point of killing the Qemu process. It seems necessary to loop at least once in the case of 'virsh destory' since the device seems to be busy for a while after the Qemu process has been killed. This also still properly tears down the macvtap device when the VM is 'virsh shutdown'ed or halted from inside.
I think it's fine to check just once, before replacing an existing macvtap interface with a new one, that the existing interface's fd can be opened, to avoid stomping on an interface owned by someone else. On shutdown, I think it's safe to assume that an interface whose name, macaddr and type matches is in fact the interface it created. I think it should just go ahead and delete it without again trying to open the fd. Otherwise we end up with such annoying races, e.g. if qemu hangs or takes a long time to shut down, you end up leaving the orphaned interfaces sitting around as well as the hung qemu process. --Ed

Ed Swierk <eswierk@aristanetworks.com> wrote on 02/15/2010 11:15:11 PM:
On Mon, Feb 15, 2010 at 4:39 PM, Stefan Berger <stefanb@us.ibm.com>
wrote:
This patch fixes the problem with the tear down of the macvtap device when issuing a 'virsh destroy' by moving the tear down block past the point of killing the Qemu process. It seems necessary to loop at least once in the case of 'virsh destory' since the device seems to be busy for a while after the Qemu process has been killed. This also still properly tears down the macvtap device when the VM is 'virsh shutdown'ed or halted from inside.
I think it's fine to check just once, before replacing an existing macvtap interface with a new one, that the existing interface's fd can be opened, to avoid stomping on an interface owned by someone else.
This is for the case when the device gets created. In that case I do check for someone else 'accidentally' having the same MAC address.
On shutdown, I think it's safe to assume that an interface whose name, macaddr and type matches is in fact the interface it created. I think it should just go ahead and delete it without again trying to open the fd. Otherwise we end up with such annoying races, e.g. if qemu hangs or takes a long time to shut down, you end up leaving the orphaned interfaces sitting around as well as the hung qemu process.
This would be for the case that the SIGKILL on the Qemu process didn't actually tear down the process. I was hoping that that would not happen. I guess the solution is to introduce another function like 'delMacvtapByAddressNoBusyCheck' that skips the test that tries to open the tap device. Does this sound right ? Stefan
--Ed

Ed Swierk <eswierk@aristanetworks.com> wrote on 02/15/2010 11:15:11 PM:
On Mon, Feb 15, 2010 at 4:39 PM, Stefan Berger <stefanb@us.ibm.com>
wrote:
This patch fixes the problem with the tear down of the macvtap device when issuing a 'virsh destroy' by moving the tear down block past the point of killing the Qemu process. It seems necessary to loop at least once in the case of 'virsh destory' since the device seems to be busy for a while after the Qemu process has been killed. This also still properly tears down the macvtap device when the VM is 'virsh shutdown'ed or halted from inside.
I think it's fine to check just once, before replacing an existing macvtap interface with a new one, that the existing interface's fd can be opened, to avoid stomping on an interface owned by someone else.
On shutdown, I think it's safe to assume that an interface whose name, macaddr and type matches is in fact the interface it created. I think it should just go ahead and delete it without again trying to open the fd. Otherwise we end up with such annoying races, e.g. if qemu hangs or takes a long time to shut down, you end up leaving the orphaned interfaces sitting around as well as the hung qemu process.
As an RFC, now attached the patch that tears down the macvtap device upon termination of the Qemu process without checking whether the device(s) with the given MAC address are in use. Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
--Ed

On Tue, Feb 16, 2010 at 5:22 AM, Stefan Berger <stefanb@us.ibm.com> wrote:
As an RFC, now attached the patch that tears down the macvtap device upon termination of the Qemu process without checking whether the device(s) with the given MAC address are in use.
Another problem comes to mind: it is incorrect to assume that an existing macvtap interface with a given MAC address has anything to do with the one we want to delete. Consider a host with two NICs, eth0 and eth1. Suppose some application previously created macvtap0 with MAC 52:54:00:12:34:56 from lower interface eth0. Now we want to create macvtap1 with the same MAC address 52:54:00:12:34:56 from lower interface eth1; this is fine as long as eth0 and eth1 are on different LANs or VLANs. We would be unhappy if libvirt blew away macvtap0 before creating macvtap1. This can be solved by comparing not only the MAC address but also the lower interface name: if they both match, then it's safe to delete the macvtap interface. It should be possible to query the name of the lower interface for a given macvtap interface via netlink (ip link show reveals this information). I'd stick with just one function delMacvtap(char *macaddress, char *linkdev, int checkTapInUse), which can be called from qemudPhysIfaceConnect(), qemudDomainDetachNetDevice() and qemudShutdownVMDaemon(). I don't see any reason for a separate hasBusyDev parameter; it seems sufficient to return 1 if the tap device is in use. --Ed

Ed Swierk <eswierk@aristanetworks.com> wrote on 02/17/2010 02:05:44 AM:
On Tue, Feb 16, 2010 at 5:22 AM, Stefan Berger <stefanb@us.ibm.com>
wrote:
As an RFC, now attached the patch that tears down the macvtap device upon termination of the Qemu process without checking whether the device(s) with the given MAC address are in use.
Another problem comes to mind: it is incorrect to assume that an existing macvtap interface with a given MAC address has anything to do with the one we want to delete. Consider a host with two NICs, eth0 and eth1. Suppose some application previously created macvtap0 with MAC 52:54:00:12:34:56 from lower interface eth0. Now we want to create macvtap1 with the same MAC address 52:54:00:12:34:56 from lower interface eth1; this is fine as long as eth0 and eth1 are on different LANs or VLANs. We would be unhappy if libvirt blew away macvtap0 before creating macvtap1.
Right.
This can be solved by comparing not only the MAC address but also the lower interface name: if they both match, then it's safe to delete the macvtap interface. It should be possible to query the name of the lower interface for a given macvtap interface via netlink (ip link show reveals this information).
I now took the route through reading the necessary information from sysfs rather than going through what seems to be the more costly netlink route in terms of code that needs to be written. The new patch is attached. I still tear down all unused macvtap interfaces that happen to have the same MAC address and link device, but test for whether one of them is in use if requested. I assume that those interfaces with the same properties, as the one to be created, are stale.
I'd stick with just one function delMacvtap(char *macaddress, char *linkdev, int checkTapInUse), which can be called from qemudPhysIfaceConnect(), qemudDomainDetachNetDevice() and qemudShutdownVMDaemon(). I don't see any reason for a separate hasBusyDev parameter; it seems sufficient to return 1 if the tap device is in use.
Thanks for the hints and feedback. Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
--Ed
participants (2)
-
Ed Swierk
-
Stefan Berger