[libvirt PATCH 0/3] Eliminate old tap/macvtap teardown stomping on new tap setup

The problem and solution are very well described in patches 2 and 3, but in short - because we (libvirt for macvtap, the kernel for tap) always try to assign the lowest numbered names possible to macvtap and tap devices, we sometimes create a new tap for a new guest using the same name as an old tap for an old guest that is shutting down simultaneous to setting up the new guest/tap. This can lead to the old guest teardown stomping on the new guest setup. These patches eliminate that problem by changing the strategy to do our best to *not* reuse tap / macvtap device names, but instead use a monotonically incrementing counter to name the devices. One possibly undesirable side effect of this (and the other) patch is that the longer a host is running without reboot, the higher the numbers tap device names will get. While users are accustomed to always seeing vnet0 and vnet1, they may be a bit surprised to now see vnet39283 or macvtap735. It has been pointed out to me that the same thing happened with PIDs a few years ago, and while it looked strange at first, everyone is now accustomed to it. Laine Stump (3): util: make locking versions of virNetDevMacVLan(Reserve|Release)Name() util: assign macvtap names using a monotonically increasing integer util: assign tap device names using a monotonically increasing integer src/libvirt_private.syms | 1 + src/qemu/qemu_process.c | 22 +++++++- src/util/virnetdevmacvlan.c | 109 +++++++++++++++++++++++++++--------- src/util/virnetdevtap.c | 79 +++++++++++++++++++++++++- src/util/virnetdevtap.h | 4 ++ 5 files changed, 186 insertions(+), 29 deletions(-) -- 2.26.2

When these functions are called from within virnetdevmacvlan.c, they are usually called with virNetDevMacVLanCreateMutex held, but when virNetDevMacVLanReserveName() is called from other places (hypervisor drivers keeping track of already-in-use macvlan/macvtap devices) the lock isn't acquired. This could lead to a situation where one thread is setting a bit in the bitmap to notify of a device already in-use, while another thread is checking/setting/clearing a bit while creating a new macvtap device. In practice this *probably* doesn't happen, because the external calls to virNetDevMacVLan() only happen during hypervisor driver init routines when libvirtd is restarted, but there's no harm in protecting ourselves. (NB: virNetDevMacVLanReleaseName() is actually never called from outside virnetdevmacvlan.c, so it could just as well be static, but I'm leaving it as-is for now. This locking version *is* called from within virnetdevmacvlan.c, since there are a couple places that we used to call the unlocked version after the lock was already released.) Signed-off-by: Laine Stump <laine@redhat.com> --- src/util/virnetdevmacvlan.c | 42 ++++++++++++++++++++++++++++++------- 1 file changed, 34 insertions(+), 8 deletions(-) diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index dcea93a5fe..69a9c784bb 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -196,7 +196,7 @@ virNetDevMacVLanReleaseID(int id, unsigned int flags) /** - * virNetDevMacVLanReserveName: + * virNetDevMacVLanReserveNameInternal: * * @name: already-known name of device * @quietFail: don't log an error if this name is already in-use @@ -208,8 +208,8 @@ virNetDevMacVLanReleaseID(int id, unsigned int flags) * Returns reserved ID# on success, -1 on failure, -2 if the name * doesn't fit the auto-pattern (so not reserveable). */ -int -virNetDevMacVLanReserveName(const char *name, bool quietFail) +static int +virNetDevMacVLanReserveNameInternal(const char *name, bool quietFail) { unsigned int id; unsigned int flags = 0; @@ -237,8 +237,21 @@ virNetDevMacVLanReserveName(const char *name, bool quietFail) } +int +virNetDevMacVLanReserveName(const char *name, bool quietFail) +{ + /* Call the internal function after locking the macvlan mutex */ + int ret; + + virMutexLock(&virNetDevMacVLanCreateMutex); + ret = virNetDevMacVLanReserveNameInternal(name, quietFail); + virMutexUnlock(&virNetDevMacVLanCreateMutex); + return ret; +} + + /** - * virNetDevMacVLanReleaseName: + * virNetDevMacVLanReleaseNameInternal: * * @name: already-known name of device * @@ -248,8 +261,8 @@ virNetDevMacVLanReserveName(const char *name, bool quietFail) * * returns 0 on success, -1 on failure */ -int -virNetDevMacVLanReleaseName(const char *name) +static int +virNetDevMacVLanReleaseNameInternal(const char *name) { unsigned int id; unsigned int flags = 0; @@ -277,6 +290,19 @@ virNetDevMacVLanReleaseName(const char *name) } +int +virNetDevMacVLanReleaseName(const char *name) +{ + /* Call the internal function after locking the macvlan mutex */ + int ret; + + virMutexLock(&virNetDevMacVLanCreateMutex); + ret = virNetDevMacVLanReleaseNameInternal(name); + virMutexUnlock(&virNetDevMacVLanCreateMutex); + return ret; +} + + /** * virNetDevMacVLanIsMacvtap: * @ifname: Name of the interface @@ -967,7 +993,7 @@ virNetDevMacVLanCreateWithVPortProfile(const char *ifnameRequested, return -1; } if (isAutoName && - (reservedID = virNetDevMacVLanReserveName(ifnameRequested, true)) < 0) { + (reservedID = virNetDevMacVLanReserveNameInternal(ifnameRequested, true)) < 0) { reservedID = -1; goto create_name; } @@ -975,7 +1001,7 @@ virNetDevMacVLanCreateWithVPortProfile(const char *ifnameRequested, if (virNetDevMacVLanCreate(ifnameRequested, type, macaddress, linkdev, macvtapMode, &do_retry) < 0) { if (isAutoName) { - virNetDevMacVLanReleaseName(ifnameRequested); + virNetDevMacVLanReleaseNameInternal(ifnameRequested); reservedID = -1; goto create_name; } -- 2.26.2

On 8/24/20 6:23 AM, Laine Stump wrote:
When these functions are called from within virnetdevmacvlan.c, they are usually called with virNetDevMacVLanCreateMutex held, but when virNetDevMacVLanReserveName() is called from other places (hypervisor drivers keeping track of already-in-use macvlan/macvtap devices) the lock isn't acquired. This could lead to a situation where one thread is setting a bit in the bitmap to notify of a device already in-use, while another thread is checking/setting/clearing a bit while creating a new macvtap device.
In practice this *probably* doesn't happen, because the external calls to virNetDevMacVLan() only happen during hypervisor driver init routines when libvirtd is restarted, but there's no harm in protecting ourselves.
(NB: virNetDevMacVLanReleaseName() is actually never called from outside virnetdevmacvlan.c, so it could just as well be static, but I'm leaving it as-is for now. This locking version *is* called from within virnetdevmacvlan.c, since there are a couple places that we used to call the unlocked version after the lock was already released.)
Signed-off-by: Laine Stump <laine@redhat.com> --- src/util/virnetdevmacvlan.c | 42 ++++++++++++++++++++++++++++++------- 1 file changed, 34 insertions(+), 8 deletions(-)
diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index dcea93a5fe..69a9c784bb 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -196,7 +196,7 @@ virNetDevMacVLanReleaseID(int id, unsigned int flags)
/** - * virNetDevMacVLanReserveName: + * virNetDevMacVLanReserveNameInternal: * * @name: already-known name of device * @quietFail: don't log an error if this name is already in-use @@ -208,8 +208,8 @@ virNetDevMacVLanReleaseID(int id, unsigned int flags) * Returns reserved ID# on success, -1 on failure, -2 if the name * doesn't fit the auto-pattern (so not reserveable). */ -int -virNetDevMacVLanReserveName(const char *name, bool quietFail) +static int +virNetDevMacVLanReserveNameInternal(const char *name, bool quietFail) { unsigned int id; unsigned int flags = 0; @@ -237,8 +237,21 @@ virNetDevMacVLanReserveName(const char *name, bool quietFail) }
+int +virNetDevMacVLanReserveName(const char *name, bool quietFail) +{ + /* Call the internal function after locking the macvlan mutex */ + int ret; + + virMutexLock(&virNetDevMacVLanCreateMutex); + ret = virNetDevMacVLanReserveNameInternal(name, quietFail); + virMutexUnlock(&virNetDevMacVLanCreateMutex); + return ret; +}
Hopefully, we won't use any of these in a forked off process because these are not async-signal safe anymore. Michal

On 8/24/20 6:23 AM, Michal Privoznik wrote:
On 8/24/20 6:23 AM, Laine Stump wrote:
When these functions are called from within virnetdevmacvlan.c, they are usually called with virNetDevMacVLanCreateMutex held, but when virNetDevMacVLanReserveName() is called from other places (hypervisor drivers keeping track of already-in-use macvlan/macvtap devices) the lock isn't acquired. This could lead to a situation where one thread is setting a bit in the bitmap to notify of a device already in-use, while another thread is checking/setting/clearing a bit while creating a new macvtap device.
In practice this *probably* doesn't happen, because the external calls to virNetDevMacVLan() only happen during hypervisor driver init routines when libvirtd is restarted, but there's no harm in protecting ourselves.
(NB: virNetDevMacVLanReleaseName() is actually never called from outside virnetdevmacvlan.c, so it could just as well be static, but I'm leaving it as-is for now. This locking version *is* called from within virnetdevmacvlan.c, since there are a couple places that we used to call the unlocked version after the lock was already released.)
Signed-off-by: Laine Stump <laine@redhat.com> --- src/util/virnetdevmacvlan.c | 42 ++++++++++++++++++++++++++++++------- 1 file changed, 34 insertions(+), 8 deletions(-)
diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index dcea93a5fe..69a9c784bb 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -196,7 +196,7 @@ virNetDevMacVLanReleaseID(int id, unsigned int flags) /** - * virNetDevMacVLanReserveName: + * virNetDevMacVLanReserveNameInternal: * * @name: already-known name of device * @quietFail: don't log an error if this name is already in-use @@ -208,8 +208,8 @@ virNetDevMacVLanReleaseID(int id, unsigned int flags) * Returns reserved ID# on success, -1 on failure, -2 if the name * doesn't fit the auto-pattern (so not reserveable). */ -int -virNetDevMacVLanReserveName(const char *name, bool quietFail) +static int +virNetDevMacVLanReserveNameInternal(const char *name, bool quietFail) { unsigned int id; unsigned int flags = 0; @@ -237,8 +237,21 @@ virNetDevMacVLanReserveName(const char *name, bool quietFail) } +int +virNetDevMacVLanReserveName(const char *name, bool quietFail) +{ + /* Call the internal function after locking the macvlan mutex */ + int ret; + + virMutexLock(&virNetDevMacVLanCreateMutex); + ret = virNetDevMacVLanReserveNameInternal(name, quietFail); + virMutexUnlock(&virNetDevMacVLanCreateMutex); + return ret; +}
Hopefully, we won't use any of these in a forked off process because these are not async-signal safe anymore.
Interesting point (not because I think it could happen in this case, but because I hadn't even been thinking about it when I added to the mutex usage (and created a new mutex in the next patch)). But of course this could be said for any code that uses a mutex (and in this case, even without the mutex we can't use the global counter in a forked off process and expect to get unique numbers). I wonder if there's a way a static code checker could verify that certain bits of code can never be in the call chain in a forked process...

There have been some reports that, due to libvirt always trying to assign the lowest numbered macvtap / tap device name possible, a new guest would sometimes be started using the same tap device name as previously used by another guest that is in the process of being destroyed *as the new guest is starting. In some cases this has led to, for example, the old guest's qemuProcessStop() code deleting a port from an OVS switch that had just been re-added by the new guest (because the port name is based on only the device name using the port). Similar problems can happen (and I believe have) with nwfilter rules and bandwidth rules (which are both instantiated based on the name of the tap device). A couple patches have been previously proposed to change the ordering of startup and shutdown processing, or to put a mutex around everything related to the tap/macvtap device name usage, but in the end no matter what you do there will still be possible holes, because the device could be deleted outside libvirt's control (for example, regular tap devices are automatically deleted when the qemu process terminates, and that isn't always initiated by libvirt but could instead happen completely asynchronously - libvirt then has no control over the ordering of shutdown operations, and no opportunity to protect it with a mutex.) But this only happens if a new device is created at the same time as one is being deleted. We can effectively eliminate the chance of this happening if we end the practice of always looking for the lowest numbered available device name, and instead just keep an integer that is incremented each time we need a new device name. At some point it will need to wrap back around to 0 (in order to avoid the IFNAMSIZ 15 character limit if nothing else), and we can't guarantee that the new name really will be the *least* recently used name, but "math" suggests that it will be *much* less common that we'll try to re-use the *most* recently used name. This patch implements such a counter for macvtap/macvlan only. It does it on top of the existing "ID reservation" system (I'm thinking about making a followup that gets rid of the bitmap, as it's now overkill and just serves to make the code more confusing). Signed-off-by: Laine Stump <laine@redhat.com> --- src/util/virnetdevmacvlan.c | 67 +++++++++++++++++++++++++++---------- 1 file changed, 50 insertions(+), 17 deletions(-) diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index 69a9c784bb..c086aa3eb0 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -74,6 +74,8 @@ VIR_LOG_INIT("util.netdevmacvlan"); virMutex virNetDevMacVLanCreateMutex = VIR_MUTEX_INITIALIZER; virBitmapPtr macvtapIDs = NULL; virBitmapPtr macvlanIDs = NULL; +static int macvtapLastID = -1; +static int macvlanLastID = -1; static int virNetDevMacVLanOnceInit(void) @@ -108,12 +110,18 @@ virNetDevMacVLanReserveID(int id, unsigned int flags, bool quietFail, bool nextFree) { virBitmapPtr bitmap; + int *lastID; if (virNetDevMacVLanInitialize() < 0) return -1; - bitmap = (flags & VIR_NETDEV_MACVLAN_CREATE_WITH_TAP) ? - macvtapIDs : macvlanIDs; + if (flags & VIR_NETDEV_MACVLAN_CREATE_WITH_TAP) { + bitmap = macvtapIDs; + lastID = &macvtapLastID; + } else { + bitmap = macvlanIDs; + lastID = &macvlanLastID; + } if (id > MACVLAN_MAX_ID) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -122,24 +130,49 @@ virNetDevMacVLanReserveID(int id, unsigned int flags, return -1; } - if ((id < 0 || nextFree) && - (id = virBitmapNextClearBit(bitmap, id)) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("no unused %s names available"), - VIR_NET_GENERATED_PREFIX); - return -1; - } + if (id < 0 || nextFree) { + /* starting with *lastID + 1, do a loop looking for an unused + * device name, wrapping around at MACVLAN_MAX_ID. + */ + int start = (++(*lastID)) % (MACVLAN_MAX_ID + 1); + bool found = false; - if (virBitmapIsBitSet(bitmap, id)) { - if (quietFail) { - VIR_INFO("couldn't reserve name %s%d - already in use", - VIR_NET_GENERATED_PREFIX, id); - } else { + for (id = start; + id + 1 != start; + id = (++(*lastID)) % (MACVLAN_MAX_ID + 1)) { + + if (!virBitmapIsBitSet(bitmap, id)) { + found = true; + break; + } + } + if (!found) { virReportError(VIR_ERR_INTERNAL_ERROR, - _("couldn't reserve name %s%d - already in use"), - VIR_NET_GENERATED_PREFIX, id); + _("no unused %s names available"), + VIR_NET_GENERATED_PREFIX); + return -1; } - return -1; + } else { + /* A specific ID was requested, we just fail if that + * ID isn't available + */ + if (virBitmapIsBitSet(bitmap, id)) { + if (quietFail) { + VIR_INFO("couldn't reserve name %s%d - already in use", + VIR_NET_GENERATED_PREFIX, id); + } else { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("couldn't reserve name %s%d - already in use"), + VIR_NET_GENERATED_PREFIX, id); + } + return -1; + } + /* adjust lastID to not look below this ID (even though + * eventually we will wrap around and look below it - this is + * just a delay tactic + */ + if (*lastID % (MACVLAN_MAX_ID + 1) < id) + *lastID = id; } if (virBitmapSetBit(bitmap, id) < 0) { -- 2.26.2

On 8/24/20 6:23 AM, Laine Stump wrote:
There have been some reports that, due to libvirt always trying to assign the lowest numbered macvtap / tap device name possible, a new guest would sometimes be started using the same tap device name as previously used by another guest that is in the process of being destroyed *as the new guest is starting.
Realign please :-)
In some cases this has led to, for example, the old guest's qemuProcessStop() code deleting a port from an OVS switch that had just been re-added by the new guest (because the port name is based on only the device name using the port). Similar problems can happen (and I believe have) with nwfilter rules and bandwidth rules (which are both instantiated based on the name of the tap device).
A couple patches have been previously proposed to change the ordering of startup and shutdown processing, or to put a mutex around everything related to the tap/macvtap device name usage, but in the end no matter what you do there will still be possible holes, because the device could be deleted outside libvirt's control (for example, regular tap devices are automatically deleted when the qemu process terminates, and that isn't always initiated by libvirt but could instead happen completely asynchronously - libvirt then has no control over the ordering of shutdown operations, and no opportunity to protect it with a mutex.)
But this only happens if a new device is created at the same time as one is being deleted. We can effectively eliminate the chance of this happening if we end the practice of always looking for the lowest numbered available device name, and instead just keep an integer that is incremented each time we need a new device name. At some point it will need to wrap back around to 0 (in order to avoid the IFNAMSIZ 15 character limit if nothing else), and we can't guarantee that the new name really will be the *least* recently used name, but "math" suggests that it will be *much* less common that we'll try to re-use the *most* recently used name.
This patch implements such a counter for macvtap/macvlan only. It does it on top of the existing "ID reservation" system (I'm thinking about making a followup that gets rid of the bitmap, as it's now overkill and just serves to make the code more confusing).
Signed-off-by: Laine Stump <laine@redhat.com> --- src/util/virnetdevmacvlan.c | 67 +++++++++++++++++++++++++++---------- 1 file changed, 50 insertions(+), 17 deletions(-)
Michal

When creating a standard tap device, if provided with an ifname that contains "%d", rather than taking that literally as the name to use for the new device, the kernel will instead use that string as a template, and search for the lowest number that could be put in place of %d and produce an otherwise unused and unique name for the new device. For example, if there is no tap device name given in the XML, libvirt will always send "vnet%d" as the device name, and the kernel will create new devices named "vnet0", "vnet1", etc. If one of those devices is deleted, creating a "hole" in the name list, the kernel will always attempt to reuse the name in the hole first before using a name with a higher number (i.e. it finds the lowest possible unused number). The problem with this, as described in the previous patch dealing with macvtap device naming, is that it makes "immediate reuse" of a newly freed tap device name *much* more common, and in the aftermath of deleting a tap device, there is some other necessary cleanup of things which are named based on the device name (nwfilter rules, bandwidth rules, OVS switch ports, to name a few) that could end up stomping over the top of the setup of a new device of the same name for a different guest. Since the kernel "create a name based on a template" functionality for tap devices doesn't exist for macvtap, this patch is a bit different from the previous patch - we look for a requested ifname of "vnet%d", and when we see that, we use it as a sprintf format string to find an unused device name ourselves, then pass that exact name on to the kernel; in this way we can avoid the perils of the kernel's "lowest number first" algorithm, and instead use a "hopefully never-before used number" algorithm. (NB: It is still possible for a user to provide their own parameterized template name (e.g. "mytap%d") in the XML, and libvirt will just pass that through to the kernel as it always has.) Signed-off-by: Laine Stump <laine@redhat.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_process.c | 22 ++++++++++- src/util/virnetdevtap.c | 79 +++++++++++++++++++++++++++++++++++++++- src/util/virnetdevtap.h | 4 ++ 4 files changed, 102 insertions(+), 4 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 01c2e710cd..a9d5af9dde 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2674,6 +2674,7 @@ virNetDevTapGetName; virNetDevTapGetRealDeviceName; virNetDevTapInterfaceStats; virNetDevTapReattachBridge; +virNetDevTapReserveName; # util/virnetdevveth.h diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 832b2e6870..2a1c1a3732 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3319,8 +3319,26 @@ qemuProcessNotifyNets(virDomainDefPtr def) * domain to be unceremoniously killed, which would be *very* * impolite. */ - if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_DIRECT) - ignore_value(virNetDevMacVLanReserveName(net->ifname, false)); + switch (virDomainNetGetActualType(net)) { + case VIR_DOMAIN_NET_TYPE_DIRECT: + ignore_value(virNetDevMacVLanReserveName(net->ifname, false)); + break; + case VIR_DOMAIN_NET_TYPE_BRIDGE: + case VIR_DOMAIN_NET_TYPE_NETWORK: + case VIR_DOMAIN_NET_TYPE_ETHERNET: + virNetDevTapReserveName(net->ifname); + break; + case VIR_DOMAIN_NET_TYPE_USER: + case VIR_DOMAIN_NET_TYPE_VHOSTUSER: + case VIR_DOMAIN_NET_TYPE_SERVER: + case VIR_DOMAIN_NET_TYPE_CLIENT: + case VIR_DOMAIN_NET_TYPE_MCAST: + case VIR_DOMAIN_NET_TYPE_INTERNAL: + case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_LAST: + break; + } if (net->type == VIR_DOMAIN_NET_TYPE_NETWORK) { if (!conn && !(conn = virGetConnectNetwork())) diff --git a/src/util/virnetdevtap.c b/src/util/virnetdevtap.c index c0a7c3019e..4c3b68e582 100644 --- a/src/util/virnetdevtap.c +++ b/src/util/virnetdevtap.c @@ -54,6 +54,45 @@ VIR_LOG_INIT("util.netdevtap"); +#define VIR_TAP_MAX_ID 99999 + +virMutex virNetDevTapCreateMutex = VIR_MUTEX_INITIALIZER; +static int virNetDevTapLastID = -1; + + +/** + * virNetDevTapReserveName: + * @name: name of an existing tap device + * + * Set the value of virNetDevTapLastID to assure that any new tap + * device created with an autogenerated name will use a number higher + * than the number in the given tap device name. + * + * Returns nothing. + */ +void +virNetDevTapReserveName(const char *name) +{ + unsigned int id; + const char *idstr = NULL; + + + if (STRPREFIX(name, VIR_NET_GENERATED_TAP_PREFIX)) { + + idstr = name + strlen(VIR_NET_GENERATED_TAP_PREFIX); + + if (virStrToLong_ui(idstr, NULL, 10, &id) >= 0) { + virMutexLock(&virNetDevTapCreateMutex); + + if (virNetDevTapLastID % (VIR_TAP_MAX_ID + 1) < (int)id) + virNetDevTapLastID = id; + + virMutexUnlock(&virNetDevTapCreateMutex); + } + } +} + + /** * virNetDevTapGetName: * @tapfd: a tun/tap file descriptor @@ -230,10 +269,45 @@ int virNetDevTapCreate(char **ifname, size_t tapfdSize, unsigned int flags) { - size_t i; + size_t i = 0; struct ifreq ifr; int ret = -1; - int fd; + int fd = 0; + + virMutexLock(&virNetDevTapCreateMutex); + + /* if ifname is "vnet%d", then create the actual name to use by + * replacing %d with ++virNetDevTapLastID. Keep trying new values + * until one is found that doesn't already exist, or we've + * completed one full circle of the number space. + * + */ + + if (STREQ(*ifname, VIR_NET_GENERATED_TAP_PREFIX "%d")) { + int id; + int start = ++virNetDevTapLastID % (VIR_TAP_MAX_ID + 1); + bool found = false; + + for (id = start; + id + 1 != start; + id = ++virNetDevTapLastID % (VIR_TAP_MAX_ID + 1)) { + + g_autofree char *try = g_strdup_printf(*ifname, id); + + if (!virNetDevExists(try)) { + g_free(*ifname); + *ifname = g_steal_pointer(&try); + found = true; + break; + } + } + if (!found) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("no unused %s names available"), + VIR_NET_GENERATED_TAP_PREFIX); + goto cleanup; + } + } if (!tunpath) tunpath = "/dev/net/tun"; @@ -302,6 +376,7 @@ int virNetDevTapCreate(char **ifname, ret = 0; cleanup: + virMutexUnlock(&virNetDevTapCreateMutex); if (ret < 0) { VIR_FORCE_CLOSE(fd); while (i--) diff --git a/src/util/virnetdevtap.h b/src/util/virnetdevtap.h index c6bd9285ba..dea8aec3af 100644 --- a/src/util/virnetdevtap.h +++ b/src/util/virnetdevtap.h @@ -29,6 +29,10 @@ # define VIR_NETDEV_TAP_REQUIRE_MANUAL_CLEANUP 1 #endif +void +virNetDevTapReserveName(const char *name) + ATTRIBUTE_NONNULL(1); + int virNetDevTapCreate(char **ifname, const char *tunpath, int *tapfd, -- 2.26.2

On 8/24/20 6:23 AM, Laine Stump wrote:
The problem and solution are very well described in patches 2 and 3, but in short - because we (libvirt for macvtap, the kernel for tap) always try to assign the lowest numbered names possible to macvtap and tap devices, we sometimes create a new tap for a new guest using the same name as an old tap for an old guest that is shutting down simultaneous to setting up the new guest/tap. This can lead to the old guest teardown stomping on the new guest setup.
These patches eliminate that problem by changing the strategy to do our best to *not* reuse tap / macvtap device names, but instead use a monotonically incrementing counter to name the devices.
One possibly undesirable side effect of this (and the other) patch is that the longer a host is running without reboot, the higher the numbers tap device names will get. While users are accustomed to always seeing vnet0 and vnet1, they may be a bit surprised to now see vnet39283 or macvtap735. It has been pointed out to me that the same thing happened with PIDs a few years ago, and while it looked strange at first, everyone is now accustomed to it.
Laine Stump (3): util: make locking versions of virNetDevMacVLan(Reserve|Release)Name() util: assign macvtap names using a monotonically increasing integer util: assign tap device names using a monotonically increasing integer
src/libvirt_private.syms | 1 + src/qemu/qemu_process.c | 22 +++++++- src/util/virnetdevmacvlan.c | 109 +++++++++++++++++++++++++++--------- src/util/virnetdevtap.c | 79 +++++++++++++++++++++++++- src/util/virnetdevtap.h | 4 ++ 5 files changed, 186 insertions(+), 29 deletions(-)
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Michal
participants (2)
-
Laine Stump
-
Michal Privoznik