[PATCH safe for 10.10.0] qemu: re-use existing ActualNetDef for more interface types during update-device

For the full history behind this patch, look at the following: https://issues.redhat.com/browse/RHEL-7036 commit v10.7.0-101-ga37bd2a15b commit v10.8.0-rc2-8-gbcd5ae4e73 Summary: original problem was unexpected failure of update-device when the user hadn't changed anything other than online status of the guest NIC (which should always be allowed). The first commit "fixed" this by avoiding the allocation of a new ActualNetDef (i.e. creating a new networkport) for *all* network device updates (because that was inappropriately changing which ethernet physdev should be used for a macvtap connection, which by design can't be handled in an update-device). But this commit caused a regression for update-device of bridge-based network devices (because some the updates of certain attributes *do* require the ActualNetDef be re-allocated), so... The 2nd commit narrowed the list of network types that get the "don't allocate new ActualNetDef" treatment (so that only interfaces connected to a network that uses a pool of ethernet VFs *being used in passthrough mode* qualify). But then it was pointed out that this re-broke simple updates of devices that used a direct/macvtap network in "bridge" mode (because it's possible to list multiple physdevs to use for bridge mode, in which case the network driver attempts to "load balance" (and so a new allocation might have a different ethernet physdev which, again, can't be supported in a device-update). So this (single line of code) patch *widens* the list of network types that don't allocate a new ActualNetDef to also include the other direct (macvtap) modes, e.g. bridge, private, etc. Signed-off-by: Laine Stump <laine@redhat.com> --- There is a more comprehensive fix that also, e.g., makes updating the bandwidth or vlan info of a direct interface work correctly, but that is much more invasive (and also isn't done yet). This patch fixes the case of updating a direct interface's online status (for example) without breaking anything else. src/qemu/qemu_hotplug.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 3c18af6b0c..ff8c1263c6 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3935,25 +3935,29 @@ qemuDomainChangeNet(virQEMUDriver *driver, if (newdev->type == VIR_DOMAIN_NET_TYPE_NETWORK) { if (olddev->type == VIR_DOMAIN_NET_TYPE_NETWORK && oldType == VIR_DOMAIN_NET_TYPE_DIRECT && - virDomainNetGetActualDirectMode(olddev) == VIR_NETDEV_MACVLAN_MODE_PASSTHRU && STREQ(olddev->data.network.name, newdev->data.network.name)) { /* old and new are type='network', and the network name - * hasn't changed *and* this is a network where each - * connection is allocated exclusive use of a VF - * device. In this case we *don't* want to get a new port - * ("actual device") from the network because attempting - * to allocate a new device would also allocate a - * new/different VF, causing the update to fail. And - * anyway we can use olddev's actualNetDef (since it - * hasn't changed). - * - * So instead we just duplicate *the pointer to* the - * actualNetDef from olddev to newdev so that comparisons - * of actualNetDef will show no change. If the update is - * successful, we will clear the actualNetDef pointer from - * olddev before destroying it (or if the update fails, - * then we need to clear the pointer from newdev before - * destroying it) + * hasn't changed *and* this is a "direct" network (a pool + * of 1 or more host ethernet devices where each guest + * interface is allocated one device that it connects to + * via macvtap. In this case we *don't* want to get a new + * port ("actual device") from the network because + * attempting to allocate a new device would also allocate + * a new/different ethernet, causing the update to fail + * (because the physical device of a macvtap-based + * interface can't be changed without completely + * unplugging and re-plugging the guest NIC). + + + * We can work around this issue by just re-using olddev's + * actualNetDef (since it hasn't changed) rather than + * allocating a new one. We just duplicate *the pointer + * to* the actualNetDef from olddev to newdev so that + * comparisons of actualNetDef will show no change. If the + * update is successful, we will clear the actualNetDef + * pointer from olddev before destroying it (or if the + * update fails, then we need to clear the pointer from + * newdev before destroying it) */ newdev->data.network.actual = olddev->data.network.actual; memcpy(newdev->data.network.portid, olddev->data.network.portid, -- 2.47.0

On 11/27/24 20:34, Laine Stump wrote:
For the full history behind this patch, look at the following:
https://issues.redhat.com/browse/RHEL-7036 commit v10.7.0-101-ga37bd2a15b commit v10.8.0-rc2-8-gbcd5ae4e73
Summary: original problem was unexpected failure of update-device when the user hadn't changed anything other than online status of the guest NIC (which should always be allowed).
The first commit "fixed" this by avoiding the allocation of a new ActualNetDef (i.e. creating a new networkport) for *all* network device updates (because that was inappropriately changing which ethernet physdev should be used for a macvtap connection, which by design can't be handled in an update-device).
But this commit caused a regression for update-device of bridge-based network devices (because some the updates of certain attributes *do* require the ActualNetDef be re-allocated), so...
The 2nd commit narrowed the list of network types that get the "don't allocate new ActualNetDef" treatment (so that only interfaces connected to a network that uses a pool of ethernet VFs *being used in passthrough mode* qualify).
But then it was pointed out that this re-broke simple updates of devices that used a direct/macvtap network in "bridge" mode (because it's possible to list multiple physdevs to use for bridge mode, in which case the network driver attempts to "load balance" (and so a new allocation might have a different ethernet physdev which, again, can't be supported in a device-update).
So this (single line of code) patch *widens* the list of network types that don't allocate a new ActualNetDef to also include the other direct (macvtap) modes, e.g. bridge, private, etc.
Signed-off-by: Laine Stump <laine@redhat.com> ---
There is a more comprehensive fix that also, e.g., makes updating the bandwidth or vlan info of a direct interface work correctly, but that is much more invasive (and also isn't done yet). This patch fixes the case of updating a direct interface's online status (for example) without breaking anything else.
src/qemu/qemu_hotplug.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-)
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Sorry for letting this miss the release. Michal

On 12/12/24 5:13 AM, Michal Prívozník wrote:
On 11/27/24 20:34, Laine Stump wrote:
For the full history behind this patch, look at the following:
https://issues.redhat.com/browse/RHEL-7036 commit v10.7.0-101-ga37bd2a15b commit v10.8.0-rc2-8-gbcd5ae4e73
Summary: original problem was unexpected failure of update-device when the user hadn't changed anything other than online status of the guest NIC (which should always be allowed).
The first commit "fixed" this by avoiding the allocation of a new ActualNetDef (i.e. creating a new networkport) for *all* network device updates (because that was inappropriately changing which ethernet physdev should be used for a macvtap connection, which by design can't be handled in an update-device).
But this commit caused a regression for update-device of bridge-based network devices (because some the updates of certain attributes *do* require the ActualNetDef be re-allocated), so...
The 2nd commit narrowed the list of network types that get the "don't allocate new ActualNetDef" treatment (so that only interfaces connected to a network that uses a pool of ethernet VFs *being used in passthrough mode* qualify).
But then it was pointed out that this re-broke simple updates of devices that used a direct/macvtap network in "bridge" mode (because it's possible to list multiple physdevs to use for bridge mode, in which case the network driver attempts to "load balance" (and so a new allocation might have a different ethernet physdev which, again, can't be supported in a device-update).
So this (single line of code) patch *widens* the list of network types that don't allocate a new ActualNetDef to also include the other direct (macvtap) modes, e.g. bridge, private, etc.
Signed-off-by: Laine Stump <laine@redhat.com> ---
There is a more comprehensive fix that also, e.g., makes updating the bandwidth or vlan info of a direct interface work correctly, but that is much more invasive (and also isn't done yet). This patch fixes the case of updating a direct interface's online status (for example) without breaking anything else.
src/qemu/qemu_hotplug.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-)
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Sorry for letting this miss the release.
Not a big problem; I'm guessing that not many people actually use the specific configuration this patch targets, and among those even fewer will need to be externally fiddling with the guest interfaces' online status. It will still be good to have the patch in there - even if/when I get the more comprehensive fix pushed, someone may need this simpler fix backported to a prior release. Thanks for the review!
participants (2)
-
Laine Stump
-
Michal Prívozník