[libvirt] [PATCH 0/3] Fix starting/stopping of netdevs when starting/stopping CPUs

The first two patches provide a cleaner, more complete fix to a fix that was recently pushed for https://bugzilla.redhat.com/show_bug.cgi?id=1081461 While the original patch does fix the symptoms in the report, applying these two additional patches handle some situations that weren't addressed by the original. Patch 3 fixes a similar problem with tap devices when the newly added macTableManager='libvirt' is used. It should make migration of guests with this new type of network connection work properly with no loss of network connectivity (including once post-copy migration is working). Laine Stump (3): qemu: always call qemuInterfaceStartDevices() when starting CPUs qemu: add a qemuInterfaceStopDevices(), called when guest CPUs stop qemu: add/remove bridge fdb entries as guest CPUs are started/stopped src/qemu/qemu_command.c | 10 ++---- src/qemu/qemu_hotplug.c | 8 +++++ src/qemu/qemu_interface.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_interface.h | 2 ++ src/qemu/qemu_process.c | 7 ++-- 5 files changed, 109 insertions(+), 10 deletions(-) -- 1.9.3

The patch that added qemuInterfaceStartDevices() (upstream commit 82977058f5b1d143a355079900029e9cbfee2fe4) had an extra conditional to prevent calling it if the reason for starting the CPUs was VIR_DOMAIN_RUNNING_UNPAUSED or VIR_DOMAIN_RUNNING_SAVE_CANCELED. This was put in by the author as the result of a reviewer asking if it was necessary to ifup the interfaces in *all* occasions (because these were the two cases where the CPU would have already been started (and stopped) once, so the interface would already be ifup'ed). It turns out that, as long as there is no corresponding qemuInterfaceStopDevices() to ifdown the interfaces anytime the CPUs are stopped, neglecting to ifup when reason is RUNNING_UNPAUSED or RUNNING_SAVE_CANCELED doesn't cause any problems (because it just happens that the interface will have already been ifup'ed by a prior call when the CPU was previously started for some other reason). However, it also doesn't *help*, and there will soon be a qemuInterfaceStopDevices() function which *will* ifdown these interfaces when the guest CPUs are stopped, and once that is done, the interfaces will be left down in some cases when they should be up (for example, if a domain is paused and then unpaused). So, this patch is removing the condition in favor of always calling qemuInterfaeStartDevices() when the guest CPUs are started. This patch (and the aforementioned patch) resolve: https://bugzilla.redhat.com/show_bug.cgi?id=1081461 --- src/qemu/qemu_process.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index ab4df9b..0028283 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3126,9 +3126,7 @@ qemuProcessStartCPUs(virQEMUDriverPtr driver, virDomainObjPtr vm, virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver); /* Bring up netdevs before starting CPUs */ - if (reason != VIR_DOMAIN_RUNNING_UNPAUSED && - reason != VIR_DOMAIN_RUNNING_SAVE_CANCELED && - qemuInterfaceStartDevices(vm->def) < 0) + if (qemuInterfaceStartDevices(vm->def) < 0) goto cleanup; VIR_DEBUG("Using lock state '%s'", NULLSTR(priv->lockState)); -- 1.9.3

On 12/12/2014 11:34 AM, Laine Stump wrote:
The patch that added qemuInterfaceStartDevices() (upstream commit 82977058f5b1d143a355079900029e9cbfee2fe4) had an extra conditional to prevent calling it if the reason for starting the CPUs was VIR_DOMAIN_RUNNING_UNPAUSED or VIR_DOMAIN_RUNNING_SAVE_CANCELED. This was put in by the author as the result of a reviewer asking if it was necessary to ifup the interfaces in *all* occasions (because these were the two cases where the CPU would have already been started (and stopped) once, so the interface would already be ifup'ed).
It turns out that, as long as there is no corresponding qemuInterfaceStopDevices() to ifdown the interfaces anytime the CPUs are stopped, neglecting to ifup when reason is RUNNING_UNPAUSED or RUNNING_SAVE_CANCELED doesn't cause any problems (because it just happens that the interface will have already been ifup'ed by a prior call when the CPU was previously started for some other reason).
However, it also doesn't *help*, and there will soon be a qemuInterfaceStopDevices() function which *will* ifdown these interfaces when the guest CPUs are stopped, and once that is done, the interfaces will be left down in some cases when they should be up (for example, if a domain is paused and then unpaused).
So, this patch is removing the condition in favor of always calling qemuInterfaeStartDevices() when the guest CPUs are started.
This patch (and the aforementioned patch) resolve:
https://bugzilla.redhat.com/show_bug.cgi?id=1081461 --- src/qemu/qemu_process.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index ab4df9b..0028283 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3126,9 +3126,7 @@ qemuProcessStartCPUs(virQEMUDriverPtr driver, virDomainObjPtr vm, virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver);
/* Bring up netdevs before starting CPUs */ - if (reason != VIR_DOMAIN_RUNNING_UNPAUSED && - reason != VIR_DOMAIN_RUNNING_SAVE_CANCELED && - qemuInterfaceStartDevices(vm->def) < 0) + if (qemuInterfaceStartDevices(vm->def) < 0) goto cleanup;
VIR_DEBUG("Using lock state '%s'", NULLSTR(priv->lockState));
I agreed to this in a separate thread & code looks good so: Reviewed by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>

We now have a qemuInterfaceStartDevices() which does the final activation needed for the host-side tap/macvtap devices that are used for qemu network connections. It will soon make sense to have the converse qemuInterfaceStopDevices() which will undo whatever was done during qemuInterfaceStartDevices(). A function to "stop" a single device has also been added, and is called from the appropriate place in qemuDomainDetachNetDevice(), although this is currently unnecessary - the device is going to immediately be deleted anyway, so any extra "deactivation" will be for naught. The call is included for completeness, though, in anticipation that in the future there may be some required action that *isn't* nullified by deleting the device. This patch is a part of a more complete fix for: https://bugzilla.redhat.com/show_bug.cgi?id=1081461 --- src/qemu/qemu_hotplug.c | 8 ++++++ src/qemu/qemu_interface.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_interface.h | 2 ++ src/qemu/qemu_process.c | 3 +++ 4 files changed, 79 insertions(+) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 8c0642e..e56cffe 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3513,6 +3513,14 @@ qemuDomainDetachNetDevice(virQEMUDriverPtr driver, VIR_WARN("cannot clear bandwidth setting for device : %s", detach->ifname); + /* deactivate the tap/macvtap device on the host (currently this + * isn't necessary, as everything done in + * qemuInterfaceStopDevice() is made meaningless when the device + * is deleted anyway, but in the future it may be important, and + * doesn't hurt anything for now) + */ + ignore_value(qemuInterfaceStopDevice(detach)); + qemuDomainMarkDeviceForRemoval(vm, &detach->info); qemuDomainObjEnterMonitor(driver, vm); diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c index b0f0c5d..b9694c8 100644 --- a/src/qemu/qemu_interface.c +++ b/src/qemu/qemu_interface.c @@ -98,3 +98,69 @@ qemuInterfaceStartDevices(virDomainDefPtr def) } return 0; } + + +/** + * qemuInterfaceStopDevice: + * @net: net device to stop + * + * Based upon the type of device provided, perform the appropriate + * work to deactivate the device so that packets aren't forwarded to + * it from the rest of the network. + */ +int +qemuInterfaceStopDevice(virDomainNetDefPtr net) +{ + int ret = -1; + + switch (virDomainNetGetActualType(net)) { + case VIR_DOMAIN_NET_TYPE_BRIDGE: + case VIR_DOMAIN_NET_TYPE_NETWORK: + break; + + case VIR_DOMAIN_NET_TYPE_DIRECT: + /* macvtap interfaces need to be marked !IFF_UP (ie "down") to + * prevent any host-generated traffic sent from this interface + * from putting bad info into the arp caches of other machines + * on this network. + */ + if (virNetDevSetOnline(net->ifname, false) < 0) + goto cleanup; + break; + + case VIR_DOMAIN_NET_TYPE_USER: + case VIR_DOMAIN_NET_TYPE_ETHERNET: + case VIR_DOMAIN_NET_TYPE_VHOSTUSER: + case VIR_DOMAIN_NET_TYPE_SERVER: + case VIR_DOMAIN_NET_TYPE_CLIENT: + case VIR_DOMAIN_NET_TYPE_MCAST: + case VIR_DOMAIN_NET_TYPE_INTERNAL: + case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_LAST: + /* these types all require no action */ + break; + } + + ret = 0; + cleanup: + return ret; +} + +/** + * qemuInterfaceStopDevices: + * @def: domain definition + * + * Make all interfaces associated with this domain inaccessible from + * the rest of the network. + */ +int +qemuInterfaceStopDevices(virDomainDefPtr def) +{ + size_t i; + + for (i = 0; i < def->nnets; i++) { + if (qemuInterfaceStopDevice(def->nets[i]) < 0) + return -1; + } + return 0; +} diff --git a/src/qemu/qemu_interface.h b/src/qemu/qemu_interface.h index d040f52..b4c1efc 100644 --- a/src/qemu/qemu_interface.h +++ b/src/qemu/qemu_interface.h @@ -28,5 +28,7 @@ int qemuInterfaceStartDevice(virDomainNetDefPtr net); int qemuInterfaceStartDevices(virDomainDefPtr def); +int qemuInterfaceStopDevice(virDomainNetDefPtr net); +int qemuInterfaceStopDevices(virDomainDefPtr def); #endif /* __QEMU_INTERFACE_H__ */ diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 0028283..a19e71a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3182,6 +3182,9 @@ int qemuProcessStopCPUs(virQEMUDriverPtr driver, if (ret < 0) goto cleanup; + /* de-activate netdevs after stopping CPUs */ + ignore_value(qemuInterfaceStopDevices(vm->def)); + if (priv->job.current) ignore_value(virTimeMillisNow(&priv->job.current->stopped)); -- 1.9.3

On 12/12/2014 11:34 AM, Laine Stump wrote:
We now have a qemuInterfaceStartDevices() which does the final activation needed for the host-side tap/macvtap devices that are used for qemu network connections. It will soon make sense to have the converse qemuInterfaceStopDevices() which will undo whatever was done during qemuInterfaceStartDevices().
A function to "stop" a single device has also been added, and is called from the appropriate place in qemuDomainDetachNetDevice(), although this is currently unnecessary - the device is going to immediately be deleted anyway, so any extra "deactivation" will be for naught. The call is included for completeness, though, in anticipation that in the future there may be some required action that *isn't* nullified by deleting the device.
This patch is a part of a more complete fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=1081461 --- src/qemu/qemu_hotplug.c | 8 ++++++ src/qemu/qemu_interface.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_interface.h | 2 ++ src/qemu/qemu_process.c | 3 +++ 4 files changed, 79 insertions(+)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 8c0642e..e56cffe 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3513,6 +3513,14 @@ qemuDomainDetachNetDevice(virQEMUDriverPtr driver, VIR_WARN("cannot clear bandwidth setting for device : %s", detach->ifname);
+ /* deactivate the tap/macvtap device on the host (currently this + * isn't necessary, as everything done in + * qemuInterfaceStopDevice() is made meaningless when the device + * is deleted anyway, but in the future it may be important, and + * doesn't hurt anything for now) + */ + ignore_value(qemuInterfaceStopDevice(detach)); + qemuDomainMarkDeviceForRemoval(vm, &detach->info);
qemuDomainObjEnterMonitor(driver, vm); diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c index b0f0c5d..b9694c8 100644 --- a/src/qemu/qemu_interface.c +++ b/src/qemu/qemu_interface.c @@ -98,3 +98,69 @@ qemuInterfaceStartDevices(virDomainDefPtr def) } return 0; } + + +/** + * qemuInterfaceStopDevice: + * @net: net device to stop + * + * Based upon the type of device provided, perform the appropriate + * work to deactivate the device so that packets aren't forwarded to + * it from the rest of the network. + */ +int +qemuInterfaceStopDevice(virDomainNetDefPtr net) +{ + int ret = -1; + + switch (virDomainNetGetActualType(net)) { + case VIR_DOMAIN_NET_TYPE_BRIDGE: + case VIR_DOMAIN_NET_TYPE_NETWORK: + break; + + case VIR_DOMAIN_NET_TYPE_DIRECT: + /* macvtap interfaces need to be marked !IFF_UP (ie "down") to + * prevent any host-generated traffic sent from this interface + * from putting bad info into the arp caches of other machines + * on this network. + */ + if (virNetDevSetOnline(net->ifname, false) < 0) + goto cleanup; + break; + + case VIR_DOMAIN_NET_TYPE_USER: + case VIR_DOMAIN_NET_TYPE_ETHERNET: + case VIR_DOMAIN_NET_TYPE_VHOSTUSER: + case VIR_DOMAIN_NET_TYPE_SERVER: + case VIR_DOMAIN_NET_TYPE_CLIENT: + case VIR_DOMAIN_NET_TYPE_MCAST: + case VIR_DOMAIN_NET_TYPE_INTERNAL: + case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_LAST: + /* these types all require no action */ + break; + } + + ret = 0; + cleanup: + return ret; +} + +/** + * qemuInterfaceStopDevices: + * @def: domain definition + * + * Make all interfaces associated with this domain inaccessible from + * the rest of the network. + */ +int +qemuInterfaceStopDevices(virDomainDefPtr def) +{ + size_t i; + + for (i = 0; i < def->nnets; i++) { + if (qemuInterfaceStopDevice(def->nets[i]) < 0) + return -1; + } + return 0; +} diff --git a/src/qemu/qemu_interface.h b/src/qemu/qemu_interface.h index d040f52..b4c1efc 100644 --- a/src/qemu/qemu_interface.h +++ b/src/qemu/qemu_interface.h @@ -28,5 +28,7 @@
int qemuInterfaceStartDevice(virDomainNetDefPtr net); int qemuInterfaceStartDevices(virDomainDefPtr def); +int qemuInterfaceStopDevice(virDomainNetDefPtr net); +int qemuInterfaceStopDevices(virDomainDefPtr def);
#endif /* __QEMU_INTERFACE_H__ */ diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 0028283..a19e71a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3182,6 +3182,9 @@ int qemuProcessStopCPUs(virQEMUDriverPtr driver, if (ret < 0) goto cleanup;
+ /* de-activate netdevs after stopping CPUs */ + ignore_value(qemuInterfaceStopDevices(vm->def)); + if (priv->job.current) ignore_value(virTimeMillisNow(&priv->job.current->stopped));
Reviewed by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>

When libvirt is managing a bridge's forwarding database (FDB) (macTableManager='libvirt'), if we add FDB entries for a new guest interface even before the qemu process is created, then in the case of a migration any other guest attached to the "destination" bridge will have its traffic immediately sent to the destination of the migration even while the source domain is still running (and the destination, of course, isn't). To make sure that traffic from other guests on the new host continues flowing to the old guest until the new one is ready, we have to wait until the new guest CPUs are started to add the FDB entries. Conversely, we need to remove the FDB entries from the bridge any time the guest CPUs are stopped; among other things, this will assure proper operation during a post-copy migration (which is just the opposite of the problem described in the previous paragraph). --- src/qemu/qemu_command.c | 10 +++------- src/qemu/qemu_interface.c | 26 ++++++++++++++++++++++++++ 2 files changed, 29 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 48bdf4e..f4d2f85 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -335,18 +335,14 @@ qemuNetworkIfaceConnect(virDomainDefPtr def, /* libvirt is managing the FDB of the bridge this device * is attaching to, so we need to turn off learning and * unicast_flood on the device to prevent the kernel from - * adding any FDB entries for it, then add an fdb entry - * outselves, using the MAC address from the interface - * config. + * adding any FDB entries for it. We will add add an fdb + * entry ourselves (during qemuInterfaceStartDevices(), + * using the MAC address from the interface config. */ if (virNetDevBridgePortSetLearning(brname, net->ifname, false) < 0) goto cleanup; if (virNetDevBridgePortSetUnicastFlood(brname, net->ifname, false) < 0) goto cleanup; - if (virNetDevBridgeFDBAdd(&net->mac, net->ifname, - VIR_NETDEVBRIDGE_FDB_FLAG_MASTER | - VIR_NETDEVBRIDGE_FDB_FLAG_TEMP) < 0) - goto cleanup; } } else { if (qemuCreateInBridgePortWithHelper(cfg, brname, diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c index b9694c8..4e16521 100644 --- a/src/qemu/qemu_interface.c +++ b/src/qemu/qemu_interface.c @@ -23,10 +23,12 @@ #include <config.h> +#include "network_conf.h" #include "qemu_interface.h" #include "virnetdev.h" #include "virnetdevtap.h" #include "virnetdevmacvlan.h" +#include "virnetdevbridge.h" #include "virnetdevvportprofile.h" /** @@ -45,6 +47,20 @@ qemuInterfaceStartDevice(virDomainNetDefPtr net) switch (virDomainNetGetActualType(net)) { case VIR_DOMAIN_NET_TYPE_BRIDGE: case VIR_DOMAIN_NET_TYPE_NETWORK: + if (virDomainNetGetActualBridgeMACTableManager(net) + == VIR_NETWORK_BRIDGE_MAC_TABLE_MANAGER_LIBVIRT) { + /* libvirt is managing the FDB of the bridge this device + * is attaching to, so we have turned off learning and + * unicast_flood on the device to prevent the kernel from + * adding any FDB entries for it. This means we need to + * add an fdb entry ourselves, using the MAC address from + * the interface config. + */ + if (virNetDevBridgeFDBAdd(&net->mac, net->ifname, + VIR_NETDEVBRIDGE_FDB_FLAG_MASTER | + VIR_NETDEVBRIDGE_FDB_FLAG_TEMP) < 0) + goto cleanup; + } break; case VIR_DOMAIN_NET_TYPE_DIRECT: /* macvtap devices share their MAC address with the guest @@ -116,6 +132,16 @@ qemuInterfaceStopDevice(virDomainNetDefPtr net) switch (virDomainNetGetActualType(net)) { case VIR_DOMAIN_NET_TYPE_BRIDGE: case VIR_DOMAIN_NET_TYPE_NETWORK: + if (virDomainNetGetActualBridgeMACTableManager(net) + == VIR_NETWORK_BRIDGE_MAC_TABLE_MANAGER_LIBVIRT) { + /* remove the FDB entries that were added during + * qemuInterfaceStartDevices() + */ + if (virNetDevBridgeFDBDel(&net->mac, net->ifname, + VIR_NETDEVBRIDGE_FDB_FLAG_MASTER | + VIR_NETDEVBRIDGE_FDB_FLAG_TEMP) < 0) + goto cleanup; + } break; case VIR_DOMAIN_NET_TYPE_DIRECT: -- 1.9.3

On 12.12.2014 17:34, Laine Stump wrote:
When libvirt is managing a bridge's forwarding database (FDB) (macTableManager='libvirt'), if we add FDB entries for a new guest interface even before the qemu process is created, then in the case of a migration any other guest attached to the "destination" bridge will have its traffic immediately sent to the destination of the migration even while the source domain is still running (and the destination, of course, isn't). To make sure that traffic from other guests on the new host continues flowing to the old guest until the new one is ready, we have to wait until the new guest CPUs are started to add the FDB entries.
Conversely, we need to remove the FDB entries from the bridge any time the guest CPUs are stopped; among other things, this will assure proper operation during a post-copy migration (which is just the opposite of the problem described in the previous paragraph). --- src/qemu/qemu_command.c | 10 +++------- src/qemu/qemu_interface.c | 26 ++++++++++++++++++++++++++ 2 files changed, 29 insertions(+), 7 deletions(-)
ACK Michal
participants (3)
-
Laine Stump
-
Matthew Rosato
-
Michal Privoznik