On 2/22/23 5:21 AM, Michal Prívozník wrote:
On 2/22/23 01:35, Laine Stump wrote:
> When a QEMU netdev is of type "stream", if the socket it uses for
> connectivity to the host network gets closed, then QEMU will send a
> NETDEV_STREAM_DISCONNECTED event. We know that any stream netdev we've
> created is backed by a passt process, and if the socket was closed,
> that means the passt process has disappeared.
>
> When we receive this event, we can respond by starting a new passt
> process with the same options (including socket path) we originally
> used. If we have previously created the stream netdev device with a
> "reconnect" option, then QEMU will automatically reconnect to this new
> passt process. (If we hadn't used "reconnect", then QEMU will never
> try to reconnect to the new passt process, so there's no point in
> starting it.)
>
> Note that NETDEV_STREAM_DISCONNECTED is an event sent for the netdev
> (ie "host side") of the network device, and so it sends the
> "netdev-id" to specify which device was disconnected. But libvirt's
> virDomainNetDef (the object used to keep track of network devices) is
> the internal representation of both the host-side "netdev", and the
> guest side device, and virDomainNetDef doesn't directly keep track of
> the netdev-id, only of the device's "alias" (which is the
"id"
> parameter of the *guest* side of the device). Fortunately, by convention
> libvirt always names the host-side of devices as "host" + alias, so in
> order to search for the affected NetDef, all we need to do is trim the
> 1st 4 characters from the netdev-id and look for the NetDef having
> that resulting trimmed string as its alias. (Contrast this to
> NIC_RX_FILTER_CHANGED, which is an event received for the guest side
> of the device, and so directly contains the device alias.)
>
> Resolves:
https://bugzilla.redhat.com/2172098
> Signed-off-by: Laine Stump <laine(a)redhat.com>
> ---
> src/qemu/qemu_domain.c | 1 +
> src/qemu/qemu_domain.h | 1 +
> src/qemu/qemu_driver.c | 82 ++++++++++++++++++++++++++++++++++++
> src/qemu/qemu_monitor.c | 11 +++++
> src/qemu/qemu_monitor.h | 6 +++
> src/qemu/qemu_monitor_json.c | 16 +++++++
> src/qemu/qemu_process.c | 18 ++++++++
> 7 files changed, 135 insertions(+)
>
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index e9bc0f375d..4cf9a259ea 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -11238,6 +11238,7 @@ qemuProcessEventFree(struct qemuProcessEvent *event)
> break;
> case QEMU_PROCESS_EVENT_WATCHDOG:
> case QEMU_PROCESS_EVENT_DEVICE_DELETED:
> + case QEMU_PROCESS_EVENT_NETDEV_STREAM_DISCONNECTED:
> case QEMU_PROCESS_EVENT_NIC_RX_FILTER_CHANGED:
> case QEMU_PROCESS_EVENT_SERIAL_CHANGED:
> case QEMU_PROCESS_EVENT_MONITOR_EOF:
> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
> index 1053d1d4cb..6adc067681 100644
> --- a/src/qemu/qemu_domain.h
> +++ b/src/qemu/qemu_domain.h
> @@ -447,6 +447,7 @@ typedef enum {
> QEMU_PROCESS_EVENT_WATCHDOG = 0,
> QEMU_PROCESS_EVENT_GUESTPANIC,
> QEMU_PROCESS_EVENT_DEVICE_DELETED,
> + QEMU_PROCESS_EVENT_NETDEV_STREAM_DISCONNECTED,
> QEMU_PROCESS_EVENT_NIC_RX_FILTER_CHANGED,
> QEMU_PROCESS_EVENT_SERIAL_CHANGED,
> QEMU_PROCESS_EVENT_JOB_STATUS_CHANGE,
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 6154fe9bfe..47d6a0dd95 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -40,6 +40,7 @@
> #include "qemu_hostdev.h"
> #include "qemu_hotplug.h"
> #include "qemu_monitor.h"
> +#include "qemu_passt.h"
> #include "qemu_process.h"
> #include "qemu_migration.h"
> #include "qemu_migration_params.h"
> @@ -3622,6 +3623,84 @@ processDeviceDeletedEvent(virQEMUDriver *driver,
> }
>
>
> +static void
> +processNetdevStreamDisconnectedEvent(virDomainObj *vm,
> + const char *netdevId)
> +{
> + virDomainDeviceDef dev;
> + virDomainNetDef *def;
> + qemuDomainObjPrivate *priv;
> + virQEMUCaps *qemuCaps;
> + const char *devAlias = NULL;
> +
> + /* The event sends us the "netdev-id", but we don't store the
> + * netdev-id in the NetDef and thus can't use it to find the
> + * correct NetDef. We *do* keep the device alias in the NetDef.
> + * By definition, the netdev-id is "host" + devAlias, so we just
> + * need to remove "host" from the front of netdev-id to get
> + * something we can use to find the proper NetDef.
> + */
> + if (STREQLEN(netdevId, "host", 4))
> + devAlias = &netdevId[4];
This is open coding STRSKIP():
devAlias = STRSKIP(netdevId, "host");
Ah yes, right you are! Changed.
> +
> + if (!devAlias) {
> + VIR_WARN("Received NETDEV_STREAM_DISCONNECTED event for unrecognized
netdev %s from domain %p %s",
> + netdevId, vm, vm->def->name);
> + return;
> + }
> +
> + VIR_DEBUG("Received NETDEV_STREAM_DISCONNECTED event for device %s from
domain %p %s",
> + devAlias, vm, vm->def->name);
> +
> + if (virDomainObjBeginJob(vm, VIR_JOB_QUERY) < 0)
> + return;
> +
> + if (!virDomainObjIsActive(vm)) {
> + VIR_DEBUG("Domain is not running");
> + goto endjob;
> + }
> +
> + if (virDomainDefFindDevice(vm->def, devAlias, &dev, true) < 0) {
> + VIR_WARN("NETDEV_STREAM_DISCONNECTED event received for non-existent
device %s in domain %s",
> + devAlias, vm->def->name);
> + goto endjob;
> + }
> + if (dev.type != VIR_DOMAIN_DEVICE_NET) {
> + VIR_WARN("NETDEV_STREAM_DISCONNECTED event received for non-network
device %s in domain %s",
> + devAlias, vm->def->name);
> + goto endjob;
> + }
> + def =
dev.data.net;
> +
> + if (def->backend.type != VIR_DOMAIN_NET_BACKEND_PASST) {
> + VIR_DEBUG("ignore NETDEV_STREAM_DISCONNECTED event for non-passt
network device %s in domain %s",
> + def->info.alias, vm->def->name);
> + goto endjob;
> + }
> +
> + priv = vm->privateData;
> + qemuCaps = priv->qemuCaps;
> +
These two can be done in variable declaration. Alternatively, if you
want to lose @priv variable completely you can go with:
virQEMUCaps *caps = QEMU_DOMAIN_PRIVATE(vm)->qemuCaps;
I had originally had them initialized, but moved the assignment down
until it was actually needed because of some warped idea about saving
cycles, or avoiding a double de-reference in case one of them might be
NULL or something. But in retrospect, just delaying wouldn't have solved
that problem anyway. I'm changing it to use QEMU_DOMAIN_PRIVATE before
pushing.
Thanks for the review!
> + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_NETDEV_STREAM_RECONNECT)) {
> + VIR_WARN("ignore NETDEV_STREAM_DISCONNECTED event for passt network
device %s in domain %s - QEMU binary does not support reconnect",
> + def->info.alias, vm->def->name);
> + goto endjob;
> + }
> +
> + /* handle the event - restart the passt process with its original
> + * parameters
> + */
> + VIR_DEBUG("process NETDEV_STREAM_DISCONNECTED event for network device %s
in domain %s",
> + def->info.alias, vm->def->name);
> +
> + if (qemuPasstStart(vm, def) < 0)
> + goto endjob;
> +
> + endjob:
> + virDomainObjEndJob(vm);
> +}
> +
Michal