[PATCH 00/12] [PATCH v2 00/12] qemu: support passt as a backend for vhost-user network interfaces

==== Changes from V1: * fixed missing change to error log message pointed out by abologna * added a validation check to assure that shared memory is enabled if there is a type='vhostuser' interface in the domain definition * included a patch documenting differences between type='user' SLIRP and passt behaviors (because I had to do it anyway, and the reorganization made documenting type='vhostuser' passt slightly easier. * added documentation for type='vhostuser' backend type='passt' ===== passt (https://passt.top) provides a method of connecting QEMU virtual machines to the external network without requiring special privileges or capabilities of any participating processes - even libvirt itself can run unprivileged and create an instance of passt (which *always* runs unprivileged) that is then connected to the qemu process (and thus the virtual machine) with a unix socket. Originally passt used its own protocol for this socket, sending both control messages and data packets over the socket. This works, and is already much more efficient than the previously only-unprivileged-networking-solution slirp. But recently passt added support for using the vhost-user protocol for communication between the passt process (which is connected to the external network) and the QEMU process (and thus the VM). vhost-user also uses a unix socket, but only for control plane messages - all data packets are "sent" between the VM and passt process via a shared memory region. This is unsurprisingly much more efficient. From the point of view of QEMU, the passt process looks identical to any normal vhost-user backend, so we can run QEMU with exactly the same interface commandline options as normal vhost-user. Also, the passt process supports all of the same options as it does when used in its "traditional" mode, so really in the end all we need to do is twist libvirt around so that when <backend type='passt'/> is specified for an <interface type='vhostuser'>, it will run passt just as before (except with the added "--vhost-user" option so that passt will know to use that), and then force feed the vhost-user code in libvirt with the same socket path used by passt. This series does that, while also switching up a few bits of code prior to adding in the new functionality. So far this has been tested both unprivileged and privileged on Fedora 40 (with latest passt packet) and selinux enabled (there are a couple of selinux policy tweaks that still need to be pushed to passt-selinux) as well as unprivileged on debian (I *think* with AppArmor enabled) and everything seems to work. (I haven't gotten to testing hotplug, but it *should* work, and I'll be testing it while (hopefully) someone is reviewing these patches.) To test, you will need the latest (20250121) passt package and the aforementioned upstream passt-selinux patch if you're using selinux. This Resolves: https://issues.redhat.com/browse/RHEL-69455 Laine Stump (12): conf: change virDomainHostdevInsert() to return void qemu: fix qemu validation to forbid guest-side IP address for type='vdpa' qemu: validate that model is virtio for vhostuser and vdpa interfaces in the same place qemu: automatically set model type='virtio' for interface type='vhostuser' qemu: do all vhostuser attribute validation in qemu driver conf/qemu: make <source> element *almost* optional for type=vhostuser qemu: use switch instead of if in qemuProcessPrepareDomainNetwork() qemu: make qemuPasstCreateSocketPath() public qemu: complete vhostuser + passt support qemu: fail validation if a domain def has vhostuser/passt but no shared mem docs: improve type='user' docs to higlight differences between SLIRP and passt docs: document using passt backend with <interface type='vhostuser'> docs/formatdomain.rst | 189 +++++++++++++----- src/conf/domain_conf.c | 107 +++++----- src/conf/domain_conf.h | 2 +- src/conf/domain_validate.c | 85 +++----- src/conf/schemas/domaincommon.rng | 32 ++- src/libxl/libxl_domain.c | 5 +- src/libxl/libxl_driver.c | 3 +- src/lxc/lxc_driver.c | 3 +- src/qemu/qemu_command.c | 7 +- src/qemu/qemu_driver.c | 3 +- src/qemu/qemu_extdevice.c | 6 +- src/qemu/qemu_hotplug.c | 21 +- src/qemu/qemu_passt.c | 5 +- src/qemu/qemu_passt.h | 3 + src/qemu/qemu_postparse.c | 3 +- src/qemu/qemu_process.c | 85 +++++--- src/qemu/qemu_validate.c | 65 ++++-- ...t-user-slirp-portforward.x86_64-latest.err | 2 +- ...vhostuser-passt-no-shmem.x86_64-latest.err | 1 + .../net-vhostuser-passt-no-shmem.xml | 70 +++++++ .../net-vhostuser-passt.x86_64-latest.args | 42 ++++ .../net-vhostuser-passt.x86_64-latest.xml | 75 +++++++ tests/qemuxmlconfdata/net-vhostuser-passt.xml | 73 +++++++ tests/qemuxmlconftest.c | 2 + 24 files changed, 657 insertions(+), 232 deletions(-) create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.x86_64-latest.err create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.xml create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.xml -- 2.47.1

We haven't checked for memalloc failure in many years, and that was the only reason this function would have ever failed. Signed-off-by: Laine Stump <laine@redhat.com> --- src/conf/domain_conf.c | 15 +++++---------- src/conf/domain_conf.h | 2 +- src/libxl/libxl_domain.c | 5 +---- src/libxl/libxl_driver.c | 3 +-- src/lxc/lxc_driver.c | 3 +-- src/qemu/qemu_driver.c | 3 +-- src/qemu/qemu_process.c | 3 +-- 7 files changed, 11 insertions(+), 23 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 87f87bbe56..50dc4a33a6 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -14465,12 +14465,10 @@ virDomainChrTargetTypeToString(int deviceType, return type; } -int +void virDomainHostdevInsert(virDomainDef *def, virDomainHostdevDef *hostdev) { VIR_APPEND_ELEMENT(def->hostdevs, def->nhostdevs, hostdev); - - return 0; } virDomainHostdevDef * @@ -14886,9 +14884,8 @@ virDomainDiskRemoveByName(virDomainDef *def, const char *name) int virDomainNetInsert(virDomainDef *def, virDomainNetDef *net) { /* hostdev net devices must also exist in the hostdevs array */ - if (net->type == VIR_DOMAIN_NET_TYPE_HOSTDEV && - virDomainHostdevInsert(def, &net->data.hostdev.def) < 0) - return -1; + if (net->type == VIR_DOMAIN_NET_TYPE_HOSTDEV) + virDomainHostdevInsert(def, &net->data.hostdev.def); VIR_APPEND_ELEMENT(def->nets, def->nnets, net); return 0; @@ -19281,10 +19278,8 @@ virDomainDefParseXML(xmlXPathContextPtr ctxt, * where the actual network type is already known to be * hostdev) must also be in the hostdevs array. */ - if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_HOSTDEV && - virDomainHostdevInsert(def, virDomainNetGetActualHostdev(net)) < 0) { - return NULL; - } + if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_HOSTDEV) + virDomainHostdevInsert(def, virDomainNetGetActualHostdev(net)); } VIR_FREE(nodes); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index e51c74b6d1..9da6586e66 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -3994,7 +3994,7 @@ virDomainNetDef *virDomainNetRemove(virDomainDef *def, size_t i); virDomainNetDef *virDomainNetRemoveByObj(virDomainDef *def, virDomainNetDef *net); void virDomainNetRemoveHostdev(virDomainDef *def, virDomainNetDef *net); -int virDomainHostdevInsert(virDomainDef *def, virDomainHostdevDef *hostdev); +void virDomainHostdevInsert(virDomainDef *def, virDomainHostdevDef *hostdev); virDomainHostdevDef * virDomainHostdevRemove(virDomainDef *def, size_t i); int virDomainHostdevFind(virDomainDef *def, virDomainHostdevDef *match, diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index a049cdb30f..6805160923 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -1014,10 +1014,7 @@ libxlNetworkPrepareDevices(virDomainDef *def) /* Each type='hostdev' network device must also have a * corresponding entry in the hostdevs array. */ - virDomainHostdevDef *hostdev = virDomainNetGetActualHostdev(net); - - if (virDomainHostdevInsert(def, hostdev) < 0) - return -1; + virDomainHostdevInsert(def, virDomainNetGetActualHostdev(net)); } } diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index bd858d8127..edf7b37581 100644 --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -3574,8 +3574,7 @@ libxlDomainAttachDeviceConfig(virDomainDef *vmdef, virDomainDeviceDef *dev) return -1; } - if (virDomainHostdevInsert(vmdef, hostdev) < 0) - return -1; + virDomainHostdevInsert(vmdef, hostdev); dev->data.hostdev = NULL; break; diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index e63732dbea..22266c1ab6 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -2993,8 +2993,7 @@ lxcDomainAttachDeviceConfig(virDomainDef *vmdef, _("device is already in the domain configuration")); return -1; } - if (virDomainHostdevInsert(vmdef, hostdev) < 0) - return -1; + virDomainHostdevInsert(vmdef, hostdev); dev->data.hostdev = NULL; ret = 0; break; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 772cb405d6..1d0da1028f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6732,8 +6732,7 @@ qemuDomainAttachDeviceConfig(virDomainDef *vmdef, _("device is already in the domain configuration")); return -1; } - if (virDomainHostdevInsert(vmdef, hostdev)) - return -1; + virDomainHostdevInsert(vmdef, hostdev); dev->data.hostdev = NULL; break; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d015285b0d..910229a616 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -5928,8 +5928,7 @@ qemuProcessPrepareDomainNetwork(virDomainObj *vm) if (qemuDomainPrepareHostdev(hostdev, priv) < 0) return -1; - if (virDomainHostdevInsert(def, hostdev) < 0) - return -1; + virDomainHostdevInsert(def, hostdev); } } return 0; -- 2.47.1

Because all the checks for VIR_DOMAIN_NET_TYPE_VDPA were inside an else-if clause that was immediately followed by another else-if clause that forbid setting guestIP.ips or guestIP.routes, we've been allowing users to set guestIP.* for vdpa interfaces (but then not doing validation of the attributes that should have been done if we *did* support setting IPs for vdpa (but we don't anyway, so :shrug:.) This can be fixed by turning the vdpa else-if clause into a top-level if - this way vdpa interfaces will hit the "else if (net->guestIP.nips)" clause and reject guest-side IP address setting. Also, since there are currently *no* interface types for QEMU that support adding guest-side routes, we put that check by itself (I think it may be possible to set some guest routes for passt interfaces, but we don't do that) Signed-off-by: Laine Stump <laine@redhat.com> --- src/qemu/qemu_validate.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index 76f2eafe49..06093bc42b 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -1745,6 +1745,12 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, bool hasIPv6 = false; size_t i; + if (net->guestIP.nroutes) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Invalid attempt to set network interface guest-side IP route, not supported by QEMU")); + return -1; + } + if (net->type == VIR_DOMAIN_NET_TYPE_USER) { virDomainCapsDeviceNet netCaps = { }; @@ -1758,12 +1764,6 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, return -1; } - if (net->guestIP.nroutes) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("Invalid attempt to set network interface guest-side IP route, not supported by QEMU")); - return -1; - } - for (i = 0; i < net->guestIP.nips; i++) { const virNetDevIPAddr *ip = net->guestIP.ips[i]; @@ -1811,7 +1811,13 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, } } } - } else if (net->type == VIR_DOMAIN_NET_TYPE_VDPA) { + } else if (net->guestIP.nips) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Invalid attempt to set network interface guest-side IP address info, not supported by QEMU")); + return -1; + } + + if (net->type == VIR_DOMAIN_NET_TYPE_VDPA) { if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_NETDEV_VHOST_VDPA)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("vDPA devices are not supported with this QEMU binary")); @@ -1825,10 +1831,6 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, virDomainNetModelTypeToString(net->model)); return -1; } - } else if (net->guestIP.nroutes || net->guestIP.nips) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("Invalid attempt to set network interface guest-side IP route and/or address info, not supported by QEMU")); - return -1; } if (virDomainNetIsVirtioModel(net)) { -- 2.47.1

Both vhostuser and vdpa interface types must use the virtio model in the guest (because part of the functionality is implemented in the guest virtio driver). Due to ["because that's the way it happened"] this has been validated for vhostuser in the hypervisor-agnostic validate function, but for vdpa it has been done in the QEMU-specific validate. Since these interface models are only supported by QEMU anyway, validate for both of them in the QEMU validation function. Take advantage of this change to switch to using virDomainNetIsVirtioModel(net) instead of "net->model == VIR_DOMAIN_NET_MODEL_VIRTIO" (the former also matches ...VIRTIO_TRANSITIONAL and ...VIRTIO_NON_TRANSITIONAL, so is more correct). Signed-off-by: Laine Stump <laine@redhat.com> --- src/conf/domain_validate.c | 6 ------ src/qemu/qemu_validate.c | 11 ++++++----- 2 files changed, 6 insertions(+), 11 deletions(-) diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index eb5e764c02..d0e2bcaccf 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -2186,12 +2186,6 @@ virDomainNetDefValidate(const virDomainNetDef *net) switch (net->type) { case VIR_DOMAIN_NET_TYPE_VHOSTUSER: - if (!virDomainNetIsVirtioModel(net)) { - virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("Wrong or no <model> 'type' attribute specified with <interface type='vhostuser'/>. vhostuser requires the virtio-net* frontend")); - return -1; - } - if (net->data.vhostuser->data.nix.listen && net->data.vhostuser->data.nix.reconnect.enabled == VIR_TRISTATE_BOOL_YES) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index 06093bc42b..243c499a33 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -1823,17 +1823,18 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, _("vDPA devices are not supported with this QEMU binary")); return -1; } + } - if (net->model != VIR_DOMAIN_NET_MODEL_VIRTIO) { + if (!virDomainNetIsVirtioModel(net)) { + if (net->type == VIR_DOMAIN_NET_TYPE_VDPA || + net->type == VIR_DOMAIN_NET_TYPE_VHOSTUSER) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("invalid model for interface of type '%1$s': '%2$s'"), + _("invalid model for interface of type '%1$s': '%2$s' - must be 'virtio'"), virDomainNetTypeToString(net->type), virDomainNetModelTypeToString(net->model)); return -1; } - } - - if (virDomainNetIsVirtioModel(net)) { + } else { if (net->driver.virtio.rx_queue_size) { if (!VIR_IS_POW2(net->driver.virtio.rx_queue_size)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", -- 2.47.1

Both vdpa and vhostuser require that the guest device be virtio, and for interface type='vdpa', we already set <model type='virtio'/> if it is unspecified in the input XML, so let's be just as courteous for interface type='vhostuser'. Signed-off-by: Laine Stump <laine@redhat.com> --- src/qemu/qemu_postparse.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_postparse.c b/src/qemu/qemu_postparse.c index 20ee333e0d..49009ae2e4 100644 --- a/src/qemu/qemu_postparse.c +++ b/src/qemu/qemu_postparse.c @@ -100,7 +100,8 @@ qemuDomainDeviceNetDefPostParse(virDomainNetDef *net, const virDomainDef *def, virQEMUCaps *qemuCaps) { - if (net->type == VIR_DOMAIN_NET_TYPE_VDPA && + if ((net->type == VIR_DOMAIN_NET_TYPE_VDPA || + net->type == VIR_DOMAIN_NET_TYPE_VHOSTUSER) && !virDomainNetGetModelString(net)) { net->model = VIR_DOMAIN_NET_MODEL_VIRTIO; } else if (net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV && -- 2.47.1

Since vhostuser is only used/supported by the QEMU driver, and all the rest of the vhostuser-specific validation is done in QEMU's validation, lets move the final check (to see if they've tried to enable auto-reconnect when this interface is on the server side of the vhostuser socket) to the QEMU validate. Signed-off-by: Laine Stump <laine@redhat.com> --- src/conf/domain_validate.c | 10 +--------- src/qemu/qemu_validate.c | 8 ++++++++ 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index d0e2bcaccf..577dbab0af 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -2185,15 +2185,6 @@ virDomainNetDefValidate(const virDomainNetDef *net) } switch (net->type) { - case VIR_DOMAIN_NET_TYPE_VHOSTUSER: - if (net->data.vhostuser->data.nix.listen && - net->data.vhostuser->data.nix.reconnect.enabled == VIR_TRISTATE_BOOL_YES) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("'reconnect' attribute unsupported 'server' mode for <interface type='vhostuser'>")); - return -1; - } - break; - case VIR_DOMAIN_NET_TYPE_USER: if (net->backend.type == VIR_DOMAIN_NET_BACKEND_PASST) { size_t p; @@ -2217,6 +2208,7 @@ virDomainNetDefValidate(const virDomainNetDef *net) } break; + case VIR_DOMAIN_NET_TYPE_VHOSTUSER: case VIR_DOMAIN_NET_TYPE_NETWORK: case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_BRIDGE: diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index 243c499a33..351fe38830 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -1825,6 +1825,14 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, } } + if (net->type == VIR_DOMAIN_NET_TYPE_VHOSTUSER && + net->data.vhostuser->data.nix.listen && + net->data.vhostuser->data.nix.reconnect.enabled == VIR_TRISTATE_BOOL_YES) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("'reconnect' attribute is not supported when source mode='server' for <interface type='vhostuser'>")); + return -1; + } + if (!virDomainNetIsVirtioModel(net)) { if (net->type == VIR_DOMAIN_NET_TYPE_VDPA || net->type == VIR_DOMAIN_NET_TYPE_VHOSTUSER) { -- 2.47.1

For some reason, when vhostuser interface support was added in 2014, the parser required that the XML for the <interface> have a <source> element with type, mode, and path, all 3 also required. This in spite of the fact that 'unix' is the only possible valid setting for type, and 95% of the time the mode is set to 'client' (as I understand from comments in the code, normally a guest will use mode='client' to connect to an existing socket that is precreated (by OVS?), and the only use for mode='server' is for test setups where one guest is setup with a listening vhostuser socket (i.e. 'server') and another guest connects to that socket (i.e. 'client')). (or maybe one guest connects to OVS in server mode, and all the others connect in client mode, not sure - I don't claim to be an expert on vhost-user.) So from the point of view of existing vhost-user functionality, it seems reasonable to make 'type' and 'mode' optional, and by default fill in the vhostuser part of the NetDef as if they were 'unix' and 'client'. In theory, the <source> element itself is also not *directly* required after this patch, however, the path attribute of <source> *is* required (for now), so effectively the <source> element is still required. Signed-off-by: Laine Stump <laine@redhat.com> --- src/conf/domain_conf.c | 56 ++++++++++++------------------- src/conf/schemas/domaincommon.rng | 4 ++- src/qemu/qemu_validate.c | 20 +++++++---- 3 files changed, 39 insertions(+), 41 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 50dc4a33a6..6b382eb63f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -9776,50 +9776,38 @@ virDomainNetDefParseXML(virDomainXMLOption *xmlopt, g_autofree char *vhostuser_type = NULL; virDomainNetVhostuserMode vhostuser_mode; - if (virDomainNetDefParseXMLRequireSource(def, source_node) < 0) - return NULL; - - if (!(vhostuser_type = virXMLPropStringRequired(source_node, "type"))) - return NULL; - - if (STRNEQ_NULLABLE(vhostuser_type, "unix")) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("Type='%1$s' unsupported for <interface type='vhostuser'>"), - vhostuser_type); - return NULL; - } - if (!(def->data.vhostuser = virDomainChrSourceDefNew(xmlopt))) return NULL; + /* Default (and only valid) value of type is "unix". + * Everything else's default value is 0/NULL. + */ def->data.vhostuser->type = VIR_DOMAIN_CHR_TYPE_UNIX; - if (!(def->data.vhostuser->data.nix.path = virXMLPropStringRequired(source_node, "path"))) - return NULL; + if (source_node) { + if ((vhostuser_type = virXMLPropString(source_node, "type"))) { + if (STRNEQ(vhostuser_type, "unix")) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Type='%1$s' unsupported for <interface type='vhostuser'>"), + vhostuser_type); + return NULL; + } + } - if (virXMLPropEnum(source_node, "mode", - virDomainNetVhostuserModeTypeFromString, - VIR_XML_PROP_REQUIRED | VIR_XML_PROP_NONZERO, - &vhostuser_mode) < 0) - return NULL; + def->data.vhostuser->data.nix.path = virXMLPropString(source_node, "path"); - switch (vhostuser_mode) { - case VIR_DOMAIN_NET_VHOSTUSER_MODE_CLIENT: - def->data.vhostuser->data.nix.listen = false; - break; + if (virXMLPropEnum(source_node, "mode", virDomainNetVhostuserModeTypeFromString, + VIR_XML_PROP_NONZERO, &vhostuser_mode) < 0) { + return NULL; + } - case VIR_DOMAIN_NET_VHOSTUSER_MODE_SERVER: - def->data.vhostuser->data.nix.listen = true; - break; + if (vhostuser_mode == VIR_DOMAIN_NET_VHOSTUSER_MODE_SERVER) + def->data.vhostuser->data.nix.listen = true; - case VIR_DOMAIN_NET_VHOSTUSER_MODE_NONE: - case VIR_DOMAIN_NET_VHOSTUSER_MODE_LAST: - break; + if (virDomainChrSourceReconnectDefParseXML(&def->data.vhostuser->data.nix.reconnect, + source_node, ctxt) < 0) + return NULL; } - - if (virDomainChrSourceReconnectDefParseXML(&def->data.vhostuser->data.nix.reconnect, - source_node, ctxt) < 0) - return NULL; } break; diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng index 96cedb85e8..e5da550e45 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -3485,7 +3485,9 @@ <value>vhostuser</value> </attribute> <interleave> - <ref name="unixSocketSource"/> + <optional> + <ref name="unixSocketSource"/> + </optional> <ref name="interface-options"/> </interleave> </group> diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index 351fe38830..b0cf5e866c 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -1825,12 +1825,20 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, } } - if (net->type == VIR_DOMAIN_NET_TYPE_VHOSTUSER && - net->data.vhostuser->data.nix.listen && - net->data.vhostuser->data.nix.reconnect.enabled == VIR_TRISTATE_BOOL_YES) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("'reconnect' attribute is not supported when source mode='server' for <interface type='vhostuser'>")); - return -1; + if (net->type == VIR_DOMAIN_NET_TYPE_VHOSTUSER) { + if (!net->data.vhostuser->data.nix.path) { + virReportError(VIR_ERR_XML_ERROR, + _("Missing required attribute '%1$s' in element '%2$s'"), + "path", "source"); + return -1; + } + + if (net->data.vhostuser->data.nix.listen && + net->data.vhostuser->data.nix.reconnect.enabled == VIR_TRISTATE_BOOL_YES) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("'reconnect' attribute is not supported when source mode='server' for <interface type='vhostuser'>")); + return -1; + } } if (!virDomainNetIsVirtioModel(net)) { -- 2.47.1

qemuProcessPrepareDomain()'s comments say that it should be the only place to change the "live XML" of a domain (i.e. the public parts of the virDomainDef object that is shown in the domain's status XML), and that seems like a reasonable idea (although there aren't many users of it to date). qemuProcessPrepareDomainNetwork() is called by the aforementioned qemuProcessPrepareDomain() - this patch changes the "if (type == HOSTDEV)" in that function to a "switch(type)" so it's simpler to add DomainDef modifications for various other types of virDomainNetDef, and also so that anyone who adds a new interface type is forced to look at the code and decide if anything needs to be done here for the new type. Signed-off-by: Laine Stump <laine@redhat.com> --- src/qemu/qemu_process.c | 75 ++++++++++++++++++++++++++--------------- 1 file changed, 47 insertions(+), 28 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 910229a616..963d090963 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -5887,7 +5887,6 @@ qemuProcessPrepareDomainNetwork(virDomainObj *vm) for (i = 0; i < def->nnets; i++) { virDomainNetDef *net = def->nets[i]; - virDomainNetType actualType; /* If appropriate, grab a physical device from the configured * network's pool of devices, or resolve bridge device name @@ -5900,36 +5899,56 @@ qemuProcessPrepareDomainNetwork(virDomainObj *vm) return -1; } - actualType = virDomainNetGetActualType(net); - if (actualType == VIR_DOMAIN_NET_TYPE_HOSTDEV && - net->type == VIR_DOMAIN_NET_TYPE_NETWORK) { - /* Each type='hostdev' network device must also have a - * corresponding entry in the hostdevs array. For netdevs - * that are hardcoded as type='hostdev', this is already - * done by the parser, but for those allocated from a - * network / determined at runtime, we need to do it - * separately. - */ - virDomainHostdevDef *hostdev = virDomainNetGetActualHostdev(net); - virDomainHostdevSubsysPCI *pcisrc = &hostdev->source.subsys.u.pci; - - if (virDomainHostdevFind(def, hostdev, NULL) >= 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("PCI device %1$04x:%2$02x:%3$02x.%4$x allocated from network %5$s is already in use by domain %6$s"), - pcisrc->addr.domain, pcisrc->addr.bus, - pcisrc->addr.slot, pcisrc->addr.function, - net->data.network.name, def->name); - return -1; - } + switch (virDomainNetGetActualType(net)) { + case VIR_DOMAIN_NET_TYPE_HOSTDEV: + if (net->type == VIR_DOMAIN_NET_TYPE_NETWORK) { + /* Each type='hostdev' network device must also have a + * corresponding entry in the hostdevs array. For netdevs + * that are hardcoded as type='hostdev', this is already + * done by the parser, but for those allocated from a + * network / determined at runtime, we need to do it + * separately. + */ + virDomainHostdevDef *hostdev = virDomainNetGetActualHostdev(net); + virDomainHostdevSubsysPCI *pcisrc = &hostdev->source.subsys.u.pci; + + if (virDomainHostdevFind(def, hostdev, NULL) >= 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("PCI device %1$04x:%2$02x:%3$02x.%4$x allocated from network %5$s is already in use by domain %6$s"), + pcisrc->addr.domain, pcisrc->addr.bus, + pcisrc->addr.slot, pcisrc->addr.function, + net->data.network.name, def->name); + return -1; + } - /* For hostdev present in qemuProcessPrepareDomain() phase this was - * done already, but this code runs after that, so we have to call - * it ourselves. */ - if (qemuDomainPrepareHostdev(hostdev, priv) < 0) - return -1; + /* For hostdev present in qemuProcessPrepareDomain() phase this was + * done already, but this code runs after that, so we have to call + * it ourselves. */ + if (qemuDomainPrepareHostdev(hostdev, priv) < 0) + return -1; - virDomainHostdevInsert(def, hostdev); + virDomainHostdevInsert(def, hostdev); + } + break; + + case VIR_DOMAIN_NET_TYPE_DIRECT: + case VIR_DOMAIN_NET_TYPE_BRIDGE: + case VIR_DOMAIN_NET_TYPE_NETWORK: + case VIR_DOMAIN_NET_TYPE_ETHERNET: + case VIR_DOMAIN_NET_TYPE_USER: + case VIR_DOMAIN_NET_TYPE_VHOSTUSER: + case VIR_DOMAIN_NET_TYPE_SERVER: + case VIR_DOMAIN_NET_TYPE_CLIENT: + case VIR_DOMAIN_NET_TYPE_MCAST: + case VIR_DOMAIN_NET_TYPE_INTERNAL: + case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: + case VIR_DOMAIN_NET_TYPE_NULL: + case VIR_DOMAIN_NET_TYPE_VDS: + case VIR_DOMAIN_NET_TYPE_LAST: + break; } + } return 0; } -- 2.47.1

When passt is used with vhostuser, the vhostuser code that builds the qemu commandline will need to have the same socket path that is given to the passt command, so this patch makes it visible outside of qemu_passt.c. Signed-off-by: Laine Stump <laine@redhat.com> --- src/qemu/qemu_passt.c | 2 +- src/qemu/qemu_passt.h | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index dd4a8bb997..8a3ac4e988 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -54,7 +54,7 @@ qemuPasstCreatePidFilename(virDomainObj *vm, } -static char * +char * qemuPasstCreateSocketPath(virDomainObj *vm, virDomainNetDef *net) { diff --git a/src/qemu/qemu_passt.h b/src/qemu/qemu_passt.h index 623b494b7a..e0b9aaac8d 100644 --- a/src/qemu/qemu_passt.h +++ b/src/qemu/qemu_passt.h @@ -36,3 +36,6 @@ void qemuPasstStop(virDomainObj *vm, int qemuPasstSetupCgroup(virDomainObj *vm, virDomainNetDef *net, virCgroup *cgroup); + +char *qemuPasstCreateSocketPath(virDomainObj *vm, + virDomainNetDef *net); -- 2.47.1

<interface type='vhostuser'><backend type='passt'/> needs to run the passt command just as is done for interface type='user', but then add vhostuser bits to the qemu commandline/monitor command. There are some changes to the parsing/validation along with changes to the vhostuser codepath do do the extra stuff for passt. I tried keeping them separated into different patches, but then the unit test failed in a strange way deep down in the bowels of the commandline generation, so this patch both 1) makes the final changes to parsing/formatting and 2) adds passt stuff at appropriate places for vhostuser (as well as making a couple of things *not* happen when the passt backend is chosen). The result is that you can now have: <interface type='vhostuser'> <backend type='passt'/> ... </interface> Then as long as you also have the following as a subelement of <domain>: <memoryBacking> <access mode='shared'/> </memoryBacking> your passt interfaces will benefit from the greatly improved efficiency of a vhost-user data path, and all without requiring special privileges or capabilities *anywhere* (i.e. it works for unprivileged libvirt (qemu:///session) as well as privileged libvirt). Signed-off-by: Laine Stump <laine@redhat.com> --- src/conf/domain_conf.c | 36 ++++++--- src/conf/domain_validate.c | 77 +++++++------------ src/conf/schemas/domaincommon.rng | 32 +++++++- src/qemu/qemu_command.c | 7 +- src/qemu/qemu_extdevice.c | 6 +- src/qemu/qemu_hotplug.c | 21 ++++- src/qemu/qemu_passt.c | 3 + src/qemu/qemu_process.c | 15 +++- src/qemu/qemu_validate.c | 7 +- ...t-user-slirp-portforward.x86_64-latest.err | 2 +- .../net-vhostuser-passt.x86_64-latest.args | 42 ++++++++++ .../net-vhostuser-passt.x86_64-latest.xml | 72 +++++++++++++++++ tests/qemuxmlconfdata/net-vhostuser-passt.xml | 70 +++++++++++++++++ tests/qemuxmlconftest.c | 1 + 14 files changed, 317 insertions(+), 74 deletions(-) create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.xml diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 6b382eb63f..49555efc56 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -9457,9 +9457,25 @@ virDomainNetBackendParseXML(xmlNodePtr node, g_autofree char *tap = virXMLPropString(node, "tap"); g_autofree char *vhost = virXMLPropString(node, "vhost"); - /* The VIR_DOMAIN_NET_BACKEND_DEFAULT really means 'use hypervisor's - * builtin SLIRP'. It's reported in domain caps and thus we need to accept - * it. Hence VIR_XML_PROP_NONE instead of VIR_XML_PROP_NONZERO. */ + /* In the case of NET_TYPE_USER, backend type can be unspecified + * (i.e. VIR_DOMAIN_NET_BACKEND_DEFAULT) and that means 'use + * hypervisor's builtin SLIRP (or if that isn't available, use + * passt)'. Similarly, it can also be left unspecified in the case + * of NET_TYPE_VHOSTUSER, and then it means "use the traditional + * vhost-user backend (which auto-detects between connecting to a + * socket created by OVS, or connecting to a standalone socket + * used (mostly in testing) to connect the vhost-user interface of + * one guest directly to the vhost-user interface of another + * guest. + * + * If backend type is set to 'passt', then in both cases a passt + * process will be started, and libvirt will connect that to the + * guest interface (either communicating everything over the + * socket created by passt using a specific-to-passt protocol + * (interface type='user'>), or by using the socket for control + * plane messages and shared memory for data using the vhost-user + * protocol (<interface type='vhostuser'>)). + */ if (virXMLPropEnum(node, "type", virDomainNetBackendTypeFromString, VIR_XML_PROP_NONE, &def->backend.type) < 0) { return -1; @@ -24616,7 +24632,11 @@ virDomainNetDefFormat(virBuffer *buf, break; case VIR_DOMAIN_NET_TYPE_VHOSTUSER: - if (def->data.vhostuser->type == VIR_DOMAIN_CHR_TYPE_UNIX) { + if (def->data.vhostuser->type == VIR_DOMAIN_CHR_TYPE_UNIX && + def->backend.type != VIR_DOMAIN_NET_BACKEND_PASST) { + /* in the case of BACKEND_PASST, the values of all of these are either + * fixed (type, mode, reconnect), or derived from elsewhere (path) + */ virBufferAddLit(&sourceAttrBuf, " type='unix'"); virBufferEscapeString(&sourceAttrBuf, " path='%s'", def->data.vhostuser->data.nix.path); @@ -24627,7 +24647,6 @@ virDomainNetDefFormat(virBuffer *buf, virDomainChrSourceReconnectDefFormat(&sourceChildBuf, &def->data.vhostuser->data.nix.reconnect); } - } break; @@ -24689,15 +24708,14 @@ virDomainNetDefFormat(virBuffer *buf, } case VIR_DOMAIN_NET_TYPE_USER: - if (def->backend.type == VIR_DOMAIN_NET_BACKEND_PASST) - virBufferEscapeString(&sourceAttrBuf, " dev='%s'", def->sourceDev); - break; - case VIR_DOMAIN_NET_TYPE_NULL: case VIR_DOMAIN_NET_TYPE_LAST: break; } + if (def->backend.type == VIR_DOMAIN_NET_BACKEND_PASST) + virBufferEscapeString(&sourceAttrBuf, " dev='%s'", def->sourceDev); + if (def->hostIP.nips || def->hostIP.nroutes) { if (virDomainNetIPInfoFormat(&sourceChildBuf, &def->hostIP) < 0) return -1; diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index 577dbab0af..563558d920 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -2163,67 +2163,46 @@ virDomainNetDefValidate(const virDomainNetDef *net) return -1; } - if (net->type != VIR_DOMAIN_NET_TYPE_USER) { + if (net->type != VIR_DOMAIN_NET_TYPE_USER && + net->type != VIR_DOMAIN_NET_TYPE_VHOSTUSER) { if (net->backend.type == VIR_DOMAIN_NET_BACKEND_PASST) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("The 'passt' backend can only be used with interface type='user'")); + _("The 'passt' backend can only be used with interface type='user' or type='vhostuser'")); return -1; } } - if (net->nPortForwards > 0 && - (net->type != VIR_DOMAIN_NET_TYPE_USER || - (net->type == VIR_DOMAIN_NET_TYPE_USER && - net->backend.type != VIR_DOMAIN_NET_BACKEND_PASST))) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("The <portForward> element can only be used with <interface type='user'> and its 'passt' backend")); - return -1; - } + if (net->nPortForwards > 0) { + size_t p; - if (!virNetDevBandwidthValidate(net->bandwidth)) { - return -1; - } + if ((net->type != VIR_DOMAIN_NET_TYPE_USER && + net->type != VIR_DOMAIN_NET_TYPE_VHOSTUSER) || + net->backend.type != VIR_DOMAIN_NET_BACKEND_PASST) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("The <portForward> element can only be used with the 'passt' backend of interface type='user' or type='vhostuser'")); + return -1; + } - switch (net->type) { - case VIR_DOMAIN_NET_TYPE_USER: - if (net->backend.type == VIR_DOMAIN_NET_BACKEND_PASST) { - size_t p; - - for (p = 0; p < net->nPortForwards; p++) { - size_t r; - virDomainNetPortForward *pf = net->portForwards[p]; - - for (r = 0; r < pf->nRanges; r++) { - virDomainNetPortForwardRange *range = pf->ranges[r]; - - if (!range->start - && (range->end || range->to - || range->exclude != VIR_TRISTATE_BOOL_ABSENT)) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("The 'range' of a 'portForward' requires 'start' attribute if 'end', 'to', or 'exclude' is specified")); - return -1; - } + for (p = 0; p < net->nPortForwards; p++) { + size_t r; + virDomainNetPortForward *pf = net->portForwards[p]; + + for (r = 0; r < pf->nRanges; r++) { + virDomainNetPortForwardRange *range = pf->ranges[r]; + + if (!range->start + && (range->end || range->to + || range->exclude != VIR_TRISTATE_BOOL_ABSENT)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("The 'range' of a 'portForward' requires 'start' attribute if 'end', 'to', or 'exclude' is specified")); + return -1; } } } - break; + } - case VIR_DOMAIN_NET_TYPE_VHOSTUSER: - case VIR_DOMAIN_NET_TYPE_NETWORK: - case VIR_DOMAIN_NET_TYPE_VDPA: - case VIR_DOMAIN_NET_TYPE_BRIDGE: - case VIR_DOMAIN_NET_TYPE_CLIENT: - case VIR_DOMAIN_NET_TYPE_SERVER: - case VIR_DOMAIN_NET_TYPE_MCAST: - case VIR_DOMAIN_NET_TYPE_UDP: - case VIR_DOMAIN_NET_TYPE_INTERNAL: - case VIR_DOMAIN_NET_TYPE_DIRECT: - case VIR_DOMAIN_NET_TYPE_HOSTDEV: - case VIR_DOMAIN_NET_TYPE_VDS: - case VIR_DOMAIN_NET_TYPE_ETHERNET: - case VIR_DOMAIN_NET_TYPE_NULL: - case VIR_DOMAIN_NET_TYPE_LAST: - break; + if (!virNetDevBandwidthValidate(net->bandwidth)) { + return -1; } return 0; diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng index e5da550e45..3328a63205 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -3486,8 +3486,36 @@ </attribute> <interleave> <optional> - <ref name="unixSocketSource"/> - </optional> + <element name="source"> + <optional> + <attribute name="type"> + <value>unix</value> + </attribute> + </optional> + <optional> + <attribute name="path"> + <ref name="absFilePath"/> + </attribute> + </optional> + <optional> + <attribute name="mode"> + <choice> + <value>server</value> + <value>client</value> + </choice> + </attribute> + </optional> + <optional> + <attribute name="dev"> + <ref name="deviceName"/> + </attribute> + </optional> + <optional> + <ref name="reconnect"/> + </optional> + <empty/> + </element> + </optional> <ref name="interface-options"/> </interleave> </group> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 7370711918..54130ac4f0 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -8649,11 +8649,12 @@ qemuBuildInterfaceCommandLine(virQEMUDriver *driver, if (qemuInterfaceVhostuserConnect(cmd, net, qemuCaps) < 0) goto cleanup; - if (virNetDevOpenvswitchGetVhostuserIfname(net->data.vhostuser->data.nix.path, + if (net->backend.type != VIR_DOMAIN_NET_BACKEND_PASST && + virNetDevOpenvswitchGetVhostuserIfname(net->data.vhostuser->data.nix.path, net->data.vhostuser->data.nix.listen, - &net->ifname) < 0) + &net->ifname) < 0) { goto cleanup; - + } break; case VIR_DOMAIN_NET_TYPE_VDPA: diff --git a/src/qemu/qemu_extdevice.c b/src/qemu/qemu_extdevice.c index 954cb323a4..2384bab7a6 100644 --- a/src/qemu/qemu_extdevice.c +++ b/src/qemu/qemu_extdevice.c @@ -212,13 +212,15 @@ qemuExtDevicesStart(virQEMUDriver *driver, for (i = 0; i < def->nnets; i++) { virDomainNetDef *net = def->nets[i]; - if (net->type != VIR_DOMAIN_NET_TYPE_USER) + if (net->type != VIR_DOMAIN_NET_TYPE_USER && + net->type != VIR_DOMAIN_NET_TYPE_VHOSTUSER) { continue; + } if (net->backend.type == VIR_DOMAIN_NET_BACKEND_PASST) { if (qemuPasstStart(vm, net) < 0) return -1; - } else { + } else if (net->type == VIR_DOMAIN_NET_TYPE_USER) { if (qemuSlirpStart(vm, net, incomingMigration) < 0) return -1; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 6c224c9793..28ca321c5c 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1262,10 +1262,23 @@ qemuDomainAttachNetDevice(virQEMUDriver *driver, if (!(charDevAlias = qemuAliasChardevFromDevAlias(net->info.alias))) goto cleanup; - if (virNetDevOpenvswitchGetVhostuserIfname(net->data.vhostuser->data.nix.path, - net->data.vhostuser->data.nix.listen, - &net->ifname) < 0) - goto cleanup; + if (net->backend.type == VIR_DOMAIN_NET_BACKEND_PASST) { + + /* vhostuser needs socket path in this location, and when + * backend is passt, the path is derived from other info, + * not taken from config. + */ + g_free(net->data.vhostuser->data.nix.path); + net->data.vhostuser->data.nix.path = qemuPasstCreateSocketPath(vm, net); + + if (qemuPasstStart(vm, net) < 0) + goto cleanup; + } else { + if (virNetDevOpenvswitchGetVhostuserIfname(net->data.vhostuser->data.nix.path, + net->data.vhostuser->data.nix.listen, + &net->ifname) < 0) + goto cleanup; + } if (qemuSecuritySetNetdevLabel(driver, vm, net) < 0) goto cleanup; diff --git a/src/qemu/qemu_passt.c b/src/qemu/qemu_passt.c index 8a3ac4e988..b9616d1c63 100644 --- a/src/qemu/qemu_passt.c +++ b/src/qemu/qemu_passt.c @@ -180,6 +180,9 @@ qemuPasstStart(virDomainObj *vm, virCommandClearCaps(cmd); + if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_VHOSTUSER) + virCommandAddArg(cmd, "--vhost-user"); + virCommandAddArgList(cmd, "--one-off", "--socket", passtSocketName, diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 963d090963..0d9b8bcb93 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -64,6 +64,7 @@ #include "qemu_backup.h" #include "qemu_dbus.h" #include "qemu_snapshot.h" +#include "qemu_passt.h" #include "cpu/cpu.h" #include "cpu/cpu_x86.h" @@ -5931,12 +5932,23 @@ qemuProcessPrepareDomainNetwork(virDomainObj *vm) } break; + case VIR_DOMAIN_NET_TYPE_VHOSTUSER: + if (net->backend.type == VIR_DOMAIN_NET_BACKEND_PASST) { + /* when using the passt backend, the path of the + * unix socket is always derived from other info + * *not* manually given in the config, but all the + * vhostuser code looks for it there. + */ + g_free(net->data.vhostuser->data.nix.path); + net->data.vhostuser->data.nix.path = qemuPasstCreateSocketPath(vm, net); + } + break; + case VIR_DOMAIN_NET_TYPE_DIRECT: case VIR_DOMAIN_NET_TYPE_BRIDGE: case VIR_DOMAIN_NET_TYPE_NETWORK: case VIR_DOMAIN_NET_TYPE_ETHERNET: case VIR_DOMAIN_NET_TYPE_USER: - case VIR_DOMAIN_NET_TYPE_VHOSTUSER: case VIR_DOMAIN_NET_TYPE_SERVER: case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: @@ -5948,7 +5960,6 @@ qemuProcessPrepareDomainNetwork(virDomainObj *vm) case VIR_DOMAIN_NET_TYPE_LAST: break; } - } return 0; } diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index b0cf5e866c..92e745cea1 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -1751,7 +1751,9 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, return -1; } - if (net->type == VIR_DOMAIN_NET_TYPE_USER) { + if (net->type == VIR_DOMAIN_NET_TYPE_USER || + (net->type == VIR_DOMAIN_NET_TYPE_VHOSTUSER && + net->backend.type == VIR_DOMAIN_NET_BACKEND_PASST)) { virDomainCapsDeviceNet netCaps = { }; virQEMUCapsFillDomainDeviceNetCaps(qemuCaps, &netCaps); @@ -1826,7 +1828,8 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, } if (net->type == VIR_DOMAIN_NET_TYPE_VHOSTUSER) { - if (!net->data.vhostuser->data.nix.path) { + if (!net->data.vhostuser->data.nix.path && + net->backend.type != VIR_DOMAIN_NET_BACKEND_PASST) { virReportError(VIR_ERR_XML_ERROR, _("Missing required attribute '%1$s' in element '%2$s'"), "path", "source"); diff --git a/tests/qemuxmlconfdata/net-user-slirp-portforward.x86_64-latest.err b/tests/qemuxmlconfdata/net-user-slirp-portforward.x86_64-latest.err index eaa934742e..e231677e57 100644 --- a/tests/qemuxmlconfdata/net-user-slirp-portforward.x86_64-latest.err +++ b/tests/qemuxmlconfdata/net-user-slirp-portforward.x86_64-latest.err @@ -1 +1 @@ -unsupported configuration: The <portForward> element can only be used with <interface type='user'> and its 'passt' backend +unsupported configuration: The <portForward> element can only be used with the 'passt' backend of interface type='user' or type='vhostuser' diff --git a/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args b/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args new file mode 100644 index 0000000000..21d78d6072 --- /dev/null +++ b/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args @@ -0,0 +1,42 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/var/lib/libvirt/qemu/domain--1-QEMUGuest1 \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain--1-QEMUGuest1/.local/share \ +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain--1-QEMUGuest1/.cache \ +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain--1-QEMUGuest1/.config \ +/usr/bin/qemu-system-x86_64 \ +-name guest=QEMUGuest1,debug-threads=on \ +-S \ +-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain--1-QEMUGuest1/master-key.aes"}' \ +-machine pc,usb=off,dump-guest-core=off,memory-backend=pc.ram,acpi=off \ +-accel tcg \ +-cpu qemu64 \ +-m size=219136k \ +-object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":224395264}' \ +-overcommit mem-lock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=1729,server=on,wait=off \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-boot strict=on \ +-blockdev '{"driver":"host_device","filename":"/dev/HostVG/QEMUGuest1","node-name":"libvirt-1-storage","read-only":false}' \ +-device '{"driver":"ide-hd","bus":"ide.0","unit":0,"drive":"libvirt-1-storage","id":"ide0-0-0","bootindex":1}' \ +-chardev socket,id=charnet0,path=/var/run/libvirt/qemu/passt/-1-QEMUGuest1-net0.socket \ +-netdev '{"type":"vhost-user","chardev":"charnet0","id":"hostnet0"}' \ +-device '{"driver":"virtio-net-pci","netdev":"hostnet0","id":"net0","mac":"00:11:22:33:44:55","bus":"pci.0","addr":"0x2"}' \ +-chardev socket,id=charnet1,path=/var/run/libvirt/qemu/passt/-1-QEMUGuest1-net1.socket \ +-netdev '{"type":"vhost-user","chardev":"charnet1","id":"hostnet1"}' \ +-device '{"driver":"virtio-net-pci","netdev":"hostnet1","id":"net1","mac":"00:11:22:33:44:11","bus":"pci.0","addr":"0x3"}' \ +-chardev socket,id=charnet2,path=/var/run/libvirt/qemu/passt/-1-QEMUGuest1-net2.socket \ +-netdev '{"type":"vhost-user","chardev":"charnet2","id":"hostnet2"}' \ +-device '{"driver":"virtio-net-pci","netdev":"hostnet2","id":"net2","mac":"00:11:22:33:44:11","bus":"pci.0","addr":"0x4"}' \ +-audiodev '{"id":"audio1","driver":"none"}' \ +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ +-msg timestamp=on diff --git a/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml b/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml new file mode 100644 index 0000000000..26aa4c8d05 --- /dev/null +++ b/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml @@ -0,0 +1,72 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='custom' match='exact' check='none'> + <model fallback='forbid'>qemu64</model> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0' model='none'/> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <interface type='vhostuser'> + <mac address='00:11:22:33:44:55'/> + <ip address='172.17.2.0' family='ipv4' prefix='24'/> + <ip address='2001:db8:ac10:fd01::feed' family='ipv6'/> + <portForward proto='tcp' address='2001:db8:ac10:fd01::1:10'> + <range start='22' to='2022'/> + <range start='1000' end='1050'/> + <range start='1020' exclude='yes'/> + <range start='1030' end='1040' exclude='yes'/> + </portForward> + <portForward proto='udp' address='1.2.3.4' dev='eth0'> + <range start='5000' end='5020' to='6000'/> + <range start='5010' end='5015' exclude='yes'/> + </portForward> + <portForward proto='tcp'> + <range start='80'/> + </portForward> + <portForward proto='tcp'> + <range start='443' to='344'/> + </portForward> + <model type='virtio'/> + <backend type='passt' logFile='/var/log/loglaw.blog'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </interface> + <interface type='vhostuser'> + <mac address='00:11:22:33:44:11'/> + <model type='virtio'/> + <backend type='passt'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </interface> + <interface type='vhostuser'> + <mac address='00:11:22:33:44:11'/> + <model type='virtio'/> + <backend type='passt'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </interface> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <audio id='1' type='none'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxmlconfdata/net-vhostuser-passt.xml b/tests/qemuxmlconfdata/net-vhostuser-passt.xml new file mode 100644 index 0000000000..e44c91e541 --- /dev/null +++ b/tests/qemuxmlconfdata/net-vhostuser-passt.xml @@ -0,0 +1,70 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0' model='none'/> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <interface type='vhostuser'> + <mac address='00:11:22:33:44:55'/> + <ip address='172.17.2.0' family='ipv4' prefix='24'/> + <ip address='2001:db8:ac10:fd01::feed' family='ipv6'/> + <source dev='eth42'/> + <portForward proto='tcp' address='2001:db8:ac10:fd01::1:10'> + <range start='22' to='2022'/> + <range start='1000' end='1050'/> + <range start='1020' exclude='yes'/> + <range start='1030' end='1040' exclude='yes'/> + </portForward> + <portForward proto='udp' address='1.2.3.4' dev='eth0'> + <range start='5000' end='5020' to='6000'/> + <range start='5010' end='5015' exclude='yes'/> + </portForward> + <portForward proto='tcp'> + <range start='80'/> + </portForward> + <portForward proto='tcp'> + <range start='443' to='344'/> + </portForward> + <model type='virtio'/> + <backend type='passt' logFile='/var/log/loglaw.blog'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </interface> + <interface type='vhostuser'> + <mac address='00:11:22:33:44:11'/> + <backend type='passt'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </interface> + <interface type='vhostuser'> + <mac address='00:11:22:33:44:11'/> + <source dev='eth43'/> + <model type='virtio'/> + <backend type='passt'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </interface> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <audio id='1' type='none'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxmlconftest.c b/tests/qemuxmlconftest.c index 1f0068864a..34674551a4 100644 --- a/tests/qemuxmlconftest.c +++ b/tests/qemuxmlconftest.c @@ -1793,6 +1793,7 @@ mymain(void) DO_TEST_CAPS_LATEST("net-user-passt"); DO_TEST_CAPS_VER("net-user-passt", "7.2.0"); DO_TEST_CAPS_LATEST_PARSE_ERROR("net-user-slirp-portforward"); + DO_TEST_CAPS_LATEST("net-vhostuser-passt"); DO_TEST_CAPS_LATEST("net-virtio"); DO_TEST_CAPS_LATEST("net-virtio-device"); DO_TEST_CAPS_LATEST("net-virtio-disable-offloads"); -- 2.47.1

On a Saturday in 2025, Laine Stump wrote:
<interface type='vhostuser'><backend type='passt'/> needs to run the passt command just as is done for interface type='user', but then add vhostuser bits to the qemu commandline/monitor command.
There are some changes to the parsing/validation along with changes to the vhostuser codepath do do the extra stuff for passt. I tried keeping them separated into different patches, but then the unit test failed in a strange way deep down in the bowels of the commandline generation, so this patch both 1) makes the final changes to parsing/formatting and 2) adds passt stuff at appropriate places for vhostuser (as well as making a couple of things *not* happen when the passt backend is chosen). The result is that you can now have:
<interface type='vhostuser'> <backend type='passt'/> ... </interface>
Then as long as you also have the following as a subelement of <domain>:
<memoryBacking> <access mode='shared'/> </memoryBacking>
your passt interfaces will benefit from the greatly improved efficiency of a vhost-user data path, and all without requiring special privileges or capabilities *anywhere* (i.e. it works for unprivileged libvirt (qemu:///session) as well as privileged libvirt).
Signed-off-by: Laine Stump <laine@redhat.com> --- src/conf/domain_conf.c | 36 ++++++--- src/conf/domain_validate.c | 77 +++++++------------ src/conf/schemas/domaincommon.rng | 32 +++++++- src/qemu/qemu_command.c | 7 +- src/qemu/qemu_extdevice.c | 6 +- src/qemu/qemu_hotplug.c | 21 ++++- src/qemu/qemu_passt.c | 3 + src/qemu/qemu_process.c | 15 +++- src/qemu/qemu_validate.c | 7 +- ...t-user-slirp-portforward.x86_64-latest.err | 2 +- .../net-vhostuser-passt.x86_64-latest.args | 42 ++++++++++ .../net-vhostuser-passt.x86_64-latest.xml | 72 +++++++++++++++++ tests/qemuxmlconfdata/net-vhostuser-passt.xml | 70 +++++++++++++++++ tests/qemuxmlconftest.c | 1 + 14 files changed, 317 insertions(+), 74 deletions(-) create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.xml
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

This can/should also be done for a traditional vhost-user interface (ie not backend type='passt') but that will be a separate change. Signed-off-by: Laine Stump <laine@redhat.com> --- src/qemu/qemu_validate.c | 9 ++- ...vhostuser-passt-no-shmem.x86_64-latest.err | 1 + .../net-vhostuser-passt-no-shmem.xml | 70 +++++++++++++++++++ .../net-vhostuser-passt.x86_64-latest.args | 2 +- .../net-vhostuser-passt.x86_64-latest.xml | 3 + tests/qemuxmlconfdata/net-vhostuser-passt.xml | 3 + tests/qemuxmlconftest.c | 1 + 7 files changed, 87 insertions(+), 2 deletions(-) create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.x86_64-latest.err create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.xml diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index 92e745cea1..3e3e368da3 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -1739,6 +1739,7 @@ qemuValidateDomainDefVhostUserRequireSharedMemory(const virDomainDef *def, static int qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, + const virDomainDef *def, virQEMUCaps *qemuCaps) { bool hasIPv4 = false; @@ -1819,6 +1820,12 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, return -1; } + if (net->type == VIR_DOMAIN_NET_TYPE_VHOSTUSER && + net->backend.type == VIR_DOMAIN_NET_BACKEND_PASST) { + if (qemuValidateDomainDefVhostUserRequireSharedMemory(def, "interface type=\"vhostuser\" backend type=\"passt\"") < 0) + return -1; + } + if (net->type == VIR_DOMAIN_NET_TYPE_VDPA) { if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_NETDEV_VHOST_VDPA)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", @@ -5443,7 +5450,7 @@ qemuValidateDomainDeviceDef(const virDomainDeviceDef *dev, switch (dev->type) { case VIR_DOMAIN_DEVICE_NET: - return qemuValidateDomainDeviceDefNetwork(dev->data.net, qemuCaps); + return qemuValidateDomainDeviceDefNetwork(dev->data.net, def, qemuCaps); case VIR_DOMAIN_DEVICE_CHR: return qemuValidateDomainChrDef(dev->data.chr, def, qemuCaps); diff --git a/tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.x86_64-latest.err b/tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.x86_64-latest.err new file mode 100644 index 0000000000..274af5c722 --- /dev/null +++ b/tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.x86_64-latest.err @@ -0,0 +1 @@ +unsupported configuration: 'interface type="vhostuser" backend type="passt"' requires shared memory diff --git a/tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.xml b/tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.xml new file mode 100644 index 0000000000..e44c91e541 --- /dev/null +++ b/tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.xml @@ -0,0 +1,70 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0' model='none'/> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <interface type='vhostuser'> + <mac address='00:11:22:33:44:55'/> + <ip address='172.17.2.0' family='ipv4' prefix='24'/> + <ip address='2001:db8:ac10:fd01::feed' family='ipv6'/> + <source dev='eth42'/> + <portForward proto='tcp' address='2001:db8:ac10:fd01::1:10'> + <range start='22' to='2022'/> + <range start='1000' end='1050'/> + <range start='1020' exclude='yes'/> + <range start='1030' end='1040' exclude='yes'/> + </portForward> + <portForward proto='udp' address='1.2.3.4' dev='eth0'> + <range start='5000' end='5020' to='6000'/> + <range start='5010' end='5015' exclude='yes'/> + </portForward> + <portForward proto='tcp'> + <range start='80'/> + </portForward> + <portForward proto='tcp'> + <range start='443' to='344'/> + </portForward> + <model type='virtio'/> + <backend type='passt' logFile='/var/log/loglaw.blog'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </interface> + <interface type='vhostuser'> + <mac address='00:11:22:33:44:11'/> + <backend type='passt'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </interface> + <interface type='vhostuser'> + <mac address='00:11:22:33:44:11'/> + <source dev='eth43'/> + <model type='virtio'/> + <backend type='passt'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </interface> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <audio id='1' type='none'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args b/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args index 21d78d6072..7c030d7067 100644 --- a/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args +++ b/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args @@ -14,7 +14,7 @@ XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain--1-QEMUGuest1/.config \ -accel tcg \ -cpu qemu64 \ -m size=219136k \ --object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":224395264}' \ +-object '{"qom-type":"memory-backend-file","id":"pc.ram","mem-path":"/var/lib/libvirt/qemu/ram/-1-QEMUGuest1/pc.ram","share":true,"x-use-canonical-path-for-ramblock-id":false,"size":224395264}' \ -overcommit mem-lock=off \ -smp 1,sockets=1,cores=1,threads=1 \ -uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ diff --git a/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml b/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml index 26aa4c8d05..a1f9366722 100644 --- a/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml +++ b/tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml @@ -3,6 +3,9 @@ <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> <memory unit='KiB'>219136</memory> <currentMemory unit='KiB'>219136</currentMemory> + <memoryBacking> + <access mode='shared'/> + </memoryBacking> <vcpu placement='static'>1</vcpu> <os> <type arch='x86_64' machine='pc'>hvm</type> diff --git a/tests/qemuxmlconfdata/net-vhostuser-passt.xml b/tests/qemuxmlconfdata/net-vhostuser-passt.xml index e44c91e541..71b845329b 100644 --- a/tests/qemuxmlconfdata/net-vhostuser-passt.xml +++ b/tests/qemuxmlconfdata/net-vhostuser-passt.xml @@ -3,6 +3,9 @@ <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> <memory unit='KiB'>219136</memory> <currentMemory unit='KiB'>219136</currentMemory> + <memoryBacking> + <access mode='shared'/> + </memoryBacking> <vcpu placement='static'>1</vcpu> <os> <type arch='x86_64' machine='pc'>hvm</type> diff --git a/tests/qemuxmlconftest.c b/tests/qemuxmlconftest.c index 34674551a4..c271170d25 100644 --- a/tests/qemuxmlconftest.c +++ b/tests/qemuxmlconftest.c @@ -1794,6 +1794,7 @@ mymain(void) DO_TEST_CAPS_VER("net-user-passt", "7.2.0"); DO_TEST_CAPS_LATEST_PARSE_ERROR("net-user-slirp-portforward"); DO_TEST_CAPS_LATEST("net-vhostuser-passt"); + DO_TEST_CAPS_LATEST_PARSE_ERROR("net-vhostuser-passt-no-shmem"); DO_TEST_CAPS_LATEST("net-virtio"); DO_TEST_CAPS_LATEST("net-virtio-device"); DO_TEST_CAPS_LATEST("net-virtio-disable-offloads"); -- 2.47.1

On a Saturday in 2025, Laine Stump wrote:
This can/should also be done for a traditional vhost-user interface (ie not backend type='passt') but that will be a separate change.
Signed-off-by: Laine Stump <laine@redhat.com> --- src/qemu/qemu_validate.c | 9 ++- ...vhostuser-passt-no-shmem.x86_64-latest.err | 1 + .../net-vhostuser-passt-no-shmem.xml | 70 +++++++++++++++++++ .../net-vhostuser-passt.x86_64-latest.args | 2 +- .../net-vhostuser-passt.x86_64-latest.xml | 3 + tests/qemuxmlconfdata/net-vhostuser-passt.xml | 3 + tests/qemuxmlconftest.c | 1 + 7 files changed, 87 insertions(+), 2 deletions(-) create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.x86_64-latest.err create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.xml
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

This reorganizes the section about <interface type='user'> and describes the differences in behavior between SLIRP and passt. Resolves: https://issues.redhat.com/browse/RHEL-46601 Signed-off-by: Laine Stump <laine@redhat.com> --- docs/formatdomain.rst | 116 ++++++++++++++++++++++++++++-------------- 1 file changed, 78 insertions(+), 38 deletions(-) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 83aeaa32c2..9f703f8e3c 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -5089,25 +5089,34 @@ to the interface. </devices> ... -Userspace (SLIRP or passt) connection -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Userspace connection using SLIRP +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -The ``user`` type connects the guest interface to the outside via a +The ``user`` interface type connects the guest interface to the outside via a transparent userspace proxy that doesn't require any special system privileges, making it usable in cases when libvirt itself is running with no privileges (e.g. libvirt's "session mode" daemon, or when libvirt is run inside an unprivileged container). -By default, this user proxy is done with QEMU's internal SLIRP driver -which has DHCP & DNS services that give the guest IP addresses -starting from ``10.0.2.15``, a default route of ``10.0.2.2`` and DNS -server of ``10.0.2.3``. :since:`Since 3.8.0` it is possible to override -the default network address by including an ``ip`` element specifying -an IPv4 address in its one mandatory attribute, -``address``. Optionally, a second ``ip`` element with a ``family`` -attribute set to "ipv6" can be specified to add an IPv6 address to the -interface. ``address``. Optionally, address ``prefix`` can be -specified. +By default, this user proxy is done with QEMU's SLIRP driver, a +userspace proxy built into QEMU that has DHCP & DNS services that give +the guest an IP address of ``10.0.2.15``, a default route of +``10.0.2.2`` and DNS server at ``10.0.2.3``. + +:since:`Since 3.8.0` it is possible to override the guest's default +network address by including an ``ip`` element specifying an IPv4 +address in its one mandatory attribute, ``address``. Optionally, a +second ``ip`` element with a ``family`` attribute set to "ipv6" can be +specified to add an IPv6 address to the interface. ``address``. +Optionally, an address ``prefix`` can be specified. These settings are +surprisingly **not** used by SLIRP to set the exact IP address; +instead they are used to determine what network/subnet the guest's IP +address should be on, and the guest will be given an address in that +subnet, but the host portion of the address will still be "2.15". In +the example below, for example, the guest will be given the IP address +172.17.2.15 (**note that the '1.1' in the host portion of the address +has been ignored**), default route of 172.17.2.2, and DNS server +172.17.2.3. :: @@ -5117,34 +5126,65 @@ specified. ... <interface type='user'> <mac address="00:11:22:33:44:55"/> - <ip family='ipv4' address='172.17.2.0' prefix='24'/> - <ip family='ipv6' address='2001:db8:ac10:fd01::' prefix='64'/> + <ip family='ipv4' address='172.17.1.1' prefix='16'/> + <ip family='ipv6' address='2001:db8:ac10:fd01::1' prefix='64'/> </interface> </devices> ... -:since:`Since 9.0.0` an alternate backend implementation of the -``user`` interface type can be selected by setting the interface's -``<backend>`` subelement ``type`` attribute to ``passt``. In this -case, the passt transport (https://passt.top) is used. Similar to -SLIRP, passt has an internal DHCP server that provides a requesting -guest with one ipv4 and one ipv6 address; it then uses userspace -proxies and a separate network namespace to provide outgoing -UDP/TCP/ICMP sessions, and optionally redirect incoming traffic -destined for the host toward the guest instead. - -When the passt backend is used, the ``<backend>`` attribute -``logFile`` can be used to tell the passt process for this interface -where to write its message log, and the ``<source>`` attribute ``dev`` -can tell it to use a particular host interface to derive the routes -given to the guest for forwarding traffic upstream. Due to the design -decisions of passt, if using SELinux, the log file is recommended to -reside in the runtime directory of a user under which the passt -process will run, most probably ``/run/user/$UID`` where ``$UID`` is -the UID of the user, e.g. ``qemu``. Beware that libvirt does not -create this directory if it does not already exist to avoid possible, -however unlikely, issues, especially since this logfile attribute is -meant mostly for debugging. +Userspace connection using passt +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:since:`Since 9.0.0 (QEMU and KVM only)` an alternate backend +implementation of the ``user`` interface type can be selected by +setting the interface's ``<backend>`` subelement ``type`` attribute to +``passt``. In this case, the passt transport `(details here) +<https://passt.top>`__ is used. passt is run as a separate process +from QEMU - the passt process handles the details of forwarding +network traffic back and forth to the physical network (using +userspace proxies and a separate network namespace to provide outgoing +UDP/TCP/ICMP sessions, and optionally redirecting incoming traffic +destined for the host toward the guest instead), and a socket between +passt and QEMU forwards that traffic on to the guest (and back out, +of course). + +Similar to SLIRP, passt has an internal DHCP server that provides a +requesting guest with one ipv4 and one ipv6 address. There are default +values for both of these, or you can use the ``<ip>`` element +(described above, with behavioral differences as outlined below) to +configure one IPv4 and one IPv6 address that passt's DHCP server can +provide to the guest. + +Unlike SLIRP, when no `<ip>`` address is specified, passt will by +default provide the guest with an IP address, DNS server, etc. that +are identical to those settings on the host itself (through the magic +of the proxies and a separate network namespace, this doesn't create +any conflict). + +Also different from SLIRP's behavior: if you do specify IP +address(es), the exact address and netmask/prefix you specify will be +provided to the guest (i.e. passt doesn't interpret the <ip> settings +as a network address like SLIRP does, but as a host address). In +example given above, the guest IP would be set to exactly 172.17.1.1. + +Just as with SLIRP, though, once traffic from the guest leaves the +host towards the rest of the network, it will always appear as if it +came from the host's IP. + +There are a few other options that are configurable only for the passt +backend. For example, the ``<backend>`` attribute ``logFile`` can be +used to tell the passt process for this interface where to write its +message log, and the ``<source>`` attribute ``dev`` can tell it a +particular host interface to use when deriving the routes given to the +guest for forwarding traffic upstream. Due to the design decisions of +passt, when using SELinux on the host, it is recommended that the log +file reside in the runtime directory of the user under which the passt +process will run, most probably ``/run/user/$UID`` (where ``$UID`` is +the UID of that user), e.g. ``/run/user/1000``. Be aware that libvirt +does not create this directory if it does not already exist to avoid +possible, however unlikely, issues with orphaned directories or +permissions, etc. The logfile attribute is meant mostly for debugging, +so it shouldn't be set under normal circumstances. Additionally, when passt is used, multiple ``<portForward>`` elements can be added to forward incoming network traffic for the host to this @@ -5181,7 +5221,7 @@ ports **with the exception of some subset**. <backend type='passt' logFile='/run/user/$UID/passt-domain.log'/> <mac address="00:11:22:33:44:55"/> <source dev='eth0'/> - <ip family='ipv4' address='172.17.2.4' prefix='24'/> + <ip family='ipv4' address='172.17.5.4' prefix='24'/> <ip family='ipv6' address='2001:db8:ac10:fd01::20'/> <portForward proto='tcp'> <range start='2022' to='22'/> -- 2.47.1

On a Saturday in 2025, Laine Stump wrote:
This reorganizes the section about <interface type='user'> and describes the differences in behavior between SLIRP and passt.
Resolves: https://issues.redhat.com/browse/RHEL-46601 Signed-off-by: Laine Stump <laine@redhat.com> --- docs/formatdomain.rst | 116 ++++++++++++++++++++++++++++-------------- 1 file changed, 78 insertions(+), 38 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Almost everything is already there (in the section for using passt with type='user'), so we just need to point to that from the type='vhostuser' section (and vice versa), and add a bit of glue. Also updated a few related details that have changed (e.g. default model type for vhostuser is now 'virtio', and source type/mode are now optional), and changed "vhost-user interface" to "vhost-user connection" because the interface is a virtio interface, and vhost-user is being used to connect that interface to the outside. Signed-off-by: Laine Stump <laine@redhat.com> --- docs/formatdomain.rst | 73 ++++++++++++++++++++++++++++++++++++------- 1 file changed, 62 insertions(+), 11 deletions(-) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 9f703f8e3c..381bf84f67 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -5148,6 +5148,15 @@ destined for the host toward the guest instead), and a socket between passt and QEMU forwards that traffic on to the guest (and back out, of course). +*(:since:`Since 11.1.0 (QEMU and KVM only)` you may prefer to use the +passt backend with the more efficient and performant type='vhostuser' +rather than type='user'. All the options related to passt in the +paragraphs below here also apply when using the passt backend with +type='vhostuser'; any other details specific to vhostuser are +described* `here +<formatdomain.html#vhost-user-connection-with-passt-backend>`__.) + + Similar to SLIRP, passt has an internal DHCP server that provides a requesting guest with one ipv4 and one ipv6 address. There are default values for both of these, or you can use the ``<ip>`` element @@ -5840,7 +5849,7 @@ following attributes are available for the ``virtio`` NIC driver: The optional ``queues`` attribute controls the number of queues to be used for either `Multiqueue virtio-net <https://www.linux-kvm.org/page/Multiqueue>`__ or vhost-user (See - `vhost-user interface`_) network interfaces. Use of multiple packet + `vhost-user connection`_) network interfaces. Use of multiple packet processing queues requires the interface having the ``<model type='virtio'/>`` element. Each queue will potentially be handled by a different processor, resulting in much higher throughput. @@ -6285,8 +6294,8 @@ similarly named elements used to configure the guest side of the interface (described above). -vhost-user interface -^^^^^^^^^^^^^^^^^^^^ +vhost-user connection +^^^^^^^^^^^^^^^^^^^^^ :since:`Since 1.2.7` the vhost-user enables the communication between a QEMU virtual machine and other userspace process using the Virtio transport protocol. @@ -6313,16 +6322,58 @@ plane is based on shared memory. </devices> ... -The ``<source>`` element has to be specified along with the type of char device. -Currently, only type='unix' is supported, where the path (the directory path of -the socket) and mode attributes are required. Both ``mode='server'`` and -``mode='client'`` are supported. vhost-user requires the virtio model type, thus -the ``<model>`` element is mandatory. :since:`Since 4.1.0` the element has an -optional child element ``reconnect`` which configures reconnect timeout if the -connection is lost. It has two attributes ``enabled`` (which accepts ``yes`` and -``no``) and ``timeout`` which specifies the amount of seconds after which +The ``<source>`` element has to be specified along with the type of +char device. Currently, only type='unix' is supported, where the path +(the directory path of the socket) and mode attributes are +required. Both ``mode='server'`` and ``mode='client'`` are +supported. (:since:`Since 11.1.0` the default source type for +vhostuser interfaces is 'unix' and default mode is 'client', so those +two attributes are now optional). + +The vhost-user protocol only works with the virtio guest driver, so +the ``<model>`` element ``type`` attribute is mandatory (:since:`Since +11.1.0` the default model type for vhostuser interfaces is now +'virtio' so ``<model>`` is no longer mandatory). :since:`Since 4.1.0` +the ``<source>`` element has an optional child element ``reconnect`` +which configures reconnect timeout if the connection is lost. It has +two attributes ``enabled`` (which accepts ``yes`` and ``no``) and +``timeout`` which specifies the amount of seconds after which hypervisor tries to reconnect. + +vhost-user connection with passt backend +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:since:`Since 11.1.0 (QEMU and KVM only)` passt can be used as the +other end of the vhost-user connection. This is a compelling +alternative, because passt provides all of its network connectivity +without requiring any elevated privileges or capabilities, and +vhost-user uses shared memory to make this unprivileged connection +very high performance as well. You can set a type='vhostuser' +interface to use passt as the backend by adding ``<backend +type='passt'/>``. When passt is the backend, only a single driver +queue is supported, and the ``<source>`` path/type/mode are all +implied to be "matching the passt process" so **must not** be +specified. All of the passt options `described here +<formatdomain.html#userspace-connection-using-passt>`__, are also +supported for ``type='vhostuser'`` with the passt backend, e.g. +setting guest-side IP addresses with ``<ip>`` and port forwarding with +``<portForward``. + +:: + + ... + <devices> + <interface type='vhostuser'> + <backend type='passt'/> + <mac address='52:54:00:3b:83:1a'/> + <source dev='enp1s0'/> + <ip address='10.30.0.5 prefix='24'/> + </interface> + </devices> + ... + + Traffic filtering with NWFilter ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -- 2.47.1

On a Saturday in 2025, Laine Stump wrote:
Almost everything is already there (in the section for using passt with type='user'), so we just need to point to that from the type='vhostuser' section (and vice versa), and add a bit of glue.
Also updated a few related details that have changed (e.g. default model type for vhostuser is now 'virtio', and source type/mode are now optional), and changed "vhost-user interface" to "vhost-user connection" because the interface is a virtio interface, and vhost-user is being used to connect that interface to the outside.
Signed-off-by: Laine Stump <laine@redhat.com> --- docs/formatdomain.rst | 73 ++++++++++++++++++++++++++++++++++++------- 1 file changed, 62 insertions(+), 11 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

On Sat, Feb 15, 2025 at 12:20:17AM -0500, Laine Stump wrote:
+vhost-user connection with passt backend +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:since:`Since 11.1.0 (QEMU and KVM only)` passt can be used as the +other end of the vhost-user connection. This is a compelling +alternative, because passt provides all of its network connectivity +without requiring any elevated privileges or capabilities, and +vhost-user uses shared memory to make this unprivileged connection +very high performance as well. You can set a type='vhostuser' +interface to use passt as the backend by adding ``<backend +type='passt'/>``. When passt is the backend, only a single driver +queue is supported,
This should be added to the validation step. Otherwise: error: internal error: QEMU unexpectedly closed the monitor (vm='test'): qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: Failed to read msg header. Read -1 instead of 12. Original request 1. qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: vhost_backend_init failed: Protocol error qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: failed to init vhost_net for queue 0 qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: Failed to connect to '/run/libvirt/qemu/passt/1-test-net0.socket': Connection refused qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: Device 'vhost-user' could not be initialized
and the ``<source>`` path/type/mode are all +implied to be "matching the passt process" so **must not** be +specified.
These seem to simply be ignored if specified, which I guess is better than accepting values that we know aren't going to work. Explicitly rejecting them might be better. Incidentally, dumpxml doesn't report the actual values, which I kinda expected to happen. Not sure how other vhost-user device behave in this sense. -- Andrea Bolognani / Red Hat / Virtualization

On 2/17/25 9:13 AM, Andrea Bolognani wrote:
On Sat, Feb 15, 2025 at 12:20:17AM -0500, Laine Stump wrote:
+vhost-user connection with passt backend +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:since:`Since 11.1.0 (QEMU and KVM only)` passt can be used as the +other end of the vhost-user connection. This is a compelling +alternative, because passt provides all of its network connectivity +without requiring any elevated privileges or capabilities, and +vhost-user uses shared memory to make this unprivileged connection +very high performance as well. You can set a type='vhostuser' +interface to use passt as the backend by adding ``<backend +type='passt'/>``. When passt is the backend, only a single driver +queue is supported,
This should be added to the validation step.
Good point. I figured it wouldn't work, so I wasn't surprised that it failed when I tried it, but I didn't think to add validation to prevent it. I'll make a patch for that.
Otherwise:
error: internal error: QEMU unexpectedly closed the monitor (vm='test'): qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: Failed to read msg header. Read -1 instead of 12. Original request 1. qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: vhost_backend_init failed: Protocol error qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: failed to init vhost_net for queue 0 qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: Failed to connect to '/run/libvirt/qemu/passt/1-test-net0.socket': Connection refused qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: Device 'vhost-user' could not be initialized
and the ``<source>`` path/type/mode are all +implied to be "matching the passt process" so **must not** be +specified.
These seem to simply be ignored if specified, which I guess is better than accepting values that we know aren't going to work. Explicitly rejecting them might be better.
Yeah, you're again correct.
Incidentally, dumpxml doesn't report the actual values, which I kinda expected to happen.
Since it's an internal implementation detail, and there's really nothing you can do with those values if you have them, I think they should just completely not exist in the XML output. (for type='user' passt the socket path is never seen outside of the passt commandline (iirc libvirt passes it to QEMU as an open fd)) When I had gotten close to the end of the implementation, I realized that this all could probably be much cleaner and (in some ways) more straightforward if the relevant XML attributes were stored in a simpler struct in the virDomainNetDef (rather than a full virDomainChrSourceDef), and we then constructed a virDomainChrSourceDef in the NetDef's privateData object at runtime and used *that* to open the socket. But without looking at the other uses of the chardev code at all, it seems like that could require quite a lot of changes to other code of functionalities not related to vhostuser network interfaces, leading to a higher likelyhood of regressions in places that wouldn't be caught until full regression tests done at a later time. I still think it should be done, but at a later time.
Not sure how other vhost-user device behave in this sense.
To be truthful, the code surrounding the socket path (and ifname) in the case of normal vhostuser was convoluted enough that I didn't try too hard to fully understand it. I think I remember that you're required to set <dev target='blah'/> even though that name has no functional use (in the case that you're hooking up to an OVS switch)(although possibly I was just witnessing odd behavior of an interim state of the code), and may even be replaced by a value returned from a call to OVS anyway (or something like that - I think it is the name of the port on the OVS switch), but in the case where you have two QEMU's connected directly to each other it is completely meaningless, yet you still have to have it). Again a more general cleanup is in order.

On 2/17/25 10:48 AM, Laine Stump wrote:
On 2/17/25 9:13 AM, Andrea Bolognani wrote:
On Sat, Feb 15, 2025 at 12:20:17AM -0500, Laine Stump wrote:
+vhost-user connection with passt backend +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:since:`Since 11.1.0 (QEMU and KVM only)` passt can be used as the +other end of the vhost-user connection. This is a compelling +alternative, because passt provides all of its network connectivity +without requiring any elevated privileges or capabilities, and +vhost-user uses shared memory to make this unprivileged connection +very high performance as well. You can set a type='vhostuser' +interface to use passt as the backend by adding ``<backend +type='passt'/>``. When passt is the backend, only a single driver +queue is supported,
This should be added to the validation step.
Good point. I figured it wouldn't work, so I wasn't surprised that it failed when I tried it, but I didn't think to add validation to prevent it. I'll make a patch for that.
Otherwise:
error: internal error: QEMU unexpectedly closed the monitor (vm='test'): qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: Failed to read msg header. Read -1 instead of 12. Original request 1. qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: vhost_backend_init failed: Protocol error qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: failed to init vhost_net for queue 0 qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: Failed to connect to '/run/libvirt/qemu/passt/1-test-net0.socket': Connection refused qemu-system-x86_64: -netdev {"type":"vhost-user","chardev":"charnet0","queues":5,"id":"hostnet0"}: Device 'vhost-user' could not be initialized
and the ``<source>`` path/type/mode are all +implied to be "matching the passt process" so **must not** be +specified.
These seem to simply be ignored if specified, which I guess is better than accepting values that we know aren't going to work. Explicitly rejecting them might be better.
Yeah, you're again correct.
Now that I've had more time to drink coffee and retrace the past events (pun not intended), I remember the reason I didn't validate that the type & mode settings are empty: 1) The "mode" setting is just directly saved into a bool ("listen") in the struct (virDomainChrDef) by the parser (ie. only 2 possible values (t/f), not 3 (t/f/unspecified)), so by the time you get to any validation function you no longer have any way of knowing if that bool is false because mode was unspecified, or if it's false because the input had "mode='client'". I could have used the hack of adding a "mode_specified" bool to virDomainChrDef, but that struct has several different users in different contexts, and a "mode_specified" would be nonsensical and confusing for most of them (especially since the variable in the struct is "listen", not "mode"), and the last thing I want to do is add *even more* to the confusion. 2) The 'type' setting is an enum in the virDomainChrDef, and that enum doesn't have an "unspecified" or "default" value as 0; it instead has VIR_DOMAIN_CHR_TYPE_NULL. I haven't tried to see what happens if I enter "type='null'" for some other type of chrdev, but that value is listed for VIR_ENUM_IMPL(virDomainChr), and every place that virDomainChrTypeFromString() is called, only a return of < 0 is considered an error (rather than <= 0) (both the name choice and all of the < 0 return value checks implying that there could be/is a place where setting "type='null'" has some valid functionality). Anyway the end of this is that it seemed "risky" to begin interpreting VIR_DOMAIN_CHR_TYPE_NULL to mean "type not specified", and anyway that would require that we add special handling at each place type is examined (there are many). Alternatively I could add a *new* 0 value VIR_DOMAIN_CHR_TYPE_UNSPECIFIED (or ..._NONE or something), but that would still require changing all of the checks on virDomainChrTypeFromString() to <= 0, and then changing every switch(type) to equate ..._NONE with ..._UNIX (but then in the future, one of the other uses for that enum could want _NONE to be equivalent to something else rather than _UNIX). So adding a new enum value was also increasing the fingers of change for this functionality way beyond just the vhostuser interface code, and I preferred to not have to regression-test every single usage of virDomainChrDef just for this one new feature. 3) Definitely I can/should check that they haven't tried to specify the path when backend type='passt' - that one has none of the complications of type/mode
Incidentally, dumpxml doesn't report the actual values, which I kinda expected to happen.
Since it's an internal implementation detail, and there's really nothing you can do with those values if you have them, I think they should just completely not exist in the XML output. (for type='user' passt the socket path is never seen outside of the passt commandline (iirc libvirt passes it to QEMU as an open fd))
When I had gotten close to the end of the implementation, I realized that this all could probably be much cleaner and (in some ways) more straightforward if the relevant XML attributes were stored in a simpler struct in the virDomainNetDef (rather than a full virDomainChrSourceDef), and we then constructed a virDomainChrSourceDef in the NetDef's privateData object at runtime and used *that* to open the socket.
Just wanted to re-mention the previous paragraph, because I think we really should do that (just not today/yesterday).
But without looking at the other uses of the chardev code at all, it seems like that could require quite a lot of changes to other code of functionalities not related to vhostuser network interfaces, leading to a higher likelyhood of regressions in places that wouldn't be caught until full regression tests done at a later time. I still think it should be done, but at a later time.
Not sure how other vhost-user device behave in this sense.
To be truthful, the code surrounding the socket path (and ifname) in the case of normal vhostuser was convoluted enough that I didn't try too hard to fully understand it. I think I remember that you're required to set <dev target='blah'/> even though that name has no functional use (in the case that you're hooking up to an OVS switch)(although possibly I was just witnessing odd behavior of an interim state of the code), and may even be replaced by a value returned from a call to OVS anyway (or something like that - I think it is the name of the port on the OVS switch), but in the case where you have two QEMU's connected directly to each other it is completely meaningless, yet you still have to have it).
Again a more general cleanup is in order.

Signed-off-by: Laine Stump <laine@redhat.com> --- NEWS.rst | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/NEWS.rst b/NEWS.rst index 4fc8a3bba0..7984f358f3 100644 --- a/NEWS.rst +++ b/NEWS.rst @@ -54,6 +54,14 @@ v11.1.0 (unreleased) The virtio-mem model of ``<memory/>`` device can now be used with s390 guests. + * Support using passt as the backend for interface type='vhostuser' + + The combination of vhostuser transport with passt as the backend + provides high performance, fully featured networking without the + need for libvirt or QEMU to have any elevated privileges or + capabilities. Configuration and features are identical to the + configuration for type='user' with the passt backend. + * **Improvements** * qemu: I/O error messages can be queried via ``virDomainGetMessages()`` -- 2.47.1

On a Saturday in 2025, Laine Stump wrote:
Signed-off-by: Laine Stump <laine@redhat.com> --- NEWS.rst | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/NEWS.rst b/NEWS.rst index 4fc8a3bba0..7984f358f3 100644 --- a/NEWS.rst +++ b/NEWS.rst @@ -54,6 +54,14 @@ v11.1.0 (unreleased) The virtio-mem model of ``<memory/>`` device can now be used with s390 guests.
+ * Support using passt as the backend for interface type='vhostuser' + + The combination of vhostuser transport with passt as the backend + provides high performance, fully featured networking without the + need for libvirt or QEMU to have any elevated privileges or + capabilities. Configuration and features are identical to the + configuration for type='user' with the passt backend. +
It also makes your teeth whiter.
* **Improvements**
* qemu: I/O error messages can be queried via ``virDomainGetMessages()``
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano

Oops - sorry, I had recreated my branch so git-publish didn't recognize the series as a v2, so all the subject lines are missing v2 :-/. Hopefully this won't cause too much confusion. On 2/15/25 12:20 AM, Laine Stump wrote:
==== Changes from V1:
* fixed missing change to error log message pointed out by abologna
* added a validation check to assure that shared memory is enabled if there is a type='vhostuser' interface in the domain definition
* included a patch documenting differences between type='user' SLIRP and passt behaviors (because I had to do it anyway, and the reorganization made documenting type='vhostuser' passt slightly easier.
* added documentation for type='vhostuser' backend type='passt' =====
passt (https://passt.top) provides a method of connecting QEMU virtual machines to the external network without requiring special privileges or capabilities of any participating processes - even libvirt itself can run unprivileged and create an instance of passt (which *always* runs unprivileged) that is then connected to the qemu process (and thus the virtual machine) with a unix socket.
Originally passt used its own protocol for this socket, sending both control messages and data packets over the socket. This works, and is already much more efficient than the previously only-unprivileged-networking-solution slirp.
But recently passt added support for using the vhost-user protocol for communication between the passt process (which is connected to the external network) and the QEMU process (and thus the VM). vhost-user also uses a unix socket, but only for control plane messages - all data packets are "sent" between the VM and passt process via a shared memory region. This is unsurprisingly much more efficient.
From the point of view of QEMU, the passt process looks identical to any normal vhost-user backend, so we can run QEMU with exactly the same interface commandline options as normal vhost-user. Also, the passt process supports all of the same options as it does when used in its "traditional" mode, so really in the end all we need to do is twist libvirt around so that when <backend type='passt'/> is specified for an <interface type='vhostuser'>, it will run passt just as before (except with the added "--vhost-user" option so that passt will know to use that), and then force feed the vhost-user code in libvirt with the same socket path used by passt.
This series does that, while also switching up a few bits of code prior to adding in the new functionality.
So far this has been tested both unprivileged and privileged on Fedora 40 (with latest passt packet) and selinux enabled (there are a couple of selinux policy tweaks that still need to be pushed to passt-selinux) as well as unprivileged on debian (I *think* with AppArmor enabled) and everything seems to work.
(I haven't gotten to testing hotplug, but it *should* work, and I'll be testing it while (hopefully) someone is reviewing these patches.)
To test, you will need the latest (20250121) passt package and the aforementioned upstream passt-selinux patch if you're using selinux.
This Resolves: https://issues.redhat.com/browse/RHEL-69455
Laine Stump (12): conf: change virDomainHostdevInsert() to return void qemu: fix qemu validation to forbid guest-side IP address for type='vdpa' qemu: validate that model is virtio for vhostuser and vdpa interfaces in the same place qemu: automatically set model type='virtio' for interface type='vhostuser' qemu: do all vhostuser attribute validation in qemu driver conf/qemu: make <source> element *almost* optional for type=vhostuser qemu: use switch instead of if in qemuProcessPrepareDomainNetwork() qemu: make qemuPasstCreateSocketPath() public qemu: complete vhostuser + passt support qemu: fail validation if a domain def has vhostuser/passt but no shared mem docs: improve type='user' docs to higlight differences between SLIRP and passt docs: document using passt backend with <interface type='vhostuser'>
docs/formatdomain.rst | 189 +++++++++++++----- src/conf/domain_conf.c | 107 +++++----- src/conf/domain_conf.h | 2 +- src/conf/domain_validate.c | 85 +++----- src/conf/schemas/domaincommon.rng | 32 ++- src/libxl/libxl_domain.c | 5 +- src/libxl/libxl_driver.c | 3 +- src/lxc/lxc_driver.c | 3 +- src/qemu/qemu_command.c | 7 +- src/qemu/qemu_driver.c | 3 +- src/qemu/qemu_extdevice.c | 6 +- src/qemu/qemu_hotplug.c | 21 +- src/qemu/qemu_passt.c | 5 +- src/qemu/qemu_passt.h | 3 + src/qemu/qemu_postparse.c | 3 +- src/qemu/qemu_process.c | 85 +++++--- src/qemu/qemu_validate.c | 65 ++++-- ...t-user-slirp-portforward.x86_64-latest.err | 2 +- ...vhostuser-passt-no-shmem.x86_64-latest.err | 1 + .../net-vhostuser-passt-no-shmem.xml | 70 +++++++ .../net-vhostuser-passt.x86_64-latest.args | 42 ++++ .../net-vhostuser-passt.x86_64-latest.xml | 75 +++++++ tests/qemuxmlconfdata/net-vhostuser-passt.xml | 73 +++++++ tests/qemuxmlconftest.c | 2 + 24 files changed, 657 insertions(+), 232 deletions(-) create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.x86_64-latest.err create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt-no-shmem.xml create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.xml

On a Saturday in 2025, Laine Stump wrote:
Oops - sorry, I had recreated my branch so git-publish didn't recognize the series as a v2, so all the subject lines are missing v2 :-/. Hopefully this won't cause too much confusion.
On 2/15/25 12:20 AM, Laine Stump wrote: [...]
Laine Stump (12): conf: change virDomainHostdevInsert() to return void qemu: fix qemu validation to forbid guest-side IP address for type='vdpa' qemu: validate that model is virtio for vhostuser and vdpa interfaces in the same place qemu: automatically set model type='virtio' for interface type='vhostuser' qemu: do all vhostuser attribute validation in qemu driver conf/qemu: make <source> element *almost* optional for type=vhostuser qemu: use switch instead of if in qemuProcessPrepareDomainNetwork() qemu: make qemuPasstCreateSocketPath() public
Patches 1 through 8 are missing my R-b tag from v1 Jano
qemu: complete vhostuser + passt support qemu: fail validation if a domain def has vhostuser/passt but no shared mem docs: improve type='user' docs to higlight differences between SLIRP and passt docs: document using passt backend with <interface type='vhostuser'>

On 2/15/25 11:57 PM, Ján Tomko wrote:
On a Saturday in 2025, Laine Stump wrote:
Oops - sorry, I had recreated my branch so git-publish didn't recognize the series as a v2, so all the subject lines are missing v2 :-/. Hopefully this won't cause too much confusion.
On 2/15/25 12:20 AM, Laine Stump wrote: [...]
Laine Stump (12): conf: change virDomainHostdevInsert() to return void qemu: fix qemu validation to forbid guest-side IP address for type='vdpa' qemu: validate that model is virtio for vhostuser and vdpa interfaces in the same place qemu: automatically set model type='virtio' for interface type='vhostuser' qemu: do all vhostuser attribute validation in qemu driver conf/qemu: make <source> element *almost* optional for type=vhostuser qemu: use switch instead of if in qemuProcessPrepareDomainNetwork() qemu: make qemuPasstCreateSocketPath() public
Patches 1 through 8 are missing my R-b tag from v1
Ooops! I'll be sure to add them all in before pushing. Thanks!
Jano
qemu: complete vhostuser + passt support qemu: fail validation if a domain def has vhostuser/passt but no shared mem docs: improve type='user' docs to higlight differences between SLIRP and passt docs: document using passt backend with <interface type='vhostuser'>
participants (4)
-
Andrea Bolognani
-
Ján Tomko
-
Laine Stump
-
Laine Stump