[libvirt PATCH v5 0/6] Add support for vDPA network devices

vDPA network devices allow high-performance networking in a virtual machine by providing a wire-speed data path. These devices require a vendor-specific host driver but the data path follows the virtio specification. The support for vDPA devices was recently added to qemu. This allows libvirt to support these devices. This patchset requires that the device is configured on the host with the appropriate vendor-specific driver. This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That chardev path can then be used to define a new interface with type=3D'vdpa'. Note that in order for hot-unplug to work properly, you may need to apply a qemu patch[1] for now. Without the patch, qemu will not close the fd properly and any subsequent attempts to use the vdpa chardev will fail like this: virsh # attach-device guest1 vdpa.xml error: Failed to attach device from vdpa.xml error: Unable to open '/dev/vhost-vdpa-0' for vdpa device: Device or reso= urce busy [1] https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg06374.html Changes in v5: - rebased to latest master - fixed a case where qemuDomainObjExitMonitor() was not called on an error p= ath - Improved the nodedev xml. It now includes the path to the chardev in /dev - also updated the nodedev xml schema - added sample nodedev-dumpxml output to the commit message of patch #6 Jonathon Jongsma (6): conf: Add support for vDPA network devices qemu: add vhost-vdpa capability qemu: add vdpa support qemu: add monitor functions for handling file descriptors qemu: support hotplug of vdpa devices Include vdpa devices in node device list docs/formatdomain.rst | 24 +++ docs/formatnode.html.in | 9 + docs/schemas/domaincommon.rng | 15 ++ docs/schemas/nodedev.rng | 10 + include/libvirt/libvirt-nodedev.h | 1 + src/conf/domain_conf.c | 31 ++++ src/conf/domain_conf.h | 4 + src/conf/netdev_bandwidth_conf.c | 1 + src/conf/node_device_conf.c | 14 ++ src/conf/node_device_conf.h | 11 +- src/conf/virnodedeviceobj.c | 4 +- src/libxl/libxl_conf.c | 1 + src/libxl/xen_common.c | 1 + src/lxc/lxc_controller.c | 1 + src/lxc/lxc_driver.c | 3 + src/lxc/lxc_process.c | 1 + src/node_device/node_device_udev.c | 53 ++++++ src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 36 +++- src/qemu/qemu_command.h | 3 +- src/qemu/qemu_domain.c | 6 +- src/qemu/qemu_hotplug.c | 75 +++++++- src/qemu/qemu_interface.c | 25 +++ src/qemu/qemu_interface.h | 2 + src/qemu/qemu_migration.c | 10 +- src/qemu/qemu_monitor.c | 93 ++++++++++ src/qemu/qemu_monitor.h | 41 +++++ src/qemu/qemu_monitor_json.c | 173 ++++++++++++++++++ src/qemu/qemu_monitor_json.h | 12 ++ src/qemu/qemu_process.c | 2 + src/qemu/qemu_validate.c | 15 ++ src/vmx/vmx.c | 1 + .../caps_5.1.0.x86_64.xml | 1 + .../caps_5.2.0.x86_64.xml | 1 + tests/qemuhotplugmock.c | 9 + tests/qemuhotplugtest.c | 16 ++ .../qemuhotplug-interface-vdpa.xml | 4 + .../qemuhotplug-base-live+interface-vdpa.xml | 57 ++++++ .../net-vdpa.x86_64-latest.args | 38 ++++ tests/qemuxml2argvdata/net-vdpa.xml | 28 +++ tests/qemuxml2argvmock.c | 11 +- tests/qemuxml2argvtest.c | 1 + tests/qemuxml2xmloutdata/net-vdpa.xml | 34 ++++ tests/qemuxml2xmltest.c | 1 + tools/virsh-domain.c | 1 + tools/virsh-nodedev.c | 3 + 47 files changed, 870 insertions(+), 16 deletions(-) create mode 100644 tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.x= ml create mode 100644 tests/qemuhotplugtestdomains/qemuhotplug-base-live+interf= ace-vdpa.xml create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml --=20 2.26.2

This patch adds new schema and adds support for parsing and formatting domain configurations that include vdpa devices. vDPA network devices allow high-performance networking in a virtual machine by providing a wire-speed data path. These devices require a vendor-specific host driver but the data path follows the virtio specification. When a device on the host is bound to an appropriate vendor-specific driver, it will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That chardev path can then be used to define a new interface with type='vdpa'. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- docs/formatdomain.rst | 24 ++++++++++++++++++++++++ docs/schemas/domaincommon.rng | 15 +++++++++++++++ src/conf/domain_conf.c | 31 +++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 4 ++++ src/conf/netdev_bandwidth_conf.c | 1 + src/libxl/libxl_conf.c | 1 + src/libxl/xen_common.c | 1 + src/lxc/lxc_controller.c | 1 + src/lxc/lxc_driver.c | 3 +++ src/lxc/lxc_process.c | 1 + src/qemu/qemu_command.c | 3 +++ src/qemu/qemu_domain.c | 4 +++- src/qemu/qemu_hotplug.c | 3 +++ src/qemu/qemu_interface.c | 2 ++ src/qemu/qemu_process.c | 2 ++ src/qemu/qemu_validate.c | 1 + src/vmx/vmx.c | 1 + tools/virsh-domain.c | 1 + 18 files changed, 98 insertions(+), 1 deletion(-) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 83dec62f30..f4e4bf7fe7 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -4644,6 +4644,30 @@ or stopping the guest. </devices> ... +:anchor:`<a id="elementsNICSVDPA"/>` + +vDPA devices +^^^^^^^^^^^^ + +A vDPA network device can be used to provide wire speed network performance +within a domain. A vDPA device is a specialized type of network device that +uses a datapath that complies with the virtio specification but has a +vendor-specific control path. To use such a device with libvirt, the host +device must already be bound to the appropriate device-specific vDPA driver. +This creates a vDPA char device (e.g. /dev/vhost-vdpa-0) that can be used to +assign the device to a libvirt domain. :since:`Since 6.9.0 (QEMU only, +requires QEMU 5.1.0 or newer)` + +:: + + ... + <devices> + <interface type='vdpa'> + <source dev='/dev/vhost-vdpa-0'/> + </interface> + </devices> + ... + :anchor:`<a id="elementsTeaming"/>` Teaming a virtio/hostdev NIC pair diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 0a0f0ed8a8..45193feb68 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -3117,6 +3117,21 @@ <ref name="interface-options"/> </interleave> </group> + + <group> + <attribute name="type"> + <value>vdpa</value> + </attribute> + <interleave> + <element name="source"> + <attribute name="dev"> + <ref name="deviceName"/> + </attribute> + </element> + <ref name="interface-options"/> + </interleave> + </group> + </choice> <optional> <attribute name="trustGuestRxFilters"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 1d3661c21f..518c9ca1c2 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -554,6 +554,7 @@ VIR_ENUM_IMPL(virDomainNet, "direct", "hostdev", "udp", + "vdpa", ); VIR_ENUM_IMPL(virDomainNetModel, @@ -2495,6 +2496,10 @@ virDomainNetDefFree(virDomainNetDefPtr def) def->data.vhostuser = NULL; break; + case VIR_DOMAIN_NET_TYPE_VDPA: + VIR_FREE(def->data.vdpa.devicepath); + break; + case VIR_DOMAIN_NET_TYPE_SERVER: case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: @@ -12095,6 +12100,10 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt, if (virDomainChrSourceReconnectDefParseXML(&reconnect, cur, ctxt) < 0) goto error; + } else if (!dev + && def->type == VIR_DOMAIN_NET_TYPE_VDPA + && virXMLNodeNameEqual(cur, "source")) { + dev = virXMLPropString(cur, "dev"); } else if (!def->virtPortProfile && virXMLNodeNameEqual(cur, "virtualport")) { if (def->type == VIR_DOMAIN_NET_TYPE_NETWORK) { @@ -12352,6 +12361,16 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt, } break; + case VIR_DOMAIN_NET_TYPE_VDPA: + if (dev == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("No <source> 'dev' attribute " + "specified with <interface type='vdpa'/>")); + goto error; + } + def->data.vdpa.devicepath = g_steal_pointer(&dev); + break; + case VIR_DOMAIN_NET_TYPE_BRIDGE: if (bridge == NULL) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -12741,6 +12760,7 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt, case VIR_DOMAIN_NET_TYPE_DIRECT: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: break; case VIR_DOMAIN_NET_TYPE_LAST: default: @@ -26974,6 +26994,14 @@ virDomainNetDefFormat(virBufferPtr buf, } break; + case VIR_DOMAIN_NET_TYPE_VDPA: + if (def->data.vdpa.devicepath) { + virBufferEscapeString(buf, "<source dev='%s'", + def->data.vdpa.devicepath); + sourceLines++; + } + break; + case VIR_DOMAIN_NET_TYPE_USER: case VIR_DOMAIN_NET_TYPE_LAST: break; @@ -31191,6 +31219,7 @@ virDomainNetGetActualVirtPortProfile(const virDomainNetDef *iface) case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: default: return NULL; @@ -32022,6 +32051,7 @@ virDomainNetTypeSharesHostView(const virDomainNetDef *net) case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: break; } @@ -32283,6 +32313,7 @@ virDomainNetDefActualToNetworkPort(virDomainDefPtr dom, case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_USER: case VIR_DOMAIN_NET_TYPE_VHOSTUSER: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unexpected network port type %s"), virDomainNetTypeToString(virDomainNetGetActualType(iface))); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 902dd58112..8b663d7623 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -883,6 +883,7 @@ typedef enum { VIR_DOMAIN_NET_TYPE_DIRECT, VIR_DOMAIN_NET_TYPE_HOSTDEV, VIR_DOMAIN_NET_TYPE_UDP, + VIR_DOMAIN_NET_TYPE_VDPA, VIR_DOMAIN_NET_TYPE_LAST } virDomainNetType; @@ -1056,6 +1057,9 @@ struct _virDomainNetDef { */ virDomainActualNetDefPtr actual; } network; + struct { + char *devicepath; + } vdpa; struct { char *brname; } bridge; diff --git a/src/conf/netdev_bandwidth_conf.c b/src/conf/netdev_bandwidth_conf.c index 831ee036ac..4fb7aa4e3d 100644 --- a/src/conf/netdev_bandwidth_conf.c +++ b/src/conf/netdev_bandwidth_conf.c @@ -312,6 +312,7 @@ bool virNetDevSupportsBandwidth(virDomainNetType type) case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: break; } diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c index 03ec37d6c5..43d23565f1 100644 --- a/src/libxl/libxl_conf.c +++ b/src/libxl/libxl_conf.c @@ -1386,6 +1386,7 @@ libxlMakeNic(virDomainDefPtr def, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_DIRECT: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("unsupported interface type %s"), diff --git a/src/libxl/xen_common.c b/src/libxl/xen_common.c index 7b6a7b6e9f..c82e487d80 100644 --- a/src/libxl/xen_common.c +++ b/src/libxl/xen_common.c @@ -1759,6 +1759,7 @@ xenFormatNet(virConnectPtr conn, case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_USER: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type '%s'"), virDomainNetTypeToString(net->type)); return -1; diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index e6dee85ec7..4f77a6ace8 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -422,6 +422,7 @@ static int virLXCControllerGetNICIndexes(virLXCControllerPtr ctrl) case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type %s"), virDomainNetTypeToString(actualType)); diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index ec3cb60a78..a6905b5a54 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -3504,6 +3504,7 @@ lxcDomainAttachDeviceNetLive(virLXCDriverPtr driver, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Network device type is not supported")); goto cleanup; @@ -3558,6 +3559,7 @@ lxcDomainAttachDeviceNetLive(virLXCDriverPtr driver, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: default: /* no-op */ @@ -3999,6 +4001,7 @@ lxcDomainDetachDeviceNetLive(virDomainObjPtr vm, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Only bridged veth devices can be detached")); goto cleanup; diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index e392d98f5d..c5a710fc3f 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -607,6 +607,7 @@ virLXCProcessSetupInterfaces(virLXCDriverPtr driver, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_LAST: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_INTERNAL_ERROR, _("Unsupported network type %s"), virDomainNetTypeToString(type)); diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 697a2db62b..91fff432a1 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3699,6 +3699,7 @@ qemuBuildHostNetStr(virDomainNetDefPtr net, return NULL; break; + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_HOSTDEV: /* Should have been handled earlier via PCI/USB hotplug code. */ case VIR_DOMAIN_NET_TYPE_LAST: @@ -8208,6 +8209,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* nada */ break; @@ -8244,6 +8246,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* These types don't use a network device on the host, but * instead use some other type of connection to the emulated diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 5e603284be..0ad8007962 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -5120,7 +5120,8 @@ qemuDomainDeviceNetDefPostParse(virDomainNetDefPtr net, const virDomainDef *def, virQEMUCapsPtr qemuCaps) { - if (net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV && + if (net->type != VIR_DOMAIN_NET_TYPE_VDPA && + net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV && !virDomainNetGetModelString(net) && virDomainNetResolveActualType(net) != VIR_DOMAIN_NET_TYPE_HOSTDEV) net->model = qemuDomainDefaultNetModel(def, qemuCaps); @@ -9313,6 +9314,7 @@ qemuDomainNetSupportsMTU(virDomainNetType type) case VIR_DOMAIN_NET_TYPE_DIRECT: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: break; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 2c184b9ba0..7bbf28ea6a 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1339,6 +1339,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _("hotplug of interface type of %s is not implemented yet"), @@ -3390,6 +3391,7 @@ qemuDomainChangeNetFilter(virDomainObjPtr vm, case VIR_DOMAIN_NET_TYPE_DIRECT: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("filters not supported on interfaces of type %s"), virDomainNetTypeToString(virDomainNetGetActualType(newdev))); @@ -3727,6 +3729,7 @@ qemuDomainChangeNet(virQEMUDriverPtr driver, case VIR_DOMAIN_NET_TYPE_VHOSTUSER: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _("unable to change config on '%s' network type"), virDomainNetTypeToString(newdev->type)); diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c index cbf3d99981..b24f9060a9 100644 --- a/src/qemu/qemu_interface.c +++ b/src/qemu/qemu_interface.c @@ -118,6 +118,7 @@ qemuInterfaceStartDevice(virDomainNetDefPtr net) case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* these types all require no action */ break; @@ -203,6 +204,7 @@ qemuInterfaceStopDevice(virDomainNetDefPtr net) case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* these types all require no action */ break; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 5bc76a75e3..423e1ffa60 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3347,6 +3347,7 @@ qemuProcessNotifyNets(virDomainDefPtr def) case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: break; } @@ -7578,6 +7579,7 @@ void qemuProcessStop(virQEMUDriverPtr driver, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* No special cleanup procedure for these types. */ break; diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index bc3043bb3f..f5c07f1521 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -1145,6 +1145,7 @@ qemuValidateNetSupportsCoalesce(virDomainNetType type) case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: break; } diff --git a/src/vmx/vmx.c b/src/vmx/vmx.c index e0777a9ddd..b7794540fe 100644 --- a/src/vmx/vmx.c +++ b/src/vmx/vmx.c @@ -3810,6 +3810,7 @@ virVMXFormatEthernet(virDomainNetDefPtr def, int controller, case VIR_DOMAIN_NET_TYPE_DIRECT: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type '%s'"), virDomainNetTypeToString(def->type)); return -1; diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 8f11393197..01b4cfda4e 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1007,6 +1007,7 @@ cmdAttachInterface(vshControl *ctl, const vshCmd *cmd) case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_LAST: vshError(ctl, _("No support for %s in command 'attach-interface'"), -- 2.26.2

Recent versions of qemu added the -netdev vhost-vdpa device. This capability allows libvirt to know whether this is supported. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml | 1 + 4 files changed, 5 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 81d9ecd886..66ceb8c868 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -601,6 +601,7 @@ VIR_ENUM_IMPL(virQEMUCaps, /* 380 */ "usb-host.hostdevice", "virtio-balloon.free-page-reporting", + "netdev.vhost-vdpa", ); @@ -1533,6 +1534,7 @@ static struct virQEMUCapsStringFlags virQEMUCapsQMPSchemaQueries[] = { { "migrate-set-parameters/arg-type/downtime-limit", QEMU_CAPS_MIGRATION_PARAM_DOWNTIME }, { "migrate-set-parameters/arg-type/xbzrle-cache-size", QEMU_CAPS_MIGRATION_PARAM_XBZRLE_CACHE_SIZE }, { "set-numa-node/arg-type/+hmat-lb", QEMU_CAPS_NUMA_HMAT }, + { "netdev_add/arg-type/+vhost-vdpa", QEMU_CAPS_NETDEV_VHOST_VDPA }, }; typedef struct _virQEMUCapsObjectTypeProps virQEMUCapsObjectTypeProps; diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 44c45589f0..ad558ff3cb 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -581,6 +581,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ /* 380 */ QEMU_CAPS_USB_HOST_HOSTDEVICE, /* -device usb-host.hostdevice */ QEMU_CAPS_VIRTIO_BALLOON_FREE_PAGE_REPORTING, /*virtio balloon free-page-reporting */ + QEMU_CAPS_NETDEV_VHOST_VDPA, /* -netdev vhost-vdpa*/ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml index 9ebd7ea582..ac9e258b25 100644 --- a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml @@ -244,6 +244,7 @@ <flag name='blockdev-hostdev-scsi'/> <flag name='usb-host.hostdevice'/> <flag name='virtio-balloon.free-page-reporting'/> + <flag name='netdev.vhost-vdpa'/> <version>5001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml index 52b6a47004..4396898bc1 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml @@ -244,6 +244,7 @@ <flag name='blockdev-hostdev-scsi'/> <flag name='usb-host.hostdevice'/> <flag name='virtio-balloon.free-page-reporting'/> + <flag name='netdev.vhost-vdpa'/> <version>5001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> -- 2.26.2

Enable <interface type='vdpa'> for qemu domains. This provides basic support and does not support hotplug or migration. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_command.c | 35 +++++++++++++++-- src/qemu/qemu_command.h | 3 +- src/qemu/qemu_domain.c | 6 ++- src/qemu/qemu_hotplug.c | 14 ++++--- src/qemu/qemu_interface.c | 23 +++++++++++ src/qemu/qemu_interface.h | 2 + src/qemu/qemu_migration.c | 10 ++++- src/qemu/qemu_validate.c | 14 +++++++ .../net-vdpa.x86_64-latest.args | 38 +++++++++++++++++++ tests/qemuxml2argvdata/net-vdpa.xml | 28 ++++++++++++++ tests/qemuxml2argvmock.c | 11 +++++- tests/qemuxml2argvtest.c | 1 + tests/qemuxml2xmloutdata/net-vdpa.xml | 34 +++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 14 files changed, 206 insertions(+), 14 deletions(-) create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 91fff432a1..d15080f6de 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3561,7 +3561,8 @@ qemuBuildHostNetStr(virDomainNetDefPtr net, size_t tapfdSize, char **vhostfd, size_t vhostfdSize, - const char *slirpfd) + const char *slirpfd, + const char *vdpadev) { bool is_tap = false; virDomainNetType netType = virDomainNetGetActualType(net); @@ -3700,6 +3701,12 @@ qemuBuildHostNetStr(virDomainNetDefPtr net, break; case VIR_DOMAIN_NET_TYPE_VDPA: + /* Caller will pass the fd to qemu with add-fd */ + if (virJSONValueObjectCreate(&netprops, "s:type", "vhost-vdpa", NULL) < 0 || + virJSONValueObjectAppendString(netprops, "vhostdev", vdpadev) < 0) + return NULL; + break; + case VIR_DOMAIN_NET_TYPE_HOSTDEV: /* Should have been handled earlier via PCI/USB hotplug code. */ case VIR_DOMAIN_NET_TYPE_LAST: @@ -8121,6 +8128,8 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, char **tapfdName = NULL; char **vhostfdName = NULL; g_autofree char *slirpfdName = NULL; + g_autofree char *vdpafdName = NULL; + int vdpafd = -1; virDomainNetType actualType = virDomainNetGetActualType(net); const virNetDevBandwidth *actualBandwidth; bool requireNicdev = false; @@ -8203,13 +8212,17 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, break; + case VIR_DOMAIN_NET_TYPE_VDPA: + if ((vdpafd = qemuInterfaceVDPAConnect(net)) < 0) + goto cleanup; + break; + case VIR_DOMAIN_NET_TYPE_USER: case VIR_DOMAIN_NET_TYPE_SERVER: case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: - case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* nada */ break; @@ -8327,13 +8340,29 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, vhostfd[i] = -1; } + if (vdpafd > 0) { + g_autofree char *fdset = NULL; + g_autofree char *addfdarg = NULL; + + virCommandPassFD(cmd, vdpafd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); + fdset = qemuVirCommandGetFDSet(cmd, vdpafd); + if (!fdset) + goto cleanup; + vdpafdName = qemuVirCommandGetDevSet(cmd, vdpafd); + /* set opaque to the devicepath so that we can look up the fdset later + * if necessary */ + addfdarg = g_strdup_printf("%s,opaque=%s", fdset, + net->data.vdpa.devicepath); + virCommandAddArgList(cmd, "-add-fd", addfdarg, NULL); + } + if (chardev) virCommandAddArgList(cmd, "-chardev", chardev, NULL); if (!(hostnetprops = qemuBuildHostNetStr(net, tapfdName, tapfdSize, vhostfdName, vhostfdSize, - slirpfdName))) + slirpfdName, vdpafdName))) goto cleanup; if (!(host = virQEMUBuildNetdevCommandlineFromJSON(hostnetprops, diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 8a30f2852c..cabfedd6ba 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -99,7 +99,8 @@ virJSONValuePtr qemuBuildHostNetStr(virDomainNetDefPtr net, size_t tapfdSize, char **vhostfd, size_t vhostfdSize, - const char *slirpfd); + const char *slirpfd, + const char *vdpadev); /* Current, best practice */ char *qemuBuildNicDevStr(virDomainDefPtr def, diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 0ad8007962..f2a32e97d4 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -5120,8 +5120,10 @@ qemuDomainDeviceNetDefPostParse(virDomainNetDefPtr net, const virDomainDef *def, virQEMUCapsPtr qemuCaps) { - if (net->type != VIR_DOMAIN_NET_TYPE_VDPA && - net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV && + if (net->type == VIR_DOMAIN_NET_TYPE_VDPA && + !virDomainNetGetModelString(net)) + net->model = VIR_DOMAIN_NET_MODEL_VIRTIO; + else if (net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV && !virDomainNetGetModelString(net) && virDomainNetResolveActualType(net) != VIR_DOMAIN_NET_TYPE_HOSTDEV) net->model = qemuDomainDefaultNetModel(def, qemuCaps); diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 7bbf28ea6a..6864e8b47a 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1389,7 +1389,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, if (!(netprops = qemuBuildHostNetStr(net, tapfdName, tapfdSize, vhostfdName, vhostfdSize, - slirpfdName))) + slirpfdName, NULL))) goto cleanup; qemuDomainObjEnterMonitor(driver, vm); @@ -3485,10 +3485,11 @@ qemuDomainChangeNet(virQEMUDriverPtr driver, olddev = *devslot; oldType = virDomainNetGetActualType(olddev); - if (oldType == VIR_DOMAIN_NET_TYPE_HOSTDEV) { - /* no changes are possible to a type='hostdev' interface */ + if (oldType == VIR_DOMAIN_NET_TYPE_HOSTDEV || + oldType == VIR_DOMAIN_NET_TYPE_VDPA) { + /* no changes are possible to a type='hostdev' or type='vdpa' interface */ virReportError(VIR_ERR_OPERATION_UNSUPPORTED, - _("cannot change config of '%s' network type"), + _("cannot change config of '%s' network interface type"), virDomainNetTypeToString(oldType)); goto cleanup; } @@ -3673,8 +3674,9 @@ qemuDomainChangeNet(virQEMUDriverPtr driver, newType = virDomainNetGetActualType(newdev); - if (newType == VIR_DOMAIN_NET_TYPE_HOSTDEV) { - /* can't turn it into a type='hostdev' interface */ + if (newType == VIR_DOMAIN_NET_TYPE_HOSTDEV || + newType == VIR_DOMAIN_NET_TYPE_VDPA) { + /* can't turn it into a type='hostdev' or type='vdpa' interface */ virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _("cannot change network interface type to '%s'"), virDomainNetTypeToString(newType)); diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c index b24f9060a9..3714828fe1 100644 --- a/src/qemu/qemu_interface.c +++ b/src/qemu/qemu_interface.c @@ -638,6 +638,29 @@ qemuInterfaceBridgeConnect(virDomainDefPtr def, } +/* qemuInterfaceVDPAConnect: + * @net: pointer to the VM's interface description + * + * returns: file descriptor of the vdpa device + * + * Called *only* called if actualType is VIR_DOMAIN_NET_TYPE_VDPA + */ +int +qemuInterfaceVDPAConnect(virDomainNetDefPtr net) +{ + int fd; + + if ((fd = open(net->data.vdpa.devicepath, O_RDWR)) < 0) { + virReportSystemError(errno, + _("Unable to open '%s' for vdpa device"), + net->data.vdpa.devicepath); + return -1; + } + + return fd; +} + + qemuSlirpPtr qemuInterfacePrepareSlirp(virQEMUDriverPtr driver, virDomainNetDefPtr net) diff --git a/src/qemu/qemu_interface.h b/src/qemu/qemu_interface.h index 3dcefc6a12..1ba24f0a6f 100644 --- a/src/qemu/qemu_interface.h +++ b/src/qemu/qemu_interface.h @@ -58,3 +58,5 @@ int qemuInterfaceOpenVhostNet(virDomainDefPtr def, qemuSlirpPtr qemuInterfacePrepareSlirp(virQEMUDriverPtr driver, virDomainNetDefPtr net); + +int qemuInterfaceVDPAConnect(virDomainNetDefPtr net) G_GNUC_NO_INLINE; diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 4e959abebf..b5d99ad79d 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1377,7 +1377,15 @@ qemuMigrationSrcIsAllowed(virQEMUDriverPtr driver, for (i = 0; i < vm->def->nnets; i++) { virDomainNetDefPtr net = vm->def->nets[i]; - qemuSlirpPtr slirp = QEMU_DOMAIN_NETWORK_PRIVATE(net)->slirp; + qemuSlirpPtr slirp; + + if (net->type == VIR_DOMAIN_NET_TYPE_VDPA) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("vDPA devices cannot be migrated")); + return false; + } + + slirp = QEMU_DOMAIN_NETWORK_PRIVATE(net)->slirp; if (slirp && !qemuSlirpHasFeature(slirp, QEMU_SLIRP_FEATURE_MIGRATE)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index f5c07f1521..1eb01714eb 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -1245,6 +1245,20 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, } } } + } else if (net->type == VIR_DOMAIN_NET_TYPE_VDPA) { + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_NETDEV_VHOST_VDPA)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("vDPA devices are not supported with this QEMU binary")); + return -1; + } + + if (net->model != VIR_DOMAIN_NET_MODEL_VIRTIO) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("invalid model for interface of type '%s': '%s'"), + virDomainNetTypeToString(net->type), + virDomainNetModelTypeToString(net->model)); + return -1; + } } else if (net->guestIP.nroutes || net->guestIP.nips) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Invalid attempt to set network interface " diff --git a/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args new file mode 100644 index 0000000000..002ec498a0 --- /dev/null +++ b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args @@ -0,0 +1,38 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-QEMUGuest1 \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-i386 \ +-name guest=QEMUGuest1,debug-threads=on \ +-S \ +-object secret,id=masterKey0,format=raw,\ +file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \ +-machine pc,accel=tcg,usb=off,dump-guest-core=off,memory-backend=pc.ram \ +-cpu qemu64 \ +-m 214 \ +-object memory-backend-ram,id=pc.ram,size=224395264 \ +-overcommit mem-lock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=1729,server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-boot strict=on \ +-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ +-add-fd set=0,fd=1732,opaque=/dev/vhost-vdpa-0 \ +-netdev vhost-vdpa,vhostdev=/dev/fdset/0,id=hostnet0 \ +-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:95:db:c0,bus=pci.0,\ +addr=0x2 \ +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,\ +resourcecontrol=deny \ +-msg timestamp=on diff --git a/tests/qemuxml2argvdata/net-vdpa.xml b/tests/qemuxml2argvdata/net-vdpa.xml new file mode 100644 index 0000000000..30cca7eb6e --- /dev/null +++ b/tests/qemuxml2argvdata/net-vdpa.xml @@ -0,0 +1,28 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i386</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <interface type='vdpa'> + <mac address='52:54:00:95:db:c0'/> + <source dev='/dev/vhost-vdpa-0'/> + </interface> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvmock.c b/tests/qemuxml2argvmock.c index b9322f4f2a..6ea1db8e2f 100644 --- a/tests/qemuxml2argvmock.c +++ b/tests/qemuxml2argvmock.c @@ -208,7 +208,7 @@ virHostGetDRMRenderNode(void) static void (*real_virCommandPassFD)(virCommandPtr cmd, int fd, unsigned int flags); -static const int testCommandPassSafeFDs[] = { 1730, 1731 }; +static const int testCommandPassSafeFDs[] = { 1730, 1731, 1732 }; void virCommandPassFD(virCommandPtr cmd, @@ -294,3 +294,12 @@ virNetDevSetRootQDisc(const char *ifname G_GNUC_UNUSED, { return 0; } + + +int +qemuInterfaceVDPAConnect(virDomainNetDefPtr net G_GNUC_UNUSED) +{ + if (fcntl(1732, F_GETFD) != -1) + abort(); + return 1732; +} diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 8aa791d9f7..23d89718af 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1469,6 +1469,7 @@ mymain(void) QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST_FAILURE("net-hostdev-fail", QEMU_CAPS_DEVICE_VFIO_PCI); + DO_TEST_CAPS_LATEST("net-vdpa"); DO_TEST("hostdev-pci-multifunction", QEMU_CAPS_KVM, diff --git a/tests/qemuxml2xmloutdata/net-vdpa.xml b/tests/qemuxml2xmloutdata/net-vdpa.xml new file mode 100644 index 0000000000..b362405c14 --- /dev/null +++ b/tests/qemuxml2xmloutdata/net-vdpa.xml @@ -0,0 +1,34 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i386</emulator> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <interface type='vdpa'> + <mac address='52:54:00:95:db:c0'/> + <source dev='/dev/vhost-vdpa-0'/> + <model type='virtio'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </interface> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 2bf8dd5b14..1ce21f0519 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -497,6 +497,7 @@ mymain(void) DO_TEST("net-mtu", NONE); DO_TEST("net-coalesce", NONE); DO_TEST("net-many-models", NONE); + DO_TEST("net-vdpa", QEMU_CAPS_NETDEV_VHOST_VDPA); DO_TEST("serial-tcp-tlsx509-chardev", NONE); DO_TEST("serial-tcp-tlsx509-chardev-notls", NONE); -- 2.26.2

On 10/14/20 1:08 PM, Jonathon Jongsma wrote:
Enable <interface type='vdpa'> for qemu domains. This provides basic support and does not support hotplug or migration.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_command.c | 35 +++++++++++++++-- src/qemu/qemu_command.h | 3 +- src/qemu/qemu_domain.c | 6 ++- src/qemu/qemu_hotplug.c | 14 ++++--- src/qemu/qemu_interface.c | 23 +++++++++++ src/qemu/qemu_interface.h | 2 + src/qemu/qemu_migration.c | 10 ++++- src/qemu/qemu_validate.c | 14 +++++++ .../net-vdpa.x86_64-latest.args | 38 +++++++++++++++++++ tests/qemuxml2argvdata/net-vdpa.xml | 28 ++++++++++++++ tests/qemuxml2argvmock.c | 11 +++++- tests/qemuxml2argvtest.c | 1 + tests/qemuxml2xmloutdata/net-vdpa.xml | 34 +++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 14 files changed, 206 insertions(+), 14 deletions(-) create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml
Coverity indicated a possible RESOURCE_LEAK [...]
@@ -8203,13 +8212,17 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
break;
+ case VIR_DOMAIN_NET_TYPE_VDPA: + if ((vdpafd = qemuInterfaceVDPAConnect(net)) < 0) + goto cleanup; + break; +
Between here and where it gets used/consumed, it's possible to jump to cleanup. Whether it's technically possible based on various tests made, I'm not 100% sure. The cleanup code would need to account for VIR_CLOSE_FORCE(vdpafd) if (vdpafd >= 0)...
case VIR_DOMAIN_NET_TYPE_USER: case VIR_DOMAIN_NET_TYPE_SERVER: case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: - case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* nada */ break; @@ -8327,13 +8340,29 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, vhostfd[i] = -1; }
+ if (vdpafd > 0) { + g_autofree char *fdset = NULL; + g_autofree char *addfdarg = NULL; + + virCommandPassFD(cmd, vdpafd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); + fdset = qemuVirCommandGetFDSet(cmd, vdpafd); + if (!fdset) + goto cleanup; + vdpafdName = qemuVirCommandGetDevSet(cmd, vdpafd); + /* set opaque to the devicepath so that we can look up the fdset later + * if necessary */ + addfdarg = g_strdup_printf("%s,opaque=%s", fdset, + net->data.vdpa.devicepath); + virCommandAddArgList(cmd, "-add-fd", addfdarg, NULL); + } +
As long as the above code consumes vdpafd, then just set it to -1 right after consumption to avoid double cleanup when it is really closed. John
if (chardev) virCommandAddArgList(cmd, "-chardev", chardev, NULL);
if (!(hostnetprops = qemuBuildHostNetStr(net, tapfdName, tapfdSize, vhostfdName, vhostfdSize, - slirpfdName))) + slirpfdName, vdpafdName))) goto cleanup;
if (!(host = virQEMUBuildNetdevCommandlineFromJSON(hostnetprops,
[...]

add-fd, remove-fd, and query-fdsets provide functionality that can be used for passing fds to qemu and closing fdsets that are no longer necessary. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_monitor.c | 93 +++++++++++++++++++ src/qemu/qemu_monitor.h | 41 +++++++++ src/qemu/qemu_monitor_json.c | 173 +++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 12 +++ 4 files changed, 319 insertions(+) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 8c991fefbb..594d701c48 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -2649,6 +2649,99 @@ qemuMonitorGraphicsRelocate(qemuMonitorPtr mon, } +/** + * qemuMonitorAddFileHandleToSet: + * @mon: monitor object + * @fd: file descriptor to pass to qemu + * @fdset: the fdset to register this fd with, -1 to create a new fdset + * @opaque: opaque data to associated with this fd + * @info: structure that will be updated with the fd and fdset returned by qemu + * + * Attempts to register a file descriptor with qemu that can then be referenced + * via the file path /dev/fdset/$FDSETID + * Returns 0 if ok, and -1 on failure */ +int +qemuMonitorAddFileHandleToSet(qemuMonitorPtr mon, + int fd, + int fdset, + const char *opaque, + qemuMonitorAddFdInfoPtr info) +{ + VIR_DEBUG("fd=%d,fdset=%i,opaque=%s", fd, fdset, opaque); + + QEMU_CHECK_MONITOR(mon); + + if (fd < 0) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("fd must be valid")); + return -1; + } + + return qemuMonitorJSONAddFileHandleToSet(mon, fd, fdset, opaque, info); +} + + +/** + * qemuMonitorRemoveFdset: + * @mon: monitor object + * @fdset: the fdset to remove + * + * Attempts to remove a fdset from qemu and close associated file descriptors + * Returns 0 if ok, and -1 on failure */ +int +qemuMonitorRemoveFdset(qemuMonitorPtr mon, + int fdset) +{ + VIR_DEBUG("fdset=%d", fdset); + + QEMU_CHECK_MONITOR(mon); + + if (fdset < 0) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("fdset must be valid")); + return -1; + } + + return qemuMonitorJSONRemoveFdset(mon, fdset); +} + + +void qemuMonitorFdsetsFree(qemuMonitorFdsetsPtr fdsets) +{ + size_t i; + + for (i = 0; i < fdsets->nfdsets; i++) { + size_t j; + qemuMonitorFdsetInfoPtr set = &fdsets->fdsets[i]; + + for (j = 0; j < set->nfds; j++) + g_free(set->fds[j].opaque); + } + g_free(fdsets->fdsets); + g_free(fdsets); +} + + +/** + * qemuMonitorQueryFdsets: + * @mon: monitor object + * @fdsets: a pointer that is filled with a new qemuMonitorFdsets struct + * + * Queries qemu for the fdsets that are registered with that instance, and + * returns a structure describing those fdsets. The returned struct should be + * freed with qemuMonitorFdsetsFree(); + * + * Returns 0 if ok, and -1 on failure */ +int +qemuMonitorQueryFdsets(qemuMonitorPtr mon, + qemuMonitorFdsetsPtr *fdsets) +{ + QEMU_CHECK_MONITOR(mon); + + return qemuMonitorJSONQueryFdsets(mon, fdsets); +} + + int qemuMonitorSendFileHandle(qemuMonitorPtr mon, const char *fdname, diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index a744c8975b..a372ba1eda 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -880,6 +880,47 @@ int qemuMonitorGraphicsRelocate(qemuMonitorPtr mon, int tlsPort, const char *tlsSubject); +typedef struct _qemuMonitorAddFdInfo qemuMonitorAddFdInfo; +typedef qemuMonitorAddFdInfo *qemuMonitorAddFdInfoPtr; +struct _qemuMonitorAddFdInfo { + int fd; + int fdset; +}; +int +qemuMonitorAddFileHandleToSet(qemuMonitorPtr mon, + int fd, + int fdset, + const char *opaque, + qemuMonitorAddFdInfoPtr info); + +int +qemuMonitorRemoveFdset(qemuMonitorPtr mon, + int fdset); + +typedef struct _qemuMonitorFdsetFdInfo qemuMonitorFdsetFdInfo; +typedef qemuMonitorFdsetFdInfo *qemuMonitorFdsetFdInfoPtr; +struct _qemuMonitorFdsetFdInfo { + int fd; + char *opaque; +}; +typedef struct _qemuMonitorFdsetInfo qemuMonitorFdsetInfo; +typedef qemuMonitorFdsetInfo *qemuMonitorFdsetInfoPtr; +struct _qemuMonitorFdsetInfo { + int id; + qemuMonitorFdsetFdInfoPtr fds; + int nfds; +}; +typedef struct _qemuMonitorFdsets qemuMonitorFdsets; +typedef qemuMonitorFdsets *qemuMonitorFdsetsPtr; +struct _qemuMonitorFdsets { + qemuMonitorFdsetInfoPtr fdsets; + int nfdsets; +}; +void qemuMonitorFdsetsFree(qemuMonitorFdsetsPtr fdsets); +G_DEFINE_AUTOPTR_CLEANUP_FUNC(qemuMonitorFdsets, qemuMonitorFdsetsFree); +int qemuMonitorQueryFdsets(qemuMonitorPtr mon, + qemuMonitorFdsetsPtr *fdsets); + int qemuMonitorSendFileHandle(qemuMonitorPtr mon, const char *fdname, int fd); diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index 26ac499fc5..1b77adfbcd 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -3929,6 +3929,179 @@ int qemuMonitorJSONGraphicsRelocate(qemuMonitorPtr mon, } +static int +qemuAddfdInfoParse(virJSONValuePtr msg, + qemuMonitorAddFdInfoPtr fdinfo) +{ + virJSONValuePtr returnObj; + + if (!(returnObj = virJSONValueObjectGetObject(msg, "return"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing or invalid return data in add-fd response")); + return -1; + } + + if (virJSONValueObjectGetNumberInt(returnObj, "fd", &fdinfo->fd) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing or invalid fd in add-fd response")); + return -1; + } + + if (virJSONValueObjectGetNumberInt(returnObj, "fdset-id", &fdinfo->fdset) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing or invalid fdset-id in add-fd response")); + return -1; + } + + return 0; +} + + +/* if fdset is negative, qemu will create a new fdset and add the fd to that */ +int qemuMonitorJSONAddFileHandleToSet(qemuMonitorPtr mon, + int fd, + int fdset, + const char *opaque, + qemuMonitorAddFdInfoPtr fdinfo) +{ + virJSONValuePtr args = NULL; + g_autoptr(virJSONValue) reply = NULL; + g_autoptr(virJSONValue) cmd = NULL; + + if (virJSONValueObjectCreate(&args, "S:opaque", opaque, NULL) < 0) + return -1; + + if (fdset >= 0) + if (virJSONValueObjectAdd(args, "j:fdset-id", fdset, NULL) < 0) + return -1; + + if (!(cmd = qemuMonitorJSONMakeCommandInternal("add-fd", args))) + return -1; + + if (qemuMonitorJSONCommandWithFd(mon, cmd, fd, &reply) < 0) + return -1; + + if (qemuMonitorJSONCheckError(cmd, reply) < 0) + return -1; + + if (qemuAddfdInfoParse(reply, fdinfo) < 0) + return -1; + + return 0; +} + +static int +qemuMonitorJSONQueryFdsetsParse(virJSONValuePtr msg, + qemuMonitorFdsetsPtr *fdsets) +{ + virJSONValuePtr returnArray, entry; + size_t i; + g_autoptr(qemuMonitorFdsets) sets = g_new0(qemuMonitorFdsets, 1); + int ninfo; + + returnArray = virJSONValueObjectGetArray(msg, "return"); + + ninfo = virJSONValueArraySize(returnArray); + if (ninfo > 0) + sets->fdsets = g_new0(qemuMonitorFdsetInfo, ninfo); + sets->nfdsets = ninfo; + + for (i = 0; i < ninfo; i++) { + size_t j; + const char *tmp; + virJSONValuePtr fdarray; + qemuMonitorFdsetInfoPtr fdsetinfo = &sets->fdsets[i]; + + if (!(entry = virJSONValueArrayGet(returnArray, i))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-fdsets return data missing fdset array element")); + return -1; + } + + if (virJSONValueObjectGetNumberInt(entry, "fdset-id", &fdsetinfo->id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-fdsets reply was missing 'fdset-id'")); + return -1; + + } + + fdarray = virJSONValueObjectGetArray(entry, "fds"); + fdsetinfo->nfds = virJSONValueArraySize(fdarray); + if (fdsetinfo->nfds > 0) + fdsetinfo->fds = g_new0(qemuMonitorFdsetFdInfo, fdsetinfo->nfds); + + for (j = 0; j < fdsetinfo->nfds; j++) { + qemuMonitorFdsetFdInfoPtr fdinfo = &fdsetinfo->fds[j]; + virJSONValuePtr fdentry; + + if (!(fdentry = virJSONValueArrayGet(fdarray, j))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-fdsets return data missing fd array element")); + return -1; + } + + if (virJSONValueObjectGetNumberInt(fdentry, "fd", &fdinfo->fd) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-fdsets return data missing 'fd'")); + return -1; + } + + /* opaque is optional and may be missing */ + tmp = virJSONValueObjectGetString(fdentry, "opaque"); + if (tmp) + fdinfo->opaque = g_strdup(tmp); + } + } + + *fdsets = g_steal_pointer(&sets); + return 0; +} + + +int qemuMonitorJSONQueryFdsets(qemuMonitorPtr mon, + qemuMonitorFdsetsPtr *fdsets) +{ + g_autoptr(virJSONValue) reply = NULL; + g_autoptr(virJSONValue) cmd = qemuMonitorJSONMakeCommand("query-fdsets", + NULL); + + if (!cmd) + return -1; + + if (qemuMonitorJSONCommand(mon, cmd, &reply) < 0) + return -1; + + if (qemuMonitorJSONCheckError(cmd, reply) < 0) + return -1; + + if (qemuMonitorJSONQueryFdsetsParse(reply, fdsets) < 0) + return -1; + + return 0; +} + + +int qemuMonitorJSONRemoveFdset(qemuMonitorPtr mon, + int fdset) +{ + g_autoptr(virJSONValue) reply = NULL; + g_autoptr(virJSONValue) cmd = qemuMonitorJSONMakeCommand("remove-fd", + "i:fdset-id", fdset, + NULL); + + if (!cmd) + return -1; + + if (qemuMonitorJSONCommand(mon, cmd, &reply) < 0) + return -1; + + if (qemuMonitorJSONCheckError(cmd, reply) < 0) + return -1; + + return 0; +} + + int qemuMonitorJSONSendFileHandle(qemuMonitorPtr mon, const char *fdname, int fd) diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index 098ab857be..2b9a42efe0 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -202,6 +202,18 @@ int qemuMonitorJSONAddPCINetwork(qemuMonitorPtr mon, int qemuMonitorJSONRemovePCIDevice(qemuMonitorPtr mon, virPCIDeviceAddress *guestAddr); +int qemuMonitorJSONAddFileHandleToSet(qemuMonitorPtr mon, + int fd, + int fdset, + const char *opaque, + qemuMonitorAddFdInfoPtr info); + +int qemuMonitorJSONRemoveFdset(qemuMonitorPtr mon, + int fdset); + +int qemuMonitorJSONQueryFdsets(qemuMonitorPtr mon, + qemuMonitorFdsetsPtr *fdsets); + int qemuMonitorJSONSendFileHandle(qemuMonitorPtr mon, const char *fdname, int fd); -- 2.26.2

On 10/14/20 1:08 PM, Jonathon Jongsma wrote:
add-fd, remove-fd, and query-fdsets provide functionality that can be used for passing fds to qemu and closing fdsets that are no longer necessary.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_monitor.c | 93 +++++++++++++++++++ src/qemu/qemu_monitor.h | 41 +++++++++ src/qemu/qemu_monitor_json.c | 173 +++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 12 +++ 4 files changed, 319 insertions(+)
Coverity indicated a possible RESOURCE_LEAK
+/* if fdset is negative, qemu will create a new fdset and add the fd to that */ +int qemuMonitorJSONAddFileHandleToSet(qemuMonitorPtr mon, + int fd, + int fdset, + const char *opaque, + qemuMonitorAddFdInfoPtr fdinfo) +{ + virJSONValuePtr args = NULL; + g_autoptr(virJSONValue) reply = NULL; + g_autoptr(virJSONValue) cmd = NULL; + + if (virJSONValueObjectCreate(&args, "S:opaque", opaque, NULL) < 0) + return -1; + + if (fdset >= 0) + if (virJSONValueObjectAdd(args, "j:fdset-id", fdset, NULL) < 0)
Leaks @args
+ return -1;
I'm surprised the code style gremlins didn't complain about not having { } or combining the conditions
+ + if (!(cmd = qemuMonitorJSONMakeCommandInternal("add-fd", args))) + return -1;
I think at this point @args is consumed within @cmd ... which really confuses Coverity, but I have a bunch of hacks to handle that... John
+ + if (qemuMonitorJSONCommandWithFd(mon, cmd, fd, &reply) < 0) + return -1; + + if (qemuMonitorJSONCheckError(cmd, reply) < 0) + return -1; + + if (qemuAddfdInfoParse(reply, fdinfo) < 0) + return -1; + + return 0; +} +
[...]

On 10/21/20 4:54 PM, John Ferlan wrote:
On 10/14/20 1:08 PM, Jonathon Jongsma wrote:
add-fd, remove-fd, and query-fdsets provide functionality that can be used for passing fds to qemu and closing fdsets that are no longer necessary.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_monitor.c | 93 +++++++++++++++++++ src/qemu/qemu_monitor.h | 41 +++++++++ src/qemu/qemu_monitor_json.c | 173 +++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 12 +++ 4 files changed, 319 insertions(+)
Coverity indicated a possible RESOURCE_LEAK
+/* if fdset is negative, qemu will create a new fdset and add the fd to that */ +int qemuMonitorJSONAddFileHandleToSet(qemuMonitorPtr mon, + int fd, + int fdset, + const char *opaque, + qemuMonitorAddFdInfoPtr fdinfo) +{ + virJSONValuePtr args = NULL; + g_autoptr(virJSONValue) reply = NULL; + g_autoptr(virJSONValue) cmd = NULL; + + if (virJSONValueObjectCreate(&args, "S:opaque", opaque, NULL) < 0) + return -1; + + if (fdset >= 0) + if (virJSONValueObjectAdd(args, "j:fdset-id", fdset, NULL) < 0) Leaks @args
Yeah, I think args needs to be g_autoptr(virJSONValue)...
+ return -1; I'm surprised the code style gremlins didn't complain about not having { } or combining the conditions
(Wasn't watching close enough. I have to admit my eyes glazed over a bit :-P)
+ + if (!(cmd = qemuMonitorJSONMakeCommandInternal("add-fd", args))) + return -1;
... and then here it can be passed as g_steal_pointer(&args) - qemuMonitorJSONMakeCommandInternal() will free it no matter what the outcome.
I think at this point @args is consumed within @cmd ... which really confuses Coverity, but I have a bunch of hacks to handle that...
John
+ + if (qemuMonitorJSONCommandWithFd(mon, cmd, fd, &reply) < 0) + return -1; + + if (qemuMonitorJSONCheckError(cmd, reply) < 0) + return -1; + + if (qemuAddfdInfoParse(reply, fdinfo) < 0) + return -1; + + return 0; +} + [...]

By using the new qemu monitor functions to handle passing and removing file descriptors, we can support hotplug of vdpa devices. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_hotplug.c | 62 +++++++++++++++++-- tests/qemuhotplugmock.c | 9 +++ tests/qemuhotplugtest.c | 16 +++++ .../qemuhotplug-interface-vdpa.xml | 4 ++ .../qemuhotplug-base-live+interface-vdpa.xml | 57 +++++++++++++++++ 5 files changed, 144 insertions(+), 4 deletions(-) create mode 100644 tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml create mode 100644 tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 6864e8b47a..f999f5cd07 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1157,6 +1157,8 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, virErrorPtr originalError = NULL; g_autofree char *slirpfdName = NULL; int slirpfd = -1; + g_autofree char *vdpafdName = NULL; + int vdpafd = -1; char **tapfdName = NULL; int *tapfd = NULL; size_t tapfdSize = 0; @@ -1334,12 +1336,16 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, /* hostdev interfaces were handled earlier in this function */ break; + case VIR_DOMAIN_NET_TYPE_VDPA: + if ((vdpafd = qemuInterfaceVDPAConnect(net)) < 0) + goto cleanup; + break; + case VIR_DOMAIN_NET_TYPE_SERVER: case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: - case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _("hotplug of interface type of %s is not implemented yet"), @@ -1386,13 +1392,29 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, for (i = 0; i < vhostfdSize; i++) vhostfdName[i] = g_strdup_printf("vhostfd-%s%zu", net->info.alias, i); + qemuDomainObjEnterMonitor(driver, vm); + + if (vdpafd > 0) { + /* vhost-vdpa only accepts a filename. We can pass an open fd by + * filename if we add the fd to an fdset and then pass a filename of + * /dev/fdset/$FDSETID. */ + qemuMonitorAddFdInfo fdinfo; + if (qemuMonitorAddFileHandleToSet(priv->mon, vdpafd, -1, + net->data.vdpa.devicepath, + &fdinfo) < 0) { + ignore_value(qemuDomainObjExitMonitor(driver, vm)); + goto cleanup; + } + vdpafdName = g_strdup_printf("/dev/fdset/%d", fdinfo.fdset); + } + if (!(netprops = qemuBuildHostNetStr(net, tapfdName, tapfdSize, vhostfdName, vhostfdSize, - slirpfdName, NULL))) + slirpfdName, vdpafdName))) { + ignore_value(qemuDomainObjExitMonitor(driver, vm)); goto cleanup; - - qemuDomainObjEnterMonitor(driver, vm); + } if (actualType == VIR_DOMAIN_NET_TYPE_VHOSTUSER) { if (qemuMonitorAttachCharDev(priv->mon, charDevAlias, net->data.vhostuser) < 0) { @@ -1518,6 +1540,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, VIR_FREE(vhostfdName); virDomainCCWAddressSetFree(ccwaddrs); VIR_FORCE_CLOSE(slirpfd); + VIR_FORCE_CLOSE(vdpafd); return ret; @@ -4595,8 +4618,39 @@ qemuDomainRemoveNetDevice(virQEMUDriverPtr driver, * to just ignore the error and carry on. */ } + } else if (actualType == VIR_DOMAIN_NET_TYPE_VDPA) { + int vdpafdset = -1; + g_autoptr(qemuMonitorFdsets) fdsets = NULL; + + /* query qemu for which fdset is associated with the fd that we passed + * to qemu via 'add-fd' for this vdpa device. If we don't remove the + * fd, qemu will keep it open */ + if (qemuMonitorQueryFdsets(priv->mon, &fdsets) == 0) { + for (i = 0; i < fdsets->nfdsets && vdpafdset < 0; i++) { + size_t j; + qemuMonitorFdsetInfoPtr set = &fdsets->fdsets[i]; + + for (j = 0; j < set->nfds; j++) { + qemuMonitorFdsetFdInfoPtr fdinfo = &set->fds[j]; + if (STREQ_NULLABLE(fdinfo->opaque, net->data.vdpa.devicepath)) { + vdpafdset = set->id; + break; + } + } + } + } + + if (vdpafdset < 0) { + VIR_WARN("Cannot determine fdset for vdpa device"); + } else { + if (qemuMonitorRemoveFdset(priv->mon, vdpafdset) < 0) { + /* if it fails, there's not much we can do... just carry on */ + VIR_WARN("failed to close vdpa device"); + } + } } + if (qemuDomainObjExitMonitor(driver, vm) < 0) return -1; diff --git a/tests/qemuhotplugmock.c b/tests/qemuhotplugmock.c index 29fac8a598..d2e32ecf7e 100644 --- a/tests/qemuhotplugmock.c +++ b/tests/qemuhotplugmock.c @@ -19,11 +19,13 @@ #include <config.h> #include "qemu/qemu_hotplug.h" +#include "qemu/qemu_interface.h" #include "qemu/qemu_process.h" #include "conf/domain_conf.h" #include "virdevmapper.h" #include "virutil.h" #include "virmock.h" +#include <fcntl.h> static int (*real_virGetDeviceID)(const char *path, int *maj, int *min); static bool (*real_virFileExists)(const char *path); @@ -106,3 +108,10 @@ void qemuProcessKillManagedPRDaemon(virDomainObjPtr vm G_GNUC_UNUSED) { } + +int +qemuInterfaceVDPAConnect(virDomainNetDefPtr net G_GNUC_UNUSED) +{ + /* need a valid fd or sendmsg won't work. Just open /dev/null */ + return open("/dev/null", O_RDONLY); +} diff --git a/tests/qemuhotplugtest.c b/tests/qemuhotplugtest.c index 2d12cacf28..b7cebfc0e7 100644 --- a/tests/qemuhotplugtest.c +++ b/tests/qemuhotplugtest.c @@ -89,6 +89,7 @@ qemuHotplugCreateObjects(virDomainXMLOptionPtr xmlopt, virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_SPICE_FILE_XFER_DISABLE); virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_PR_MANAGER_HELPER); virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_SCSI_BLOCK); + virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_NETDEV_VHOST_VDPA); if (qemuTestCapsCacheInsert(driver.qemuCapsCache, priv->qemuCaps) < 0) return -1; @@ -140,6 +141,9 @@ testQemuHotplugAttach(virDomainObjPtr vm, case VIR_DOMAIN_DEVICE_HOSTDEV: ret = qemuDomainAttachHostDevice(&driver, vm, dev->data.hostdev); break; + case VIR_DOMAIN_DEVICE_NET: + ret = qemuDomainAttachNetDevice(&driver, vm, dev->data.net); + break; default: VIR_TEST_VERBOSE("device type '%s' cannot be attached", virDomainDeviceTypeToString(dev->type)); @@ -162,6 +166,7 @@ testQemuHotplugDetach(virDomainObjPtr vm, case VIR_DOMAIN_DEVICE_SHMEM: case VIR_DOMAIN_DEVICE_WATCHDOG: case VIR_DOMAIN_DEVICE_HOSTDEV: + case VIR_DOMAIN_DEVICE_NET: ret = qemuDomainDetachDeviceLive(vm, dev, &driver, async); break; default: @@ -823,6 +828,17 @@ mymain(void) DO_TEST_DETACH("pseries-base-live", "hostdev-pci", false, false, "device_del", QMP_DEVICE_DELETED("hostdev0") QMP_OK); + DO_TEST_ATTACH("base-live", "interface-vdpa", false, true, + "add-fd", "{ \"return\": { \"fdset-id\": 1, \"fd\": 95 }}", + "netdev_add", QMP_OK, "device_add", QMP_OK); + DO_TEST_DETACH("base-live", "interface-vdpa", false, false, + "device_del", QMP_DEVICE_DELETED("net0") QMP_OK, + "netdev_del", QMP_OK, + "query-fdsets", + "{ \"return\": [{\"fds\": [{\"fd\": 95, \"opaque\": \"/dev/vhost-vdpa-0\"}], \"fdset-id\": 1}]}", + "remove-fd", QMP_OK + ); + DO_TEST_ATTACH("base-live", "watchdog", false, true, "watchdog-set-action", QMP_OK, "device_add", QMP_OK); diff --git a/tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml b/tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml new file mode 100644 index 0000000000..e42ca08d31 --- /dev/null +++ b/tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml @@ -0,0 +1,4 @@ +<interface type='vdpa'> + <mac address='52:54:00:39:5f:04'/> + <source dev='/dev/vhost-vdpa-0'/> +</interface> diff --git a/tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml b/tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml new file mode 100644 index 0000000000..066180bb3c --- /dev/null +++ b/tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml @@ -0,0 +1,57 @@ +<domain type='kvm' id='7'> + <name>hotplug</name> + <uuid>d091ea82-29e6-2e34-3005-f02617b36e87</uuid> + <memory unit='KiB'>4194304</memory> + <currentMemory unit='KiB'>4194304</currentMemory> + <vcpu placement='static'>4</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <features> + <acpi/> + <apic/> + <pae/> + </features> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>restart</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <controller type='usb' index='0'> + <alias name='usb'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='ide' index='0'> + <alias name='ide'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <alias name='scsi0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'> + <alias name='pci'/> + </controller> + <controller type='virtio-serial' index='0'> + <alias name='virtio-serial0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </controller> + <interface type='vdpa'> + <mac address='52:54:00:39:5f:04'/> + <source dev='/dev/vhost-vdpa-0'/> + <model type='virtio'/> + <alias name='net0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> + </interface> + <input type='mouse' bus='ps2'> + <alias name='input0'/> + </input> + <input type='keyboard' bus='ps2'> + <alias name='input1'/> + </input> + <memballoon model='none'/> + </devices> + <seclabel type='none' model='none'/> +</domain> -- 2.26.2

The current udev node device driver ignores all events related to vdpa devices. Since libvirt now supports vDPA network devices, include these devices in the device list. Example output: virsh # nodedev-list [...ommitted long list of nodedevs...] vdpa_vdpa0 virsh # nodedev-dumpxml vdpa_vdpa0 <device> <name>vdpa_vdpa0</name> <path>/sys/devices/vdpa0</path> <parent>computer</parent> <driver> <name>vhost_vdpa</name> </driver> <capability type='vdpa'> <chardev>/dev/vhost-vdpa-0</chardev> </capability> </device> NOTE: normally the 'parent' would be a PCI device instead of 'computer', but this example output is from the vdpa_sim kernel module, so it doesn't have a normal parent device. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- docs/formatnode.html.in | 9 +++++ docs/schemas/nodedev.rng | 10 ++++++ include/libvirt/libvirt-nodedev.h | 1 + src/conf/node_device_conf.c | 14 ++++++++ src/conf/node_device_conf.h | 11 ++++++- src/conf/virnodedeviceobj.c | 4 ++- src/node_device/node_device_udev.c | 53 ++++++++++++++++++++++++++++++ tools/virsh-nodedev.c | 3 ++ 8 files changed, 103 insertions(+), 2 deletions(-) diff --git a/docs/formatnode.html.in b/docs/formatnode.html.in index 594427468b..6928bdd69c 100644 --- a/docs/formatnode.html.in +++ b/docs/formatnode.html.in @@ -432,6 +432,15 @@ <dd>The device number.</dd> </dl> </dd> + <dt><code>vdpa</code></dt> + <dd>Describes a virtual datapath acceleration (vDPA) network device. + <span class="since">Since 6.9.0</span>. Sub-elements include: + <dl> + <dt><code>chardev</code></dt> + <dd>The path to the character device that is used to access the + device.</dd> + </dl> + </dd> </dl> </dd> </dl> diff --git a/docs/schemas/nodedev.rng b/docs/schemas/nodedev.rng index 166e278cf8..0456ddbe93 100644 --- a/docs/schemas/nodedev.rng +++ b/docs/schemas/nodedev.rng @@ -86,6 +86,7 @@ <ref name="capmdev"/> <ref name="capccwdev"/> <ref name="capcssdev"/> + <ref name="capvdpa"/> </choice> </element> </define> @@ -675,6 +676,15 @@ </element> </define> + <define name="capvdpa"> + <attribute name="type"> + <value>vdpa</value> + </attribute> + <element name="chardev"> + <ref name="path"/> + </element> + </define> + <define name="address"> <element name="address"> <attribute name="domain"><ref name="hexuint"/></attribute> diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index dd2ffd5782..b73b076f14 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -82,6 +82,7 @@ typedef enum { VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV = 1 << 14, /* Mediated device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CCW_DEV = 1 << 15, /* CCW device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV = 1 << 16, /* CSS device */ + VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA = 1 << 17, /* vDPA device */ } virConnectListAllNodeDeviceFlags; int virConnectListAllNodeDevices (virConnectPtr conn, diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index 4adfdef572..9e75f6f3a2 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -66,6 +66,7 @@ VIR_ENUM_IMPL(virNodeDevCap, "mdev", "ccw", "css", + "vdpa", ); VIR_ENUM_IMPL(virNodeDevNetCap, @@ -518,6 +519,13 @@ virNodeDeviceCapMdevDefFormat(virBufferPtr buf, } } +static void +virNodeDeviceCapVDPADefFormat(virBufferPtr buf, + const virNodeDevCapData *data) +{ + virBufferEscapeString(buf, "<chardev>%s</chardev>\n", data->vdpa.chardev); +} + char * virNodeDeviceDefFormat(const virNodeDeviceDef *def) { @@ -611,6 +619,9 @@ virNodeDeviceDefFormat(const virNodeDeviceDef *def) virBufferAsprintf(&buf, "<devno>0x%04x</devno>\n", data->ccw_dev.devno); break; + case VIR_NODE_DEV_CAP_VDPA: + virNodeDeviceCapVDPADefFormat(&buf, data); + break; case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_FC_HOST: case VIR_NODE_DEV_CAP_VPORTS: @@ -1902,6 +1913,7 @@ virNodeDevCapsDefParseXML(xmlXPathContextPtr ctxt, case VIR_NODE_DEV_CAP_FC_HOST: case VIR_NODE_DEV_CAP_VPORTS: case VIR_NODE_DEV_CAP_SCSI_GENERIC: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: virReportError(VIR_ERR_INTERNAL_ERROR, _("unknown capability type '%d' for '%s'"), @@ -2219,6 +2231,7 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps) case VIR_NODE_DEV_CAP_VPORTS: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: /* This case is here to shutup the compiler */ break; @@ -2273,6 +2286,7 @@ virNodeDeviceUpdateCaps(virNodeDeviceDefPtr def) case VIR_NODE_DEV_CAP_MDEV: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 5484bc340f..3057c728a0 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -65,6 +65,7 @@ typedef enum { VIR_NODE_DEV_CAP_MDEV, /* Mediated device */ VIR_NODE_DEV_CAP_CCW_DEV, /* s390 CCW device */ VIR_NODE_DEV_CAP_CSS_DEV, /* s390 channel subsystem device */ + VIR_NODE_DEV_CAP_VDPA, /* vDPA device */ VIR_NODE_DEV_CAP_LAST } virNodeDevCapType; @@ -275,6 +276,12 @@ struct _virNodeDevCapCCW { unsigned int devno; }; +typedef struct _virNodeDevCapVDPA virNodeDevCapVDPA; +typedef virNodeDevCapVDPA *virNodeDevCapVDPAPtr; +struct _virNodeDevCapVDPA { + char *chardev; +}; + typedef struct _virNodeDevCapData virNodeDevCapData; typedef virNodeDevCapData *virNodeDevCapDataPtr; struct _virNodeDevCapData { @@ -293,6 +300,7 @@ struct _virNodeDevCapData { virNodeDevCapDRM drm; virNodeDevCapMdev mdev; virNodeDevCapCCW ccw_dev; + virNodeDevCapVDPA vdpa; }; }; @@ -369,7 +377,8 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps); VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV_TYPES | \ VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV | \ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CCW_DEV | \ - VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV) + VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV | \ + VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA) int virNodeDeviceGetSCSIHostCaps(virNodeDevCapSCSIHostPtr scsi_host); diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index 9af80b8036..6331d1a981 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -711,6 +711,7 @@ virNodeDeviceObjHasCap(const virNodeDeviceObj *obj, case VIR_NODE_DEV_CAP_MDEV: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } @@ -862,7 +863,8 @@ virNodeDeviceObjMatch(virNodeDeviceObjPtr obj, MATCH(MDEV_TYPES) || MATCH(MDEV) || MATCH(CCW_DEV) || - MATCH(CSS_DEV))) + MATCH(CSS_DEV) || + MATCH(VDPA))) return false; } diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 29a7eaa07c..b1b8427c05 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1142,6 +1142,55 @@ udevProcessCSS(struct udev_device *device, return 0; } + +static int +udevGetVDPACharDev(const char *sysfs_path, + virNodeDevCapDataPtr data) +{ + struct dirent *entry; + DIR *dir = NULL; + int direrr; + + if (virDirOpenIfExists(&dir, sysfs_path) <= 0) + return -1; + + while ((direrr = virDirRead(dir, &entry, NULL)) > 0) { + if (g_str_has_prefix(entry->d_name, "vhost-vdpa")) { + g_autofree char *chardev = g_strdup_printf("/dev/%s", entry->d_name); + + if (!virFileExists(chardev)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("vDPA chardev path '%s' does not exist"), + chardev); + return -1; + } + VIR_DEBUG("vDPA chardev is at '%s'", chardev); + + data->vdpa.chardev = g_steal_pointer(&chardev); + break; + } + } + + if (direrr < 0) + return -1; + + return 0; +} + +static int +udevProcessVDPA(struct udev_device *device, + virNodeDeviceDefPtr def) +{ + if (udevGenerateDeviceName(device, def, NULL) != 0) + return -1; + + if (udevGetVDPACharDev(def->sysfs_path, &def->caps->data) < 0) + return -1; + + return 0; +} + + static int udevGetDeviceNodes(struct udev_device *device, virNodeDeviceDefPtr def) @@ -1221,6 +1270,8 @@ udevGetDeviceType(struct udev_device *device, *type = VIR_NODE_DEV_CAP_CCW_DEV; else if (STREQ_NULLABLE(subsystem, "css")) *type = VIR_NODE_DEV_CAP_CSS_DEV; + else if (STREQ_NULLABLE(subsystem, "vdpa")) + *type = VIR_NODE_DEV_CAP_VDPA; VIR_FREE(subsystem); } @@ -1267,6 +1318,8 @@ udevGetDeviceDetails(struct udev_device *device, return udevProcessCCW(device, def); case VIR_NODE_DEV_CAP_CSS_DEV: return udevProcessCSS(device, def); + case VIR_NODE_DEV_CAP_VDPA: + return udevProcessVDPA(device, def); case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_SYSTEM: case VIR_NODE_DEV_CAP_FC_HOST: diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 483e36bd53..527bf49fc3 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -464,6 +464,9 @@ cmdNodeListDevices(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) case VIR_NODE_DEV_CAP_CSS_DEV: flags |= VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV; break; + case VIR_NODE_DEV_CAP_VDPA: + flags |= VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA; + break; case VIR_NODE_DEV_CAP_LAST: break; } -- 2.26.2

On 10/14/20 1:08 PM, Jonathon Jongsma wrote:
The current udev node device driver ignores all events related to vdpa devices. Since libvirt now supports vDPA network devices, include these devices in the device list.
Example output:
virsh # nodedev-list [...ommitted long list of nodedevs...] vdpa_vdpa0
virsh # nodedev-dumpxml vdpa_vdpa0 <device> <name>vdpa_vdpa0</name> <path>/sys/devices/vdpa0</path> <parent>computer</parent> <driver> <name>vhost_vdpa</name> </driver> <capability type='vdpa'> <chardev>/dev/vhost-vdpa-0</chardev> </capability> </device>
NOTE: normally the 'parent' would be a PCI device instead of 'computer', but this example output is from the vdpa_sim kernel module, so it doesn't have a normal parent device.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com> (I had left this patch in limbo in case anyone had issues with the particular element names, and then forgot about it for the last week. Seeing that nobody had an issue, I'm pushing it so it gets into the same release as the rest of the series)

On 10/14/20 1:08 PM, Jonathon Jongsma wrote:
The current udev node device driver ignores all events related to vdpa devices. Since libvirt now supports vDPA network devices, include these devices in the device list.
Example output:
virsh # nodedev-list [...ommitted long list of nodedevs...] vdpa_vdpa0
virsh # nodedev-dumpxml vdpa_vdpa0 <device> <name>vdpa_vdpa0</name> <path>/sys/devices/vdpa0</path> <parent>computer</parent> <driver> <name>vhost_vdpa</name> </driver> <capability type='vdpa'> <chardev>/dev/vhost-vdpa-0</chardev> </capability> </device>
NOTE: normally the 'parent' would be a PCI device instead of 'computer', but this example output is from the vdpa_sim kernel module, so it doesn't have a normal parent device.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- docs/formatnode.html.in | 9 +++++ docs/schemas/nodedev.rng | 10 ++++++ include/libvirt/libvirt-nodedev.h | 1 + src/conf/node_device_conf.c | 14 ++++++++ src/conf/node_device_conf.h | 11 ++++++- src/conf/virnodedeviceobj.c | 4 ++- src/node_device/node_device_udev.c | 53 ++++++++++++++++++++++++++++++ tools/virsh-nodedev.c | 3 ++ 8 files changed, 103 insertions(+), 2 deletions(-)
Coverity notes a RESOURCE_LEAK
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 29a7eaa07c..b1b8427c05 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1142,6 +1142,55 @@ udevProcessCSS(struct udev_device *device, return 0; }
+ +static int +udevGetVDPACharDev(const char *sysfs_path, + virNodeDevCapDataPtr data) +{ + struct dirent *entry; + DIR *dir = NULL; + int direrr; + + if (virDirOpenIfExists(&dir, sysfs_path) <= 0) + return -1;
Any return after this leaks @dir - need a VIR_CLOSE_DIR(dir)
+ + while ((direrr = virDirRead(dir, &entry, NULL)) > 0) { + if (g_str_has_prefix(entry->d_name, "vhost-vdpa")) { + g_autofree char *chardev = g_strdup_printf("/dev/%s", entry->d_name); + + if (!virFileExists(chardev)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("vDPA chardev path '%s' does not exist"), + chardev);> + return -1; + } + VIR_DEBUG("vDPA chardev is at '%s'", chardev); + + data->vdpa.chardev = g_steal_pointer(&chardev); + break; + } + } + + if (direrr < 0) + return -1; + + return 0; +} +
John [...]

On 10/26/20 6:53 AM, John Ferlan wrote:
On 10/14/20 1:08 PM, Jonathon Jongsma wrote:
The current udev node device driver ignores all events related to vdpa devices. Since libvirt now supports vDPA network devices, include these devices in the device list.
Example output:
virsh # nodedev-list [...ommitted long list of nodedevs...] vdpa_vdpa0
virsh # nodedev-dumpxml vdpa_vdpa0 <device> <name>vdpa_vdpa0</name> <path>/sys/devices/vdpa0</path> <parent>computer</parent> <driver> <name>vhost_vdpa</name> </driver> <capability type='vdpa'> <chardev>/dev/vhost-vdpa-0</chardev> </capability> </device>
NOTE: normally the 'parent' would be a PCI device instead of 'computer', but this example output is from the vdpa_sim kernel module, so it doesn't have a normal parent device.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- docs/formatnode.html.in | 9 +++++ docs/schemas/nodedev.rng | 10 ++++++ include/libvirt/libvirt-nodedev.h | 1 + src/conf/node_device_conf.c | 14 ++++++++ src/conf/node_device_conf.h | 11 ++++++- src/conf/virnodedeviceobj.c | 4 ++- src/node_device/node_device_udev.c | 53 ++++++++++++++++++++++++++++++ tools/virsh-nodedev.c | 3 ++ 8 files changed, 103 insertions(+), 2 deletions(-)
Coverity notes a RESOURCE_LEAK
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 29a7eaa07c..b1b8427c05 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1142,6 +1142,55 @@ udevProcessCSS(struct udev_device *device, return 0; }
+ +static int +udevGetVDPACharDev(const char *sysfs_path, + virNodeDevCapDataPtr data) +{ + struct dirent *entry; + DIR *dir = NULL; + int direrr; + + if (virDirOpenIfExists(&dir, sysfs_path) <= 0) + return -1; Any return after this leaks @dir - need a VIR_CLOSE_DIR(dir)
Sigh. I have a nice patch series that converts all DIR*'s to g_autoptr(DIR). You'd think I would have seen this one in review :-/
+ + while ((direrr = virDirRead(dir, &entry, NULL)) > 0) { + if (g_str_has_prefix(entry->d_name, "vhost-vdpa")) { + g_autofree char *chardev = g_strdup_printf("/dev/%s", entry->d_name); + + if (!virFileExists(chardev)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("vDPA chardev path '%s' does not exist"), + chardev); + return -1; + } + VIR_DEBUG("vDPA chardev is at '%s'", chardev); + + data->vdpa.chardev = g_steal_pointer(&chardev); + break; + } + } + + if (direrr < 0) + return -1; + + return 0; +} + John
[...]

On 10/14/20 1:08 PM, Jonathon Jongsma wrote:
vDPA network devices allow high-performance networking in a virtual machine by providing a wire-speed data path. These devices require a vendor-specific host driver but the data path follows the virtio specification.
The support for vDPA devices was recently added to qemu. This allows libvirt to support these devices. This patchset requires that the device is configured on the host with the appropriate vendor-specific driver. This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That chardev path can then be used to define a new interface with type=3D'vdpa'.
Note that in order for hot-unplug to work properly, you may need to apply a qemu patch[1] for now. Without the patch, qemu will not close the fd properly and any subsequent attempts to use the vdpa chardev will fail like this:
virsh # attach-device guest1 vdpa.xml error: Failed to attach device from vdpa.xml error: Unable to open '/dev/vhost-vdpa-0' for vdpa device: Device or reso= urce busy
[1] https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg06374.html
Changes in v5: - rebased to latest master - fixed a case where qemuDomainObjExitMonitor() was not called on an error p= ath - Improved the nodedev xml. It now includes the path to the chardev in /dev - also updated the nodedev xml schema - added sample nodedev-dumpxml output to the commit message of patch #6
Jonathon Jongsma (6): conf: Add support for vDPA network devices qemu: add vhost-vdpa capability qemu: add vdpa support qemu: add monitor functions for handling file descriptors qemu: support hotplug of vdpa devices Include vdpa devices in node device list
Reviewed-by: Laine Stump <laine@redhat.com> for 1-5 For patch 6 (the nodedev XML additions) I'm holding off on that in case anyone has an opinion on changes that should be made there (you had expressed some concern about it in IRC) (HINT HINT!!! ANY TAKERS?) I've done CI testing with all the patches, and am pushing 1-5 now, which will hopefully encourage wider testing among people with real hardware that's capable of VDPA.
docs/formatdomain.rst | 24 +++ docs/formatnode.html.in | 9 + docs/schemas/domaincommon.rng | 15 ++ docs/schemas/nodedev.rng | 10 + include/libvirt/libvirt-nodedev.h | 1 + src/conf/domain_conf.c | 31 ++++ src/conf/domain_conf.h | 4 + src/conf/netdev_bandwidth_conf.c | 1 + src/conf/node_device_conf.c | 14 ++ src/conf/node_device_conf.h | 11 +- src/conf/virnodedeviceobj.c | 4 +- src/libxl/libxl_conf.c | 1 + src/libxl/xen_common.c | 1 + src/lxc/lxc_controller.c | 1 + src/lxc/lxc_driver.c | 3 + src/lxc/lxc_process.c | 1 + src/node_device/node_device_udev.c | 53 ++++++ src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 36 +++- src/qemu/qemu_command.h | 3 +- src/qemu/qemu_domain.c | 6 +- src/qemu/qemu_hotplug.c | 75 +++++++- src/qemu/qemu_interface.c | 25 +++ src/qemu/qemu_interface.h | 2 + src/qemu/qemu_migration.c | 10 +- src/qemu/qemu_monitor.c | 93 ++++++++++ src/qemu/qemu_monitor.h | 41 +++++ src/qemu/qemu_monitor_json.c | 173 ++++++++++++++++++ src/qemu/qemu_monitor_json.h | 12 ++ src/qemu/qemu_process.c | 2 + src/qemu/qemu_validate.c | 15 ++ src/vmx/vmx.c | 1 + .../caps_5.1.0.x86_64.xml | 1 + .../caps_5.2.0.x86_64.xml | 1 + tests/qemuhotplugmock.c | 9 + tests/qemuhotplugtest.c | 16 ++ .../qemuhotplug-interface-vdpa.xml | 4 + .../qemuhotplug-base-live+interface-vdpa.xml | 57 ++++++ .../net-vdpa.x86_64-latest.args | 38 ++++ tests/qemuxml2argvdata/net-vdpa.xml | 28 +++ tests/qemuxml2argvmock.c | 11 +- tests/qemuxml2argvtest.c | 1 + tests/qemuxml2xmloutdata/net-vdpa.xml | 34 ++++ tests/qemuxml2xmltest.c | 1 + tools/virsh-domain.c | 1 + tools/virsh-nodedev.c | 3 + 47 files changed, 870 insertions(+), 16 deletions(-) create mode 100644 tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.x= ml create mode 100644 tests/qemuhotplugtestdomains/qemuhotplug-base-live+interf= ace-vdpa.xml create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml
--=20 2.26.2

On Tue, 20 Oct 2020 15:16:48 -0400 Laine Stump <laine@redhat.com> wrote:
On 10/14/20 1:08 PM, Jonathon Jongsma wrote:
vDPA network devices allow high-performance networking in a virtual machine by providing a wire-speed data path. These devices require a vendor-specific host driver but the data path follows the virtio specification.
The support for vDPA devices was recently added to qemu. This allows libvirt to support these devices. This patchset requires that the device is configured on the host with the appropriate vendor-specific driver. This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That chardev path can then be used to define a new interface with type=3D'vdpa'.
Note that in order for hot-unplug to work properly, you may need to apply a qemu patch[1] for now. Without the patch, qemu will not close the fd properly and any subsequent attempts to use the vdpa chardev will fail like this:
virsh # attach-device guest1 vdpa.xml error: Failed to attach device from vdpa.xml error: Unable to open '/dev/vhost-vdpa-0' for vdpa device: Device or reso= urce busy
[1] https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg06374.html
Changes in v5: - rebased to latest master - fixed a case where qemuDomainObjExitMonitor() was not called on an error p= ath - Improved the nodedev xml. It now includes the path to the chardev in /dev - also updated the nodedev xml schema - added sample nodedev-dumpxml output to the commit message of patch #6
Jonathon Jongsma (6): conf: Add support for vDPA network devices qemu: add vhost-vdpa capability qemu: add vdpa support qemu: add monitor functions for handling file descriptors qemu: support hotplug of vdpa devices Include vdpa devices in node device list
Reviewed-by: Laine Stump <laine@redhat.com> for 1-5
For patch 6 (the nodedev XML additions) I'm holding off on that in case anyone has an opinion on changes that should be made there (you had expressed some concern about it in IRC) (HINT HINT!!! ANY TAKERS?)
I wouldn't say "concern", necessarily. I just wanted to make sure that there wasn't a different element name that might be more consistent with other parts of the XML schema.
I've done CI testing with all the patches, and am pushing 1-5 now, which will hopefully encourage wider testing among people with real hardware that's capable of VDPA.
docs/formatdomain.rst | 24 +++ docs/formatnode.html.in | 9 + docs/schemas/domaincommon.rng | 15 ++ docs/schemas/nodedev.rng | 10 + include/libvirt/libvirt-nodedev.h | 1 + src/conf/domain_conf.c | 31 ++++ src/conf/domain_conf.h | 4 + src/conf/netdev_bandwidth_conf.c | 1 + src/conf/node_device_conf.c | 14 ++ src/conf/node_device_conf.h | 11 +- src/conf/virnodedeviceobj.c | 4 +- src/libxl/libxl_conf.c | 1 + src/libxl/xen_common.c | 1 + src/lxc/lxc_controller.c | 1 + src/lxc/lxc_driver.c | 3 + src/lxc/lxc_process.c | 1 + src/node_device/node_device_udev.c | 53 ++++++ src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 36 +++- src/qemu/qemu_command.h | 3 +- src/qemu/qemu_domain.c | 6 +- src/qemu/qemu_hotplug.c | 75 +++++++- src/qemu/qemu_interface.c | 25 +++ src/qemu/qemu_interface.h | 2 + src/qemu/qemu_migration.c | 10 +- src/qemu/qemu_monitor.c | 93 ++++++++++ src/qemu/qemu_monitor.h | 41 +++++ src/qemu/qemu_monitor_json.c | 173 ++++++++++++++++++ src/qemu/qemu_monitor_json.h | 12 ++ src/qemu/qemu_process.c | 2 + src/qemu/qemu_validate.c | 15 ++ src/vmx/vmx.c | 1 + .../caps_5.1.0.x86_64.xml | 1 + .../caps_5.2.0.x86_64.xml | 1 + tests/qemuhotplugmock.c | 9 + tests/qemuhotplugtest.c | 16 ++ .../qemuhotplug-interface-vdpa.xml | 4 + .../qemuhotplug-base-live+interface-vdpa.xml | 57 ++++++ .../net-vdpa.x86_64-latest.args | 38 ++++ tests/qemuxml2argvdata/net-vdpa.xml | 28 +++ tests/qemuxml2argvmock.c | 11 +- tests/qemuxml2argvtest.c | 1 + tests/qemuxml2xmloutdata/net-vdpa.xml | 34 ++++ tests/qemuxml2xmltest.c | 1 + tools/virsh-domain.c | 1 + tools/virsh-nodedev.c | 3 + 47 files changed, 870 insertions(+), 16 deletions(-) create mode 100644 tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.x= ml create mode 100644 tests/qemuhotplugtestdomains/qemuhotplug-base-live+interf= ace-vdpa.xml create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml
--=20 2.26.2
participants (4)
-
John Ferlan
-
Jonathon Jongsma
-
Laine Stump
-
Laine Stump