[libvirt PATCH v4 0/6] Add support for vDPA network devices

vDPA network devices allow high-performance networking in a virtual machine by providing a wire-speed data path. These devices require a vendor-specific host driver but the data path follows the virtio specification. The support for vDPA devices was recently added to qemu. This allows libvirt to support these devices. This patchset requires that the device is configured on the host with the appropriate vendor-specific driver. This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That chardev path can then be used to define a new interface with type=3D'vdpa'. Changes in v4: - rebased to latest master - added hotplug support - report vdpa devices in node device list Jonathon Jongsma (6): conf: Add support for vDPA network devices qemu: add vhost-vdpa capability qemu: add vdpa support qemu: add monitor functions for handling file descriptors qemu: support hotplug of vdpa devices Include vdpa devices in node device list docs/formatdomain.rst | 24 +++ docs/schemas/domaincommon.rng | 15 ++ include/libvirt/libvirt-nodedev.h | 1 + src/conf/domain_conf.c | 31 ++++ src/conf/domain_conf.h | 4 + src/conf/netdev_bandwidth_conf.c | 1 + src/conf/node_device_conf.c | 5 + src/conf/node_device_conf.h | 4 +- src/conf/virnodedeviceobj.c | 4 +- src/libxl/libxl_conf.c | 1 + src/libxl/xen_common.c | 1 + src/lxc/lxc_controller.c | 1 + src/lxc/lxc_driver.c | 3 + src/lxc/lxc_process.c | 1 + src/node_device/node_device_udev.c | 16 ++ src/qemu/qemu_capabilities.c | 4 + src/qemu/qemu_capabilities.h | 3 + src/qemu/qemu_command.c | 36 +++- src/qemu/qemu_command.h | 3 +- src/qemu/qemu_domain.c | 6 +- src/qemu/qemu_hotplug.c | 73 +++++++- src/qemu/qemu_interface.c | 25 +++ src/qemu/qemu_interface.h | 2 + src/qemu/qemu_migration.c | 10 +- src/qemu/qemu_monitor.c | 93 ++++++++++ src/qemu/qemu_monitor.h | 41 +++++ src/qemu/qemu_monitor_json.c | 173 ++++++++++++++++++ src/qemu/qemu_monitor_json.h | 12 ++ src/qemu/qemu_process.c | 2 + src/qemu/qemu_validate.c | 15 ++ src/vmx/vmx.c | 1 + .../caps_5.1.0.x86_64.xml | 1 + .../caps_5.2.0.x86_64.xml | 1 + tests/qemuhotplugmock.c | 9 + tests/qemuhotplugtest.c | 16 ++ .../qemuhotplug-interface-vdpa.xml | 4 + .../qemuhotplug-base-live+interface-vdpa.xml | 57 ++++++ .../net-vdpa.x86_64-latest.args | 37 ++++ tests/qemuxml2argvdata/net-vdpa.xml | 28 +++ tests/qemuxml2argvmock.c | 11 +- tests/qemuxml2argvtest.c | 1 + tests/qemuxml2xmloutdata/net-vdpa.xml | 34 ++++ tests/qemuxml2xmltest.c | 1 + tools/virsh-domain.c | 1 + tools/virsh-nodedev.c | 3 + 45 files changed, 799 insertions(+), 16 deletions(-) create mode 100644 tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.x= ml create mode 100644 tests/qemuhotplugtestdomains/qemuhotplug-base-live+interf= ace-vdpa.xml create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml --=20 2.26.2

This patch adds new schema and adds support for parsing and formatting domain configurations that include vdpa devices. vDPA network devices allow high-performance networking in a virtual machine by providing a wire-speed data path. These devices require a vendor-specific host driver but the data path follows the virtio specification. When a device on the host is bound to an appropriate vendor-specific driver, it will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That chardev path can then be used to define a new interface with type='vdpa'. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- docs/formatdomain.rst | 24 ++++++++++++++++++++++++ docs/schemas/domaincommon.rng | 15 +++++++++++++++ src/conf/domain_conf.c | 31 +++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 4 ++++ src/conf/netdev_bandwidth_conf.c | 1 + src/libxl/libxl_conf.c | 1 + src/libxl/xen_common.c | 1 + src/lxc/lxc_controller.c | 1 + src/lxc/lxc_driver.c | 3 +++ src/lxc/lxc_process.c | 1 + src/qemu/qemu_command.c | 3 +++ src/qemu/qemu_domain.c | 4 +++- src/qemu/qemu_hotplug.c | 3 +++ src/qemu/qemu_interface.c | 2 ++ src/qemu/qemu_process.c | 2 ++ src/qemu/qemu_validate.c | 1 + src/vmx/vmx.c | 1 + tools/virsh-domain.c | 1 + 18 files changed, 98 insertions(+), 1 deletion(-) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 888db5ea29..b0c2b43dc2 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -4643,6 +4643,30 @@ or stopping the guest. </devices> ... +:anchor:`<a id="elementsNICSVDPA"/>` + +vDPA devices +^^^^^^^^^^^^ + +A vDPA network device can be used to provide wire speed network performance +within a domain. A vDPA device is a specialized type of network device that +uses a datapath that complies with the virtio specification but has a +vendor-specific control path. To use such a device with libvirt, the host +device must already be bound to the appropriate device-specific vDPA driver. +This creates a vDPA char device (e.g. /dev/vhost-vdpa-0) that can be used to +assign the device to a libvirt domain. :since:`Since 6.8.0 (QEMU only, +requires QEMU 5.1.0 or newer)` + +:: + + ... + <devices> + <interface type='vdpa'> + <source dev='/dev/vhost-vdpa-0'/> + </interface> + </devices> + ... + :anchor:`<a id="elementsTeaming"/>` Teaming a virtio/hostdev NIC pair diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 4b7e460148..4ce485a762 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -3117,6 +3117,21 @@ <ref name="interface-options"/> </interleave> </group> + + <group> + <attribute name="type"> + <value>vdpa</value> + </attribute> + <interleave> + <element name="source"> + <attribute name="dev"> + <ref name="deviceName"/> + </attribute> + </element> + <ref name="interface-options"/> + </interleave> + </group> + </choice> <optional> <attribute name="trustGuestRxFilters"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a91dbd4aa9..0b78ff7c70 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -554,6 +554,7 @@ VIR_ENUM_IMPL(virDomainNet, "direct", "hostdev", "udp", + "vdpa", ); VIR_ENUM_IMPL(virDomainNetModel, @@ -2505,6 +2506,10 @@ virDomainNetDefClear(virDomainNetDefPtr def) def->data.vhostuser = NULL; break; + case VIR_DOMAIN_NET_TYPE_VDPA: + VIR_FREE(def->data.vdpa.devicepath); + break; + case VIR_DOMAIN_NET_TYPE_SERVER: case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: @@ -12133,6 +12138,10 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt, if (virDomainChrSourceReconnectDefParseXML(&reconnect, cur, ctxt) < 0) goto error; + } else if (!dev + && def->type == VIR_DOMAIN_NET_TYPE_VDPA + && virXMLNodeNameEqual(cur, "source")) { + dev = virXMLPropString(cur, "dev"); } else if (!def->virtPortProfile && virXMLNodeNameEqual(cur, "virtualport")) { if (def->type == VIR_DOMAIN_NET_TYPE_NETWORK) { @@ -12390,6 +12399,16 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt, } break; + case VIR_DOMAIN_NET_TYPE_VDPA: + if (dev == NULL) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("No <source> 'dev' attribute " + "specified with <interface type='vdpa'/>")); + goto error; + } + def->data.vdpa.devicepath = g_steal_pointer(&dev); + break; + case VIR_DOMAIN_NET_TYPE_BRIDGE: if (bridge == NULL) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -12779,6 +12798,7 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt, case VIR_DOMAIN_NET_TYPE_DIRECT: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: break; case VIR_DOMAIN_NET_TYPE_LAST: default: @@ -26947,6 +26967,14 @@ virDomainNetDefFormat(virBufferPtr buf, } break; + case VIR_DOMAIN_NET_TYPE_VDPA: + if (def->data.vdpa.devicepath) { + virBufferEscapeString(buf, "<source dev='%s'", + def->data.vdpa.devicepath); + sourceLines++; + } + break; + case VIR_DOMAIN_NET_TYPE_USER: case VIR_DOMAIN_NET_TYPE_LAST: break; @@ -31160,6 +31188,7 @@ virDomainNetGetActualVirtPortProfile(const virDomainNetDef *iface) case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: default: return NULL; @@ -31992,6 +32021,7 @@ virDomainNetTypeSharesHostView(const virDomainNetDef *net) case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: break; } @@ -32256,6 +32286,7 @@ virDomainNetDefActualToNetworkPort(virDomainDefPtr dom, case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_USER: case VIR_DOMAIN_NET_TYPE_VHOSTUSER: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unexpected network port type %s"), virDomainNetTypeToString(virDomainNetGetActualType(iface))); diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 8f1662aae0..da35aa6c2d 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -883,6 +883,7 @@ typedef enum { VIR_DOMAIN_NET_TYPE_DIRECT, VIR_DOMAIN_NET_TYPE_HOSTDEV, VIR_DOMAIN_NET_TYPE_UDP, + VIR_DOMAIN_NET_TYPE_VDPA, VIR_DOMAIN_NET_TYPE_LAST } virDomainNetType; @@ -1056,6 +1057,9 @@ struct _virDomainNetDef { */ virDomainActualNetDefPtr actual; } network; + struct { + char *devicepath; + } vdpa; struct { char *brname; } bridge; diff --git a/src/conf/netdev_bandwidth_conf.c b/src/conf/netdev_bandwidth_conf.c index 396ac62019..4eb12e2951 100644 --- a/src/conf/netdev_bandwidth_conf.c +++ b/src/conf/netdev_bandwidth_conf.c @@ -315,6 +315,7 @@ bool virNetDevSupportsBandwidth(virDomainNetType type) case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: break; } diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c index 5b729a019a..b570a5a167 100644 --- a/src/libxl/libxl_conf.c +++ b/src/libxl/libxl_conf.c @@ -1391,6 +1391,7 @@ libxlMakeNic(virDomainDefPtr def, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_DIRECT: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("unsupported interface type %s"), diff --git a/src/libxl/xen_common.c b/src/libxl/xen_common.c index 56702a2a76..183b09671a 100644 --- a/src/libxl/xen_common.c +++ b/src/libxl/xen_common.c @@ -1792,6 +1792,7 @@ xenFormatNet(virConnectPtr conn, case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_USER: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type '%s'"), virDomainNetTypeToString(net->type)); return -1; diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index c3cf485e2c..9f944dec74 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -422,6 +422,7 @@ static int virLXCControllerGetNICIndexes(virLXCControllerPtr ctrl) case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type %s"), virDomainNetTypeToString(actualType)); diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index a530488dd2..4d3c5d9f63 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -3504,6 +3504,7 @@ lxcDomainAttachDeviceNetLive(virLXCDriverPtr driver, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Network device type is not supported")); goto cleanup; @@ -3558,6 +3559,7 @@ lxcDomainAttachDeviceNetLive(virLXCDriverPtr driver, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: default: /* no-op */ @@ -3999,6 +4001,7 @@ lxcDomainDetachDeviceNetLive(virDomainObjPtr vm, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Only bridged veth devices can be detached")); goto cleanup; diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index 16969dbf33..d103ec6666 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -606,6 +606,7 @@ virLXCProcessSetupInterfaces(virLXCDriverPtr driver, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_LAST: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_INTERNAL_ERROR, _("Unsupported network type %s"), virDomainNetTypeToString(type)); diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 91b59538aa..a9367f7758 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3692,6 +3692,7 @@ qemuBuildHostNetStr(virDomainNetDefPtr net, return NULL; break; + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_HOSTDEV: /* Should have been handled earlier via PCI/USB hotplug code. */ case VIR_DOMAIN_NET_TYPE_LAST: @@ -8132,6 +8133,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* nada */ break; @@ -8168,6 +8170,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* These types don't use a network device on the host, but * instead use some other type of connection to the emulated diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 279de2997d..fe80fa9aed 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -5129,7 +5129,8 @@ qemuDomainDeviceNetDefPostParse(virDomainNetDefPtr net, const virDomainDef *def, virQEMUCapsPtr qemuCaps) { - if (net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV && + if (net->type != VIR_DOMAIN_NET_TYPE_VDPA && + net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV && !virDomainNetGetModelString(net) && virDomainNetResolveActualType(net) != VIR_DOMAIN_NET_TYPE_HOSTDEV) net->model = qemuDomainDefaultNetModel(def, qemuCaps); @@ -9323,6 +9324,7 @@ qemuDomainNetSupportsMTU(virDomainNetType type) case VIR_DOMAIN_NET_TYPE_DIRECT: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: break; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index fc0866c011..ade77330ff 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1340,6 +1340,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _("hotplug of interface type of %s is not implemented yet"), @@ -3390,6 +3391,7 @@ qemuDomainChangeNetFilter(virDomainObjPtr vm, case VIR_DOMAIN_NET_TYPE_DIRECT: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("filters not supported on interfaces of type %s"), virDomainNetTypeToString(virDomainNetGetActualType(newdev))); @@ -3727,6 +3729,7 @@ qemuDomainChangeNet(virQEMUDriverPtr driver, case VIR_DOMAIN_NET_TYPE_VHOSTUSER: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _("unable to change config on '%s' network type"), virDomainNetTypeToString(newdev->type)); diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c index cbf3d99981..b24f9060a9 100644 --- a/src/qemu/qemu_interface.c +++ b/src/qemu/qemu_interface.c @@ -118,6 +118,7 @@ qemuInterfaceStartDevice(virDomainNetDefPtr net) case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* these types all require no action */ break; @@ -203,6 +204,7 @@ qemuInterfaceStopDevice(virDomainNetDefPtr net) case VIR_DOMAIN_NET_TYPE_UDP: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* these types all require no action */ break; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index f21b8f1585..b32b9fe20a 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3363,6 +3363,7 @@ qemuProcessNotifyNets(virDomainDefPtr def) case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: break; } @@ -7591,6 +7592,7 @@ void qemuProcessStop(virQEMUDriverPtr driver, case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* No special cleanup procedure for these types. */ break; diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index 3ed4039cdf..1c803c98f7 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -1145,6 +1145,7 @@ qemuValidateNetSupportsCoalesce(virDomainNetType type) case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: break; } diff --git a/src/vmx/vmx.c b/src/vmx/vmx.c index 4b1b04c6e1..6e0fd61f60 100644 --- a/src/vmx/vmx.c +++ b/src/vmx/vmx.c @@ -3833,6 +3833,7 @@ virVMXFormatEthernet(virDomainNetDefPtr def, int controller, case VIR_DOMAIN_NET_TYPE_DIRECT: case VIR_DOMAIN_NET_TYPE_HOSTDEV: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type '%s'"), virDomainNetTypeToString(def->type)); return -1; diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 32edfc0398..ef2a353aab 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -1007,6 +1007,7 @@ cmdAttachInterface(vshControl *ctl, const vshCmd *cmd) case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_UDP: + case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_LAST: vshError(ctl, _("No support for %s in command 'attach-interface'"), -- 2.26.2

On Thu, Sep 24, 2020 at 16:45:11 -0500, Jonathon Jongsma wrote:
This patch adds new schema and adds support for parsing and formatting domain configurations that include vdpa devices.
vDPA network devices allow high-performance networking in a virtual machine by providing a wire-speed data path. These devices require a vendor-specific host driver but the data path follows the virtio specification.
When a device on the host is bound to an appropriate vendor-specific driver, it will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That chardev path can then be used to define a new interface with type='vdpa'.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> ---
[...]
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 888db5ea29..b0c2b43dc2 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -4643,6 +4643,30 @@ or stopping the guest. </devices> ...
+:anchor:`<a id="elementsNICSVDPA"/>`
These are not needed for new entries. The link in the header is actually..
+ +vDPA devices +^^^^^^^^^^^^
... generated from the title here. I've added the anchors just to keep old links working.
+ +A vDPA network device can be used to provide wire speed network performance

Recent versions of qemu added the -netdev vhost-vdpa device. This capability allows libvirt to know whether this is supported. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_capabilities.c | 4 ++++ src/qemu/qemu_capabilities.h | 3 +++ tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml | 1 + 4 files changed, 9 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 2cc0c61588..b19928f68d 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -597,6 +597,9 @@ VIR_ENUM_IMPL(virQEMUCaps, "spapr-tpm-proxy", "numa.hmat", "blockdev-hostdev-scsi", + + /* 380 */ + "netdev.vhost-vdpa", ); @@ -1526,6 +1529,7 @@ static struct virQEMUCapsStringFlags virQEMUCapsQMPSchemaQueries[] = { { "migrate-set-parameters/arg-type/downtime-limit", QEMU_CAPS_MIGRATION_PARAM_DOWNTIME }, { "migrate-set-parameters/arg-type/xbzrle-cache-size", QEMU_CAPS_MIGRATION_PARAM_XBZRLE_CACHE_SIZE }, { "set-numa-node/arg-type/+hmat-lb", QEMU_CAPS_NUMA_HMAT }, + { "netdev_add/arg-type/+vhost-vdpa", QEMU_CAPS_NETDEV_VHOST_VDPA }, }; typedef struct _virQEMUCapsObjectTypeProps virQEMUCapsObjectTypeProps; diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 5d08941538..b6110f1c34 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -578,6 +578,9 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_NUMA_HMAT, /* -numa hmat */ QEMU_CAPS_BLOCKDEV_HOSTDEV_SCSI, /* -blockdev used for (i)SCSI hostdevs */ + /* 380 */ + QEMU_CAPS_NETDEV_VHOST_VDPA, /* -netdev vhost-vdpa*/ + QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml index 7496ff1379..0fd2f3b816 100644 --- a/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.1.0.x86_64.xml @@ -242,6 +242,7 @@ <flag name='intel-iommu.aw-bits'/> <flag name='numa.hmat'/> <flag name='blockdev-hostdev-scsi'/> + <flag name='netdev.vhost-vdpa'/> <version>5001000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml index 151bd18137..e4686000a9 100644 --- a/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.2.0.x86_64.xml @@ -242,6 +242,7 @@ <flag name='intel-iommu.aw-bits'/> <flag name='numa.hmat'/> <flag name='blockdev-hostdev-scsi'/> + <flag name='netdev.vhost-vdpa'/> <version>5001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100243</microcodeVersion> -- 2.26.2

Enable <interface type='vdpa'> for qemu domains. This provides basic support and does not support hotplug or migration. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_command.c | 35 ++++++++++++++++-- src/qemu/qemu_command.h | 3 +- src/qemu/qemu_domain.c | 6 ++- src/qemu/qemu_hotplug.c | 14 ++++--- src/qemu/qemu_interface.c | 23 ++++++++++++ src/qemu/qemu_interface.h | 2 + src/qemu/qemu_migration.c | 10 ++++- src/qemu/qemu_validate.c | 14 +++++++ .../net-vdpa.x86_64-latest.args | 37 +++++++++++++++++++ tests/qemuxml2argvdata/net-vdpa.xml | 28 ++++++++++++++ tests/qemuxml2argvmock.c | 11 +++++- tests/qemuxml2argvtest.c | 1 + tests/qemuxml2xmloutdata/net-vdpa.xml | 34 +++++++++++++++++ tests/qemuxml2xmltest.c | 1 + 14 files changed, 205 insertions(+), 14 deletions(-) create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index a9367f7758..bc1d4bfd90 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3554,7 +3554,8 @@ qemuBuildHostNetStr(virDomainNetDefPtr net, size_t tapfdSize, char **vhostfd, size_t vhostfdSize, - const char *slirpfd) + const char *slirpfd, + const char *vdpadev) { bool is_tap = false; virDomainNetType netType = virDomainNetGetActualType(net); @@ -3693,6 +3694,12 @@ qemuBuildHostNetStr(virDomainNetDefPtr net, break; case VIR_DOMAIN_NET_TYPE_VDPA: + /* Caller will pass the fd to qemu with add-fd */ + if (virJSONValueObjectCreate(&netprops, "s:type", "vhost-vdpa", NULL) < 0 || + virJSONValueObjectAppendString(netprops, "vhostdev", vdpadev) < 0) + return NULL; + break; + case VIR_DOMAIN_NET_TYPE_HOSTDEV: /* Should have been handled earlier via PCI/USB hotplug code. */ case VIR_DOMAIN_NET_TYPE_LAST: @@ -8042,6 +8049,8 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, char **tapfdName = NULL; char **vhostfdName = NULL; g_autofree char *slirpfdName = NULL; + g_autofree char *vdpafdName = NULL; + int vdpafd = -1; virDomainNetType actualType = virDomainNetGetActualType(net); const virNetDevBandwidth *actualBandwidth; bool requireNicdev = false; @@ -8127,13 +8136,17 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, break; + case VIR_DOMAIN_NET_TYPE_VDPA: + if ((vdpafd = qemuInterfaceVDPAConnect(net)) < 0) + goto cleanup; + break; + case VIR_DOMAIN_NET_TYPE_USER: case VIR_DOMAIN_NET_TYPE_SERVER: case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: - case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: /* nada */ break; @@ -8250,13 +8263,29 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver, vhostfd[i] = -1; } + if (vdpafd > 0) { + g_autofree char *fdset = NULL; + g_autofree char *addfdarg = NULL; + + virCommandPassFD(cmd, vdpafd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); + fdset = qemuVirCommandGetFDSet(cmd, vdpafd); + if (!fdset) + goto cleanup; + vdpafdName = qemuVirCommandGetDevSet(cmd, vdpafd); + /* set opaque to the devicepath so that we can look up the fdset later + * if necessary */ + addfdarg = g_strdup_printf("%s,opaque=%s", fdset, + net->data.vdpa.devicepath); + virCommandAddArgList(cmd, "-add-fd", addfdarg, NULL); + } + if (chardev) virCommandAddArgList(cmd, "-chardev", chardev, NULL); if (!(hostnetprops = qemuBuildHostNetStr(net, tapfdName, tapfdSize, vhostfdName, vhostfdSize, - slirpfdName))) + slirpfdName, vdpafdName))) goto cleanup; if (!(host = virQEMUBuildNetdevCommandlineFromJSON(hostnetprops, diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 89d99b111f..8db51f93b1 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -99,7 +99,8 @@ virJSONValuePtr qemuBuildHostNetStr(virDomainNetDefPtr net, size_t tapfdSize, char **vhostfd, size_t vhostfdSize, - const char *slirpfd); + const char *slirpfd, + const char *vdpadev); /* Current, best practice */ char *qemuBuildNicDevStr(virDomainDefPtr def, diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index fe80fa9aed..28a303b3e2 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -5129,8 +5129,10 @@ qemuDomainDeviceNetDefPostParse(virDomainNetDefPtr net, const virDomainDef *def, virQEMUCapsPtr qemuCaps) { - if (net->type != VIR_DOMAIN_NET_TYPE_VDPA && - net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV && + if (net->type == VIR_DOMAIN_NET_TYPE_VDPA && + !virDomainNetGetModelString(net)) + net->model = VIR_DOMAIN_NET_MODEL_VIRTIO; + else if (net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV && !virDomainNetGetModelString(net) && virDomainNetResolveActualType(net) != VIR_DOMAIN_NET_TYPE_HOSTDEV) net->model = qemuDomainDefaultNetModel(def, qemuCaps); diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index ade77330ff..0582b78f97 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1389,7 +1389,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, if (!(netprops = qemuBuildHostNetStr(net, tapfdName, tapfdSize, vhostfdName, vhostfdSize, - slirpfdName))) + slirpfdName, NULL))) goto cleanup; qemuDomainObjEnterMonitor(driver, vm); @@ -3485,10 +3485,11 @@ qemuDomainChangeNet(virQEMUDriverPtr driver, olddev = *devslot; oldType = virDomainNetGetActualType(olddev); - if (oldType == VIR_DOMAIN_NET_TYPE_HOSTDEV) { - /* no changes are possible to a type='hostdev' interface */ + if (oldType == VIR_DOMAIN_NET_TYPE_HOSTDEV || + oldType == VIR_DOMAIN_NET_TYPE_VDPA) { + /* no changes are possible to a type='hostdev' or type='vdpa' interface */ virReportError(VIR_ERR_OPERATION_UNSUPPORTED, - _("cannot change config of '%s' network type"), + _("cannot change config of '%s' network interface type"), virDomainNetTypeToString(oldType)); goto cleanup; } @@ -3673,8 +3674,9 @@ qemuDomainChangeNet(virQEMUDriverPtr driver, newType = virDomainNetGetActualType(newdev); - if (newType == VIR_DOMAIN_NET_TYPE_HOSTDEV) { - /* can't turn it into a type='hostdev' interface */ + if (newType == VIR_DOMAIN_NET_TYPE_HOSTDEV || + newType == VIR_DOMAIN_NET_TYPE_VDPA) { + /* can't turn it into a type='hostdev' or type='vdpa' interface */ virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _("cannot change network interface type to '%s'"), virDomainNetTypeToString(newType)); diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c index b24f9060a9..3714828fe1 100644 --- a/src/qemu/qemu_interface.c +++ b/src/qemu/qemu_interface.c @@ -638,6 +638,29 @@ qemuInterfaceBridgeConnect(virDomainDefPtr def, } +/* qemuInterfaceVDPAConnect: + * @net: pointer to the VM's interface description + * + * returns: file descriptor of the vdpa device + * + * Called *only* called if actualType is VIR_DOMAIN_NET_TYPE_VDPA + */ +int +qemuInterfaceVDPAConnect(virDomainNetDefPtr net) +{ + int fd; + + if ((fd = open(net->data.vdpa.devicepath, O_RDWR)) < 0) { + virReportSystemError(errno, + _("Unable to open '%s' for vdpa device"), + net->data.vdpa.devicepath); + return -1; + } + + return fd; +} + + qemuSlirpPtr qemuInterfacePrepareSlirp(virQEMUDriverPtr driver, virDomainNetDefPtr net) diff --git a/src/qemu/qemu_interface.h b/src/qemu/qemu_interface.h index 3dcefc6a12..1ba24f0a6f 100644 --- a/src/qemu/qemu_interface.h +++ b/src/qemu/qemu_interface.h @@ -58,3 +58,5 @@ int qemuInterfaceOpenVhostNet(virDomainDefPtr def, qemuSlirpPtr qemuInterfacePrepareSlirp(virQEMUDriverPtr driver, virDomainNetDefPtr net); + +int qemuInterfaceVDPAConnect(virDomainNetDefPtr net) G_GNUC_NO_INLINE; diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 5708334d2f..9e7d793c6f 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1369,7 +1369,15 @@ qemuMigrationSrcIsAllowed(virQEMUDriverPtr driver, for (i = 0; i < vm->def->nnets; i++) { virDomainNetDefPtr net = vm->def->nets[i]; - qemuSlirpPtr slirp = QEMU_DOMAIN_NETWORK_PRIVATE(net)->slirp; + qemuSlirpPtr slirp; + + if (net->type == VIR_DOMAIN_NET_TYPE_VDPA) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("vDPA devices cannot be migrated")); + return false; + } + + slirp = QEMU_DOMAIN_NETWORK_PRIVATE(net)->slirp; if (slirp && !qemuSlirpHasFeature(slirp, QEMU_SLIRP_FEATURE_MIGRATE)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index 1c803c98f7..d9dca1ec5c 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -1245,6 +1245,20 @@ qemuValidateDomainDeviceDefNetwork(const virDomainNetDef *net, } } } + } else if (net->type == VIR_DOMAIN_NET_TYPE_VDPA) { + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_NETDEV_VHOST_VDPA)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("vDPA devices are not supported with this QEMU binary")); + return -1; + } + + if (net->model != VIR_DOMAIN_NET_MODEL_VIRTIO) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("invalid model for interface of type '%s': '%s'"), + virDomainNetTypeToString(net->type), + virDomainNetModelTypeToString(net->model)); + return -1; + } } else if (net->guestIP.nroutes || net->guestIP.nips) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Invalid attempt to set network interface " diff --git a/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args new file mode 100644 index 0000000000..3a9667de48 --- /dev/null +++ b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args @@ -0,0 +1,37 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-QEMUGuest1 \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-i386 \ +-name guest=QEMUGuest1,debug-threads=on \ +-S \ +-object secret,id=masterKey0,format=raw,\ +file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \ +-machine pc,accel=tcg,usb=off,dump-guest-core=off \ +-cpu qemu64 \ +-m 214 \ +-overcommit mem-lock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=1729,server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-boot strict=on \ +-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ +-add-fd set=0,fd=1732,opaque=/dev/vhost-vdpa-0 \ +-netdev vhost-vdpa,vhostdev=/dev/fdset/0,id=hostnet0 \ +-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:95:db:c0,bus=pci.0,\ +addr=0x2 \ +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,\ +resourcecontrol=deny \ +-msg timestamp=on diff --git a/tests/qemuxml2argvdata/net-vdpa.xml b/tests/qemuxml2argvdata/net-vdpa.xml new file mode 100644 index 0000000000..30cca7eb6e --- /dev/null +++ b/tests/qemuxml2argvdata/net-vdpa.xml @@ -0,0 +1,28 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i386</emulator> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <interface type='vdpa'> + <mac address='52:54:00:95:db:c0'/> + <source dev='/dev/vhost-vdpa-0'/> + </interface> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvmock.c b/tests/qemuxml2argvmock.c index 9bf4357b66..ea2e74178f 100644 --- a/tests/qemuxml2argvmock.c +++ b/tests/qemuxml2argvmock.c @@ -208,7 +208,7 @@ virHostGetDRMRenderNode(void) static void (*real_virCommandPassFD)(virCommandPtr cmd, int fd, unsigned int flags); -static const int testCommandPassSafeFDs[] = { 1730, 1731 }; +static const int testCommandPassSafeFDs[] = { 1730, 1731, 1732 }; void virCommandPassFD(virCommandPtr cmd, @@ -286,3 +286,12 @@ qemuBuildTPMOpenBackendFDs(const char *tpmdev G_GNUC_UNUSED, *cancelfd = 1731; return 0; } + + +int +qemuInterfaceVDPAConnect(virDomainNetDefPtr net G_GNUC_UNUSED) +{ + if (fcntl(1732, F_GETFD) != -1) + abort(); + return 1732; +} diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 2b97eb80a4..2173c3479f 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1467,6 +1467,7 @@ mymain(void) QEMU_CAPS_DEVICE_VFIO_PCI); DO_TEST_FAILURE("net-hostdev-fail", QEMU_CAPS_DEVICE_VFIO_PCI); + DO_TEST_CAPS_LATEST("net-vdpa"); DO_TEST("hostdev-pci-multifunction", QEMU_CAPS_KVM, diff --git a/tests/qemuxml2xmloutdata/net-vdpa.xml b/tests/qemuxml2xmloutdata/net-vdpa.xml new file mode 100644 index 0000000000..b362405c14 --- /dev/null +++ b/tests/qemuxml2xmloutdata/net-vdpa.xml @@ -0,0 +1,34 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-i386</emulator> + <controller type='usb' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='ide' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='pci' index='0' model='pci-root'/> + <interface type='vdpa'> + <mac address='52:54:00:95:db:c0'/> + <source dev='/dev/vhost-vdpa-0'/> + <model type='virtio'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </interface> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 2bf8dd5b14..1ce21f0519 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -497,6 +497,7 @@ mymain(void) DO_TEST("net-mtu", NONE); DO_TEST("net-coalesce", NONE); DO_TEST("net-many-models", NONE); + DO_TEST("net-vdpa", QEMU_CAPS_NETDEV_VHOST_VDPA); DO_TEST("serial-tcp-tlsx509-chardev", NONE); DO_TEST("serial-tcp-tlsx509-chardev-notls", NONE); -- 2.26.2

add-fd, remove-fd, and query-fdsets provide functionality that can be used for passing fds to qemu and closing fdsets that are no longer necessary. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_monitor.c | 93 +++++++++++++++++++ src/qemu/qemu_monitor.h | 41 +++++++++ src/qemu/qemu_monitor_json.c | 173 +++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 12 +++ 4 files changed, 319 insertions(+) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index ab3bcc761e..b33f2eec0a 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -2651,6 +2651,99 @@ qemuMonitorGraphicsRelocate(qemuMonitorPtr mon, } +/** + * qemuMonitorAddFileHandleToSet: + * @mon: monitor object + * @fd: file descriptor to pass to qemu + * @fdset: the fdset to register this fd with, -1 to create a new fdset + * @opaque: opaque data to associated with this fd + * @info: structure that will be updated with the fd and fdset returned by qemu + * + * Attempts to register a file descriptor with qemu that can then be referenced + * via the file path /dev/fdset/$FDSETID + * Returns 0 if ok, and -1 on failure */ +int +qemuMonitorAddFileHandleToSet(qemuMonitorPtr mon, + int fd, + int fdset, + const char *opaque, + qemuMonitorAddFdInfoPtr info) +{ + VIR_DEBUG("fd=%d,fdset=%i,opaque=%s", fd, fdset, opaque); + + QEMU_CHECK_MONITOR(mon); + + if (fd < 0) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("fd must be valid")); + return -1; + } + + return qemuMonitorJSONAddFileHandleToSet(mon, fd, fdset, opaque, info); +} + + +/** + * qemuMonitorRemoveFdset: + * @mon: monitor object + * @fdset: the fdset to remove + * + * Attempts to remove a fdset from qemu and close associated file descriptors + * Returns 0 if ok, and -1 on failure */ +int +qemuMonitorRemoveFdset(qemuMonitorPtr mon, + int fdset) +{ + VIR_DEBUG("fdset=%d", fdset); + + QEMU_CHECK_MONITOR(mon); + + if (fdset < 0) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("fdset must be valid")); + return -1; + } + + return qemuMonitorJSONRemoveFdset(mon, fdset); +} + + +void qemuMonitorFdsetsFree(qemuMonitorFdsetsPtr fdsets) +{ + size_t i; + + for (i = 0; i < fdsets->nfdsets; i++) { + size_t j; + qemuMonitorFdsetInfoPtr set = &fdsets->fdsets[i]; + + for (j = 0; j < set->nfds; j++) + g_free(set->fds[j].opaque); + } + g_free(fdsets->fdsets); + g_free(fdsets); +} + + +/** + * qemuMonitorQueryFdsets: + * @mon: monitor object + * @fdsets: a pointer that is filled with a new qemuMonitorFdsets struct + * + * Queries qemu for the fdsets that are registered with that instance, and + * returns a structure describing those fdsets. The returned struct should be + * freed with qemuMonitorFdsetsFree(); + * + * Returns 0 if ok, and -1 on failure */ +int +qemuMonitorQueryFdsets(qemuMonitorPtr mon, + qemuMonitorFdsetsPtr *fdsets) +{ + QEMU_CHECK_MONITOR(mon); + + return qemuMonitorJSONQueryFdsets(mon, fdsets); +} + + int qemuMonitorSendFileHandle(qemuMonitorPtr mon, const char *fdname, diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index d3f7797085..ac97aedf8a 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -880,6 +880,47 @@ int qemuMonitorGraphicsRelocate(qemuMonitorPtr mon, int tlsPort, const char *tlsSubject); +typedef struct _qemuMonitorAddFdInfo qemuMonitorAddFdInfo; +typedef qemuMonitorAddFdInfo *qemuMonitorAddFdInfoPtr; +struct _qemuMonitorAddFdInfo { + int fd; + int fdset; +}; +int +qemuMonitorAddFileHandleToSet(qemuMonitorPtr mon, + int fd, + int fdset, + const char *opaque, + qemuMonitorAddFdInfoPtr info); + +int +qemuMonitorRemoveFdset(qemuMonitorPtr mon, + int fdset); + +typedef struct _qemuMonitorFdsetFdInfo qemuMonitorFdsetFdInfo; +typedef qemuMonitorFdsetFdInfo *qemuMonitorFdsetFdInfoPtr; +struct _qemuMonitorFdsetFdInfo { + int fd; + char *opaque; +}; +typedef struct _qemuMonitorFdsetInfo qemuMonitorFdsetInfo; +typedef qemuMonitorFdsetInfo *qemuMonitorFdsetInfoPtr; +struct _qemuMonitorFdsetInfo { + int id; + qemuMonitorFdsetFdInfoPtr fds; + int nfds; +}; +typedef struct _qemuMonitorFdsets qemuMonitorFdsets; +typedef qemuMonitorFdsets *qemuMonitorFdsetsPtr; +struct _qemuMonitorFdsets { + qemuMonitorFdsetInfoPtr fdsets; + int nfdsets; +}; +void qemuMonitorFdsetsFree(qemuMonitorFdsetsPtr fdsets); +G_DEFINE_AUTOPTR_CLEANUP_FUNC(qemuMonitorFdsets, qemuMonitorFdsetsFree); +int qemuMonitorQueryFdsets(qemuMonitorPtr mon, + qemuMonitorFdsetsPtr *fdsets); + int qemuMonitorSendFileHandle(qemuMonitorPtr mon, const char *fdname, int fd); diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index e6d2e7d4db..caa281dffa 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -3936,6 +3936,179 @@ int qemuMonitorJSONGraphicsRelocate(qemuMonitorPtr mon, } +static int +qemuAddfdInfoParse(virJSONValuePtr msg, + qemuMonitorAddFdInfoPtr fdinfo) +{ + virJSONValuePtr returnObj; + + if (!(returnObj = virJSONValueObjectGetObject(msg, "return"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing or invalid return data in add-fd response")); + return -1; + } + + if (virJSONValueObjectGetNumberInt(returnObj, "fd", &fdinfo->fd) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing or invalid fd in add-fd response")); + return -1; + } + + if (virJSONValueObjectGetNumberInt(returnObj, "fdset-id", &fdinfo->fdset) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing or invalid fdset-id in add-fd response")); + return -1; + } + + return 0; +} + + +/* if fdset is negative, qemu will create a new fdset and add the fd to that */ +int qemuMonitorJSONAddFileHandleToSet(qemuMonitorPtr mon, + int fd, + int fdset, + const char *opaque, + qemuMonitorAddFdInfoPtr fdinfo) +{ + virJSONValuePtr args = NULL; + g_autoptr(virJSONValue) reply = NULL; + g_autoptr(virJSONValue) cmd = NULL; + + if (virJSONValueObjectCreate(&args, "S:opaque", opaque, NULL) < 0) + return -1; + + if (fdset >= 0) + if (virJSONValueObjectAdd(args, "j:fdset-id", fdset, NULL) < 0) + return -1; + + if (!(cmd = qemuMonitorJSONMakeCommandInternal("add-fd", args))) + return -1; + + if (qemuMonitorJSONCommandWithFd(mon, cmd, fd, &reply) < 0) + return -1; + + if (qemuMonitorJSONCheckError(cmd, reply) < 0) + return -1; + + if (qemuAddfdInfoParse(reply, fdinfo) < 0) + return -1; + + return 0; +} + +static int +qemuMonitorJSONQueryFdsetsParse(virJSONValuePtr msg, + qemuMonitorFdsetsPtr *fdsets) +{ + virJSONValuePtr returnArray, entry; + size_t i; + g_autoptr(qemuMonitorFdsets) sets = g_new0(qemuMonitorFdsets, 1); + int ninfo; + + returnArray = virJSONValueObjectGetArray(msg, "return"); + + ninfo = virJSONValueArraySize(returnArray); + if (ninfo > 0) + sets->fdsets = g_new0(qemuMonitorFdsetInfo, ninfo); + sets->nfdsets = ninfo; + + for (i = 0; i < ninfo; i++) { + size_t j; + const char *tmp; + virJSONValuePtr fdarray; + qemuMonitorFdsetInfoPtr fdsetinfo = &sets->fdsets[i]; + + if (!(entry = virJSONValueArrayGet(returnArray, i))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-fdsets return data missing fdset array element")); + return -1; + } + + if (virJSONValueObjectGetNumberInt(entry, "fdset-id", &fdsetinfo->id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-fdsets reply was missing 'fdset-id'")); + return -1; + + } + + fdarray = virJSONValueObjectGetArray(entry, "fds"); + fdsetinfo->nfds = virJSONValueArraySize(fdarray); + if (fdsetinfo->nfds > 0) + fdsetinfo->fds = g_new0(qemuMonitorFdsetFdInfo, fdsetinfo->nfds); + + for (j = 0; j < fdsetinfo->nfds; j++) { + qemuMonitorFdsetFdInfoPtr fdinfo = &fdsetinfo->fds[j]; + virJSONValuePtr fdentry; + + if (!(fdentry = virJSONValueArrayGet(fdarray, j))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-fdsets return data missing fd array element")); + return -1; + } + + if (virJSONValueObjectGetNumberInt(fdentry, "fd", &fdinfo->fd) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("query-fdsets return data missing 'fd'")); + return -1; + } + + /* opaque is optional and may be missing */ + tmp = virJSONValueObjectGetString(fdentry, "opaque"); + if (tmp) + fdinfo->opaque = g_strdup(tmp); + } + } + + *fdsets = g_steal_pointer(&sets); + return 0; +} + + +int qemuMonitorJSONQueryFdsets(qemuMonitorPtr mon, + qemuMonitorFdsetsPtr *fdsets) +{ + g_autoptr(virJSONValue) reply = NULL; + g_autoptr(virJSONValue) cmd = qemuMonitorJSONMakeCommand("query-fdsets", + NULL); + + if (!cmd) + return -1; + + if (qemuMonitorJSONCommand(mon, cmd, &reply) < 0) + return -1; + + if (qemuMonitorJSONCheckError(cmd, reply) < 0) + return -1; + + if (qemuMonitorJSONQueryFdsetsParse(reply, fdsets) < 0) + return -1; + + return 0; +} + + +int qemuMonitorJSONRemoveFdset(qemuMonitorPtr mon, + int fdset) +{ + g_autoptr(virJSONValue) reply = NULL; + g_autoptr(virJSONValue) cmd = qemuMonitorJSONMakeCommand("remove-fd", + "i:fdset-id", fdset, + NULL); + + if (!cmd) + return -1; + + if (qemuMonitorJSONCommand(mon, cmd, &reply) < 0) + return -1; + + if (qemuMonitorJSONCheckError(cmd, reply) < 0) + return -1; + + return 0; +} + + int qemuMonitorJSONSendFileHandle(qemuMonitorPtr mon, const char *fdname, int fd) diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index 098ab857be..2b9a42efe0 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -202,6 +202,18 @@ int qemuMonitorJSONAddPCINetwork(qemuMonitorPtr mon, int qemuMonitorJSONRemovePCIDevice(qemuMonitorPtr mon, virPCIDeviceAddress *guestAddr); +int qemuMonitorJSONAddFileHandleToSet(qemuMonitorPtr mon, + int fd, + int fdset, + const char *opaque, + qemuMonitorAddFdInfoPtr info); + +int qemuMonitorJSONRemoveFdset(qemuMonitorPtr mon, + int fdset); + +int qemuMonitorJSONQueryFdsets(qemuMonitorPtr mon, + qemuMonitorFdsetsPtr *fdsets); + int qemuMonitorJSONSendFileHandle(qemuMonitorPtr mon, const char *fdname, int fd); -- 2.26.2

By using the new qemu monitor functions to handle passing and removing file descriptors, we can support hotplug of vdpa devices. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_hotplug.c | 60 +++++++++++++++++-- tests/qemuhotplugmock.c | 9 +++ tests/qemuhotplugtest.c | 16 +++++ .../qemuhotplug-interface-vdpa.xml | 4 ++ .../qemuhotplug-base-live+interface-vdpa.xml | 57 ++++++++++++++++++ 5 files changed, 142 insertions(+), 4 deletions(-) create mode 100644 tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml create mode 100644 tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 0582b78f97..3a2aff607c 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1152,6 +1152,8 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, virErrorPtr originalError = NULL; g_autofree char *slirpfdName = NULL; int slirpfd = -1; + g_autofree char *vdpafdName = NULL; + int vdpafd = -1; char **tapfdName = NULL; int *tapfd = NULL; size_t tapfdSize = 0; @@ -1335,12 +1337,16 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, /* hostdev interfaces were handled earlier in this function */ break; + case VIR_DOMAIN_NET_TYPE_VDPA: + if ((vdpafd = qemuInterfaceVDPAConnect(net)) < 0) + goto cleanup; + break; + case VIR_DOMAIN_NET_TYPE_SERVER: case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: - case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _("hotplug of interface type of %s is not implemented yet"), @@ -1386,14 +1392,28 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, for (i = 0; i < vhostfdSize; i++) vhostfdName[i] = g_strdup_printf("vhostfd-%s%zu", net->info.alias, i); + qemuDomainObjEnterMonitor(driver, vm); + + if (vdpafd > 0) { + /* vhost-vdpa only takes a filename for the dev, but we want to pass an + * open fd to qemu. Passing -1 as the fdset-id will create a new fdset + * and return the id of that fdset */ + qemuMonitorAddFdInfo fdinfo; + if (qemuMonitorAddFileHandleToSet(priv->mon, vdpafd, -1, + net->data.vdpa.devicepath, + &fdinfo) < 0) { + ignore_value(qemuDomainObjExitMonitor(driver, vm)); + goto cleanup; + } + vdpafdName = g_strdup_printf("/dev/fdset/%d", fdinfo.fdset); + } + if (!(netprops = qemuBuildHostNetStr(net, tapfdName, tapfdSize, vhostfdName, vhostfdSize, - slirpfdName, NULL))) + slirpfdName, vdpafdName))) goto cleanup; - qemuDomainObjEnterMonitor(driver, vm); - if (actualType == VIR_DOMAIN_NET_TYPE_VHOSTUSER) { if (qemuMonitorAttachCharDev(priv->mon, charDevAlias, net->data.vhostuser) < 0) { ignore_value(qemuDomainObjExitMonitor(driver, vm)); @@ -1518,6 +1538,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, VIR_FREE(vhostfdName); virDomainCCWAddressSetFree(ccwaddrs); VIR_FORCE_CLOSE(slirpfd); + VIR_FORCE_CLOSE(vdpafd); return ret; @@ -4586,8 +4607,39 @@ qemuDomainRemoveNetDevice(virQEMUDriverPtr driver, * to just ignore the error and carry on. */ } + } else if (actualType == VIR_DOMAIN_NET_TYPE_VDPA) { + int vdpafdset = -1; + g_autoptr(qemuMonitorFdsets) fdsets = NULL; + + /* query qemu for which fdset is associated with the fd that we passed + * to qemu via 'add-fd' for this vdpa device. If we don't remove the + * fd, qemu will keep it open */ + if (qemuMonitorQueryFdsets(priv->mon, &fdsets) == 0) { + for (i = 0; i < fdsets->nfdsets && vdpafdset < 0; i++) { + size_t j; + qemuMonitorFdsetInfoPtr set = &fdsets->fdsets[i]; + + for (j = 0; j < set->nfds; j++) { + qemuMonitorFdsetFdInfoPtr fdinfo = &set->fds[j]; + if (STREQ_NULLABLE(fdinfo->opaque, net->data.vdpa.devicepath)) { + vdpafdset = set->id; + break; + } + } + } + } + + if (vdpafdset < 0) { + VIR_WARN("Cannot determine fdset for vdpa device"); + } else { + if (qemuMonitorRemoveFdset(priv->mon, vdpafdset) < 0) { + /* if it fails, there's not much we can do... just carry on */ + VIR_WARN("failed to close vdpa device"); + } + } } + if (qemuDomainObjExitMonitor(driver, vm) < 0) return -1; diff --git a/tests/qemuhotplugmock.c b/tests/qemuhotplugmock.c index 29fac8a598..d2e32ecf7e 100644 --- a/tests/qemuhotplugmock.c +++ b/tests/qemuhotplugmock.c @@ -19,11 +19,13 @@ #include <config.h> #include "qemu/qemu_hotplug.h" +#include "qemu/qemu_interface.h" #include "qemu/qemu_process.h" #include "conf/domain_conf.h" #include "virdevmapper.h" #include "virutil.h" #include "virmock.h" +#include <fcntl.h> static int (*real_virGetDeviceID)(const char *path, int *maj, int *min); static bool (*real_virFileExists)(const char *path); @@ -106,3 +108,10 @@ void qemuProcessKillManagedPRDaemon(virDomainObjPtr vm G_GNUC_UNUSED) { } + +int +qemuInterfaceVDPAConnect(virDomainNetDefPtr net G_GNUC_UNUSED) +{ + /* need a valid fd or sendmsg won't work. Just open /dev/null */ + return open("/dev/null", O_RDONLY); +} diff --git a/tests/qemuhotplugtest.c b/tests/qemuhotplugtest.c index 2d12cacf28..b7cebfc0e7 100644 --- a/tests/qemuhotplugtest.c +++ b/tests/qemuhotplugtest.c @@ -89,6 +89,7 @@ qemuHotplugCreateObjects(virDomainXMLOptionPtr xmlopt, virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_SPICE_FILE_XFER_DISABLE); virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_PR_MANAGER_HELPER); virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_SCSI_BLOCK); + virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_NETDEV_VHOST_VDPA); if (qemuTestCapsCacheInsert(driver.qemuCapsCache, priv->qemuCaps) < 0) return -1; @@ -140,6 +141,9 @@ testQemuHotplugAttach(virDomainObjPtr vm, case VIR_DOMAIN_DEVICE_HOSTDEV: ret = qemuDomainAttachHostDevice(&driver, vm, dev->data.hostdev); break; + case VIR_DOMAIN_DEVICE_NET: + ret = qemuDomainAttachNetDevice(&driver, vm, dev->data.net); + break; default: VIR_TEST_VERBOSE("device type '%s' cannot be attached", virDomainDeviceTypeToString(dev->type)); @@ -162,6 +166,7 @@ testQemuHotplugDetach(virDomainObjPtr vm, case VIR_DOMAIN_DEVICE_SHMEM: case VIR_DOMAIN_DEVICE_WATCHDOG: case VIR_DOMAIN_DEVICE_HOSTDEV: + case VIR_DOMAIN_DEVICE_NET: ret = qemuDomainDetachDeviceLive(vm, dev, &driver, async); break; default: @@ -823,6 +828,17 @@ mymain(void) DO_TEST_DETACH("pseries-base-live", "hostdev-pci", false, false, "device_del", QMP_DEVICE_DELETED("hostdev0") QMP_OK); + DO_TEST_ATTACH("base-live", "interface-vdpa", false, true, + "add-fd", "{ \"return\": { \"fdset-id\": 1, \"fd\": 95 }}", + "netdev_add", QMP_OK, "device_add", QMP_OK); + DO_TEST_DETACH("base-live", "interface-vdpa", false, false, + "device_del", QMP_DEVICE_DELETED("net0") QMP_OK, + "netdev_del", QMP_OK, + "query-fdsets", + "{ \"return\": [{\"fds\": [{\"fd\": 95, \"opaque\": \"/dev/vhost-vdpa-0\"}], \"fdset-id\": 1}]}", + "remove-fd", QMP_OK + ); + DO_TEST_ATTACH("base-live", "watchdog", false, true, "watchdog-set-action", QMP_OK, "device_add", QMP_OK); diff --git a/tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml b/tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml new file mode 100644 index 0000000000..e42ca08d31 --- /dev/null +++ b/tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml @@ -0,0 +1,4 @@ +<interface type='vdpa'> + <mac address='52:54:00:39:5f:04'/> + <source dev='/dev/vhost-vdpa-0'/> +</interface> diff --git a/tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml b/tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml new file mode 100644 index 0000000000..066180bb3c --- /dev/null +++ b/tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml @@ -0,0 +1,57 @@ +<domain type='kvm' id='7'> + <name>hotplug</name> + <uuid>d091ea82-29e6-2e34-3005-f02617b36e87</uuid> + <memory unit='KiB'>4194304</memory> + <currentMemory unit='KiB'>4194304</currentMemory> + <vcpu placement='static'>4</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <features> + <acpi/> + <apic/> + <pae/> + </features> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>restart</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <controller type='usb' index='0'> + <alias name='usb'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='ide' index='0'> + <alias name='ide'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <alias name='scsi0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'> + <alias name='pci'/> + </controller> + <controller type='virtio-serial' index='0'> + <alias name='virtio-serial0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </controller> + <interface type='vdpa'> + <mac address='52:54:00:39:5f:04'/> + <source dev='/dev/vhost-vdpa-0'/> + <model type='virtio'/> + <alias name='net0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> + </interface> + <input type='mouse' bus='ps2'> + <alias name='input0'/> + </input> + <input type='keyboard' bus='ps2'> + <alias name='input1'/> + </input> + <memballoon model='none'/> + </devices> + <seclabel type='none' model='none'/> +</domain> -- 2.26.2

On 9/24/20 5:45 PM, Jonathon Jongsma wrote:
By using the new qemu monitor functions to handle passing and removing file descriptors, we can support hotplug of vdpa devices.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- src/qemu/qemu_hotplug.c | 60 +++++++++++++++++-- tests/qemuhotplugmock.c | 9 +++ tests/qemuhotplugtest.c | 16 +++++ .../qemuhotplug-interface-vdpa.xml | 4 ++ .../qemuhotplug-base-live+interface-vdpa.xml | 57 ++++++++++++++++++ 5 files changed, 142 insertions(+), 4 deletions(-) create mode 100644 tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml create mode 100644 tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 0582b78f97..3a2aff607c 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1152,6 +1152,8 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, virErrorPtr originalError = NULL; g_autofree char *slirpfdName = NULL; int slirpfd = -1; + g_autofree char *vdpafdName = NULL; + int vdpafd = -1; char **tapfdName = NULL; int *tapfd = NULL; size_t tapfdSize = 0; @@ -1335,12 +1337,16 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, /* hostdev interfaces were handled earlier in this function */ break;
+ case VIR_DOMAIN_NET_TYPE_VDPA: + if ((vdpafd = qemuInterfaceVDPAConnect(net)) < 0) + goto cleanup; + break; + case VIR_DOMAIN_NET_TYPE_SERVER: case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: case VIR_DOMAIN_NET_TYPE_INTERNAL: case VIR_DOMAIN_NET_TYPE_UDP: - case VIR_DOMAIN_NET_TYPE_VDPA: case VIR_DOMAIN_NET_TYPE_LAST: virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _("hotplug of interface type of %s is not implemented yet"), @@ -1386,14 +1392,28 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, for (i = 0; i < vhostfdSize; i++) vhostfdName[i] = g_strdup_printf("vhostfd-%s%zu", net->info.alias, i);
+ qemuDomainObjEnterMonitor(driver, vm);
So this was moved up ahead of the call to qemuBuildHostNetStr()...
+ + if (vdpafd > 0) { + /* vhost-vdpa only takes a filename for the dev, but we want to pass an + * open fd to qemu. Passing -1 as the fdset-id will create a new fdset + * and return the id of that fdset */ + qemuMonitorAddFdInfo fdinfo; + if (qemuMonitorAddFileHandleToSet(priv->mon, vdpafd, -1, + net->data.vdpa.devicepath, + &fdinfo) < 0) { + ignore_value(qemuDomainObjExitMonitor(driver, vm)); + goto cleanup;
... and here you ExitMonitor prior to goto cleanup on failure...
+ } + vdpafdName = g_strdup_printf("/dev/fdset/%d", fdinfo.fdset); + } + if (!(netprops = qemuBuildHostNetStr(net, tapfdName, tapfdSize, vhostfdName, vhostfdSize, - slirpfdName, NULL))) + slirpfdName, vdpafdName))) goto cleanup;
...but here you don't. (and should) (NB: this change does put qemuBuildHostNetStr() inside the Monitor section, but it just does a bit of string shuffling/creation, so that's not a big deal)
- qemuDomainObjEnterMonitor(driver, vm); -
(^^ old location of qemuDomainObjEnterMonitor())
if (actualType == VIR_DOMAIN_NET_TYPE_VHOSTUSER) { if (qemuMonitorAttachCharDev(priv->mon, charDevAlias, net->data.vhostuser) < 0) { ignore_value(qemuDomainObjExitMonitor(driver, vm)); @@ -1518,6 +1538,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver, VIR_FREE(vhostfdName); virDomainCCWAddressSetFree(ccwaddrs); VIR_FORCE_CLOSE(slirpfd); + VIR_FORCE_CLOSE(vdpafd);
return ret;
@@ -4586,8 +4607,39 @@ qemuDomainRemoveNetDevice(virQEMUDriverPtr driver, * to just ignore the error and carry on. */ } + } else if (actualType == VIR_DOMAIN_NET_TYPE_VDPA) { + int vdpafdset = -1; + g_autoptr(qemuMonitorFdsets) fdsets = NULL; + + /* query qemu for which fdset is associated with the fd that we passed + * to qemu via 'add-fd' for this vdpa device. If we don't remove the + * fd, qemu will keep it open */ + if (qemuMonitorQueryFdsets(priv->mon, &fdsets) == 0) { + for (i = 0; i < fdsets->nfdsets && vdpafdset < 0; i++) { + size_t j; + qemuMonitorFdsetInfoPtr set = &fdsets->fdsets[i]; + + for (j = 0; j < set->nfds; j++) { + qemuMonitorFdsetFdInfoPtr fdinfo = &set->fds[j]; + if (STREQ_NULLABLE(fdinfo->opaque, net->data.vdpa.devicepath)) { + vdpafdset = set->id; + break; + } + } + } + } + + if (vdpafdset < 0) { + VIR_WARN("Cannot determine fdset for vdpa device"); + } else { + if (qemuMonitorRemoveFdset(priv->mon, vdpafdset) < 0) { + /* if it fails, there's not much we can do... just carry on */ + VIR_WARN("failed to close vdpa device"); + } + }
I agree there's not much we can do here to make the situation better, but is it really going to be okay to inform the user that the device is now free, since it apparently isn't? If we go ahead and send the DEVICE_DELETED event up to the application, then it will think that the same vdpa device is now available to be re-used elsewhere. Do you have an idea what are the odds on that being true? (I don't, that's why I'm asking :-)). It may be safer to return failure, so the device is just stuck shown as in-use by this guest; that would be bad, but maybe not as bad as if it was still actually being used by this guest somehow (possible, since the fd couldn't be deleted), and a 2nd guest started using it too. (I really don't know what the consequences of any of this might be; just trying to inject my sunny disposition into the mix; truthfully I'd be willing to accept either way, just wanted to make sure it's considered).
}
+ if (qemuDomainObjExitMonitor(driver, vm) < 0) return -1;
diff --git a/tests/qemuhotplugmock.c b/tests/qemuhotplugmock.c index 29fac8a598..d2e32ecf7e 100644 --- a/tests/qemuhotplugmock.c +++ b/tests/qemuhotplugmock.c @@ -19,11 +19,13 @@ #include <config.h>
#include "qemu/qemu_hotplug.h" +#include "qemu/qemu_interface.h" #include "qemu/qemu_process.h" #include "conf/domain_conf.h" #include "virdevmapper.h" #include "virutil.h" #include "virmock.h" +#include <fcntl.h>
static int (*real_virGetDeviceID)(const char *path, int *maj, int *min); static bool (*real_virFileExists)(const char *path); @@ -106,3 +108,10 @@ void qemuProcessKillManagedPRDaemon(virDomainObjPtr vm G_GNUC_UNUSED) { } + +int +qemuInterfaceVDPAConnect(virDomainNetDefPtr net G_GNUC_UNUSED) +{ + /* need a valid fd or sendmsg won't work. Just open /dev/null */ + return open("/dev/null", O_RDONLY); +} diff --git a/tests/qemuhotplugtest.c b/tests/qemuhotplugtest.c index 2d12cacf28..b7cebfc0e7 100644 --- a/tests/qemuhotplugtest.c +++ b/tests/qemuhotplugtest.c @@ -89,6 +89,7 @@ qemuHotplugCreateObjects(virDomainXMLOptionPtr xmlopt, virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_SPICE_FILE_XFER_DISABLE); virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_PR_MANAGER_HELPER); virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_SCSI_BLOCK); + virQEMUCapsSet(priv->qemuCaps, QEMU_CAPS_NETDEV_VHOST_VDPA);
if (qemuTestCapsCacheInsert(driver.qemuCapsCache, priv->qemuCaps) < 0) return -1; @@ -140,6 +141,9 @@ testQemuHotplugAttach(virDomainObjPtr vm, case VIR_DOMAIN_DEVICE_HOSTDEV: ret = qemuDomainAttachHostDevice(&driver, vm, dev->data.hostdev); break; + case VIR_DOMAIN_DEVICE_NET: + ret = qemuDomainAttachNetDevice(&driver, vm, dev->data.net); + break;
Nice attention to detail - nobody before you has bothered with a hotplug test for a network device :-)
default: VIR_TEST_VERBOSE("device type '%s' cannot be attached", virDomainDeviceTypeToString(dev->type)); @@ -162,6 +166,7 @@ testQemuHotplugDetach(virDomainObjPtr vm, case VIR_DOMAIN_DEVICE_SHMEM: case VIR_DOMAIN_DEVICE_WATCHDOG: case VIR_DOMAIN_DEVICE_HOSTDEV: + case VIR_DOMAIN_DEVICE_NET: ret = qemuDomainDetachDeviceLive(vm, dev, &driver, async); break; default: @@ -823,6 +828,17 @@ mymain(void) DO_TEST_DETACH("pseries-base-live", "hostdev-pci", false, false, "device_del", QMP_DEVICE_DELETED("hostdev0") QMP_OK);
+ DO_TEST_ATTACH("base-live", "interface-vdpa", false, true, + "add-fd", "{ \"return\": { \"fdset-id\": 1, \"fd\": 95 }}", + "netdev_add", QMP_OK, "device_add", QMP_OK); + DO_TEST_DETACH("base-live", "interface-vdpa", false, false, + "device_del", QMP_DEVICE_DELETED("net0") QMP_OK, + "netdev_del", QMP_OK, + "query-fdsets", + "{ \"return\": [{\"fds\": [{\"fd\": 95, \"opaque\": \"/dev/vhost-vdpa-0\"}], \"fdset-id\": 1}]}", + "remove-fd", QMP_OK + ); + DO_TEST_ATTACH("base-live", "watchdog", false, true, "watchdog-set-action", QMP_OK, "device_add", QMP_OK); diff --git a/tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml b/tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml new file mode 100644 index 0000000000..e42ca08d31 --- /dev/null +++ b/tests/qemuhotplugtestdevices/qemuhotplug-interface-vdpa.xml @@ -0,0 +1,4 @@ +<interface type='vdpa'> + <mac address='52:54:00:39:5f:04'/> + <source dev='/dev/vhost-vdpa-0'/> +</interface> diff --git a/tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml b/tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml new file mode 100644 index 0000000000..066180bb3c --- /dev/null +++ b/tests/qemuhotplugtestdomains/qemuhotplug-base-live+interface-vdpa.xml @@ -0,0 +1,57 @@ +<domain type='kvm' id='7'> + <name>hotplug</name> + <uuid>d091ea82-29e6-2e34-3005-f02617b36e87</uuid> + <memory unit='KiB'>4194304</memory> + <currentMemory unit='KiB'>4194304</currentMemory> + <vcpu placement='static'>4</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <features> + <acpi/> + <apic/> + <pae/> + </features> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>restart</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <controller type='usb' index='0'> + <alias name='usb'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='ide' index='0'> + <alias name='ide'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='scsi' index='0' model='virtio-scsi'> + <alias name='scsi0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> + </controller> + <controller type='pci' index='0' model='pci-root'> + <alias name='pci'/> + </controller> + <controller type='virtio-serial' index='0'> + <alias name='virtio-serial0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </controller> + <interface type='vdpa'> + <mac address='52:54:00:39:5f:04'/> + <source dev='/dev/vhost-vdpa-0'/> + <model type='virtio'/> + <alias name='net0'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> + </interface> + <input type='mouse' bus='ps2'> + <alias name='input0'/> + </input> + <input type='keyboard' bus='ps2'> + <alias name='input1'/> + </input> + <memballoon model='none'/> + </devices> + <seclabel type='none' model='none'/> +</domain>
With the ExitMonitor() added where indicated, consideration of possibly failing if the fd can't be deleted, and (as with the rest of the series) as long as it's been possible to test with real hardware: Reviewed-by: Laine Stump <laine@redhat.com>

On Tue, 29 Sep 2020 15:53:39 -0400 Laine Stump <laine@redhat.com> wrote:
+ + if (vdpafdset < 0) { + VIR_WARN("Cannot determine fdset for vdpa device"); + } else { + if (qemuMonitorRemoveFdset(priv->mon, vdpafdset) < 0) { + /* if it fails, there's not much we can do... just carry on */ + VIR_WARN("failed to close vdpa device"); + } + }
I agree there's not much we can do here to make the situation better, but is it really going to be okay to inform the user that the device is now free, since it apparently isn't? If we go ahead and send the DEVICE_DELETED event up to the application, then it will think that the same vdpa device is now available to be re-used elsewhere. Do you have an idea what are the odds on that being true? (I don't, that's why I'm asking :-)).
I don't either ;)
It may be safer to return failure, so the device is just stuck shown as in-use by this guest; that would be bad, but maybe not as bad as if it was still actually being used by this guest somehow (possible, since the fd couldn't be deleted), and a 2nd guest started using it too. (I really don't know what the consequences of any of this might be; just trying to inject my sunny disposition into the mix; truthfully I'd be willing to accept either way, just wanted to make sure it's considered).
Well, that's a good point. The reason that I printed a warning rather than returning an error is because I was influenced by some of the nearby code. In order to remove a network device, this function has to do a couple things (depending on the type of network device). First It removes the netdev (netdev_del), and then it may need to do some additional cleanup. For TYPE_VHOSTUSER, it needs to detach a chardev. For TYPE_VDPA, it needs to close the fd that we passed to qemu. So what do we do if 'netdev_del' succeeds, but 'remove-fd' does not? If we return an error from this function, the caller will interpret that as if we failed to remove the network device. But qemu has already removed the netdev. So things are in an inconsistent state. TYPE_VHOSTUSER just carries on without even printing a warning if the chardev can't be removed. So I did something similar here for vDPA, but added a warning. I'm not sure that there's really a "good" solution here. Regarding the possibility of a second guest attempting to use the vdpa device that was unsuccessfully removed: I have only tested with the vdpa_sim kernel module, but if the fd is not closed, attempting to re-use it with a different guest fails like this: error: Failed to attach device from vdpa.xml error: Unable to open '/dev/vhost-vdpa-0' for vdpa device: Device or resource busy Jonathon

On 10/1/20 5:08 PM, Jonathon Jongsma wrote:
On Tue, 29 Sep 2020 15:53:39 -0400 Laine Stump <laine@redhat.com> wrote:
+ + if (vdpafdset < 0) { + VIR_WARN("Cannot determine fdset for vdpa device"); + } else { + if (qemuMonitorRemoveFdset(priv->mon, vdpafdset) < 0) { + /* if it fails, there's not much we can do... just carry on */ + VIR_WARN("failed to close vdpa device"); + } + }
I agree there's not much we can do here to make the situation better, but is it really going to be okay to inform the user that the device is now free, since it apparently isn't? If we go ahead and send the DEVICE_DELETED event up to the application, then it will think that the same vdpa device is now available to be re-used elsewhere. Do you have an idea what are the odds on that being true? (I don't, that's why I'm asking :-)). I don't either ;)
It may be safer to return failure, so the device is just stuck shown as in-use by this guest; that would be bad, but maybe not as bad as if it was still actually being used by this guest somehow (possible, since the fd couldn't be deleted), and a 2nd guest started using it too. (I really don't know what the consequences of any of this might be; just trying to inject my sunny disposition into the mix; truthfully I'd be willing to accept either way, just wanted to make sure it's considered). Well, that's a good point. The reason that I printed a warning rather than returning an error is because I was influenced by some of the nearby code.
In order to remove a network device, this function has to do a couple things (depending on the type of network device). First It removes the netdev (netdev_del), and then it may need to do some additional cleanup. For TYPE_VHOSTUSER, it needs to detach a chardev. For TYPE_VDPA, it needs to close the fd that we passed to qemu. So what do we do if 'netdev_del' succeeds, but 'remove-fd' does not?
If we return an error from this function, the caller will interpret that as if we failed to remove the network device. But qemu has already removed the netdev. So things are in an inconsistent state.
TYPE_VHOSTUSER just carries on without even printing a warning if the chardev can't be removed. So I did something similar here for vDPA, but added a warning. I'm not sure that there's really a "good" solution here.
Regarding the possibility of a second guest attempting to use the vdpa device that was unsuccessfully removed: I have only tested with the vdpa_sim kernel module, but if the fd is not closed, attempting to re-use it with a different guest fails like this:
error: Failed to attach device from vdpa.xml error: Unable to open '/dev/vhost-vdpa-0' for vdpa device: Device or resource busy
Okay, based on that explanation, I think your solution is as good as, or better than, any other.

The current udev node device driver ignores all events related to vdpa devices. Since libvirt now supports vDPA network devices, include these devices in the device list. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- include/libvirt/libvirt-nodedev.h | 1 + src/conf/node_device_conf.c | 5 +++++ src/conf/node_device_conf.h | 4 +++- src/conf/virnodedeviceobj.c | 4 +++- src/node_device/node_device_udev.c | 16 ++++++++++++++++ tools/virsh-nodedev.c | 3 +++ 6 files changed, 31 insertions(+), 2 deletions(-) diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index dd2ffd5782..b73b076f14 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -82,6 +82,7 @@ typedef enum { VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV = 1 << 14, /* Mediated device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CCW_DEV = 1 << 15, /* CCW device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV = 1 << 16, /* CSS device */ + VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA = 1 << 17, /* vDPA device */ } virConnectListAllNodeDeviceFlags; int virConnectListAllNodeDevices (virConnectPtr conn, diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index a9a03ad6c2..3eab1cda75 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -66,6 +66,7 @@ VIR_ENUM_IMPL(virNodeDevCap, "mdev", "ccw", "css", + "vdpa", ); VIR_ENUM_IMPL(virNodeDevNetCap, @@ -614,6 +615,7 @@ virNodeDeviceDefFormat(const virNodeDeviceDef *def) case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_FC_HOST: case VIR_NODE_DEV_CAP_VPORTS: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } @@ -1913,6 +1915,7 @@ virNodeDevCapsDefParseXML(xmlXPathContextPtr ctxt, case VIR_NODE_DEV_CAP_FC_HOST: case VIR_NODE_DEV_CAP_VPORTS: case VIR_NODE_DEV_CAP_SCSI_GENERIC: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: virReportError(VIR_ERR_INTERNAL_ERROR, _("unknown capability type '%d' for '%s'"), @@ -2232,6 +2235,7 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps) case VIR_NODE_DEV_CAP_VPORTS: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: /* This case is here to shutup the compiler */ break; @@ -2286,6 +2290,7 @@ virNodeDeviceUpdateCaps(virNodeDeviceDefPtr def) case VIR_NODE_DEV_CAP_MDEV: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 5484bc340f..4f8e47a068 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -65,6 +65,7 @@ typedef enum { VIR_NODE_DEV_CAP_MDEV, /* Mediated device */ VIR_NODE_DEV_CAP_CCW_DEV, /* s390 CCW device */ VIR_NODE_DEV_CAP_CSS_DEV, /* s390 channel subsystem device */ + VIR_NODE_DEV_CAP_VDPA, /* vDPA device */ VIR_NODE_DEV_CAP_LAST } virNodeDevCapType; @@ -369,7 +370,8 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps); VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV_TYPES | \ VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV | \ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CCW_DEV | \ - VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV) + VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV | \ + VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA) int virNodeDeviceGetSCSIHostCaps(virNodeDevCapSCSIHostPtr scsi_host); diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index 8aefd15e94..83c58ebe91 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -711,6 +711,7 @@ virNodeDeviceObjHasCap(const virNodeDeviceObj *obj, case VIR_NODE_DEV_CAP_MDEV: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } @@ -862,7 +863,8 @@ virNodeDeviceObjMatch(virNodeDeviceObjPtr obj, MATCH(MDEV_TYPES) || MATCH(MDEV) || MATCH(CCW_DEV) || - MATCH(CSS_DEV))) + MATCH(CSS_DEV) || + MATCH(VDPA))) return false; } diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 12e3f30bad..fda72f9071 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1144,6 +1144,18 @@ udevProcessCSS(struct udev_device *device, return 0; } + +static int +udevProcessVDPA(struct udev_device *device, + virNodeDeviceDefPtr def) +{ + if (udevGenerateDeviceName(device, def, NULL) != 0) + return -1; + + return 0; +} + + static int udevGetDeviceNodes(struct udev_device *device, virNodeDeviceDefPtr def) @@ -1224,6 +1236,8 @@ udevGetDeviceType(struct udev_device *device, *type = VIR_NODE_DEV_CAP_CCW_DEV; else if (STREQ_NULLABLE(subsystem, "css")) *type = VIR_NODE_DEV_CAP_CSS_DEV; + else if (STREQ_NULLABLE(subsystem, "vdpa")) + *type = VIR_NODE_DEV_CAP_VDPA; VIR_FREE(subsystem); } @@ -1270,6 +1284,8 @@ udevGetDeviceDetails(struct udev_device *device, return udevProcessCCW(device, def); case VIR_NODE_DEV_CAP_CSS_DEV: return udevProcessCSS(device, def); + case VIR_NODE_DEV_CAP_VDPA: + return udevProcessVDPA(device, def); case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_SYSTEM: case VIR_NODE_DEV_CAP_FC_HOST: diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 2edd403a64..19f0c17b4f 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -464,6 +464,9 @@ cmdNodeListDevices(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) case VIR_NODE_DEV_CAP_CSS_DEV: flags |= VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV; break; + case VIR_NODE_DEV_CAP_VDPA: + flags |= VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA; + break; case VIR_NODE_DEV_CAP_LAST: break; } -- 2.26.2

On 9/24/20 5:45 PM, Jonathon Jongsma wrote:
The current udev node device driver ignores all events related to vdpa devices. Since libvirt now supports vDPA network devices, include these devices in the device list.
Can you provide an example in the commit log of what the output xml looks like for nodedev-list and nodedev-dumpxml?
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- include/libvirt/libvirt-nodedev.h | 1 + src/conf/node_device_conf.c | 5 +++++ src/conf/node_device_conf.h | 4 +++- src/conf/virnodedeviceobj.c | 4 +++- src/node_device/node_device_udev.c | 16 ++++++++++++++++ tools/virsh-nodedev.c | 3 +++ 6 files changed, 31 insertions(+), 2 deletions(-)
diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index dd2ffd5782..b73b076f14 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -82,6 +82,7 @@ typedef enum { VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV = 1 << 14, /* Mediated device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CCW_DEV = 1 << 15, /* CCW device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV = 1 << 16, /* CSS device */ + VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA = 1 << 17, /* vDPA device */ } virConnectListAllNodeDeviceFlags;
int virConnectListAllNodeDevices (virConnectPtr conn, diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index a9a03ad6c2..3eab1cda75 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -66,6 +66,7 @@ VIR_ENUM_IMPL(virNodeDevCap, "mdev", "ccw", "css", + "vdpa", );
VIR_ENUM_IMPL(virNodeDevNetCap, @@ -614,6 +615,7 @@ virNodeDeviceDefFormat(const virNodeDeviceDef *def) case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_FC_HOST: case VIR_NODE_DEV_CAP_VPORTS: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } @@ -1913,6 +1915,7 @@ virNodeDevCapsDefParseXML(xmlXPathContextPtr ctxt, case VIR_NODE_DEV_CAP_FC_HOST: case VIR_NODE_DEV_CAP_VPORTS: case VIR_NODE_DEV_CAP_SCSI_GENERIC: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: virReportError(VIR_ERR_INTERNAL_ERROR, _("unknown capability type '%d' for '%s'"), @@ -2232,6 +2235,7 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps) case VIR_NODE_DEV_CAP_VPORTS: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: /* This case is here to shutup the compiler */ break; @@ -2286,6 +2290,7 @@ virNodeDeviceUpdateCaps(virNodeDeviceDefPtr def) case VIR_NODE_DEV_CAP_MDEV: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 5484bc340f..4f8e47a068 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -65,6 +65,7 @@ typedef enum { VIR_NODE_DEV_CAP_MDEV, /* Mediated device */ VIR_NODE_DEV_CAP_CCW_DEV, /* s390 CCW device */ VIR_NODE_DEV_CAP_CSS_DEV, /* s390 channel subsystem device */ + VIR_NODE_DEV_CAP_VDPA, /* vDPA device */
VIR_NODE_DEV_CAP_LAST } virNodeDevCapType; @@ -369,7 +370,8 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps); VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV_TYPES | \ VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV | \ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CCW_DEV | \ - VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV) + VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV | \ + VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA)
int virNodeDeviceGetSCSIHostCaps(virNodeDevCapSCSIHostPtr scsi_host); diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index 8aefd15e94..83c58ebe91 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -711,6 +711,7 @@ virNodeDeviceObjHasCap(const virNodeDeviceObj *obj, case VIR_NODE_DEV_CAP_MDEV: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } @@ -862,7 +863,8 @@ virNodeDeviceObjMatch(virNodeDeviceObjPtr obj, MATCH(MDEV_TYPES) || MATCH(MDEV) || MATCH(CCW_DEV) || - MATCH(CSS_DEV))) + MATCH(CSS_DEV) || + MATCH(VDPA))) return false; }
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 12e3f30bad..fda72f9071 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1144,6 +1144,18 @@ udevProcessCSS(struct udev_device *device, return 0; }
+ +static int +udevProcessVDPA(struct udev_device *device, + virNodeDeviceDefPtr def) +{ + if (udevGenerateDeviceName(device, def, NULL) != 0) + return -1; + + return 0; +} + + static int udevGetDeviceNodes(struct udev_device *device, virNodeDeviceDefPtr def) @@ -1224,6 +1236,8 @@ udevGetDeviceType(struct udev_device *device, *type = VIR_NODE_DEV_CAP_CCW_DEV; else if (STREQ_NULLABLE(subsystem, "css")) *type = VIR_NODE_DEV_CAP_CSS_DEV; + else if (STREQ_NULLABLE(subsystem, "vdpa")) + *type = VIR_NODE_DEV_CAP_VDPA;
VIR_FREE(subsystem); } @@ -1270,6 +1284,8 @@ udevGetDeviceDetails(struct udev_device *device, return udevProcessCCW(device, def); case VIR_NODE_DEV_CAP_CSS_DEV: return udevProcessCSS(device, def); + case VIR_NODE_DEV_CAP_VDPA: + return udevProcessVDPA(device, def); case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_SYSTEM: case VIR_NODE_DEV_CAP_FC_HOST: diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 2edd403a64..19f0c17b4f 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -464,6 +464,9 @@ cmdNodeListDevices(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) case VIR_NODE_DEV_CAP_CSS_DEV: flags |= VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV; break; + case VIR_NODE_DEV_CAP_VDPA: + flags |= VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA; + break; case VIR_NODE_DEV_CAP_LAST: break; }

Adding Cindy and Jason to cc for input on the path issue below On Tue, 29 Sep 2020 15:56:13 -0400 Laine Stump <laine@redhat.com> wrote:
On 9/24/20 5:45 PM, Jonathon Jongsma wrote:
The current udev node device driver ignores all events related to vdpa devices. Since libvirt now supports vDPA network devices, include these devices in the device list.
Can you provide an example in the commit log of what the output xml looks like for nodedev-list and nodedev-dumpxml?
So, this is what the xml looks like for the vdpa node device: <device> <name>vdpa_vdpa0</name> <path>/sys/devices/vdpa0</path> <parent>computer</parent> <driver> <name>vhost_vdpa</name> </driver> <capability type='vdpa'> </capability> </device> The /sys/devices/ path is provided by udev, but unfortunately that path is not particularly useful for using the device with libvirt. Ideally, the node device xml should include the path to the chardev (e.g. /dev/vhost-vdpa-0) which is used to actually connect to the device. I don't see an obvious way (aside from string manipulation, which doesn't seem very reliable) to get from this sysfs path to the /dev/ path. Jason or Cindy, do you have any input here?
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- include/libvirt/libvirt-nodedev.h | 1 + src/conf/node_device_conf.c | 5 +++++ src/conf/node_device_conf.h | 4 +++- src/conf/virnodedeviceobj.c | 4 +++- src/node_device/node_device_udev.c | 16 ++++++++++++++++ tools/virsh-nodedev.c | 3 +++ 6 files changed, 31 insertions(+), 2 deletions(-)
diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index dd2ffd5782..b73b076f14 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -82,6 +82,7 @@ typedef enum { VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV = 1 << 14, /* Mediated device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CCW_DEV = 1 << 15, /* CCW device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV = 1 << 16, /* CSS device */ + VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA = 1 << 17, /* vDPA device */ } virConnectListAllNodeDeviceFlags;
int virConnectListAllNodeDevices (virConnectPtr conn, diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index a9a03ad6c2..3eab1cda75 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -66,6 +66,7 @@ VIR_ENUM_IMPL(virNodeDevCap, "mdev", "ccw", "css", + "vdpa", );
VIR_ENUM_IMPL(virNodeDevNetCap, @@ -614,6 +615,7 @@ virNodeDeviceDefFormat(const virNodeDeviceDef *def) case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_FC_HOST: case VIR_NODE_DEV_CAP_VPORTS: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } @@ -1913,6 +1915,7 @@ virNodeDevCapsDefParseXML(xmlXPathContextPtr ctxt, case VIR_NODE_DEV_CAP_FC_HOST: case VIR_NODE_DEV_CAP_VPORTS: case VIR_NODE_DEV_CAP_SCSI_GENERIC: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: virReportError(VIR_ERR_INTERNAL_ERROR, _("unknown capability type '%d' for '%s'"), @@ -2232,6 +2235,7 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps) case VIR_NODE_DEV_CAP_VPORTS: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: /* This case is here to shutup the compiler */ break; @@ -2286,6 +2290,7 @@ virNodeDeviceUpdateCaps(virNodeDeviceDefPtr def) case VIR_NODE_DEV_CAP_MDEV: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 5484bc340f..4f8e47a068 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -65,6 +65,7 @@ typedef enum { VIR_NODE_DEV_CAP_MDEV, /* Mediated device */ VIR_NODE_DEV_CAP_CCW_DEV, /* s390 CCW device */ VIR_NODE_DEV_CAP_CSS_DEV, /* s390 channel subsystem device */ + VIR_NODE_DEV_CAP_VDPA, /* vDPA device */
VIR_NODE_DEV_CAP_LAST } virNodeDevCapType; @@ -369,7 +370,8 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps); VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV_TYPES | \ VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV | \ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CCW_DEV | \ - VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV) + VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV | \ + VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA)
int virNodeDeviceGetSCSIHostCaps(virNodeDevCapSCSIHostPtr scsi_host); diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index 8aefd15e94..83c58ebe91 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -711,6 +711,7 @@ virNodeDeviceObjHasCap(const virNodeDeviceObj *obj, case VIR_NODE_DEV_CAP_MDEV: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } @@ -862,7 +863,8 @@ virNodeDeviceObjMatch(virNodeDeviceObjPtr obj, MATCH(MDEV_TYPES) || MATCH(MDEV) || MATCH(CCW_DEV) || - MATCH(CSS_DEV))) + MATCH(CSS_DEV) || + MATCH(VDPA))) return false; }
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 12e3f30bad..fda72f9071 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1144,6 +1144,18 @@ udevProcessCSS(struct udev_device *device, return 0; }
+ +static int +udevProcessVDPA(struct udev_device *device, + virNodeDeviceDefPtr def) +{ + if (udevGenerateDeviceName(device, def, NULL) != 0) + return -1; + + return 0; +} + + static int udevGetDeviceNodes(struct udev_device *device, virNodeDeviceDefPtr def) @@ -1224,6 +1236,8 @@ udevGetDeviceType(struct udev_device *device, *type = VIR_NODE_DEV_CAP_CCW_DEV; else if (STREQ_NULLABLE(subsystem, "css")) *type = VIR_NODE_DEV_CAP_CSS_DEV; + else if (STREQ_NULLABLE(subsystem, "vdpa")) + *type = VIR_NODE_DEV_CAP_VDPA;
VIR_FREE(subsystem); } @@ -1270,6 +1284,8 @@ udevGetDeviceDetails(struct udev_device *device, return udevProcessCCW(device, def); case VIR_NODE_DEV_CAP_CSS_DEV: return udevProcessCSS(device, def); + case VIR_NODE_DEV_CAP_VDPA: + return udevProcessVDPA(device, def); case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_SYSTEM: case VIR_NODE_DEV_CAP_FC_HOST: diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 2edd403a64..19f0c17b4f 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -464,6 +464,9 @@ cmdNodeListDevices(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) case VIR_NODE_DEV_CAP_CSS_DEV: flags |= VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV; break; + case VIR_NODE_DEV_CAP_VDPA: + flags |= VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA; + break; case VIR_NODE_DEV_CAP_LAST: break; }

On Wed, 30 Sep 2020 10:02:04 -0500 Jonathon Jongsma <jjongsma@redhat.com> wrote:
Adding Cindy and Jason to cc for input on the path issue below
On Tue, 29 Sep 2020 15:56:13 -0400 Laine Stump <laine@redhat.com> wrote:
On 9/24/20 5:45 PM, Jonathon Jongsma wrote:
The current udev node device driver ignores all events related to vdpa devices. Since libvirt now supports vDPA network devices, include these devices in the device list.
Can you provide an example in the commit log of what the output xml looks like for nodedev-list and nodedev-dumpxml?
So, this is what the xml looks like for the vdpa node device:
<device> <name>vdpa_vdpa0</name> <path>/sys/devices/vdpa0</path> <parent>computer</parent> <driver> <name>vhost_vdpa</name> </driver> <capability type='vdpa'> </capability> </device>
The /sys/devices/ path is provided by udev, but unfortunately that path is not particularly useful for using the device with libvirt. Ideally, the node device xml should include the path to the chardev (e.g. /dev/vhost-vdpa-0) which is used to actually connect to the device. I don't see an obvious way (aside from string manipulation, which doesn't seem very reliable) to get from this sysfs path to the /dev/ path. Jason or Cindy, do you have any input here?
So after after a little more thinking and poking it looks like it may not be too hard to connect the sysfs path to the chardev after all. I don't have any vdpa hardware at the moment, but for vdpa_sim, the sysfs path /sys/devices/vdpa0 has a subdirectory named vhost-vdpa-0 that matches the name under /dev/ (/dev/vhost-vdpa-0). So perhaps I can use that to generate the device path. Something like this pseudo-code: for file in syspath { if file.name begins with "vhost-vdpa" { return "/dev/" + file.name; } } Are these assumptions safe? Will the file always begin with "vhost-vdpa"? Will it always match the /dev file? Jason? In that case, I can adjust the node device XML to be something like: <device> <name>vdpa_vdpa0</name> <path>/sys/devices/vdpa0</path> <parent>computer</parent> <driver> <name>vhost_vdpa</name> </driver> <capability type='vdpa'> <chardev>/dev/vhost-vdpa-0</chardev> </capability> </device> (not totally sure about the naming of the xml element).
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> --- include/libvirt/libvirt-nodedev.h | 1 + src/conf/node_device_conf.c | 5 +++++ src/conf/node_device_conf.h | 4 +++- src/conf/virnodedeviceobj.c | 4 +++- src/node_device/node_device_udev.c | 16 ++++++++++++++++ tools/virsh-nodedev.c | 3 +++ 6 files changed, 31 insertions(+), 2 deletions(-)
diff --git a/include/libvirt/libvirt-nodedev.h b/include/libvirt/libvirt-nodedev.h index dd2ffd5782..b73b076f14 100644 --- a/include/libvirt/libvirt-nodedev.h +++ b/include/libvirt/libvirt-nodedev.h @@ -82,6 +82,7 @@ typedef enum { VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV = 1 << 14, /* Mediated device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CCW_DEV = 1 << 15, /* CCW device */ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV = 1 << 16, /* CSS device */ + VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA = 1 << 17, /* vDPA device */ } virConnectListAllNodeDeviceFlags;
int virConnectListAllNodeDevices (virConnectPtr conn, diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index a9a03ad6c2..3eab1cda75 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -66,6 +66,7 @@ VIR_ENUM_IMPL(virNodeDevCap, "mdev", "ccw", "css", + "vdpa", );
VIR_ENUM_IMPL(virNodeDevNetCap, @@ -614,6 +615,7 @@ virNodeDeviceDefFormat(const virNodeDeviceDef *def) case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_FC_HOST: case VIR_NODE_DEV_CAP_VPORTS: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } @@ -1913,6 +1915,7 @@ virNodeDevCapsDefParseXML(xmlXPathContextPtr ctxt, case VIR_NODE_DEV_CAP_FC_HOST: case VIR_NODE_DEV_CAP_VPORTS: case VIR_NODE_DEV_CAP_SCSI_GENERIC: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: virReportError(VIR_ERR_INTERNAL_ERROR, _("unknown capability type '%d' for '%s'"), @@ -2232,6 +2235,7 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps) case VIR_NODE_DEV_CAP_VPORTS: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: /* This case is here to shutup the compiler */ break; @@ -2286,6 +2290,7 @@ virNodeDeviceUpdateCaps(virNodeDeviceDefPtr def) case VIR_NODE_DEV_CAP_MDEV: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 5484bc340f..4f8e47a068 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -65,6 +65,7 @@ typedef enum { VIR_NODE_DEV_CAP_MDEV, /* Mediated device */ VIR_NODE_DEV_CAP_CCW_DEV, /* s390 CCW device */ VIR_NODE_DEV_CAP_CSS_DEV, /* s390 channel subsystem device */ + VIR_NODE_DEV_CAP_VDPA, /* vDPA device */
VIR_NODE_DEV_CAP_LAST } virNodeDevCapType; @@ -369,7 +370,8 @@ virNodeDevCapsDefFree(virNodeDevCapsDefPtr caps); VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV_TYPES | \ VIR_CONNECT_LIST_NODE_DEVICES_CAP_MDEV | \ VIR_CONNECT_LIST_NODE_DEVICES_CAP_CCW_DEV | \ - VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV) + VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV | \ + VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA)
int virNodeDeviceGetSCSIHostCaps(virNodeDevCapSCSIHostPtr scsi_host); diff --git a/src/conf/virnodedeviceobj.c b/src/conf/virnodedeviceobj.c index 8aefd15e94..83c58ebe91 100644 --- a/src/conf/virnodedeviceobj.c +++ b/src/conf/virnodedeviceobj.c @@ -711,6 +711,7 @@ virNodeDeviceObjHasCap(const virNodeDeviceObj *obj, case VIR_NODE_DEV_CAP_MDEV: case VIR_NODE_DEV_CAP_CCW_DEV: case VIR_NODE_DEV_CAP_CSS_DEV: + case VIR_NODE_DEV_CAP_VDPA: case VIR_NODE_DEV_CAP_LAST: break; } @@ -862,7 +863,8 @@ virNodeDeviceObjMatch(virNodeDeviceObjPtr obj, MATCH(MDEV_TYPES) || MATCH(MDEV) || MATCH(CCW_DEV) || - MATCH(CSS_DEV))) + MATCH(CSS_DEV) || + MATCH(VDPA))) return false; }
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 12e3f30bad..fda72f9071 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1144,6 +1144,18 @@ udevProcessCSS(struct udev_device *device, return 0; }
+ +static int +udevProcessVDPA(struct udev_device *device, + virNodeDeviceDefPtr def) +{ + if (udevGenerateDeviceName(device, def, NULL) != 0) + return -1; + + return 0; +} + + static int udevGetDeviceNodes(struct udev_device *device, virNodeDeviceDefPtr def) @@ -1224,6 +1236,8 @@ udevGetDeviceType(struct udev_device *device, *type = VIR_NODE_DEV_CAP_CCW_DEV; else if (STREQ_NULLABLE(subsystem, "css")) *type = VIR_NODE_DEV_CAP_CSS_DEV; + else if (STREQ_NULLABLE(subsystem, "vdpa")) + *type = VIR_NODE_DEV_CAP_VDPA;
VIR_FREE(subsystem); } @@ -1270,6 +1284,8 @@ udevGetDeviceDetails(struct udev_device *device, return udevProcessCCW(device, def); case VIR_NODE_DEV_CAP_CSS_DEV: return udevProcessCSS(device, def); + case VIR_NODE_DEV_CAP_VDPA: + return udevProcessVDPA(device, def); case VIR_NODE_DEV_CAP_MDEV_TYPES: case VIR_NODE_DEV_CAP_SYSTEM: case VIR_NODE_DEV_CAP_FC_HOST: diff --git a/tools/virsh-nodedev.c b/tools/virsh-nodedev.c index 2edd403a64..19f0c17b4f 100644 --- a/tools/virsh-nodedev.c +++ b/tools/virsh-nodedev.c @@ -464,6 +464,9 @@ cmdNodeListDevices(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED) case VIR_NODE_DEV_CAP_CSS_DEV: flags |= VIR_CONNECT_LIST_NODE_DEVICES_CAP_CSS_DEV; break; + case VIR_NODE_DEV_CAP_VDPA: + flags |= VIR_CONNECT_LIST_NODE_DEVICES_CAP_VDPA; + break; case VIR_NODE_DEV_CAP_LAST: break; }

On 9/30/20 4:29 PM, Jonathon Jongsma wrote:
So after after a little more thinking and poking it looks like it may not be too hard to connect the sysfs path to the chardev after all. I don't have any vdpa hardware at the moment, but for vdpa_sim, the sysfs path /sys/devices/vdpa0 has a subdirectory named vhost-vdpa-0 that matches the name under /dev/ (/dev/vhost-vdpa-0).
That looks more like what I was expecting to be available!! (assuming you get confirmation from Jason or Cindy)

On 9/24/20 5:45 PM, Jonathon Jongsma wrote:
vDPA network devices allow high-performance networking in a virtual machine by providing a wire-speed data path. These devices require a vendor-specific host driver but the data path follows the virtio specification.
The support for vDPA devices was recently added to qemu. This allows libvirt to support these devices. This patchset requires that the device is configured on the host with the appropriate vendor-specific driver. This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That chardev path can then be used to define a new interface with type=3D'vdpa'.
Changes in v4: - rebased to latest master - added hotplug support - report vdpa devices in node device list
Jonathon Jongsma (6): conf: Add support for vDPA network devices qemu: add vhost-vdpa capability qemu: add vdpa support qemu: add monitor functions for handling file descriptors qemu: support hotplug of vdpa devices Include vdpa devices in node device list
You can re-use my ACKs for the previously posted patches (assuming an ability to test IRL).
participants (3)
-
Jonathon Jongsma
-
Laine Stump
-
Peter Krempa