[libvirt] [PATCH RESEND v2 0/5] Implementation of QEMU vhost-scsi

[Resending after the release of 2.2.0; no changes other than a rebase to current master and the associated tweaking to the capabilities patch] This patch series provides a libvirt implementation of the vhost-scsi interface in QEMU. As near as I can see, this was discussed upstream in July 2014[1], and ended in a desire to replace a vhost-scsi controller in favor of a hostdev element instead[2]. There is no capability check in this series for vhost-scsi in the underlying QEMU. Using a recent QEMU built with --disable-vhost-scsi fails with "not a valid device model name." Host Filesystem Example: # ls /sys/kernel/config/target/vhost/ discovery_auth naa.5001405df3e54061 version # ls /sys/kernel/config/target/vhost/naa.5001405df3e54061/tpgt_1/lun/ lun_0 QEMU Example (snippet): -device vhost-scsi-ccw,wwpn=naa.5001405df3e54061,devno=fe.0.1000 Libvirt Example (snippet): <hostdev mode='subsystem' type='scsi_host'> <source protocol='vhost' wwpn='naa.5001405df3e54061'/> <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x1000'/> </hostdev> Guest Viewpoint: # lsscsi [1:0:1:0] disk LIO-ORG disk0 4.0 /dev/sda # dmesg | grep 1: [ 6.065735] scsi host1: Virtio SCSI HBA [ 6.093892] scsi 1:0:1:0: Direct-Access LIO-ORG disk0 4.0 PQ: 0 ANSI: 5 [ 6.313615] sd 1:0:1:0: Attached scsi generic sg0 type 0 [ 6.314981] sd 1:0:1:0: [sda] 29360128 512-byte logical blocks: (15.0 GB/14.0 GiB) [ 6.317290] sd 1:0:1:0: [sda] Write Protect is off [ 6.317566] sd 1:0:1:0: [sda] Mode Sense: 43 00 10 08 [ 6.317853] sd 1:0:1:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 6.352722] sd 1:0:1:0: [sda] Attached SCSI disk Changelog: v2->v2.1: - Rebased to current master (6 September) v1->v2: https://www.redhat.com/archives/libvir-list/2016-August/msg01028.html - Rebase - Applies to current master (20 August) - Added a capability check for QEMU 2.7 - Fixed the qemuxml2argv tests as the -smp options had changed - Reworked ccwaddrs parameter in virDomainCCWAddressAssign call - Comments - Squashed documentation, XML schema, XML parsing, and infrastructure patches into a single patch - Switched from "hostdev type='scsi'" to "hostdev type='scsi_host'"; this removes the refactoring patches since we're not shoe-horning a new SCSI protocol into the existing code - Reworked the handling of the fd's such that we send them to qemu after any possible errors could occur and cause us to back out - s/qemuBuildSCSIVhostHostdevDevStr/qemuBuildHostHostdevDevStr/ - Added virBufferCheckError, and an error message for vhostfdSize > 1, in qemuBuildHostHostdevDevStr - Added qemuBuildDeviceAddressStr in qemuBuildHostHostdevDevStr, thus superceding the last patch in the v1 series - Other - Simplified the vhostfd logic to just be a single int, rather than an alloc'd array (left the vhostfdSize described above as an identifier for if QEMU ever supports multiple vhostfds) - Replaced "qemuMonitorAddDevice" with "qemuMonitorAddDeviceWithFd" in hotplug routine v1: https://www.redhat.com/archives/libvir-list/2016-July/msg01004.html [1] http://www.redhat.com/archives/libvir-list/2014-July/msg01235.html [2] http://www.redhat.com/archives/libvir-list/2014-July/msg01390.html Eric Farman (5): Introduce a "scsi_host" hostdev type qemu: Introduce vhost-scsi capability qemu: Add vhost-scsi string for -device parameter qemu: Allow hotplug of vhost-scsi device tests: Introduce basic vhost-scsi test docs/formatdomain.html.in | 24 ++++ docs/schemas/domaincommon.rng | 23 ++++ src/conf/domain_audit.c | 2 + src/conf/domain_conf.c | 62 ++++++++- src/conf/domain_conf.h | 17 +++ src/libvirt_private.syms | 1 + src/qemu/qemu_capabilities.c | 3 + src/qemu/qemu_capabilities.h | 3 + src/qemu/qemu_command.c | 80 +++++++++++ src/qemu/qemu_command.h | 6 + src/qemu/qemu_domain_address.c | 10 ++ src/qemu/qemu_hotplug.c | 149 +++++++++++++++++++++ src/security/security_dac.c | 2 + src/util/virscsi.c | 26 ++++ src/util/virscsi.h | 1 + tests/domaincapsschemadata/full.xml | 1 + tests/qemucapabilitiesdata/caps_1.5.3.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_1.6.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_1.7.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.1.1.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.4.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml | 1 + .../caps_2.6.0-gicv2.aarch64.xml | 1 + .../caps_2.6.0-gicv3.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml | 1 + tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml | 1 + .../qemuxml2argv-hostdev-scsi-vhost-scsi.args | 24 ++++ .../qemuxml2argv-hostdev-scsi-vhost-scsi.xml | 33 +++++ tests/qemuxml2argvmock.c | 12 ++ tests/qemuxml2argvtest.c | 3 + 31 files changed, 491 insertions(+), 2 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.xml -- 1.9.1

We already have a "scsi" hostdev type, which refers to a single LUN that is passed through to a guest. But what of things where multiple LUNs are passed through via a single SCSI HBA, such as with the vhost-scsi target? Create a new hostdev type that will carry this, and its associated documentation and XML schema information. Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> --- docs/formatdomain.html.in | 24 ++++++++++++++++ docs/schemas/domaincommon.rng | 23 +++++++++++++++ src/conf/domain_audit.c | 2 ++ src/conf/domain_conf.c | 56 +++++++++++++++++++++++++++++++++++-- src/conf/domain_conf.h | 17 +++++++++++ src/qemu/qemu_domain_address.c | 10 +++++++ src/qemu/qemu_hotplug.c | 1 + src/security/security_dac.c | 2 ++ tests/domaincapsschemadata/full.xml | 1 + 9 files changed, 134 insertions(+), 2 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index feaeaa2..b79da95 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3655,6 +3655,17 @@ </devices> ...</pre> + <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='scsi_host'> + <source protocol='vhost' wwpn='naa.50014057667280d8'/> + </hostdev> + </devices> + ...</pre> + <dl> <dt><code>hostdev</code></dt> <dd>The <code>hostdev</code> element is the main container for describing @@ -3693,6 +3704,12 @@ If a disk lun in the domain already has the rawio capability, then this setting not required. </dd> + <dt><code>scsi_host</code></dt> + <dd><span class="since">since 2.2.0</span>For SCSI devices, user + is responsible to make sure the device is not used by host. This + <code>type</code> passes all LUNs presented by a single HBA to + the guest. + </dd> </dl> <p> Note: The <code>managed</code> attribute is only used with PCI devices @@ -3756,6 +3773,13 @@ credentials to the iSCSI server. </p> </dd> + <dt><code>scsi_host</code></dt> + <dd><span class="since">Since 2.2.0</span>, multiple LUNs behind a + single SCSI HBI are described by a <code>protocol</code> + attribute set to "vhost" and a <code>wwpn</code> attribute that + is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of + "naa.") established in the host configfs. + </dd> </dl> </dd> <dt><code>vendor</code>, <code>product</code></dt> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index af32df8..7fd6bd2 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -3938,6 +3938,7 @@ <ref name="hostdevsubsyspci"/> <ref name="hostdevsubsysusb"/> <ref name="hostdevsubsysscsi"/> + <ref name="hostdevsubsyshost"/> </choice> </define> @@ -4066,6 +4067,28 @@ </element> </define> + <define name="hostdevsubsyshost"> + <attribute name="type"> + <value>scsi_host</value> + </attribute> + <element name="source"> + <choice> + <group> + <attribute name="protocol"> + <choice> + <value>vhost</value> <!-- vhost, required --> + </choice> + </attribute> + <attribute name="wwpn"> + <data type="string"> + <param name="pattern">(naa\.)[0-9a-fA-F]{16}</param> + </data> + </attribute> + </group> + </choice> + </element> + </define> + <define name="hostdevcapsstorage"> <attribute name="type"> <value>storage</value> diff --git a/src/conf/domain_audit.c b/src/conf/domain_audit.c index 53a58ac..406dd8f 100644 --- a/src/conf/domain_audit.c +++ b/src/conf/domain_audit.c @@ -443,6 +443,8 @@ virDomainAuditHostdev(virDomainObjPtr vm, virDomainHostdevDefPtr hostdev, } break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + break; default: VIR_WARN("Unexpected hostdev type while encoding audit message: %d", hostdev->source.subsys.type); diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c8c4f61..45b643b 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -646,7 +646,8 @@ VIR_ENUM_IMPL(virDomainHostdevMode, VIR_DOMAIN_HOSTDEV_MODE_LAST, VIR_ENUM_IMPL(virDomainHostdevSubsys, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST, "usb", "pci", - "scsi") + "scsi", + "scsi_host") VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend, VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST, @@ -660,6 +661,10 @@ VIR_ENUM_IMPL(virDomainHostdevSubsysSCSIProtocol, "adapter", "iscsi") +VIR_ENUM_IMPL(virDomainHostdevSubsysHostProtocol, + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_LAST, + "vhost") + VIR_ENUM_IMPL(virDomainHostdevCaps, VIR_DOMAIN_HOSTDEV_CAPS_TYPE_LAST, "storage", "misc", @@ -2270,6 +2275,9 @@ void virDomainHostdevDefClear(virDomainHostdevDefPtr def) } else { VIR_FREE(scsisrc->u.host.adapter); } + } else if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST) { + virDomainHostdevSubsysHostPtr hostsrc = &def->source.subsys.u.host; + VIR_FREE(hostsrc->wwpn); } break; } @@ -5977,6 +5985,31 @@ virDomainHostdevSubsysSCSIDefParseXML(xmlNodePtr sourcenode, return ret; } +static int +virDomainHostdevSubsysHostDefParseXML(xmlNodePtr sourcenode, + virDomainHostdevDefPtr def) +{ + virDomainHostdevSubsysHostPtr hostsrc = &def->source.subsys.u.host; + + if (!(hostsrc->wwpn = virXMLPropString(sourcenode, "wwpn"))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing vhost-scsi hostdev source path name")); + goto cleanup; + } + + if (!STRPREFIX(hostsrc->wwpn, "naa.") || + strlen(hostsrc->wwpn) != strlen("naa.") + 16) { + virReportError(VIR_ERR_XML_ERROR, "%s", _("malformed 'wwpn' value")); + goto cleanup; + } + + return 0; + + cleanup: + VIR_FREE(hostsrc->wwpn); + return -1; +} + static int virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, @@ -6101,6 +6134,11 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, goto error; break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + if (virDomainHostdevSubsysHostDefParseXML(sourcenode, def) < 0) + goto error; + break; + default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("address type='%s' not supported in hostdev interfaces"), @@ -20521,9 +20559,11 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, unsigned int flags, bool includeTypeInAddr) { + bool closedSource = false; virDomainHostdevSubsysUSBPtr usbsrc = &def->source.subsys.u.usb; virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysHostPtr hostsrc = &def->source.subsys.u.host; virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = &scsisrc->u.iscsi; @@ -20564,6 +20604,15 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, protocol, iscsisrc->path); } + if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST) { + const char *protocol = + virDomainHostdevSubsysHostProtocolTypeToString(hostsrc->protocol); + closedSource = true; + + virBufferAsprintf(buf, " protocol='%s' wwpn='%s'/", + protocol, hostsrc->wwpn); + } + virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); @@ -20617,6 +20666,8 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, scsihostsrc->unit); } break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected hostdev type %d"), @@ -20632,7 +20683,8 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, } virBufferAdjustIndent(buf, -2); - virBufferAddLit(buf, "</source>\n"); + if (!closedSource) + virBufferAddLit(buf, "</source>\n"); return 0; } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 0fe4154..781ea7b 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -294,6 +294,7 @@ typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI, + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST } virDomainHostdevSubsysType; @@ -368,6 +369,21 @@ struct _virDomainHostdevSubsysSCSI { } u; }; +typedef enum { + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_VHOST, + + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_LAST, +} virDomainHostdevHostProtocolType; + +VIR_ENUM_DECL(virDomainHostdevSubsysHostProtocol) + +typedef struct _virDomainHostdevSubsysHost virDomainHostdevSubsysHost; +typedef virDomainHostdevSubsysHost *virDomainHostdevSubsysHostPtr; +struct _virDomainHostdevSubsysHost { + int protocol; + char *wwpn; +}; + typedef struct _virDomainHostdevSubsys virDomainHostdevSubsys; typedef virDomainHostdevSubsys *virDomainHostdevSubsysPtr; struct _virDomainHostdevSubsys { @@ -376,6 +392,7 @@ struct _virDomainHostdevSubsys { virDomainHostdevSubsysUSB usb; virDomainHostdevSubsysPCI pci; virDomainHostdevSubsysSCSI scsi; + virDomainHostdevSubsysHost host; } u; }; diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index bb16738..8419e6a 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -301,6 +301,16 @@ qemuDomainPrimeVirtioDeviceAddresses(virDomainDefPtr def, def->controllers[i]->info.type = type; } + for (i = 0; i < def->nhostdevs; i++) { + if (def->hostdevs[i]->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + def->hostdevs[i]->source.subsys.type == + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST && + def->hostdevs[i]->source.subsys.u.host.protocol == + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_VHOST && + def->hostdevs[i]->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + def->hostdevs[i]->info->type = type; + } + if (def->memballoon && def->memballoon->model == VIR_DOMAIN_MEMBALLOON_MODEL_VIRTIO && def->memballoon->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index d13474a..e9140a9 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3238,6 +3238,7 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, qemuDomainRemoveUSBHostDevice(driver, vm, hostdev); break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: qemuDomainRemoveSCSIHostDevice(driver, vm, hostdev); break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 442ce70..de2b921 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -676,6 +676,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -805,6 +806,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; diff --git a/tests/domaincapsschemadata/full.xml b/tests/domaincapsschemadata/full.xml index 2f529ff..41fb1fa 100644 --- a/tests/domaincapsschemadata/full.xml +++ b/tests/domaincapsschemadata/full.xml @@ -75,6 +75,7 @@ <value>usb</value> <value>pci</value> <value>scsi</value> + <value>scsi_host</value> </enum> <enum name='capsType'> <value>storage</value> -- 1.9.1

On 09/06/2016 08:58 AM, Eric Farman wrote:
We already have a "scsi" hostdev type, which refers to a single LUN that is passed through to a guest. But what of things where multiple LUNs are passed through via a single SCSI HBA, such as with the vhost-scsi target? Create a new hostdev type that will carry this, and its associated documentation and XML schema information.
I assume from the review of v1 this is both HBA and vHBA? That is a vHBA would be useful for the NPIV support to pass through the entire vHBA rather than individual LUN's (something recently asked for at KVM Forum). The vHBA is created when one has a vport capable scsi_host (see http://wiki.libvirt.org/page/NPIV_in_libvirt). The "result" is a 'scsi_hostM' using a vport capable parent scsi_hostN. Now if a vHBA cannot (or should not) be passed through, then we'll need to be sure to test and avoid it. Unfortunately we had a lab issue and the system I normally use for my vHBA/NPIV work was affected - trying to get it resurected remotely...
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> --- docs/formatdomain.html.in | 24 ++++++++++++++++ docs/schemas/domaincommon.rng | 23 +++++++++++++++ src/conf/domain_audit.c | 2 ++ src/conf/domain_conf.c | 56 +++++++++++++++++++++++++++++++++++-- src/conf/domain_conf.h | 17 +++++++++++ src/qemu/qemu_domain_address.c | 10 +++++++ src/qemu/qemu_hotplug.c | 1 + src/security/security_dac.c | 2 ++ tests/domaincapsschemadata/full.xml | 1 + 9 files changed, 134 insertions(+), 2 deletions(-)
I took a peek at the v1, but suffice to say didn't follow it that closely since there's a number of changes in v2 related to patch ordering. You could get 3 different opinions on that matter too! Typically I'll try to add the qemu code first (capabilities first, then command in one patch and hotplug in the next one). Once that's in place, then I add the domain bits, docs, etc. Since you're extending the domain XML, you will need to add an xml2xml test in this patch. Use the XML from your patch 5 as a basis/guide. You'll need a "tests/qemuxml2argvdata/" file and a "tests/qemuxml2xmloutdata/" file as well as a changed to qemuxml2xmltest.c. If the data file is the same as the outdata file, then you can create a soft link to the data file in the outdata file directory.
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index feaeaa2..b79da95 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3655,6 +3655,17 @@ </devices> ...</pre>
+ <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='scsi_host'> + <source protocol='vhost' wwpn='naa.50014057667280d8'/> + </hostdev> + </devices> + ...</pre> +
A wwpn doesn't generally have the "naa." prefix on it. So, why not just add that to the command line instead? That is a wwpn is a 16 hex digit value. I'm assuming that in order to support this feature a "naa." is prefixed and sent to qemu. Does that necessarily mean some other hypervisor would choose to place the "naa." prefix or would then code need to be added to strip it off for those other hypervisors? Looking at the qemu code - it seems fairly specific.
<dl> <dt><code>hostdev</code></dt> <dd>The <code>hostdev</code> element is the main container for describing @@ -3693,6 +3704,12 @@ If a disk lun in the domain already has the rawio capability, then this setting not required. </dd> + <dt><code>scsi_host</code></dt> + <dd><span class="since">since 2.2.0</span>For SCSI devices, user + is responsible to make sure the device is not used by host. This + <code>type</code> passes all LUNs presented by a single HBA to + the guest. + </dd>
It'll be at least 2.3.0 I see from the qemu code there's support for attributes 'num_queues', 'max_sectors', and 'cmd_per_lun' - since this is passing the scsi_host through and not a 'scsi' LUN with a 'scsi' controller, should those attributes be added as well? Not "required" for this version, but I would hope we could add caps and attributes for that as well because I can almost guarantee someone will ask.
</dl> <p> Note: The <code>managed</code> attribute is only used with PCI devices @@ -3756,6 +3773,13 @@ credentials to the iSCSI server. </p> </dd> + <dt><code>scsi_host</code></dt> + <dd><span class="since">Since 2.2.0</span>, multiple LUNs behind a
2.3
+ single SCSI HBI are described by a <code>protocol</code>
s/HBI/HBA
+ attribute set to "vhost" and a <code>wwpn</code> attribute that + is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of + "naa.") established in the host configfs. + </dd>
I don't think we mention the "naa." prefix... It would be nice to tell a user how to find that wwpn value via some command (virsh or otherwise) in order to fill in the wwpn attribute. Although there's always varying opinions on this since many times it's distro dependent. That's why I suggest virsh - at least if it's supportable we can display the wwpn there using nodedev-dumpxml. What/where is the 'vhost_scsi_wwpn'? IOW: It's not something defined yet - I'm assuming there's some output somewhere where that's pulled from. One final question - what is this exposed as on the guest? "Some" scsi_hostM? For a SCSI LUN it's some 'sgN' value. I see the qemu side seems to infer some amount of scsi_generic code (read_naa_id), so it's mostly curiosity/learning for me.
</dl> </dd> <dt><code>vendor</code>, <code>product</code></dt> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index af32df8..7fd6bd2 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -3938,6 +3938,7 @@ <ref name="hostdevsubsyspci"/> <ref name="hostdevsubsysusb"/> <ref name="hostdevsubsysscsi"/> + <ref name="hostdevsubsyshost"/> </choice> </define>
@@ -4066,6 +4067,28 @@ </element> </define>
+ <define name="hostdevsubsyshost"> + <attribute name="type"> + <value>scsi_host</value> + </attribute> + <element name="source"> + <choice> + <group> + <attribute name="protocol"> + <choice> + <value>vhost</value> <!-- vhost, required --> + </choice> + </attribute> + <attribute name="wwpn"> + <data type="string"> + <param name="pattern">(naa\.)[0-9a-fA-F]{16}</param> + </data>
If you go with the we'll add "naa." later, then you could just: <ref name='wwn'/> instead of <data ...> ... </data>
+ </attribute> + </group> + </choice> + </element> + </define> + <define name="hostdevcapsstorage"> <attribute name="type"> <value>storage</value> diff --git a/src/conf/domain_audit.c b/src/conf/domain_audit.c index 53a58ac..406dd8f 100644 --- a/src/conf/domain_audit.c +++ b/src/conf/domain_audit.c @@ -443,6 +443,8 @@ virDomainAuditHostdev(virDomainObjPtr vm, virDomainHostdevDefPtr hostdev, } break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST:
Can't we produce something here? Not sure seeing "?" in the output of an audit message would be useful. Perhaps the path to the device similar to how we end up in virDomainAuditGenericDev...
+ break; default: VIR_WARN("Unexpected hostdev type while encoding audit message: %d", hostdev->source.subsys.type); diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c8c4f61..45b643b 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -646,7 +646,8 @@ VIR_ENUM_IMPL(virDomainHostdevMode, VIR_DOMAIN_HOSTDEV_MODE_LAST, VIR_ENUM_IMPL(virDomainHostdevSubsys, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST, "usb", "pci", - "scsi") + "scsi", + "scsi_host")
VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend, VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST, @@ -660,6 +661,10 @@ VIR_ENUM_IMPL(virDomainHostdevSubsysSCSIProtocol, "adapter", "iscsi")
+VIR_ENUM_IMPL(virDomainHostdevSubsysHostProtocol, + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_LAST, + "vhost") +
Notice how the virDomainHostdevSubsysSCSIProtocol adds 'adapter' first - this should do something similar (although the preference is to use "none" - I cannot recall why that was different). Remember that all structs are "calloc"'d and thus making sure this setting to something other than 0 will ensure we haven't missed something.
VIR_ENUM_IMPL(virDomainHostdevCaps, VIR_DOMAIN_HOSTDEV_CAPS_TYPE_LAST, "storage", "misc", @@ -2270,6 +2275,9 @@ void virDomainHostdevDefClear(virDomainHostdevDefPtr def) } else { VIR_FREE(scsisrc->u.host.adapter); } + } else if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST) { + virDomainHostdevSubsysHostPtr hostsrc = &def->source.subsys.u.host; + VIR_FREE(hostsrc->wwpn); } break; } @@ -5977,6 +5985,31 @@ virDomainHostdevSubsysSCSIDefParseXML(xmlNodePtr sourcenode, return ret; }
+static int +virDomainHostdevSubsysHostDefParseXML(xmlNodePtr sourcenode, + virDomainHostdevDefPtr def) +{ + virDomainHostdevSubsysHostPtr hostsrc = &def->source.subsys.u.host; +
This code would do the "protocol" parsing. I believe the way it's written now protocol goes unchecked. You need to add code to use the virDomainHostdevSubsysHostProtocolTypeFromString in order to validate what was supplied in the XML is correct. See virDomainHostdevSubsysSCSIDefParseXML - although make your ->protocol comparison <= 0 since we don't want a "0" value. (Yes, there's a bug in the SCSIDefParse I think - although it's caught later on).
+ if (!(hostsrc->wwpn = virXMLPropString(sourcenode, "wwpn"))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing vhost-scsi hostdev source path name")); + goto cleanup; + }
If we take the don't require a "naa." prefix, then virValidateWWN should be used to validate things. Even if naa. was kept, the same validation should be done for consistency.
+ + if (!STRPREFIX(hostsrc->wwpn, "naa.") || + strlen(hostsrc->wwpn) != strlen("naa.") + 16) { + virReportError(VIR_ERR_XML_ERROR, "%s", _("malformed 'wwpn' value")); + goto cleanup; + } + + return 0; + + cleanup: + VIR_FREE(hostsrc->wwpn); + return -1; +} +
static int virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, @@ -6101,6 +6134,11 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, goto error; break;
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + if (virDomainHostdevSubsysHostDefParseXML(sourcenode, def) < 0) + goto error; + break; + default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("address type='%s' not supported in hostdev interfaces"), @@ -20521,9 +20559,11 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, unsigned int flags, bool includeTypeInAddr) { + bool closedSource = false; virDomainHostdevSubsysUSBPtr usbsrc = &def->source.subsys.u.usb; virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysHostPtr hostsrc = &def->source.subsys.u.host; virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = &scsisrc->u.iscsi;
@@ -20564,6 +20604,15 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, protocol, iscsisrc->path); }
+ if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST) {
FWIW: If some day in the future type='scsi_host' added some other "protocol" (something other than vhost), then at that time this code would be extended to handle that.
+ const char *protocol = + virDomainHostdevSubsysHostProtocolTypeToString(hostsrc->protocol); + closedSource = true; + + virBufferAsprintf(buf, " protocol='%s' wwpn='%s'/", + protocol, hostsrc->wwpn); + } + virBufferAddLit(buf, ">\n");
virBufferAdjustIndent(buf, 2); @@ -20617,6 +20666,8 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, scsihostsrc->unit); } break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected hostdev type %d"), @@ -20632,7 +20683,8 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, }
virBufferAdjustIndent(buf, -2); - virBufferAddLit(buf, "</source>\n"); + if (!closedSource) + virBufferAddLit(buf, "</source>\n");
return 0; } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 0fe4154..781ea7b 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -294,6 +294,7 @@ typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI, + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST,
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST } virDomainHostdevSubsysType; @@ -368,6 +369,21 @@ struct _virDomainHostdevSubsysSCSI { } u; };
+typedef enum {
VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_NONE, Add the NONE here.
+ VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_VHOST, + + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_LAST, +} virDomainHostdevHostProtocolType; + +VIR_ENUM_DECL(virDomainHostdevSubsysHostProtocol) + +typedef struct _virDomainHostdevSubsysHost virDomainHostdevSubsysHost; +typedef virDomainHostdevSubsysHost *virDomainHostdevSubsysHostPtr; +struct _virDomainHostdevSubsysHost { + int protocol; + char *wwpn; +}; + typedef struct _virDomainHostdevSubsys virDomainHostdevSubsys; typedef virDomainHostdevSubsys *virDomainHostdevSubsysPtr; struct _virDomainHostdevSubsys { @@ -376,6 +392,7 @@ struct _virDomainHostdevSubsys { virDomainHostdevSubsysUSB usb; virDomainHostdevSubsysPCI pci; virDomainHostdevSubsysSCSI scsi; + virDomainHostdevSubsysHost host; } u; };
diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index bb16738..8419e6a 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -301,6 +301,16 @@ qemuDomainPrimeVirtioDeviceAddresses(virDomainDefPtr def, def->controllers[i]->info.type = type; }
+ for (i = 0; i < def->nhostdevs; i++) { + if (def->hostdevs[i]->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + def->hostdevs[i]->source.subsys.type == + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST && + def->hostdevs[i]->source.subsys.u.host.protocol == + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_VHOST && + def->hostdevs[i]->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + def->hostdevs[i]->info->type = type; + } + if (def->memballoon && def->memballoon->model == VIR_DOMAIN_MEMBALLOON_MODEL_VIRTIO && def->memballoon->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index d13474a..e9140a9 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3238,6 +3238,7 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, qemuDomainRemoveUSBHostDevice(driver, vm, hostdev); break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST:
hmmm... If you're going to change this here, then qemuDomainAttachHostDevice would be changed at the same time. Hence why this ordering stuff to patches is important. I cannot recall if removing would cause a build error in this case.
qemuDomainRemoveSCSIHostDevice(driver, vm, hostdev); break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 442ce70..de2b921 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -676,6 +676,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -805,6 +806,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break;
Why is only security_dac.c changed? Usually _selinux and _apparmor.c are also changed, which I do see happening in your v1. Although I wouldn't classify a "vhost" and an "iSCSI" host as the same thing. The latter is a networked resource; whereas, IIUC this new device is essentially a file descriptor that's being passed along to qemu. So there must be other examples of that in the security_*.c modules. This particular area I usually don't get right either, but what I'll do is follow the history of the changes and see if something is "similar" to what I'm creating and then just copy that model. Makes me wonder how this plays in a non-privileged environment (e.g. session mode). Is it even allowed to open/read vhost-scsi.
diff --git a/tests/domaincapsschemadata/full.xml b/tests/domaincapsschemadata/full.xml index 2f529ff..41fb1fa 100644 --- a/tests/domaincapsschemadata/full.xml +++ b/tests/domaincapsschemadata/full.xml @@ -75,6 +75,7 @@ <value>usb</value> <value>pci</value> <value>scsi</value> + <value>scsi_host</value> </enum> <enum name='capsType'> <value>storage</value>
FWIW: I usually use cscope to find all VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_ occurrances and be sure the _HOST option is covered. I see that nothing in cgroup is modified. I also wonder if we need to track these in virhostdev.c. John

On 09/12/2016 05:34 PM, John Ferlan wrote:
On 09/06/2016 08:58 AM, Eric Farman wrote:
We already have a "scsi" hostdev type, which refers to a single LUN that is passed through to a guest. But what of things where multiple LUNs are passed through via a single SCSI HBA, such as with the vhost-scsi target? Create a new hostdev type that will carry this, and its associated documentation and XML schema information.
I assume from the review of v1 this is both HBA and vHBA? That is a vHBA would be useful for the NPIV support to pass through the entire vHBA rather than individual LUN's (something recently asked for at KVM Forum).
I wasn't at KVM Forum, so I'm in the dark here. :(
The vHBA is created when one has a vport capable scsi_host (see http://wiki.libvirt.org/page/NPIV_in_libvirt). The "result" is a 'scsi_hostM' using a vport capable parent scsi_hostN. Now if a vHBA cannot (or should not) be passed through, then we'll need to be sure to test and avoid it. Unfortunately we had a lab issue and the system I normally use for my vHBA/NPIV work was affected - trying to get it resurected remotely...
My experience with NPIV has been on s390, so I'm gonna plead some ignorance on NPIV for other platforms. Considering the lack of controls on the qemu side, I would imagine that it should work.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> --- docs/formatdomain.html.in | 24 ++++++++++++++++ docs/schemas/domaincommon.rng | 23 +++++++++++++++ src/conf/domain_audit.c | 2 ++ src/conf/domain_conf.c | 56 +++++++++++++++++++++++++++++++++++-- src/conf/domain_conf.h | 17 +++++++++++ src/qemu/qemu_domain_address.c | 10 +++++++ src/qemu/qemu_hotplug.c | 1 + src/security/security_dac.c | 2 ++ tests/domaincapsschemadata/full.xml | 1 + 9 files changed, 134 insertions(+), 2 deletions(-)
I took a peek at the v1, but suffice to say didn't follow it that closely since there's a number of changes in v2 related to patch ordering. You could get 3 different opinions on that matter too!
Hopefully a consensus before too long. :)
Typically I'll try to add the qemu code first (capabilities first, then command in one patch and hotplug in the next one). Once that's in place, then I add the domain bits, docs, etc.
Since you're extending the domain XML, you will need to add an xml2xml test in this patch. Use the XML from your patch 5 as a basis/guide. You'll need a "tests/qemuxml2argvdata/" file and a "tests/qemuxml2xmloutdata/" file as well as a changed to qemuxml2xmltest.c. If the data file is the same as the outdata file, then you can create a soft link to the data file in the outdata file directory.
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index feaeaa2..b79da95 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3655,6 +3655,17 @@ </devices> ...</pre>
+ <p>or:</p> + +<pre> + ... + <devices> + <hostdev mode='subsystem' type='scsi_host'> + <source protocol='vhost' wwpn='naa.50014057667280d8'/> + </hostdev> + </devices> + ...</pre> + A wwpn doesn't generally have the "naa." prefix on it. So, why not just add that to the command line instead? That is a wwpn is a 16 hex digit value. I'm assuming that in order to support this feature a "naa." is prefixed and sent to qemu. Does that necessarily mean some other hypervisor would choose to place the "naa." prefix or would then code need to be added to strip it off for those other hypervisors? Looking at the qemu code - it seems fairly specific.
Fairly certain that it's more a vhost thing of which the hypervisor utilizes. Per a recommendation made on this list years ago, I used targetcli to do the configuration rather than direct manipulation of sysfs, and according to its man page: All WWNs are prefaced by their type, such as "iqn", "naa", or "ib", and may be given with or without colons. As to whether other hypervisors follow the same behavior, I do not know. I was just trying to conform to what it would allow in the case of vhost-scsi. Now, I suppose I'm confusing matters by saying "wwpn" here since as you say that's 16 hex digits, and should probably say just "wwn" ?
<dl> <dt><code>hostdev</code></dt> <dd>The <code>hostdev</code> element is the main container for describing @@ -3693,6 +3704,12 @@ If a disk lun in the domain already has the rawio capability, then this setting not required. </dd> + <dt><code>scsi_host</code></dt> + <dd><span class="since">since 2.2.0</span>For SCSI devices, user + is responsible to make sure the device is not used by host. This + <code>type</code> passes all LUNs presented by a single HBA to + the guest. + </dd>
It'll be at least 2.3.0
Ugh, yeah. When I rebased after 2.2's release I forgot about the doc hits. My apologies.
I see from the qemu code there's support for attributes 'num_queues', 'max_sectors', and 'cmd_per_lun' - since this is passing the scsi_host through and not a 'scsi' LUN with a 'scsi' controller, should those attributes be added as well? Not "required" for this version, but I would hope we could add caps and attributes for that as well because I can almost guarantee someone will ask.
</dl> <p> Note: The <code>managed</code> attribute is only used with PCI devices @@ -3756,6 +3773,13 @@ credentials to the iSCSI server. </p> </dd> + <dt><code>scsi_host</code></dt> + <dd><span class="since">Since 2.2.0</span>, multiple LUNs behind a
2.3
+ single SCSI HBI are described by a <code>protocol</code> s/HBI/HBA
+ attribute set to "vhost" and a <code>wwpn</code> attribute that + is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of + "naa.") established in the host configfs. + </dd> I don't think we mention the "naa." prefix...
It would be nice to tell a user how to find that wwpn value via some command (virsh or otherwise) in order to fill in the wwpn attribute. Although there's always varying opinions on this since many times it's distro dependent. That's why I suggest virsh - at least if it's supportable we can display the wwpn there using nodedev-dumpxml.
Hrm... nodedev-dumpxml is new to me, neat! I'll go down that bunny trail, but it seems like it'd be a bit of work to add the smarts for this.
What/where is the 'vhost_scsi_wwpn'? IOW: It's not something defined yet - I'm assuming there's some output somewhere where that's pulled from.
As near as I can tell, it's not anything that can be autogenerated off the bat. Somebody somewhere needs to first establish the sysfs entries for the vhost target, which can then be stored and loaded on a next (host) boot. The end result looks something like this: [root@xxxxxxxx ~]# ls -l /sys/kernel/config/target/vhost/naa.50014056e4794195/tpgt_1/lun/lun_0/ total 0 lrwxrwxrwx 1 root root 0 Sep 13 15:44 752e150d11 -> ../../../../../../target/core/iblock_0/disk0 -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_gp -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_offline -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_status -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_write_md drwxr-xr-x 5 root root 0 Sep 13 15:44 statistics [root@xxxxxxxx ~]# cat /sys/kernel/config/target/core/iblock_0/disk0/info Status: ACTIVATED Max Queue Depth: 0 SectorSize: 512 HwMaxSectors: 4592 iBlock device: dm-0 UDEV PATH: /dev/mapper/mpathb readonly: 0 Major: 252 Minor: 0 CLAIMED: IBLOCK (Aside: note the folder name in /sys/kernel/config that contains the wwn/wwpn name with the "naa." prefix.)
One final question - what is this exposed as on the guest? "Some" scsi_hostM? For a SCSI LUN it's some 'sgN' value. I see the qemu side seems to infer some amount of scsi_generic code (read_naa_id), so it's mostly curiosity/learning for me.
Anything that's put behind the target gets seen to the guest. So that can be one or more sgN LUNs and their sdX equivalents, and as near as I can tell a scsi_hostM as you describe. A quick test with two LUNs behind one vhost-scsi target gives me the following in the guest: # ls -l /sys/bus/scsi/devices total 0 lrwxrwxrwx 1 root root 0 Sep 13 16:10 0:0:1:0 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1/0:0:1:0 lrwxrwxrwx 1 root root 0 Sep 13 16:10 0:0:1:1 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1/0:0:1:1 lrwxrwxrwx 1 root root 0 Sep 13 16:10 host0 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0 lrwxrwxrwx 1 root root 0 Sep 13 16:11 target0:0:1 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1 # lsscsi [0:0:1:0] disk LIO-ORG disk0 4.0 /dev/sda [0:0:1:1] disk LIO-ORG disk1 4.0 /dev/sdb FYI, hotplugging/unplugging devices to a target works just fine, except there's no notification that a guest can pick up on to know that a device had been added. Or worse, taken away.
</dl> </dd> <dt><code>vendor</code>, <code>product</code></dt> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index af32df8..7fd6bd2 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -3938,6 +3938,7 @@ <ref name="hostdevsubsyspci"/> <ref name="hostdevsubsysusb"/> <ref name="hostdevsubsysscsi"/> + <ref name="hostdevsubsyshost"/> </choice> </define>
@@ -4066,6 +4067,28 @@ </element> </define>
+ <define name="hostdevsubsyshost"> + <attribute name="type"> + <value>scsi_host</value> + </attribute> + <element name="source"> + <choice> + <group> + <attribute name="protocol"> + <choice> + <value>vhost</value> <!-- vhost, required --> + </choice> + </attribute> + <attribute name="wwpn"> + <data type="string"> + <param name="pattern">(naa\.)[0-9a-fA-F]{16}</param> + </data>
If you go with the we'll add "naa." later, then you could just:
<ref name='wwn'/>
This rings a bell with me, as to why I had the label be "wwpn" instead of "wwn" above. My attempt at trying to distinguish libvirt internals with something user-facing, maybe?
instead of <data ...> ... </data>
+ </attribute> + </group> + </choice> + </element> + </define> + <define name="hostdevcapsstorage"> <attribute name="type"> <value>storage</value> diff --git a/src/conf/domain_audit.c b/src/conf/domain_audit.c index 53a58ac..406dd8f 100644 --- a/src/conf/domain_audit.c +++ b/src/conf/domain_audit.c @@ -443,6 +443,8 @@ virDomainAuditHostdev(virDomainObjPtr vm, virDomainHostdevDefPtr hostdev, } break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: Can't we produce something here? Not sure seeing "?" in the output of an audit message would be useful. Perhaps the path to the device similar to how we end up in virDomainAuditGenericDev...
Fair enough. Will go down that trail.
+ break; default: VIR_WARN("Unexpected hostdev type while encoding audit message: %d", hostdev->source.subsys.type); diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c8c4f61..45b643b 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -646,7 +646,8 @@ VIR_ENUM_IMPL(virDomainHostdevMode, VIR_DOMAIN_HOSTDEV_MODE_LAST, VIR_ENUM_IMPL(virDomainHostdevSubsys, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST, "usb", "pci", - "scsi") + "scsi", + "scsi_host")
VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend, VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST, @@ -660,6 +661,10 @@ VIR_ENUM_IMPL(virDomainHostdevSubsysSCSIProtocol, "adapter", "iscsi")
+VIR_ENUM_IMPL(virDomainHostdevSubsysHostProtocol, + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_LAST, + "vhost") + Notice how the virDomainHostdevSubsysSCSIProtocol adds 'adapter' first - this should do something similar (although the preference is to use "none" - I cannot recall why that was different).
Because "none" == "adapter" in the case of SCSI?
Remember that all structs are "calloc"'d and thus making sure this setting to something other than 0 will ensure we haven't missed something.
VIR_ENUM_IMPL(virDomainHostdevCaps, VIR_DOMAIN_HOSTDEV_CAPS_TYPE_LAST, "storage", "misc", @@ -2270,6 +2275,9 @@ void virDomainHostdevDefClear(virDomainHostdevDefPtr def) } else { VIR_FREE(scsisrc->u.host.adapter); } + } else if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST) { + virDomainHostdevSubsysHostPtr hostsrc = &def->source.subsys.u.host; + VIR_FREE(hostsrc->wwpn); } break; } @@ -5977,6 +5985,31 @@ virDomainHostdevSubsysSCSIDefParseXML(xmlNodePtr sourcenode, return ret; }
+static int +virDomainHostdevSubsysHostDefParseXML(xmlNodePtr sourcenode, + virDomainHostdevDefPtr def) +{ + virDomainHostdevSubsysHostPtr hostsrc = &def->source.subsys.u.host; + This code would do the "protocol" parsing. I believe the way it's written now protocol goes unchecked. You need to add code to use the virDomainHostdevSubsysHostProtocolTypeFromString in order to validate what was supplied in the XML is correct.
See virDomainHostdevSubsysSCSIDefParseXML - although make your ->protocol comparison <= 0 since we don't want a "0" value.
(Yes, there's a bug in the SCSIDefParse I think - although it's caught later on).
+ if (!(hostsrc->wwpn = virXMLPropString(sourcenode, "wwpn"))) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("missing vhost-scsi hostdev source path name")); + goto cleanup; + } If we take the don't require a "naa." prefix, then virValidateWWN should be used to validate things. Even if naa. was kept, the same validation should be done for consistency.
Ah, okay. virValidateWWN can be used regardless I suppose.
+ + if (!STRPREFIX(hostsrc->wwpn, "naa.") || + strlen(hostsrc->wwpn) != strlen("naa.") + 16) { + virReportError(VIR_ERR_XML_ERROR, "%s", _("malformed 'wwpn' value")); + goto cleanup; + } + + return 0; + + cleanup: + VIR_FREE(hostsrc->wwpn); + return -1; +} +
static int virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, @@ -6101,6 +6134,11 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node, goto error; break;
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + if (virDomainHostdevSubsysHostDefParseXML(sourcenode, def) < 0) + goto error; + break; + default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("address type='%s' not supported in hostdev interfaces"), @@ -20521,9 +20559,11 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, unsigned int flags, bool includeTypeInAddr) { + bool closedSource = false; virDomainHostdevSubsysUSBPtr usbsrc = &def->source.subsys.u.usb; virDomainHostdevSubsysPCIPtr pcisrc = &def->source.subsys.u.pci; virDomainHostdevSubsysSCSIPtr scsisrc = &def->source.subsys.u.scsi; + virDomainHostdevSubsysHostPtr hostsrc = &def->source.subsys.u.host; virDomainHostdevSubsysSCSIHostPtr scsihostsrc = &scsisrc->u.host; virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = &scsisrc->u.iscsi;
@@ -20564,6 +20604,15 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, protocol, iscsisrc->path); }
+ if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST) { FWIW: If some day in the future type='scsi_host' added some other "protocol" (something other than vhost), then at that time this code would be extended to handle that.
+ const char *protocol = + virDomainHostdevSubsysHostProtocolTypeToString(hostsrc->protocol); + closedSource = true; + + virBufferAsprintf(buf, " protocol='%s' wwpn='%s'/", + protocol, hostsrc->wwpn); + } + virBufferAddLit(buf, ">\n");
virBufferAdjustIndent(buf, 2); @@ -20617,6 +20666,8 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, scsihostsrc->unit); } break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected hostdev type %d"), @@ -20632,7 +20683,8 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf, }
virBufferAdjustIndent(buf, -2); - virBufferAddLit(buf, "</source>\n"); + if (!closedSource) + virBufferAddLit(buf, "</source>\n");
return 0; } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 0fe4154..781ea7b 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -294,6 +294,7 @@ typedef enum { VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI, VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI, + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST,
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST } virDomainHostdevSubsysType; @@ -368,6 +369,21 @@ struct _virDomainHostdevSubsysSCSI { } u; };
+typedef enum { VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_NONE,
Add the NONE here.
+ VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_VHOST, + + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_LAST, +} virDomainHostdevHostProtocolType; + +VIR_ENUM_DECL(virDomainHostdevSubsysHostProtocol) + +typedef struct _virDomainHostdevSubsysHost virDomainHostdevSubsysHost; +typedef virDomainHostdevSubsysHost *virDomainHostdevSubsysHostPtr; +struct _virDomainHostdevSubsysHost { + int protocol; + char *wwpn; +}; + typedef struct _virDomainHostdevSubsys virDomainHostdevSubsys; typedef virDomainHostdevSubsys *virDomainHostdevSubsysPtr; struct _virDomainHostdevSubsys { @@ -376,6 +392,7 @@ struct _virDomainHostdevSubsys { virDomainHostdevSubsysUSB usb; virDomainHostdevSubsysPCI pci; virDomainHostdevSubsysSCSI scsi; + virDomainHostdevSubsysHost host; } u; };
diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index bb16738..8419e6a 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -301,6 +301,16 @@ qemuDomainPrimeVirtioDeviceAddresses(virDomainDefPtr def, def->controllers[i]->info.type = type; }
+ for (i = 0; i < def->nhostdevs; i++) { + if (def->hostdevs[i]->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + def->hostdevs[i]->source.subsys.type == + VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST && + def->hostdevs[i]->source.subsys.u.host.protocol == + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_VHOST && + def->hostdevs[i]->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) + def->hostdevs[i]->info->type = type; + } + if (def->memballoon && def->memballoon->model == VIR_DOMAIN_MEMBALLOON_MODEL_VIRTIO && def->memballoon->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index d13474a..e9140a9 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3238,6 +3238,7 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, qemuDomainRemoveUSBHostDevice(driver, vm, hostdev); break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: hmmm... If you're going to change this here, then qemuDomainAttachHostDevice would be changed at the same time. Hence why this ordering stuff to patches is important.
I cannot recall if removing would cause a build error in this case.
It does, because of the switch statement not having a "default" case.
qemuDomainRemoveSCSIHostDevice(driver, vm, hostdev); break; case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: diff --git a/src/security/security_dac.c b/src/security/security_dac.c index 442ce70..de2b921 100644 --- a/src/security/security_dac.c +++ b/src/security/security_dac.c @@ -676,6 +676,7 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break; @@ -805,6 +806,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr, break; }
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST: ret = 0; break;
Why is only security_dac.c changed? Usually _selinux and _apparmor.c are also changed, which I do see happening in your v1.
Oh darn, what happened. :( I'll go revisit my merging/rebasing.
Although I wouldn't classify a "vhost" and an "iSCSI" host as the same thing. The latter is a networked resource; whereas, IIUC this new device is essentially a file descriptor that's being passed along to qemu. So there must be other examples of that in the security_*.c modules.
This particular area I usually don't get right either, but what I'll do is follow the history of the changes and see if something is "similar" to what I'm creating and then just copy that model.
Makes me wonder how this plays in a non-privileged environment (e.g. session mode). Is it even allowed to open/read vhost-scsi.
Excellent question. I'll give it a try.
diff --git a/tests/domaincapsschemadata/full.xml b/tests/domaincapsschemadata/full.xml index 2f529ff..41fb1fa 100644 --- a/tests/domaincapsschemadata/full.xml +++ b/tests/domaincapsschemadata/full.xml @@ -75,6 +75,7 @@ <value>usb</value> <value>pci</value> <value>scsi</value> + <value>scsi_host</value> </enum> <enum name='capsType'> <value>storage</value>
FWIW: I usually use cscope to find all VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_ occurrances and be sure the _HOST option is covered. I see that nothing in cgroup is modified. I also wonder if we need to track these in virhostdev.c.
John
Thanks for the review. (Silent ACK on a lot of the above comments.) I'll try to get these all in place for a v3 with a little runway before the next freeze. - Eric

[...]
I took a peek at the v1, but suffice to say didn't follow it that closely since there's a number of changes in v2 related to patch ordering. You could get 3 different opinions on that matter too!
Hopefully a consensus before too long. :)
yeah - v1 was lots of little steps that you were asked to combine. OTOH there are those that like to combine multiple things in one patch which can make it really difficult to follow. The order here was "followable" - my suggestion was more to try and get the guts (qemu) done first, then the fluff (domain/xml). Consider your patch order to be making progress towards some sustainable feature/bug fix - each step along the way being able to pass "make check syntax-check" so that future attempts to git bisect can point the fickle finger of fate (at someone else!). If a patch adds something that will be external that a followup patch will use - then great. If it's testable, even better. For example, adding virSCSIOpenVhost as a separate patch with the tests/qemuxml2argvmock.c is fine. One more tidbit here - it's a very faint memory, but check out tests/commandtest.c and the lengthy comment in mymain() regarding "expected" fd values. I think you're "safe" in what you've done - although your mock should mimic whatever changes you make to the primary function. [...]
A wwpn doesn't generally have the "naa." prefix on it. So, why not just add that to the command line instead? That is a wwpn is a 16 hex digit value. I'm assuming that in order to support this feature a "naa." is prefixed and sent to qemu. Does that necessarily mean some other hypervisor would choose to place the "naa." prefix or would then code need to be added to strip it off for those other hypervisors? Looking at the qemu code - it seems fairly specific.
Fairly certain that it's more a vhost thing of which the hypervisor utilizes. Per a recommendation made on this list years ago, I used targetcli to do the configuration rather than direct manipulation of sysfs, and according to its man page:
All WWNs are prefaced by their type, such as "iqn", "naa", or "ib", and may be given with or without colons.
I don't use targetcli so I'm not familiar with it. Still not convinced using naa.# in the XML is the best way to go, but I'm not against it. I guess a lot depends on how the sysfs views things.
As to whether other hypervisors follow the same behavior, I do not know. I was just trying to conform to what it would allow in the case of vhost-scsi. Now, I suppose I'm confusing matters by saying "wwpn" here since as you say that's 16 hex digits, and should probably say just "wwn" ?
I guess I've always thought of a wwn/wwpn/wwnn as that string of bytes... Usually WWN/WWNN have been interchangeable, while WWPN is a WWN assigned to a port in a Fibre Channel fabric FWIW: virsh nodedev-dumpxml on a vport capable HBA and the created vHBA will list a wwnn, wwpn, and fabric_wwn. The fabric_wwn's will match. The wwnn/wwpn for the vport is what was used when running the HBA's/vport_create function. It's a tangled web. Wait until vGPU's show up (that's another black magic trick). [...]
It would be nice to tell a user how to find that wwpn value via some command (virsh or otherwise) in order to fill in the wwpn attribute. Although there's always varying opinions on this since many times it's distro dependent. That's why I suggest virsh - at least if it's supportable we can display the wwpn there using nodedev-dumpxml.
Hrm... nodedev-dumpxml is new to me, neat! I'll go down that bunny trail, but it seems like it'd be a bit of work to add the smarts for this.
Hours of all sorts of fun.... The 'nodedev-{list|dumpxml}' are essentially the libvirt mechanism to list/describe the various rabbit holes that is/are the sysfs tree and especially branches. The nodedev-list --tree will probably be of most interest, but so is virsh nodedev-list {scsi_host|pci|usb|scsi|scsi_generic|scs_target|net}
What/where is the 'vhost_scsi_wwpn'? IOW: It's not something defined yet - I'm assuming there's some output somewhere where that's pulled from.
As near as I can tell, it's not anything that can be autogenerated off the bat. Somebody somewhere needs to first establish the sysfs entries for the vhost target, which can then be stored and loaded on a next (host) boot. The end result looks something like this:
AFAIK systemd handles the sysfs creation (ok at least for my purposes). This is what I was concerned about - we really need a way to tell someone how to "easily" find the wwpn that's to be used. Although I don't have vhost-scsi on my laptop ;-) - I have to figure that by using 'ls -al' on the path from a 'virsh nodedev-dumpxml' of a scsi_hostN that is an HBA will show a <path> that path would then be a starting point... Googling "scsi_host HBA wwn" shows me some interesting results including ones that seem to indicate 'fc_host' is involved in some way even for the HBA (I guess that makes sense - where the vHBA is just an abstraction off of that). That said, a '/sys/class/fc_host/' is the starting point then to find things and virsh nodedev-list --cap vports would be the way to get capable HBA's.
[root@xxxxxxxx ~]# ls -l /sys/kernel/config/target/vhost/naa.50014056e4794195/tpgt_1/lun/lun_0/ total 0
Is the 'vhost' or 'vhost/naa.%s' linked from some /sys/class/{scsi|fc}_host/host%d directory? I think that becomes the key for traversal. Or maybe finding "5001...4195" in some other link from /sys/class/{scsi|fc}_host/....
lrwxrwxrwx 1 root root 0 Sep 13 15:44 752e150d11 -> ../../../../../../target/core/iblock_0/disk0 -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_gp -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_offline -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_status -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_write_md drwxr-xr-x 5 root root 0 Sep 13 15:44 statistics [root@xxxxxxxx ~]# cat /sys/kernel/config/target/core/iblock_0/disk0/info Status: ACTIVATED Max Queue Depth: 0 SectorSize: 512 HwMaxSectors: 4592 iBlock device: dm-0 UDEV PATH: /dev/mapper/mpathb readonly: 0 Major: 252 Minor: 0 CLAIMED: IBLOCK
(Aside: note the folder name in /sys/kernel/config that contains the wwn/wwpn name with the "naa." prefix.)
I see... Then again "/sys/kernel" isn't the path used by nodedev driver, hence my question above.
One final question - what is this exposed as on the guest? "Some" scsi_hostM? For a SCSI LUN it's some 'sgN' value. I see the qemu side seems to infer some amount of scsi_generic code (read_naa_id), so it's mostly curiosity/learning for me.
Anything that's put behind the target gets seen to the guest. So that can be one or more sgN LUNs and their sdX equivalents, and as near as I can tell a scsi_hostM as you describe. A quick test with two LUNs behind one vhost-scsi target gives me the following in the guest:
# ls -l /sys/bus/scsi/devices total 0 lrwxrwxrwx 1 root root 0 Sep 13 16:10 0:0:1:0 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1/0:0:1:0 lrwxrwxrwx 1 root root 0 Sep 13 16:10 0:0:1:1 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1/0:0:1:1 lrwxrwxrwx 1 root root 0 Sep 13 16:10 host0 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0 lrwxrwxrwx 1 root root 0 Sep 13 16:11 target0:0:1 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1 # lsscsi [0:0:1:0] disk LIO-ORG disk0 4.0 /dev/sda [0:0:1:1] disk LIO-ORG disk1 4.0 /dev/sdb
FYI, hotplugging/unplugging devices to a target works just fine, except there's no notification that a guest can pick up on to know that a device had been added. Or worse, taken away.
Hmm. Maybe I'm just not asking the right question. I guess my perception was that if you expose the 'scsi_hostN' HBA then you are giving the guest the whole HBA to use as it sees fit. The host shouldn't be touching the luns afterwards. What you've shown is 2 LUNS in the /sys/bus/scsi/ tree; however, is there a /sys/bus/scsi_host/hostN that corresponds to the passed through HBA? [...]
+ <attribute name="protocol"> + <choice> + <value>vhost</value> <!-- vhost, required --> + </choice> + </attribute> + <attribute name="wwpn"> + <data type="string"> + <param name="pattern">(naa\.)[0-9a-fA-F]{16}</param> + </data> If you go with the we'll add "naa." later, then you could just:
<ref name='wwn'/>
This rings a bell with me, as to why I had the label be "wwpn" instead of "wwn" above. My attempt at trying to distinguish libvirt internals with something user-facing, maybe?
I think wwpn will still be technically correct (for some strange reason after skimming google results). The HBA will have a port which is how it's reached. I suppose the answer though is in how sysfs views things. [...]
"pci", - "scsi") + "scsi", + "scsi_host") VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend, VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST, @@ -660,6 +661,10 @@ VIR_ENUM_IMPL(virDomainHostdevSubsysSCSIProtocol, "adapter", "iscsi") +VIR_ENUM_IMPL(virDomainHostdevSubsysHostProtocol, + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_LAST, + "vhost") +
Notice how the virDomainHostdevSubsysSCSIProtocol adds 'adapter' first - this should do something similar (although the preference is to use "none" - I cannot recall why that was different).
Because "none" == "adapter" in the case of SCSI?
Some sort of RNG magic in commit id '54ac483e6' - I forget the details (too long ago), but I think it had to do with being able to reuse the existing RNG without creating a whole new option. [...]
hmmm... If you're going to change this here, then qemuDomainAttachHostDevice would be changed at the same time. Hence why this ordering stuff to patches is important.
I cannot recall if removing would cause a build error in this case.
It does, because of the switch statement not having a "default" case.
Perhaps a different ordering of patches removes the need.... But I know it's tough with this new type... I guess it's odd the *RemoveHostDevice needed it, but the corollary Attach didn't... Ahhh - I see why - the switch in Attach doesn't have the typecast (virDomainHostdevSubsysType). I can send a patch for that or you can make the modification too... [...]
Thanks for the review. (Silent ACK on a lot of the above comments.) I'll try to get these all in place for a v3 with a little runway before the next freeze.
OK - hopefully I won't be neck deep in some other firefight and will be able to give them quicker attention. At the very least getting the capabilities changes in which is generally the most conflict... John

On 09/13/2016 04:49 PM, John Ferlan wrote:
[...]
I took a peek at the v1, but suffice to say didn't follow it that closely since there's a number of changes in v2 related to patch ordering. You could get 3 different opinions on that matter too! Hopefully a consensus before too long. :)
yeah - v1 was lots of little steps that you were asked to combine. OTOH there are those that like to combine multiple things in one patch which can make it really difficult to follow. The order here was "followable" - my suggestion was more to try and get the guts (qemu) done first, then the fluff (domain/xml).
Agreed. v1 was perhaps overkill in breaking things down so small, because I was picking up an idea with a bit of dust on it.
Consider your patch order to be making progress towards some sustainable feature/bug fix - each step along the way being able to pass "make check syntax-check" so that future attempts to git bisect can point the fickle finger of fate (at someone else!). If a patch adds something that will be external that a followup patch will use - then great. If it's testable, even better. For example, adding virSCSIOpenVhost as a separate patch with the tests/qemuxml2argvmock.c is fine. One more tidbit here - it's a very faint memory, but check out tests/commandtest.c and the lengthy comment in mymain() regarding "expected" fd values. I think you're "safe" in what you've done - although your mock should mimic whatever changes you make to the primary function.
[...]
A wwpn doesn't generally have the "naa." prefix on it. So, why not just add that to the command line instead? That is a wwpn is a 16 hex digit value. I'm assuming that in order to support this feature a "naa." is prefixed and sent to qemu. Does that necessarily mean some other hypervisor would choose to place the "naa." prefix or would then code need to be added to strip it off for those other hypervisors? Looking at the qemu code - it seems fairly specific. Fairly certain that it's more a vhost thing of which the hypervisor utilizes. Per a recommendation made on this list years ago, I used targetcli to do the configuration rather than direct manipulation of sysfs, and according to its man page:
All WWNs are prefaced by their type, such as "iqn", "naa", or "ib", and may be given with or without colons.
I don't use targetcli so I'm not familiar with it.
Still not convinced using naa.# in the XML is the best way to go, but I'm not against it. I guess a lot depends on how the sysfs views things.
As to whether other hypervisors follow the same behavior, I do not know. I was just trying to conform to what it would allow in the case of vhost-scsi. Now, I suppose I'm confusing matters by saying "wwpn" here since as you say that's 16 hex digits, and should probably say just "wwn" ?
I guess I've always thought of a wwn/wwpn/wwnn as that string of bytes...
I always thought WWPN and WWNN as types of WWN...
Usually WWN/WWNN have been interchangeable, while WWPN is a WWN assigned to a port in a Fibre Channel fabric
...but have seen it this way also. Either way makes sense to me.
FWIW: virsh nodedev-dumpxml on a vport capable HBA and the created vHBA will list a wwnn, wwpn, and fabric_wwn. The fabric_wwn's will match. The wwnn/wwpn for the vport is what was used when running the HBA's/vport_create function. It's a tangled web. Wait until vGPU's show up (that's another black magic trick).
[...]
It would be nice to tell a user how to find that wwpn value via some command (virsh or otherwise) in order to fill in the wwpn attribute. Although there's always varying opinions on this since many times it's distro dependent. That's why I suggest virsh - at least if it's supportable we can display the wwpn there using nodedev-dumpxml. Hrm... nodedev-dumpxml is new to me, neat! I'll go down that bunny trail, but it seems like it'd be a bit of work to add the smarts for this.
Hours of all sorts of fun.... The 'nodedev-{list|dumpxml}' are essentially the libvirt mechanism to list/describe the various rabbit holes that is/are the sysfs tree and especially branches. The nodedev-list --tree will probably be of most interest, but so is virsh nodedev-list {scsi_host|pci|usb|scsi|scsi_generic|scs_target|net}
What/where is the 'vhost_scsi_wwpn'? IOW: It's not something defined yet - I'm assuming there's some output somewhere where that's pulled from. As near as I can tell, it's not anything that can be autogenerated off the bat. Somebody somewhere needs to first establish the sysfs entries for the vhost target, which can then be stored and loaded on a next (host) boot. The end result looks something like this:
AFAIK systemd handles the sysfs creation (ok at least for my purposes).
This is what I was concerned about - we really need a way to tell someone how to "easily" find the wwpn that's to be used. Although I don't have vhost-scsi on my laptop ;-)
Me neither. Of course, I'm using an s390, so that's a little different too. :)
- I have to figure that by using 'ls -al' on the path from a 'virsh nodedev-dumpxml' of a scsi_hostN that is an HBA will show a <path> that path would then be a starting point...
Googling "scsi_host HBA wwn" shows me some interesting results including ones that seem to indicate 'fc_host' is involved in some way even for the HBA (I guess that makes sense - where the vHBA is just an abstraction off of that). That said, a '/sys/class/fc_host/' is the starting point then to find things and virsh nodedev-list --cap vports would be the way to get capable HBA's.
/sys/class/fc_host is just showing the individual SCSI disks, but nothing that looks to point to the vhost-scsi target that I'm passing to QEMU. /sys/class/fc_vports is empty. (Oh wait, is NPIV turned on here? Ugh, pick this up later.)
[root@xxxxxxxx ~]# ls -l /sys/kernel/config/target/vhost/naa.50014056e4794195/tpgt_1/lun/lun_0/ total 0 Is the 'vhost' or 'vhost/naa.%s' linked from some /sys/class/{scsi|fc}_host/host%d directory?
No, I can't see anybody linking it from anywhere except the couple cross links in /sys/kernel/config/ shown below.
I think that becomes the key for traversal. Or maybe finding "5001...4195" in some other link from /sys/class/{scsi|fc}_host/....
lrwxrwxrwx 1 root root 0 Sep 13 15:44 752e150d11 -> ../../../../../../target/core/iblock_0/disk0 -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_gp -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_offline -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_status -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_write_md drwxr-xr-x 5 root root 0 Sep 13 15:44 statistics [root@xxxxxxxx ~]# cat /sys/kernel/config/target/core/iblock_0/disk0/info Status: ACTIVATED Max Queue Depth: 0 SectorSize: 512 HwMaxSectors: 4592 iBlock device: dm-0 UDEV PATH: /dev/mapper/mpathb readonly: 0 Major: 252 Minor: 0 CLAIMED: IBLOCK
(Aside: note the folder name in /sys/kernel/config that contains the wwn/wwpn name with the "naa." prefix.)
I see... Then again "/sys/kernel" isn't the path used by nodedev driver, hence my question above.
Erg, so then the nodedev stuff becomes bigger still.
One final question - what is this exposed as on the guest? "Some" scsi_hostM? For a SCSI LUN it's some 'sgN' value. I see the qemu side seems to infer some amount of scsi_generic code (read_naa_id), so it's mostly curiosity/learning for me. Anything that's put behind the target gets seen to the guest. So that can be one or more sgN LUNs and their sdX equivalents, and as near as I can tell a scsi_hostM as you describe. A quick test with two LUNs behind one vhost-scsi target gives me the following in the guest:
# ls -l /sys/bus/scsi/devices total 0 lrwxrwxrwx 1 root root 0 Sep 13 16:10 0:0:1:0 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1/0:0:1:0 lrwxrwxrwx 1 root root 0 Sep 13 16:10 0:0:1:1 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1/0:0:1:1 lrwxrwxrwx 1 root root 0 Sep 13 16:10 host0 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0 lrwxrwxrwx 1 root root 0 Sep 13 16:11 target0:0:1 -> ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1 # lsscsi [0:0:1:0] disk LIO-ORG disk0 4.0 /dev/sda [0:0:1:1] disk LIO-ORG disk1 4.0 /dev/sdb
FYI, hotplugging/unplugging devices to a target works just fine, except there's no notification that a guest can pick up on to know that a device had been added. Or worse, taken away.
Hmm. Maybe I'm just not asking the right question.
I guess my perception was that if you expose the 'scsi_hostN' HBA then you are giving the guest the whole HBA to use as it sees fit. The host shouldn't be touching the luns afterwards.
Correct, it's just not a "scsi_hostN" HBA that exists as part of the normal fibre attachment, but a vHBA that's defined to the vhost-scsi driver (via targetcli or some other manual handshaking).
What you've shown is 2 LUNS in the /sys/bus/scsi/ tree; however, is there a /sys/bus/scsi_host/hostN that corresponds to the passed through HBA?
Nope... (s390 host) # ls -l /sys/bus/ total 0 drwxr-xr-x 4 root root 0 Sep 13 23:12 ccw drwxr-xr-x 4 root root 0 Sep 13 23:12 clockevents drwxr-xr-x 4 root root 0 Sep 13 23:12 clocksource drwxr-xr-x 4 root root 0 Sep 13 23:12 container drwxr-xr-x 4 root root 0 Sep 13 23:12 cpu drwxr-xr-x 4 root root 0 Sep 13 23:12 css drwxr-xr-x 4 root root 0 Sep 13 23:12 etr drwxr-xr-x 4 root root 0 Sep 13 23:12 event_source drwxr-xr-x 4 root root 0 Sep 13 23:12 memory drwxr-xr-x 5 root root 0 Sep 13 23:12 pci drwxr-xr-x 4 root root 0 Sep 13 23:12 platform drwxr-xr-x 4 root root 0 Sep 13 23:12 scm drwxr-xr-x 4 root root 0 Sep 13 23:12 scsi drwxr-xr-x 4 root root 0 Sep 13 23:12 stp drwxr-xr-x 4 root root 0 Sep 13 23:12 virtio drwxr-xr-x 4 root root 0 Sep 13 23:12 workqueue
[...]
+ <attribute name="protocol"> + <choice> + <value>vhost</value> <!-- vhost, required --> + </choice> + </attribute> + <attribute name="wwpn"> + <data type="string"> + <param name="pattern">(naa\.)[0-9a-fA-F]{16}</param> + </data> If you go with the we'll add "naa." later, then you could just:
<ref name='wwn'/> This rings a bell with me, as to why I had the label be "wwpn" instead of "wwn" above. My attempt at trying to distinguish libvirt internals with something user-facing, maybe?
I think wwpn will still be technically correct (for some strange reason after skimming google results). The HBA will have a port which is how it's reached. I suppose the answer though is in how sysfs views things.
[...]
"pci", - "scsi") + "scsi", + "scsi_host") VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend, VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST, @@ -660,6 +661,10 @@ VIR_ENUM_IMPL(virDomainHostdevSubsysSCSIProtocol, "adapter", "iscsi") +VIR_ENUM_IMPL(virDomainHostdevSubsysHostProtocol, + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_LAST, + "vhost") +
Notice how the virDomainHostdevSubsysSCSIProtocol adds 'adapter' first - this should do something similar (although the preference is to use "none" - I cannot recall why that was different). Because "none" == "adapter" in the case of SCSI?
Some sort of RNG magic in commit id '54ac483e6' - I forget the details (too long ago), but I think it had to do with being able to reuse the existing RNG without creating a whole new option.
Fair enough.
[...]
hmmm... If you're going to change this here, then qemuDomainAttachHostDevice would be changed at the same time. Hence why this ordering stuff to patches is important.
I cannot recall if removing would cause a build error in this case. It does, because of the switch statement not having a "default" case.
Perhaps a different ordering of patches removes the need.... But I know it's tough with this new type... I guess it's odd the *RemoveHostDevice needed it, but the corollary Attach didn't... Ahhh - I see why - the switch in Attach doesn't have the typecast (virDomainHostdevSubsysType).
I can send a patch for that or you can make the modification too...
I have a patch started, but kept it separate from this series because I got distracted/hoped to get v2 out before the last freeze. I will dust that off.
[...]
Thanks for the review. (Silent ACK on a lot of the above comments.) I'll try to get these all in place for a v3 with a little runway before the next freeze.
OK - hopefully I won't be neck deep in some other firefight
Thanks for taking a few minutes away from those to look over this little pile. I appreciate the fires that pop up and get in the way of the usual todo list. :)
and will be able to give them quicker attention. At the very least getting the capabilities changes in which is generally the most conflict...
Apologies for that. I've already rebased/resolved the capability conflict that occurred between posting to the list and now, and moved that to the head of my series. Taking a day's holiday tomorrow, so will get back to the v3 series Thursday. - Eric

On 09/13/2016 05:48 PM, Eric Farman wrote:
On 09/13/2016 04:49 PM, John Ferlan wrote:
[...]
Thanks for the review. (Silent ACK on a lot of the above comments.) I'll try to get these all in place for a v3 with a little runway before the next freeze.
OK - hopefully I won't be neck deep in some other firefight
Thanks for taking a few minutes away from those to look over this little pile. I appreciate the fires that pop up and get in the way of the usual todo list. :)
John, I had hoped to get a v3 off to the list before the 2.3 freeze, but that obviously didn't happen. FYI, I'm off on another tangent for a couple of weeks. With any luck, I'll have this back to the list before 2.4.

Do all the stuff for the vhost-scsi capability in QEMU, so it's in place for our checks later. Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com> --- src/qemu/qemu_capabilities.c | 3 +++ src/qemu/qemu_capabilities.h | 3 +++ tests/qemucapabilitiesdata/caps_1.5.3.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_1.6.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_1.7.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.1.1.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.4.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.6.0-gicv2.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.6.0-gicv3.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml | 1 + tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml | 1 + 13 files changed, 17 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 2ca7803..cf1e468 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -342,6 +342,8 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "smm", "virtio-pci-disable-legacy", "query-hotpluggable-cpus", + + "vhost-scsi", /* 235 */ ); @@ -1566,6 +1568,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "pxb-pcie", QEMU_CAPS_DEVICE_PXB_PCIE }, { "tls-creds-x509", QEMU_CAPS_OBJECT_TLS_CREDS_X509 }, { "intel-iommu", QEMU_CAPS_DEVICE_INTEL_IOMMU }, + { "vhost-scsi", QEMU_CAPS_DEVICE_VHOST_SCSI }, }; static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBalloon[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index a74d39f..d8ddc8f 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -376,6 +376,9 @@ typedef enum { QEMU_CAPS_VIRTIO_PCI_DISABLE_LEGACY, /* virtio-*pci.disable-legacy */ QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS, /* qmp command query-hotpluggable-cpus */ + /* 235 */ + QEMU_CAPS_DEVICE_VHOST_SCSI, /* -device vhost-scsi-{ccw,pci} */ + QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_1.5.3.x86_64.xml b/tests/qemucapabilitiesdata/caps_1.5.3.x86_64.xml index 0d048da..eeaf3d5 100644 --- a/tests/qemucapabilitiesdata/caps_1.5.3.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_1.5.3.x86_64.xml @@ -143,6 +143,7 @@ <flag name='device-tray-moved-event'/> <flag name='nec-usb-xhci-ports'/> <flag name='display'/> + <flag name='vhost-scsi'/> <version>1005003</version> <kvmVersion>0</kvmVersion> <package></package> diff --git a/tests/qemucapabilitiesdata/caps_1.6.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_1.6.0.x86_64.xml index a6d4561..b72c4df 100644 --- a/tests/qemucapabilitiesdata/caps_1.6.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_1.6.0.x86_64.xml @@ -148,6 +148,7 @@ <flag name='device-tray-moved-event'/> <flag name='nec-usb-xhci-ports'/> <flag name='display'/> + <flag name='vhost-scsi'/> <version>1006000</version> <kvmVersion>0</kvmVersion> <package></package> diff --git a/tests/qemucapabilitiesdata/caps_1.7.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_1.7.0.x86_64.xml index f756a41..74a4292 100644 --- a/tests/qemucapabilitiesdata/caps_1.7.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_1.7.0.x86_64.xml @@ -150,6 +150,7 @@ <flag name='device-tray-moved-event'/> <flag name='nec-usb-xhci-ports'/> <flag name='display'/> + <flag name='vhost-scsi'/> <version>1007000</version> <kvmVersion>0</kvmVersion> <package></package> diff --git a/tests/qemucapabilitiesdata/caps_2.1.1.x86_64.xml b/tests/qemucapabilitiesdata/caps_2.1.1.x86_64.xml index a77ad9e..cc829da 100644 --- a/tests/qemucapabilitiesdata/caps_2.1.1.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_2.1.1.x86_64.xml @@ -165,6 +165,7 @@ <flag name='name-guest'/> <flag name='drive-detect-zeroes'/> <flag name='display'/> + <flag name='vhost-scsi'/> <version>2001001</version> <kvmVersion>0</kvmVersion> <package></package> diff --git a/tests/qemucapabilitiesdata/caps_2.4.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_2.4.0.x86_64.xml index db778ef..020ea44 100644 --- a/tests/qemucapabilitiesdata/caps_2.4.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_2.4.0.x86_64.xml @@ -185,6 +185,7 @@ <flag name='intel-iommu'/> <flag name='smm'/> <flag name='virtio-pci-disable-legacy'/> + <flag name='vhost-scsi'/> <version>2004000</version> <kvmVersion>0</kvmVersion> <package></package> diff --git a/tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml index fc915ad..730d755 100644 --- a/tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml @@ -190,6 +190,7 @@ <flag name='intel-iommu'/> <flag name='smm'/> <flag name='virtio-pci-disable-legacy'/> + <flag name='vhost-scsi'/> <version>2005000</version> <kvmVersion>0</kvmVersion> <package></package> diff --git a/tests/qemucapabilitiesdata/caps_2.6.0-gicv2.aarch64.xml b/tests/qemucapabilitiesdata/caps_2.6.0-gicv2.aarch64.xml index fd14665..73eb588 100644 --- a/tests/qemucapabilitiesdata/caps_2.6.0-gicv2.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_2.6.0-gicv2.aarch64.xml @@ -159,6 +159,7 @@ <flag name='display'/> <flag name='smm'/> <flag name='virtio-pci-disable-legacy'/> + <flag name='vhost-scsi'/> <version>2005094</version> <kvmVersion>0</kvmVersion> <package></package> diff --git a/tests/qemucapabilitiesdata/caps_2.6.0-gicv3.aarch64.xml b/tests/qemucapabilitiesdata/caps_2.6.0-gicv3.aarch64.xml index eb708f8..6143af3 100644 --- a/tests/qemucapabilitiesdata/caps_2.6.0-gicv3.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_2.6.0-gicv3.aarch64.xml @@ -159,6 +159,7 @@ <flag name='display'/> <flag name='smm'/> <flag name='virtio-pci-disable-legacy'/> + <flag name='vhost-scsi'/> <version>2005094</version> <kvmVersion>0</kvmVersion> <package></package> diff --git a/tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml b/tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml index 482b384..f8d9e70 100644 --- a/tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml +++ b/tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml @@ -153,6 +153,7 @@ <flag name='display'/> <flag name='smm'/> <flag name='virtio-pci-disable-legacy'/> + <flag name='vhost-scsi'/> <version>2005094</version> <kvmVersion>0</kvmVersion> <package></package> diff --git a/tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml index 60f1fcf..7cbfc1a 100644 --- a/tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml @@ -196,6 +196,7 @@ <flag name='intel-iommu'/> <flag name='smm'/> <flag name='virtio-pci-disable-legacy'/> + <flag name='vhost-scsi'/> <version>2006000</version> <kvmVersion>0</kvmVersion> <package></package> diff --git a/tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml index 7a54040..73a060a 100644 --- a/tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml @@ -194,6 +194,7 @@ <flag name='smm'/> <flag name='virtio-pci-disable-legacy'/> <flag name='query-hotpluggable-cpus'/> + <flag name='vhost-scsi'/> <version>2006091</version> <kvmVersion>0</kvmVersion> <package> (v2.7.0-rc1-52-g42e0d60)</package> -- 1.9.1

On 09/06/2016 08:58 AM, Eric Farman wrote:
Do all the stuff for the vhost-scsi capability in QEMU, so it's in place for our checks later.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com> --- src/qemu/qemu_capabilities.c | 3 +++ src/qemu/qemu_capabilities.h | 3 +++ tests/qemucapabilitiesdata/caps_1.5.3.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_1.6.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_1.7.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.1.1.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.4.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.6.0-gicv2.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.6.0-gicv3.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml | 1 + tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml | 1 + 13 files changed, 17 insertions(+)
There needs to be some fairly obvious merges here. I'm assuming they'll show up in a followup which I can merge more quickly to reduce future such merge needs. John

Open /dev/vhost-scsi, and record the resulting file descriptor, so that the guest has access to the host device outside of the libvirt daemon. Pass this information, along with data parsed from the XML file, to build a device string for the qemu command line. That device string will be for either a vhost-scsi-ccw device in the case of an s390 machine, or vhost-scsi-pci for any others. Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_command.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_command.h | 6 ++++ src/util/virscsi.c | 26 ++++++++++++++++ src/util/virscsi.h | 1 + 5 files changed, 114 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d62c74c..22fe14d 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2265,6 +2265,7 @@ virSCSIDeviceListNew; virSCSIDeviceListSteal; virSCSIDeviceNew; virSCSIDeviceSetUsedBy; +virSCSIOpenVhost; # util/virseclabel.h diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 982c33c..479dde2 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4711,6 +4711,52 @@ qemuBuildSCSIiSCSIHostdevDrvStr(virDomainHostdevDefPtr dev) } char * +qemuBuildHostHostdevDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps, + char *vhostfdName, + size_t vhostfdSize) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + virDomainHostdevSubsysHostPtr hostsrc = &dev->source.subsys.u.host; + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VHOST_SCSI)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("This QEMU doesn't support vhost-scsi devices")); + goto cleanup; + } + + if (ARCH_IS_S390(def->os.arch)) + virBufferAddLit(&buf, "vhost-scsi-ccw"); + else + virBufferAddLit(&buf, "vhost-scsi-pci"); + + virBufferAsprintf(&buf, ",wwpn=%s", hostsrc->wwpn); + + if (vhostfdSize == 1) { + virBufferAsprintf(&buf, ",vhostfd=%s", vhostfdName); + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("QEMU supports a single vhost-scsi file descriptor")); + goto cleanup; + } + + virBufferAsprintf(&buf, ",id=%s", dev->info->alias); + + if (qemuBuildDeviceAddressStr(&buf, def, dev->info, qemuCaps) < 0) + goto cleanup; + + if (virBufferCheckError(&buf) < 0) + goto cleanup; + + return virBufferContentAndReset(&buf); + + cleanup: + virBufferFreeAndReset(&buf); + return NULL; +} + +char * qemuBuildSCSIHostdevDrvStr(virDomainHostdevDefPtr dev) { virBuffer buf = VIR_BUFFER_INITIALIZER; @@ -5156,6 +5202,40 @@ qemuBuildHostdevCommandLine(virCommandPtr cmd, return -1; } } + + /* SCSI_host */ + if (hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST) { + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_SCSI_GENERIC)) { + if (hostdev->source.subsys.u.host.protocol == VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_VHOST) { + char *vhostfdName = NULL; + int vhostfd = -1; + size_t vhostfdSize = 1; + + if (virSCSIOpenVhost(&vhostfd, &vhostfdSize) < 0) + return -1; + + if (virAsprintf(&vhostfdName, "%d", vhostfd) < 0) { + VIR_FORCE_CLOSE(vhostfd); + return -1; + } + + virCommandPassFD(cmd, vhostfd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); + + virCommandAddArg(cmd, "-device"); + if (!(devstr = qemuBuildHostHostdevDevStr(def, hostdev, qemuCaps, vhostfdName, vhostfdSize))) + return -1; + virCommandAddArg(cmd, devstr); + + VIR_FREE(vhostfdName); + VIR_FREE(devstr); + } + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("SCSI passthrough is not supported by this version of qemu")); + return -1; + } + } } return 0; diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 9b9ccb6..78179de 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -158,6 +158,12 @@ char *qemuBuildSCSIHostdevDrvStr(virDomainHostdevDefPtr dev); char *qemuBuildSCSIHostdevDevStr(const virDomainDef *def, virDomainHostdevDefPtr dev, virQEMUCapsPtr qemuCaps); +char * +qemuBuildHostHostdevDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps, + char *vhostfdName, + size_t vhostfdSize); char *qemuBuildRedirdevDevStr(const virDomainDef *def, virDomainRedirdevDefPtr dev, diff --git a/src/util/virscsi.c b/src/util/virscsi.c index 4843367..290b692 100644 --- a/src/util/virscsi.c +++ b/src/util/virscsi.c @@ -105,6 +105,32 @@ virSCSIDeviceGetAdapterId(const char *adapter, return -1; } +int +virSCSIOpenVhost(int *vhostfd, + size_t *vhostfdSize) +{ + size_t i; + + for (i = 0; i < *vhostfdSize; i++) { + vhostfd[i] = open("/dev/vhost-scsi", O_RDWR); + + if (vhostfd[i] < 0) { + virReportSystemError(errno, "%s", + _("vhost-scsi was requested for an interface, " + "but is unavailable")); + goto error; + } + } + + return 0; + + error: + while (i--) + VIR_FORCE_CLOSE(vhostfd[i]); + + return -1; +} + char * virSCSIDeviceGetSgName(const char *sysfs_prefix, const char *adapter, diff --git a/src/util/virscsi.h b/src/util/virscsi.h index df40d7f..cb37da8 100644 --- a/src/util/virscsi.h +++ b/src/util/virscsi.h @@ -33,6 +33,7 @@ typedef virSCSIDevice *virSCSIDevicePtr; typedef struct _virSCSIDeviceList virSCSIDeviceList; typedef virSCSIDeviceList *virSCSIDeviceListPtr; +int virSCSIOpenVhost(int *vhostfd, size_t *vhostfdSize); char *virSCSIDeviceGetSgName(const char *sysfs_prefix, const char *adapter, unsigned int bus, -- 1.9.1

On 09/06/2016 08:58 AM, Eric Farman wrote:
Open /dev/vhost-scsi, and record the resulting file descriptor, so that the guest has access to the host device outside of the libvirt daemon. Pass this information, along with data parsed from the XML file, to build a device string for the qemu command line. That device string will be for either a vhost-scsi-ccw device in the case of an s390 machine, or vhost-scsi-pci for any others.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_command.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_command.h | 6 ++++ src/util/virscsi.c | 26 ++++++++++++++++ src/util/virscsi.h | 1 + 5 files changed, 114 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d62c74c..22fe14d 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2265,6 +2265,7 @@ virSCSIDeviceListNew; virSCSIDeviceListSteal; virSCSIDeviceNew; virSCSIDeviceSetUsedBy; +virSCSIOpenVhost;
# util/virseclabel.h diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 982c33c..479dde2 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4711,6 +4711,52 @@ qemuBuildSCSIiSCSIHostdevDrvStr(virDomainHostdevDefPtr dev) }
char * +qemuBuildHostHostdevDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps, + char *vhostfdName, + size_t vhostfdSize) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + virDomainHostdevSubsysHostPtr hostsrc = &dev->source.subsys.u.host; + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VHOST_SCSI)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("This QEMU doesn't support vhost-scsi devices")); + goto cleanup; + } + + if (ARCH_IS_S390(def->os.arch)) + virBufferAddLit(&buf, "vhost-scsi-ccw"); + else + virBufferAddLit(&buf, "vhost-scsi-pci"); + + virBufferAsprintf(&buf, ",wwpn=%s", hostsrc->wwpn);
This is where id add the "naa." prefix, e.g wwpn=naa.%s" - with a somewhat healthy comment behind it as to why it's being used.
+ + if (vhostfdSize == 1) { + virBufferAsprintf(&buf, ",vhostfd=%s", vhostfdName); + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("QEMU supports a single vhost-scsi file descriptor")); + goto cleanup; + } + + virBufferAsprintf(&buf, ",id=%s", dev->info->alias); + + if (qemuBuildDeviceAddressStr(&buf, def, dev->info, qemuCaps) < 0) + goto cleanup;
I think this is what I would think auditing (during hotplug of course) would end up using as the address rather than what will happen in patch 1 resulting in "?" being displayed.
+ + if (virBufferCheckError(&buf) < 0) + goto cleanup; + + return virBufferContentAndReset(&buf); + + cleanup: + virBufferFreeAndReset(&buf); + return NULL; +} + +char * qemuBuildSCSIHostdevDrvStr(virDomainHostdevDefPtr dev) { virBuffer buf = VIR_BUFFER_INITIALIZER; @@ -5156,6 +5202,40 @@ qemuBuildHostdevCommandLine(virCommandPtr cmd, return -1; } } + + /* SCSI_host */ + if (hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST) { + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_SCSI_GENERIC)) {
I'm conflicted if we should check QEMU_CAPS_DEVICE_VHOST_SCSI here as well... I know it's checked later, but we open vhost-scsi which I assume wouldn't exist if the cap wasn't there. Of course I see in hotplug things have to be a little different in order to add that fd and name via monitor rather than command.
+ if (hostdev->source.subsys.u.host.protocol == VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_VHOST) {
80 cols for the line
+ char *vhostfdName = NULL; + int vhostfd = -1; + size_t vhostfdSize = 1; + + if (virSCSIOpenVhost(&vhostfd, &vhostfdSize) < 0) + return -1; + + if (virAsprintf(&vhostfdName, "%d", vhostfd) < 0) { + VIR_FORCE_CLOSE(vhostfd); + return -1; + } + + virCommandPassFD(cmd, vhostfd, VIR_COMMAND_PASS_FD_CLOSE_PARENT);
80 cols for the line.
+ + virCommandAddArg(cmd, "-device"); + if (!(devstr = qemuBuildHostHostdevDevStr(def, hostdev, qemuCaps, vhostfdName, vhostfdSize)))
This is a really long line - try to keep at 80 cols.
+ return -1;
We'd leak vhostfdName on failure (I think the vhostfd will be reaped by Command cleanup. It might just be easier to extract the whole hunk into a function and do all the error processing there. Hey - BTW: I see this PCI configfd - I betcha a bit of tracking on that will help give you examples for security_*.c changes and whether they're necessary or not. I didn't go chase.
+ virCommandAddArg(cmd, devstr); + + VIR_FREE(vhostfdName); + VIR_FREE(devstr); + } + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("SCSI passthrough is not supported by this version of qemu")); + return -1; + } + } }
return 0; diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 9b9ccb6..78179de 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -158,6 +158,12 @@ char *qemuBuildSCSIHostdevDrvStr(virDomainHostdevDefPtr dev); char *qemuBuildSCSIHostdevDevStr(const virDomainDef *def, virDomainHostdevDefPtr dev, virQEMUCapsPtr qemuCaps); +char * +qemuBuildHostHostdevDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps, + char *vhostfdName, + size_t vhostfdSize);
char *qemuBuildRedirdevDevStr(const virDomainDef *def, virDomainRedirdevDefPtr dev, diff --git a/src/util/virscsi.c b/src/util/virscsi.c index 4843367..290b692 100644 --- a/src/util/virscsi.c +++ b/src/util/virscsi.c @@ -105,6 +105,32 @@ virSCSIDeviceGetAdapterId(const char *adapter, return -1; }
+int +virSCSIOpenVhost(int *vhostfd, + size_t *vhostfdSize)
Well this is dangerous... I can pass "any" value value here and really cause some damage. Why the need for a loop? I think you just attempt to open the file (you could even to a virFileExists() if you want to care about checking if the /dev/vhost-scsi exists before opening ...
+{ + size_t i; + + for (i = 0; i < *vhostfdSize; i++) { + vhostfd[i] = open("/dev/vhost-scsi", O_RDWR); + + if (vhostfd[i] < 0) { + virReportSystemError(errno, "%s", + _("vhost-scsi was requested for an interface, " + "but is unavailable"));
Simplify your error _("failed to open /dev/vhost-scsi") John
+ goto error; + } + } + + return 0; + + error: + while (i--) + VIR_FORCE_CLOSE(vhostfd[i]); + + return -1; +} + char * virSCSIDeviceGetSgName(const char *sysfs_prefix, const char *adapter, diff --git a/src/util/virscsi.h b/src/util/virscsi.h index df40d7f..cb37da8 100644 --- a/src/util/virscsi.h +++ b/src/util/virscsi.h @@ -33,6 +33,7 @@ typedef virSCSIDevice *virSCSIDevicePtr; typedef struct _virSCSIDeviceList virSCSIDeviceList; typedef virSCSIDeviceList *virSCSIDeviceListPtr;
+int virSCSIOpenVhost(int *vhostfd, size_t *vhostfdSize); char *virSCSIDeviceGetSgName(const char *sysfs_prefix, const char *adapter, unsigned int bus,

On 09/12/2016 06:10 PM, John Ferlan wrote:
On 09/06/2016 08:58 AM, Eric Farman wrote:
Open /dev/vhost-scsi, and record the resulting file descriptor, so that the guest has access to the host device outside of the libvirt daemon. Pass this information, along with data parsed from the XML file, to build a device string for the qemu command line. That device string will be for either a vhost-scsi-ccw device in the case of an s390 machine, or vhost-scsi-pci for any others.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> --- src/libvirt_private.syms | 1 + src/qemu/qemu_command.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_command.h | 6 ++++ src/util/virscsi.c | 26 ++++++++++++++++ src/util/virscsi.h | 1 + 5 files changed, 114 insertions(+)
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d62c74c..22fe14d 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2265,6 +2265,7 @@ virSCSIDeviceListNew; virSCSIDeviceListSteal; virSCSIDeviceNew; virSCSIDeviceSetUsedBy; +virSCSIOpenVhost;
# util/virseclabel.h diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 982c33c..479dde2 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4711,6 +4711,52 @@ qemuBuildSCSIiSCSIHostdevDrvStr(virDomainHostdevDefPtr dev) }
char * +qemuBuildHostHostdevDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps, + char *vhostfdName, + size_t vhostfdSize) +{ + virBuffer buf = VIR_BUFFER_INITIALIZER; + virDomainHostdevSubsysHostPtr hostsrc = &dev->source.subsys.u.host; + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VHOST_SCSI)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("This QEMU doesn't support vhost-scsi devices")); + goto cleanup; + } + + if (ARCH_IS_S390(def->os.arch)) + virBufferAddLit(&buf, "vhost-scsi-ccw"); + else + virBufferAddLit(&buf, "vhost-scsi-pci"); + + virBufferAsprintf(&buf, ",wwpn=%s", hostsrc->wwpn); This is where id add the "naa." prefix, e.g wwpn=naa.%s" - with a somewhat healthy comment behind it as to why it's being used.
+ + if (vhostfdSize == 1) { + virBufferAsprintf(&buf, ",vhostfd=%s", vhostfdName); + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("QEMU supports a single vhost-scsi file descriptor")); + goto cleanup; + } + + virBufferAsprintf(&buf, ",id=%s", dev->info->alias); + + if (qemuBuildDeviceAddressStr(&buf, def, dev->info, qemuCaps) < 0) + goto cleanup; I think this is what I would think auditing (during hotplug of course) would end up using as the address rather than what will happen in patch 1 resulting in "?" being displayed.
+ + if (virBufferCheckError(&buf) < 0) + goto cleanup; + + return virBufferContentAndReset(&buf); + + cleanup: + virBufferFreeAndReset(&buf); + return NULL; +} + +char * qemuBuildSCSIHostdevDrvStr(virDomainHostdevDefPtr dev) { virBuffer buf = VIR_BUFFER_INITIALIZER; @@ -5156,6 +5202,40 @@ qemuBuildHostdevCommandLine(virCommandPtr cmd, return -1; } } + + /* SCSI_host */ + if (hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST) { + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_SCSI_GENERIC)) { I'm conflicted if we should check QEMU_CAPS_DEVICE_VHOST_SCSI here as well... I know it's checked later, but we open vhost-scsi which I assume wouldn't exist if the cap wasn't there.
Of course I see in hotplug things have to be a little different in order to add that fd and name via monitor rather than command.
+ if (hostdev->source.subsys.u.host.protocol == VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_VHOST) { 80 cols for the line + char *vhostfdName = NULL; + int vhostfd = -1; + size_t vhostfdSize = 1; + + if (virSCSIOpenVhost(&vhostfd, &vhostfdSize) < 0) + return -1; + + if (virAsprintf(&vhostfdName, "%d", vhostfd) < 0) { + VIR_FORCE_CLOSE(vhostfd); + return -1; + } + + virCommandPassFD(cmd, vhostfd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); 80 cols for the line. + + virCommandAddArg(cmd, "-device"); + if (!(devstr = qemuBuildHostHostdevDevStr(def, hostdev, qemuCaps, vhostfdName, vhostfdSize))) This is a really long line - try to keep at 80 cols.
+ return -1; We'd leak vhostfdName on failure (I think the vhostfd will be reaped by Command cleanup.
My bad.
It might just be easier to extract the whole hunk into a function and do all the error processing there.
Hey - BTW: I see this PCI configfd - I betcha a bit of tracking on that will help give you examples for security_*.c changes and whether they're necessary or not. I didn't go chase.
Ooh, good lead. I said I'll chase down again, that's where I'll start.
+ virCommandAddArg(cmd, devstr); + + VIR_FREE(vhostfdName); + VIR_FREE(devstr); + } + } else { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("SCSI passthrough is not supported by this version of qemu")); + return -1; + } + } }
return 0; diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 9b9ccb6..78179de 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -158,6 +158,12 @@ char *qemuBuildSCSIHostdevDrvStr(virDomainHostdevDefPtr dev); char *qemuBuildSCSIHostdevDevStr(const virDomainDef *def, virDomainHostdevDefPtr dev, virQEMUCapsPtr qemuCaps); +char * +qemuBuildHostHostdevDevStr(const virDomainDef *def, + virDomainHostdevDefPtr dev, + virQEMUCapsPtr qemuCaps, + char *vhostfdName, + size_t vhostfdSize);
char *qemuBuildRedirdevDevStr(const virDomainDef *def, virDomainRedirdevDefPtr dev, diff --git a/src/util/virscsi.c b/src/util/virscsi.c index 4843367..290b692 100644 --- a/src/util/virscsi.c +++ b/src/util/virscsi.c @@ -105,6 +105,32 @@ virSCSIDeviceGetAdapterId(const char *adapter, return -1; }
+int +virSCSIOpenVhost(int *vhostfd, + size_t *vhostfdSize) Well this is dangerous... I can pass "any" value value here and really cause some damage. Why the need for a loop?
Well, I guess I only half-cleaned it up. I'm not aware of any mechanism to pass multiple fd's for vhost-scsi-pci or vhost-scsi-ccw into QEMU (unlike the virtio-scsi-{ccw|pci} paths), so I started ripping the loops out. Looking at it now, I guess I didn't rip out nearly as much as I thought I did. So, more to remove I guess. Unless leaving it there for an "if the future ever arrives" case is a problem.
I think you just attempt to open the file (you could even to a virFileExists() if you want to care about checking if the /dev/vhost-scsi exists before opening ...
I like this.
+{ + size_t i; + + for (i = 0; i < *vhostfdSize; i++) { + vhostfd[i] = open("/dev/vhost-scsi", O_RDWR); + + if (vhostfd[i] < 0) { + virReportSystemError(errno, "%s", + _("vhost-scsi was requested for an interface, " + "but is unavailable")); Simplify your error _("failed to open /dev/vhost-scsi")
John
Many thanks again for the reviews. - Eric
+ goto error; + } + } + + return 0; + + error: + while (i--) + VIR_FORCE_CLOSE(vhostfd[i]); + + return -1; +} + char * virSCSIDeviceGetSgName(const char *sysfs_prefix, const char *adapter, diff --git a/src/util/virscsi.h b/src/util/virscsi.h index df40d7f..cb37da8 100644 --- a/src/util/virscsi.h +++ b/src/util/virscsi.h @@ -33,6 +33,7 @@ typedef virSCSIDevice *virSCSIDevicePtr; typedef struct _virSCSIDeviceList virSCSIDeviceList; typedef virSCSIDeviceList *virSCSIDeviceListPtr;
+int virSCSIOpenVhost(int *vhostfd, size_t *vhostfdSize); char *virSCSIDeviceGetSgName(const char *sysfs_prefix, const char *adapter, unsigned int bus,

[...]
diff --git a/src/util/virscsi.c b/src/util/virscsi.c index 4843367..290b692 100644 --- a/src/util/virscsi.c +++ b/src/util/virscsi.c @@ -105,6 +105,32 @@ virSCSIDeviceGetAdapterId(const char *adapter, return -1; } +int +virSCSIOpenVhost(int *vhostfd, + size_t *vhostfdSize) Well this is dangerous... I can pass "any" value value here and really cause some damage. Why the need for a loop?
Well, I guess I only half-cleaned it up. I'm not aware of any mechanism to pass multiple fd's for vhost-scsi-pci or vhost-scsi-ccw into QEMU (unlike the virtio-scsi-{ccw|pci} paths), so I started ripping the loops out. Looking at it now, I guess I didn't rip out nearly as much as I thought I did. So, more to remove I guess. Unless leaving it there for an "if the future ever arrives" case is a problem.
Doh - search on 'vhost-net' to find qemuInterfaceOpenVhostNet... That's a multiple open model and depending on virtio.queues which is something I believe I was asking about for an attribute num_queues... John

On 09/13/2016 04:55 PM, John Ferlan wrote:
[...]
diff --git a/src/util/virscsi.c b/src/util/virscsi.c index 4843367..290b692 100644 --- a/src/util/virscsi.c +++ b/src/util/virscsi.c @@ -105,6 +105,32 @@ virSCSIDeviceGetAdapterId(const char *adapter, return -1; } +int +virSCSIOpenVhost(int *vhostfd, + size_t *vhostfdSize) Well this is dangerous... I can pass "any" value value here and really cause some damage. Why the need for a loop? Well, I guess I only half-cleaned it up. I'm not aware of any mechanism to pass multiple fd's for vhost-scsi-pci or vhost-scsi-ccw into QEMU (unlike the virtio-scsi-{ccw|pci} paths), so I started ripping the loops out. Looking at it now, I guess I didn't rip out nearly as much as I thought I did. So, more to remove I guess. Unless leaving it there for an "if the future ever arrives" case is a problem.
Doh - search on 'vhost-net' to find qemuInterfaceOpenVhostNet... That's a multiple open model and depending on virtio.queues which is something I believe I was asking about for an attribute num_queues...
Yeah, that was the starting point. What I'd discovered, and why I started ripping this apart, is that QEMU itself doesn't have an option to pass multiple fd's for a vhost-scsi device. So while for vhost-net has fd= and (vhost)fds=, vhost-scsi only has vhostfd=. qemu-system-s390x: -device vhost-scsi-ccw,wwpn=naa.500140594524c195,vhostfds=5: Property '.vhostfds' not found The queues parameter operates in conjunction with the number of virtqueue rings that are instantiated for the guest process. - Eric

Adjust the device string that is built for vhost-scsi devices so that it can be invoked from hotplug.
From the QEMU command line, the file descriptors are expect to be numeric only. However, for hotplug, the file descriptors are expected to begin with at least one alphabetic character else this error occurs:
# virsh attach-device guest_0001 ~/vhost.xml error: Failed to attach device from /root/vhost.xml error: internal error: unable to execute QEMU command 'getfd': Parameter 'fdname' expects a name not starting with a digit We also close the file descriptor in this case, so that shutting down the guest cleans up the host cgroup entries and allows future guests to use vhost-scsi devices. (Otherwise the guest will silently end.) Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> --- src/conf/domain_conf.c | 6 ++ src/qemu/qemu_hotplug.c | 148 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 154 insertions(+) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 45b643b..123bbcb 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -13720,6 +13720,12 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, return virDomainHostdevMatchSubsysSCSIiSCSI(a, b); else return virDomainHostdevMatchSubsysSCSIHost(a, b); + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + if (a->source.subsys.u.host.protocol != + b->source.subsys.u.host.protocol) + return 0; + if (STREQ(a->source.subsys.u.host.wwpn, b->source.subsys.u.host.wwpn)) + return 1; } return 0; } diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index e9140a9..5f8d411 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -2165,6 +2165,119 @@ qemuDomainAttachHostSCSIDevice(virConnectPtr conn, goto cleanup; } +static int +qemuDomainAttachHostSCSIHostDevice(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainHostdevDefPtr hostdev) +{ + int ret = -1; + qemuDomainObjPrivatePtr priv = vm->privateData; + virDomainCCWAddressSetPtr ccwaddrs = NULL; + char *vhostfdName = NULL; + int vhostfd = -1; + size_t vhostfdSize = 1; + char *devstr = NULL; + bool teardowncgroup = false; + bool teardownlabel = false; + bool releaseaddr = false; + + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE_SCSI_GENERIC)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("SCSI passthrough is not supported by this version of qemu")); + goto cleanup; + } + + if (qemuSetupHostdevCgroup(vm, hostdev) < 0) + goto cleanup; + teardowncgroup = true; + + if (virSecurityManagerSetHostdevLabel(driver->securityManager, + vm->def, hostdev, NULL) < 0) + goto cleanup; + teardownlabel = true; + + if (virSCSIOpenVhost(&vhostfd, &vhostfdSize) < 0) + goto cleanup; + + if (virAsprintf(&vhostfdName, "vhostfd-%d", vhostfd) < 0) + goto cleanup; + + if (hostdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) { + if (qemuDomainMachineIsS390CCW(vm->def) && + virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VIRTIO_CCW)) + hostdev->info->type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW; + } + + if (hostdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE || + hostdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI ) { + if (virDomainPCIAddressEnsureAddr(priv->pciaddrs, &hostdev->info) < 0) + goto cleanup; + } else if (hostdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW) { + if (!(ccwaddrs = qemuDomainCCWAddrSetCreateFromDomain(vm->def))) + goto cleanup; + if (virDomainCCWAddressAssign(hostdev->info, ccwaddrs, + !hostdev->info->addr.ccw.assigned) < 0) + goto cleanup; + } + releaseaddr = true; + + if (qemuAssignDeviceHostdevAlias(vm->def, &hostdev->info->alias, -1) < 0) + goto cleanup; + + qemuDomainObjEnterMonitor(driver, vm); + + if (hostdev->source.subsys.u.host.protocol == VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_VHOST) { + if (!(devstr = qemuBuildHostHostdevDevStr(vm->def, + hostdev, + priv->qemuCaps, + vhostfdName, + vhostfdSize))) + goto exit_monitor; + + if ((ret = qemuMonitorAddDeviceWithFd(priv->mon, + devstr, + vhostfd, + vhostfdName)) < 0) { + goto exit_monitor; + } + } + + if (qemuDomainObjExitMonitor(driver, vm) < 0) + goto cleanup; + + virDomainAuditHostdev(vm, hostdev, "attach", true); + + if (VIR_REALLOC_N(vm->def->hostdevs, vm->def->nhostdevs + 1) < 0) + goto cleanup; + + vm->def->hostdevs[vm->def->nhostdevs++] = hostdev; + + virDomainCCWAddressSetFree(ccwaddrs); + + VIR_FORCE_CLOSE(vhostfd); + VIR_FREE(vhostfdName); + VIR_FREE(devstr); + return 0; + + exit_monitor: + ignore_value(qemuDomainObjExitMonitor(driver, vm)); + + cleanup: + if (teardowncgroup && qemuTeardownHostdevCgroup(vm, hostdev) < 0) + VIR_WARN("Unable to remove host device cgroup ACL on hotplug fail"); + if (teardownlabel && + virSecurityManagerRestoreHostdevLabel(driver->securityManager, + vm->def, hostdev, NULL) < 0) + VIR_WARN("Unable to restore host device labelling on hotplug fail"); + if (releaseaddr) + qemuDomainReleaseDeviceAddress(vm, hostdev->info, NULL); + + VIR_FORCE_CLOSE(vhostfd); + VIR_FREE(vhostfdName); + VIR_FREE(devstr); + return ret; +} + int qemuDomainAttachHostDevice(virConnectPtr conn, @@ -2198,6 +2311,11 @@ qemuDomainAttachHostDevice(virConnectPtr conn, goto error; break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + if (qemuDomainAttachHostSCSIHostDevice(driver, vm, hostdev) < 0) + goto error; + break; + default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("hostdev subsys type '%s' not supported"), @@ -3960,6 +4078,31 @@ qemuDomainDetachHostSCSIDevice(virQEMUDriverPtr driver, } static int +qemuDomainDetachHostSCSIHostDevice(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainHostdevDefPtr detach) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int ret = -1; + + if (!detach->info->alias) { + virReportError(VIR_ERR_OPERATION_FAILED, + "%s", _("device cannot be detached without a device alias")); + return -1; + } + + qemuDomainMarkDeviceForRemoval(vm, detach->info); + + qemuDomainObjEnterMonitor(driver, vm); + ret = qemuMonitorDelDevice(priv->mon, detach->info->alias); + + if (qemuDomainObjExitMonitor(driver, vm) < 0) + return -1; + + return ret; +} + +static int qemuDomainDetachThisHostDevice(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainHostdevDefPtr detach) @@ -3981,6 +4124,9 @@ qemuDomainDetachThisHostDevice(virQEMUDriverPtr driver, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: ret = qemuDomainDetachHostSCSIDevice(driver, vm, detach); break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + ret = qemuDomainDetachHostSCSIHostDevice(driver, vm, detach); + break; default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("hostdev subsys type '%s' not supported"), @@ -4058,6 +4204,8 @@ int qemuDomainDetachHostDevice(virQEMUDriverPtr driver, } break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected hostdev type %d"), subsys->type); -- 1.9.1

On 09/06/2016 08:58 AM, Eric Farman wrote:
Adjust the device string that is built for vhost-scsi devices so that it can be invoked from hotplug.
From the QEMU command line, the file descriptors are expect to be numeric only. However, for hotplug, the file descriptors are expected to begin with at least one alphabetic character else this error occurs:
# virsh attach-device guest_0001 ~/vhost.xml error: Failed to attach device from /root/vhost.xml error: internal error: unable to execute QEMU command 'getfd': Parameter 'fdname' expects a name not starting with a digit
We also close the file descriptor in this case, so that shutting down the guest cleans up the host cgroup entries and allows future guests to use vhost-scsi devices. (Otherwise the guest will silently end.)
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> --- src/conf/domain_conf.c | 6 ++ src/qemu/qemu_hotplug.c | 148 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 154 insertions(+)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 45b643b..123bbcb 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -13720,6 +13720,12 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a, return virDomainHostdevMatchSubsysSCSIiSCSI(a, b); else return virDomainHostdevMatchSubsysSCSIHost(a, b); + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + if (a->source.subsys.u.host.protocol != + b->source.subsys.u.host.protocol) + return 0; + if (STREQ(a->source.subsys.u.host.wwpn, b->source.subsys.u.host.wwpn)) + return 1; } return 0; }
This hunk seems to be out of place in this patch... Although I assume this is because of the remove device searches.
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index e9140a9..5f8d411 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -2165,6 +2165,119 @@ qemuDomainAttachHostSCSIDevice(virConnectPtr conn, goto cleanup; }
+static int +qemuDomainAttachHostSCSIHostDevice(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainHostdevDefPtr hostdev) +{ + int ret = -1; + qemuDomainObjPrivatePtr priv = vm->privateData; + virDomainCCWAddressSetPtr ccwaddrs = NULL; + char *vhostfdName = NULL; + int vhostfd = -1; + size_t vhostfdSize = 1; + char *devstr = NULL; + bool teardowncgroup = false; + bool teardownlabel = false; + bool releaseaddr = false; + + if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DEVICE_SCSI_GENERIC)) {
Here again the capabilities conundrum.
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("SCSI passthrough is not supported by this version of qemu")); + goto cleanup; + } + + if (qemuSetupHostdevCgroup(vm, hostdev) < 0) + goto cleanup;
Oh, um, didn't I comment on this in patch 1 - there's no VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST in qemuSetupHostdevCgroup
+ teardowncgroup = true; + + if (virSecurityManagerSetHostdevLabel(driver->securityManager, + vm->def, hostdev, NULL) < 0) + goto cleanup; + teardownlabel = true;
Likewise there's really not much going on...
+ + if (virSCSIOpenVhost(&vhostfd, &vhostfdSize) < 0) + goto cleanup; + + if (virAsprintf(&vhostfdName, "vhostfd-%d", vhostfd) < 0) + goto cleanup; + + if (hostdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) { + if (qemuDomainMachineIsS390CCW(vm->def) && + virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_VIRTIO_CCW)) + hostdev->info->type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW; + } + + if (hostdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE || + hostdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI ) { + if (virDomainPCIAddressEnsureAddr(priv->pciaddrs, &hostdev->info) < 0)
'info' is already a virDomainDeviceInfoPtr (unlike where you cut-n-paste'd from), so no need for the '&' here (besides it induces a build failure).
+ goto cleanup; + } else if (hostdev->info->type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW) { + if (!(ccwaddrs = qemuDomainCCWAddrSetCreateFromDomain(vm->def))) + goto cleanup; + if (virDomainCCWAddressAssign(hostdev->info, ccwaddrs, + !hostdev->info->addr.ccw.assigned) < 0) + goto cleanup; + } + releaseaddr = true; + + if (qemuAssignDeviceHostdevAlias(vm->def, &hostdev->info->alias, -1) < 0) + goto cleanup; + + qemuDomainObjEnterMonitor(driver, vm); + + if (hostdev->source.subsys.u.host.protocol == VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_VHOST) {
Long line > 80 cols Ironically we do nothing if the protocol isn't VHOST...
+ if (!(devstr = qemuBuildHostHostdevDevStr(vm->def, + hostdev, + priv->qemuCaps, + vhostfdName, + vhostfdSize)))
This can be done outside holding the monitor lock
+ goto exit_monitor; + + if ((ret = qemuMonitorAddDeviceWithFd(priv->mon, + devstr, + vhostfd, + vhostfdName)) < 0) { + goto exit_monitor; + } + } + + if (qemuDomainObjExitMonitor(driver, vm) < 0) + goto cleanup; + + virDomainAuditHostdev(vm, hostdev, "attach", true);
s/true/(ret == 0)/ Since we've set up the addresses above - the auditing code should print more than "?" Here should be a "if (ret < 0) goto cleanup;" (on two lines).
+ + if (VIR_REALLOC_N(vm->def->hostdevs, vm->def->nhostdevs + 1) < 0) + goto cleanup; + + vm->def->hostdevs[vm->def->nhostdevs++] = hostdev;
Other code will increase the size of hostdevs before the monitor calls, then just insert.
+ + virDomainCCWAddressSetFree(ccwaddrs); + + VIR_FORCE_CLOSE(vhostfd);
This I would think would only be done on *failure* to AddDeviceWithFd, but I see you're following qemuDomainAttachHostPCIDevice
+ VIR_FREE(vhostfdName); + VIR_FREE(devstr); + return 0; + + exit_monitor: + ignore_value(qemuDomainObjExitMonitor(driver, vm)); + + cleanup: + if (teardowncgroup && qemuTeardownHostdevCgroup(vm, hostdev) < 0) + VIR_WARN("Unable to remove host device cgroup ACL on hotplug fail"); + if (teardownlabel && + virSecurityManagerRestoreHostdevLabel(driver->securityManager, + vm->def, hostdev, NULL) < 0) + VIR_WARN("Unable to restore host device labelling on hotplug fail"); + if (releaseaddr) + qemuDomainReleaseDeviceAddress(vm, hostdev->info, NULL); + + VIR_FORCE_CLOSE(vhostfd); + VIR_FREE(vhostfdName); + VIR_FREE(devstr); + return ret; +} +
int qemuDomainAttachHostDevice(virConnectPtr conn, @@ -2198,6 +2311,11 @@ qemuDomainAttachHostDevice(virConnectPtr conn, goto error; break;
+ case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + if (qemuDomainAttachHostSCSIHostDevice(driver, vm, hostdev) < 0) + goto error; + break; + default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("hostdev subsys type '%s' not supported"), @@ -3960,6 +4078,31 @@ qemuDomainDetachHostSCSIDevice(virQEMUDriverPtr driver, }
static int +qemuDomainDetachHostSCSIHostDevice(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainHostdevDefPtr detach) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int ret = -1; + + if (!detach->info->alias) { + virReportError(VIR_ERR_OPERATION_FAILED, + "%s", _("device cannot be detached without a device alias")); + return -1; + } + + qemuDomainMarkDeviceForRemoval(vm, detach->info); + + qemuDomainObjEnterMonitor(driver, vm); + ret = qemuMonitorDelDevice(priv->mon, detach->info->alias); + + if (qemuDomainObjExitMonitor(driver, vm) < 0) + return -1; + + return ret; +} + +static int qemuDomainDetachThisHostDevice(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainHostdevDefPtr detach) @@ -3981,6 +4124,9 @@ qemuDomainDetachThisHostDevice(virQEMUDriverPtr driver, case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI: ret = qemuDomainDetachHostSCSIDevice(driver, vm, detach); break; + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST: + ret = qemuDomainDetachHostSCSIHostDevice(driver, vm, detach); + break; default: virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("hostdev subsys type '%s' not supported"), @@ -4058,6 +4204,8 @@ int qemuDomainDetachHostDevice(virQEMUDriverPtr driver,
Ah - and this is where that pesky virDomainHostdevFind causes the need for the domain_conf.c code above...
} break; } + case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_HOST:
But we need an error here otherwise we get a generic failed with some error message. John
+ break; default: virReportError(VIR_ERR_INTERNAL_ERROR, _("unexpected hostdev type %d"), subsys->type);

The qemuxml2argv test was cloned from hostdev-scsi-virtio-scsi Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com> --- .../qemuxml2argv-hostdev-scsi-vhost-scsi.args | 24 ++++++++++++++++ .../qemuxml2argv-hostdev-scsi-vhost-scsi.xml | 33 ++++++++++++++++++++++ tests/qemuxml2argvmock.c | 12 ++++++++ tests/qemuxml2argvtest.c | 3 ++ 4 files changed, 72 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.xml diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.args b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.args new file mode 100644 index 0000000..5cd2939 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.args @@ -0,0 +1,24 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu \ +-name QEMUGuest2 \ +-S \ +-M pc \ +-m 214 \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9466-926a-d65c16db1809 \ +-nographic \ +-nodefaults \ +-monitor unix:/tmp/lib/domain--1-QEMUGuest2/monitor.sock,server,nowait \ +-no-acpi \ +-boot c \ +-device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest2,format=raw,if=none,id=drive-ide0-0-0 \ +-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \ +-device vhost-scsi-pci,wwpn=naa.5123456789abcde0,vhostfd=3,id=hostdev0 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.xml b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.xml new file mode 100644 index 0000000..5b7cc02 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.xml @@ -0,0 +1,33 @@ +<domain type='qemu'> + <name>QEMUGuest2</name> + <uuid>c7a5fdbd-edaf-9466-926a-d65c16db1809</uuid> + <memory unit='KiB'>219100</memory> + <currentMemory unit='KiB'>219100</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <source dev='/dev/HostVG/QEMUGuest2'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='scsi' index='0' model='virtio-scsi'/> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <hostdev mode='subsystem' type='scsi_host'> + <source protocol='vhost' wwpn='naa.5123456789abcde0'/> + </hostdev> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvmock.c b/tests/qemuxml2argvmock.c index 78a224b..568dcc0 100644 --- a/tests/qemuxml2argvmock.c +++ b/tests/qemuxml2argvmock.c @@ -107,6 +107,18 @@ virSCSIDeviceGetSgName(const char *sysfs_prefix ATTRIBUTE_UNUSED, } int +virSCSIOpenVhost(int *vhostfd, + size_t *vhostfdSize) +{ + size_t i; + + for (i = 0; i < *vhostfdSize; i++) + vhostfd[i] = STDERR_FILENO + 1 + i; + + return 0; +} + +int virNetDevTapCreate(char **ifname, const char *tunpath ATTRIBUTE_UNUSED, int *tapfd, diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 39abe72..99c0465 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1783,6 +1783,9 @@ mymain(void) DO_TEST("hostdev-scsi-virtio-iscsi-auth", QEMU_CAPS_VIRTIO_SCSI, QEMU_CAPS_VIRTIO_SCSI, QEMU_CAPS_DEVICE_SCSI_GENERIC); + DO_TEST("hostdev-scsi-vhost-scsi", + QEMU_CAPS_VIRTIO_SCSI, QEMU_CAPS_DEVICE_VHOST_SCSI, + QEMU_CAPS_DEVICE_SCSI_GENERIC); DO_TEST("mlock-on", QEMU_CAPS_REALTIME_MLOCK); DO_TEST_FAILURE("mlock-on", NONE); -- 1.9.1

On 09/06/2016 08:58 AM, Eric Farman wrote:
The qemuxml2argv test was cloned from hostdev-scsi-virtio-scsi
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com> --- .../qemuxml2argv-hostdev-scsi-vhost-scsi.args | 24 ++++++++++++++++ .../qemuxml2argv-hostdev-scsi-vhost-scsi.xml | 33 ++++++++++++++++++++++ tests/qemuxml2argvmock.c | 12 ++++++++ tests/qemuxml2argvtest.c | 3 ++ 4 files changed, 72 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.xml
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.args b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.args new file mode 100644 index 0000000..5cd2939 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.args @@ -0,0 +1,24 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/home/test \ +USER=test \ +LOGNAME=test \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu \ +-name QEMUGuest2 \ +-S \ +-M pc \ +-m 214 \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9466-926a-d65c16db1809 \ +-nographic \ +-nodefaults \ +-monitor unix:/tmp/lib/domain--1-QEMUGuest2/monitor.sock,server,nowait \ +-no-acpi \ +-boot c \ +-device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 \ +-usb \ +-drive file=/dev/HostVG/QEMUGuest2,format=raw,if=none,id=drive-ide0-0-0 \ +-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \ +-device vhost-scsi-pci,wwpn=naa.5123456789abcde0,vhostfd=3,id=hostdev0 \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.xml b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.xml new file mode 100644 index 0000000..5b7cc02
This hunk would be with the qemu command line patch
--- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-scsi-vhost-scsi.xml @@ -0,0 +1,33 @@ +<domain type='qemu'> + <name>QEMUGuest2</name> + <uuid>c7a5fdbd-edaf-9466-926a-d65c16db1809</uuid> + <memory unit='KiB'>219100</memory> + <currentMemory unit='KiB'>219100</currentMemory> + <vcpu placement='static'>1</vcpu> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <source dev='/dev/HostVG/QEMUGuest2'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='scsi' index='0' model='virtio-scsi'/> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <hostdev mode='subsystem' type='scsi_host'> + <source protocol='vhost' wwpn='naa.5123456789abcde0'/> + </hostdev> + <memballoon model='virtio'/> + </devices> +</domain>
This hunk would be with the domain xml patch (along with the dataout file)
diff --git a/tests/qemuxml2argvmock.c b/tests/qemuxml2argvmock.c index 78a224b..568dcc0 100644 --- a/tests/qemuxml2argvmock.c +++ b/tests/qemuxml2argvmock.c @@ -107,6 +107,18 @@ virSCSIDeviceGetSgName(const char *sysfs_prefix ATTRIBUTE_UNUSED, }
int +virSCSIOpenVhost(int *vhostfd, + size_t *vhostfdSize) +{ + size_t i; + + for (i = 0; i < *vhostfdSize; i++) + vhostfd[i] = STDERR_FILENO + 1 + i; + + return 0; +} + +int
I think this hunk belongs with the code that added the function...
virNetDevTapCreate(char **ifname, const char *tunpath ATTRIBUTE_UNUSED, int *tapfd, diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 39abe72..99c0465 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1783,6 +1783,9 @@ mymain(void) DO_TEST("hostdev-scsi-virtio-iscsi-auth", QEMU_CAPS_VIRTIO_SCSI, QEMU_CAPS_VIRTIO_SCSI,
Interesting and existing - QEMU_CAPS_VIRTIO_SCSI listed twice... That's a separate patch though - I'll send that soon.
QEMU_CAPS_DEVICE_SCSI_GENERIC); + DO_TEST("hostdev-scsi-vhost-scsi", + QEMU_CAPS_VIRTIO_SCSI, QEMU_CAPS_DEVICE_VHOST_SCSI, + QEMU_CAPS_DEVICE_SCSI_GENERIC);
This hunk would go with the qemu command line patch... John
DO_TEST("mlock-on", QEMU_CAPS_REALTIME_MLOCK); DO_TEST_FAILURE("mlock-on", NONE);
participants (2)
-
Eric Farman
-
John Ferlan