On Fri, Jul 25, 2014 at 02:58:58PM -0700, Nicholas A. Bellinger wrote:
A vhost-scsi controller instance doesn't require the extra
virtio-scsi
disk args, at least not in order to boot QEMU proper.
The configuration of vhost-scsi WWPNs and their associated LUNs is done
using a configfs based control plane provided by the in-kernel target,
for which all in-kernel drivers share common code within
drivers/target/target_core_fabric_configfs.c.
Configfs provides reference counting for data structures within
vhost-scsi itself, and also inter-module reference counting between LIO
backend devices under /sys/kernel/config/target/core/$HBA/$DEV/, and
vhost-scsi LUN export.
The virtio-scsi LLD Host:Channel:Target:LUN disk locations in the guest
are based upon what is populated using configfs groups + symlinks
under /sys/kernel/config/target/vhost/$WWPN/$TPGT/lun/$LUN.
The rtslib library + targetcli shell are the preferred (and friendliest)
way for driving the creation of vhost-scsi controllers + LUN exports.
Hmm, now I see why vhost-scsi is so crippled in terms of features compared
to virtio-scsi. Almost everything about it is completely opaque to QEMU.
Portraying it as equivalent to virtio-scsi, only faster, is really rather
misleading / confusing :-(
Since QEMU doesn't get to configure the attached LUNs we loose any ability
to do block I/O measurement or throttling, all the drive-mirror block job
functionality, LUN hotplug/unplug, control of what happens on LUN read
or write errors, disk image formats and more besides. I don't see any
viable way to address any of that via QEMU even if we wanted to fix it.
If it is also opaque to libvirt, then it makes it impossible to actually
use this feature via the libvirt API unless you also have a side-channel
giving you root access to the host to configure configfs :-( W can also
no longer apply disk locking / lease aquisition per LUN to prevent the
same disk image being used by two VMs at the same time, and loose the
SELinux/sVirt isolation of guest from disks. It would be possible to
partially address much of this by making libvirt itself responsible for
all of the configuration HBA in configfs, but that is a major amount of
work to undertake. I also wonder if QEMU is placed in a cgroup with the
blkio controller attached, will I/O to vhost-scsi be correctly attributed
and controlled by the blkio controller.
As long as everything about the LUN configuration is completely opaque
to libvirt & QEMU, I don't think that representing vhost-scsi as a
<controller> really makes any sense. The <controller> stuff in libvirt
only exists in the first place in order to have somewhere to hang the
<disk> configs off. Using <controller> would only make sense if the
patch were to properly support the corresponding <disk> attachments
and take full ownership of configuring things via configfs. I'm not
sure that is worth the effort or ongoing maintainence cost though
given that we're expecting virtio-scsi to be able to match vhost-scsi
for performance, so there won't be much compelling reason to sacrifice
so many QEMU features and use vhost-scsi.
As proposed, this patch is really doing something more akin to SCSI HBA
passthrough from host to guest, which would be something that's more
appropriate for the <hostdev> configuration data. That's making it clear
that the device is completely opaque to libvirt/QEMU from a functional
configuration POV. We currently have a <hostdev> 'scsi' feature, but
that is about passthrough of individual LUNs, so we'd have to invent
new configuration schema for 'scsi_host' <hostdev> type. This seems like
the most viable approach to supporting this feature right now.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|