[libvirt] [PATCH 0/2] qemu: add new disk type='lun' for bus='virtio'

These two patches are in response to CVE-2011-4127: http://seclists.org/oss-sec/2011/q4/536 Once the kernel security fix and corresponding qemu mitigation patch are in place, access to SG_IO commands from qemu guests will be disabled by default. This patch series provides a way to explicitly enable such support when it is required. In a discussion just before sending this patch series, Paolo Bonzini wondered if rather than the xml syntax being what's proposed here: <disk type='block' device='disk' dev='/dev/sda'> <!-- SG_IO off --> <disk type='lun' device='disk' dev='/dev/sda'> <!-- SG_IO on --> maybe it should instead be: <disk type='block' device='disk' dev='/dev/sda'> <!-- SG_IO off --> <disk type='block' device='lun' dev='/dev/sda'> <!-- SG_IO on --> I guess it partly depends on whether we would ever want to turn on SG_IO for a disk with device='cdrom|floppy' vs. if we would ever want to turn it on for type='file|dir|network'. Opinions?

This patch adds two capabilities flags to deal with various aspects of supporting SG_IO commands on virtio-blk-pci devices: QEMU_CAPS_VIRTIO_BLK_SCSI set if -device virtio-blk-pci accepts the scsi="on|off" option When present, this is on by default, but can be set to off to disable SG_IO functions. QEMU_CAPS_VIRTIO_BLK_SG_IO set if SG_IO commands are supported in the virtio-blk-pci driver (present since qemu 0.11 according to a qemu developer, if I understood correctly) --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_capabilities.h | 3 +++ 2 files changed, 11 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 43c7578..6842cfe 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -144,6 +144,9 @@ VIR_ENUM_IMPL(qemuCaps, QEMU_CAPS_LAST, "ich9-ahci", "no-acpi", "fsdev-readonly", + + "virtio-blk-pci.scsi", + "blk-sg-io", ); struct qemu_feature_flags { @@ -1149,6 +1152,9 @@ qemuCapsComputeCmdFlags(const char *help, if (version >= 10000) qemuCapsSet(flags, QEMU_CAPS_0_10); + if (version >= 11000) + qemuCapsSet(flags, QEMU_CAPS_VIRTIO_BLK_SG_IO); + /* While JSON mode was available in 0.12.0, it was too * incomplete to contemplate using. The 0.13.0 release * is good enough to use, even though it lacks one or @@ -1386,6 +1392,8 @@ qemuCapsParseDeviceStr(const char *str, virBitmapPtr flags) qemuCapsSet(flags, QEMU_CAPS_VIRTIO_BLK_EVENT_IDX); if (strstr(str, "virtio-net-pci.event_idx")) qemuCapsSet(flags, QEMU_CAPS_VIRTIO_NET_EVENT_IDX); + if (strstr(str, "virtio-blk-pci.scsi")) + qemuCapsSet(flags, QEMU_CAPS_VIRTIO_BLK_SCSI); return 0; } diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index c759baf..d47177c 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -118,6 +118,9 @@ enum qemuCapsFlags { QEMU_CAPS_NO_ACPI = 78, /* -no-acpi */ QEMU_CAPS_FSDEV_READONLY =79, /* -fsdev readonly supported */ + QEMU_CAPS_VIRTIO_BLK_SCSI = 80, /* virtio-blk-pci.scsi */ + QEMU_CAPS_VIRTIO_BLK_SG_IO = 81, /* support for SG_IO commands, reportedly added in 0.11 */ + QEMU_CAPS_LAST, /* this must always be the last item */ }; -- 1.7.7.4

On 12/22/2011 11:39 AM, Laine Stump wrote:
This patch adds two capabilities flags to deal with various aspects of supporting SG_IO commands on virtio-blk-pci devices:
QEMU_CAPS_VIRTIO_BLK_SCSI set if -device virtio-blk-pci accepts the scsi="on|off" option When present, this is on by default, but can be set to off to disable SG_IO functions.
QEMU_CAPS_VIRTIO_BLK_SG_IO set if SG_IO commands are supported in the virtio-blk-pci driver (present since qemu 0.11 according to a qemu developer, if I understood correctly) --- src/qemu/qemu_capabilities.c | 8 ++++++++ src/qemu/qemu_capabilities.h | 3 +++ 2 files changed, 11 insertions(+), 0 deletions(-)
ACK. This patch is uncontroversial, even while we debate on the proper XML representation to use in patch 2/2.
+++ b/src/qemu/qemu_capabilities.h @@ -118,6 +118,9 @@ enum qemuCapsFlags { QEMU_CAPS_NO_ACPI = 78, /* -no-acpi */ QEMU_CAPS_FSDEV_READONLY =79, /* -fsdev readonly supported */
While you are touching this area of code, would you mind inserting the missing space after '=' on this line?
+ QEMU_CAPS_VIRTIO_BLK_SCSI = 80, /* virtio-blk-pci.scsi */ + QEMU_CAPS_VIRTIO_BLK_SG_IO = 81, /* support for SG_IO commands, reportedly added in 0.11 */ + QEMU_CAPS_LAST, /* this must always be the last item */ };
-- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

As a result of CVE-2011-4127 (see: http://seclists.org/oss-sec/2011/q4/536), qemu (and the kernel) now disable all SG_IO commands on disks via qemu by default. To continue providing this capability to those who specifically ask for it (and hopefully understand the risks), a new libvirt disk type has been created. This new disk type is identical to disk type='block', except that: 1) It is only allowed if bus='virtio' and the qemu version is "new enough" to support it ("new enough" == qemu 0.11 or better), otherwise the domain will fail to start and a CONFIG_UNSUPPORTED error will be logged). 2) The option "scsi=on" will be added to the -device arg to allow SG_IO commands (if type!='lun', "scsi=off" will be added to the -device arg so that SG_IO commands are forbidden). Guests which continue to use disk type='block' will no longer be able to use SG_IO commands on the disk; those that have their disk type changed to type='lun' will still be able to use SG_IO commands. --- docs/formatdomain.html.in | 26 ++++++---- docs/schemas/domaincommon.rng | 16 ++++++ src/conf/domain_conf.c | 5 ++- src/conf/domain_conf.h | 1 + src/locking/domain_lock.c | 1 + src/qemu/qemu_command.c | 25 +++++++++ src/qemu/qemu_driver.c | 4 +- tests/qemuhelptest.c | 24 ++++++--- .../qemuxml2argv-boot-complex-bootindex.args | 4 +- .../qemuxml2argv-boot-complex.args | 4 +- .../qemuxml2argvdata/qemuxml2argv-boot-order.args | 2 +- .../qemuxml2argv-disk-ioeventfd.args | 2 +- .../qemuxml2argvdata/qemuxml2argv-disk-order.args | 4 +- .../qemuxml2argv-encrypted-disk.args | 2 +- tests/qemuxml2argvdata/qemuxml2argv-event_idx.args | 2 +- .../qemuxml2argvdata/qemuxml2argv-virtio-lun.args | 11 ++++ tests/qemuxml2argvdata/qemuxml2argv-virtio-lun.xml | 57 ++++++++++++++++++++ tests/qemuxml2argvtest.c | 22 ++++++-- tests/qemuxml2xmltest.c | 1 + 19 files changed, 178 insertions(+), 35 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-virtio-lun.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-virtio-lun.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 9cf0f12..4245708 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -996,7 +996,7 @@ <dt><code>disk</code></dt> <dd>The <code>disk</code> element is the main container for describing disks. The <code>type</code> attribute is either "file", - "block", "dir", or "network" + "block", "dir", "network", or "lun" and refers to the underlying source for the disk. The optional <code>device</code> attribute indicates how the disk is to be exposed to the guest OS. Possible values for this attribute are "floppy", "disk" @@ -1016,20 +1016,26 @@ generally does not make sense. <span class="since">Since 0.0.3; "device" attribute since 0.1.4; "network" attribute since 0.8.7; "snapshot" since - 0.9.5</span></dd> + 0.9.5; "lun" since 0.9.9</span></dd> <dt><code>source</code></dt> <dd>If the disk <code>type</code> is "file", then the <code>file</code> attribute specifies the fully-qualified path to the file holding the disk. If the disk - <code>type</code> is "block", then the <code>dev</code> + <code>type</code> is "block" or "lun", then the <code>dev</code> attribute specifies the path to the host device to serve as - the disk. If the disk <code>type</code> is "dir", then the - <code>dir</code> attribute specifies the fully-qualified path - to the directory to use as the disk. If the disk <code>type</code> - is "network", then the <code>protocol</code> attribute specifies - the protocol to access to the requested image; possible values - are "nbd", "rbd", and "sheepdog". If the <code>protocol</code> - attribute is "rbd" or "sheepdog", an additional + the disk ("lun" is only valid if the <code>target + bus="virtio"</code>, and behaves identically to "block", + except that generic SCSI commands from the guest are accepted + in the case of "lun" - also note that <b>you should only use + type='lun' for actual raw devices, never for individual partitions, + or LVM partitions</b>). If the disk <code>type</code> is + "dir", then the <code>dir</code> attribute specifies the + fully-qualified path to the directory to use as the disk. If + the disk <code>type</code> is "network", then + the <code>protocol</code> attribute specifies the protocol to + access to the requested image; possible values are "nbd", + "rbd", and "sheepdog". If the <code>protocol</code> attribute + is "rbd" or "sheepdog", an additional attribute <code>name</code> is mandatory to specify which image to be used. When the disk <code>type</code> is "network", the <code>source</code> may have zero or diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 553a6f0..e629bcd 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -779,6 +779,22 @@ </group> <group> <attribute name="type"> + <value>lun</value> + </attribute> + <interleave> + <optional> + <element name="source"> + <attribute name="dev"> + <ref name="absFilePath"/> + </attribute> + <empty/> + </element> + </optional> + <ref name="diskspec"/> + </interleave> + </group> + <group> + <attribute name="type"> <value>dir</value> </attribute> <interleave> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 2897b4a..9cc4a57 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -152,7 +152,8 @@ VIR_ENUM_IMPL(virDomainDisk, VIR_DOMAIN_DISK_TYPE_LAST, "block", "file", "dir", - "network") + "network", + "lun") VIR_ENUM_IMPL(virDomainDiskDevice, VIR_DOMAIN_DISK_DEVICE_LAST, "disk", @@ -2662,6 +2663,7 @@ virDomainDiskDefParseXML(virCapsPtr caps, startupPolicy = virXMLPropString(cur, "startupPolicy"); break; case VIR_DOMAIN_DISK_TYPE_BLOCK: + case VIR_DOMAIN_DISK_TYPE_LUN: source = virXMLPropString(cur, "dev"); break; case VIR_DOMAIN_DISK_TYPE_DIR: @@ -9859,6 +9861,7 @@ virDomainDiskDefFormat(virBufferPtr buf, virBufferAsprintf(buf, "/>\n"); break; case VIR_DOMAIN_DISK_TYPE_BLOCK: + case VIR_DOMAIN_DISK_TYPE_LUN: virBufferEscapeString(buf, " <source dev='%s'/>\n", def->src); break; diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 1f6e442..a63b2f5 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -202,6 +202,7 @@ enum virDomainDiskType { VIR_DOMAIN_DISK_TYPE_FILE, VIR_DOMAIN_DISK_TYPE_DIR, VIR_DOMAIN_DISK_TYPE_NETWORK, + VIR_DOMAIN_DISK_TYPE_LUN, VIR_DOMAIN_DISK_TYPE_LAST }; diff --git a/src/locking/domain_lock.c b/src/locking/domain_lock.c index de1937c..ecc7b04 100644 --- a/src/locking/domain_lock.c +++ b/src/locking/domain_lock.c @@ -74,6 +74,7 @@ static int virDomainLockManagerAddDisk(virLockManagerPtr lock, return 0; if (!(disk->type == VIR_DOMAIN_DISK_TYPE_BLOCK || + disk->type == VIR_DOMAIN_DISK_TYPE_LUN || disk->type == VIR_DOMAIN_DISK_TYPE_FILE || disk->type == VIR_DOMAIN_DISK_TYPE_DIR)) return 0; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index ea1b763..0431d8c 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -2022,6 +2022,23 @@ qemuBuildDriveDevStr(virDomainDiskDefPtr disk, goto error; } + if (disk->type == VIR_DOMAIN_DISK_TYPE_LUN) { + /* make sure that both the bus and the qemu support + * type='lun' (SG_IO). + */ + if (disk->bus != VIR_DOMAIN_DISK_BUS_VIRTIO) { + qemuReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("disk type='lun' is not supported for bus='%s'"), + bus); + goto error; + } + if (!qemuCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_BLK_SG_IO)) { + qemuReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("disk type='lun' is not supported by this QEMU")); + goto error; + } + } + switch (disk->bus) { case VIR_DOMAIN_DISK_BUS_IDE: virBufferAddLit(&opt, "ide-drive"); @@ -2050,6 +2067,14 @@ qemuBuildDriveDevStr(virDomainDiskDefPtr disk, virBufferAsprintf(&opt, ",event_idx=%s", virDomainVirtioEventIdxTypeToString(disk->event_idx)); } + if (qemuCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_BLK_SCSI)) { + /* if sg_io is true but the scsi option isn't supported, + * that means it's just always on in this version of qemu. + */ + virBufferAsprintf(&opt, ",scsi=%s", + (disk->type == VIR_DOMAIN_DISK_TYPE_LUN) + ? "on" : "off"); + } if (qemuBuildDeviceAddressStr(&opt, &disk->info, qemuCaps) < 0) goto error; break; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index c908135..2911522 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8184,7 +8184,8 @@ static int qemuDomainGetBlockInfo(virDomainPtr dom, /* ..but if guest is running & not using raw disk format and on a block device, then query highest allocated extent from QEMU */ - if (disk->type == VIR_DOMAIN_DISK_TYPE_BLOCK && + if ((disk->type == VIR_DOMAIN_DISK_TYPE_BLOCK || + disk->type == VIR_DOMAIN_DISK_TYPE_LUN) && format != VIR_STORAGE_FILE_RAW && S_ISBLK(sb.st_mode) && virDomainObjIsActive(vm)) { @@ -11013,6 +11014,7 @@ qemuDiskPathToAlias(virDomainObjPtr vm, const char *path) disk = vm->def->disks[i]; if (disk->type != VIR_DOMAIN_DISK_TYPE_BLOCK && + disk->type != VIR_DOMAIN_DISK_TYPE_LUN && disk->type != VIR_DOMAIN_DISK_TYPE_FILE) goto cleanup; diff --git a/tests/qemuhelptest.c b/tests/qemuhelptest.c index 60155e7..1ef0d9b 100644 --- a/tests/qemuhelptest.c +++ b/tests/qemuhelptest.c @@ -329,7 +329,8 @@ mymain(void) QEMU_CAPS_DRIVE_AIO, QEMU_CAPS_NO_SHUTDOWN, QEMU_CAPS_PCI_ROMBAR, - QEMU_CAPS_NO_ACPI); + QEMU_CAPS_NO_ACPI, + QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("qemu-kvm-0.12.1.2-rhel60", 12001, 1, 0, QEMU_CAPS_VNC_COLON, QEMU_CAPS_NO_REBOOT, @@ -376,7 +377,8 @@ mymain(void) QEMU_CAPS_USB_HUB, QEMU_CAPS_NO_SHUTDOWN, QEMU_CAPS_PCI_ROMBAR, - QEMU_CAPS_NO_ACPI); + QEMU_CAPS_NO_ACPI, + QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("qemu-kvm-0.12.3", 12003, 1, 0, QEMU_CAPS_VNC_COLON, QEMU_CAPS_NO_REBOOT, @@ -416,7 +418,8 @@ mymain(void) QEMU_CAPS_DRIVE_AIO, QEMU_CAPS_NO_SHUTDOWN, QEMU_CAPS_PCI_ROMBAR, - QEMU_CAPS_NO_ACPI); + QEMU_CAPS_NO_ACPI, + QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("qemu-kvm-0.13.0", 13000, 1, 0, QEMU_CAPS_VNC_COLON, QEMU_CAPS_NO_REBOOT, @@ -472,7 +475,8 @@ mymain(void) QEMU_CAPS_USB_HUB, QEMU_CAPS_NO_SHUTDOWN, QEMU_CAPS_PCI_ROMBAR, - QEMU_CAPS_NO_ACPI); + QEMU_CAPS_NO_ACPI, + QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("qemu-kvm-0.12.1.2-rhel61", 12001, 1, 0, QEMU_CAPS_VNC_COLON, QEMU_CAPS_NO_REBOOT, @@ -524,7 +528,9 @@ mymain(void) QEMU_CAPS_USB_HUB, QEMU_CAPS_NO_SHUTDOWN, QEMU_CAPS_PCI_ROMBAR, - QEMU_CAPS_NO_ACPI); + QEMU_CAPS_NO_ACPI, + QEMU_CAPS_VIRTIO_BLK_SCSI, + QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("qemu-kvm-0.12.1.2-rhel62-beta", 12001, 1, 0, QEMU_CAPS_VNC_COLON, QEMU_CAPS_NO_REBOOT, @@ -584,7 +590,9 @@ mymain(void) QEMU_CAPS_USB_HUB, QEMU_CAPS_NO_SHUTDOWN, QEMU_CAPS_PCI_ROMBAR, - QEMU_CAPS_NO_ACPI); + QEMU_CAPS_NO_ACPI, + QEMU_CAPS_VIRTIO_BLK_SCSI, + QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("qemu-1.0", 1000000, 0, 0, QEMU_CAPS_VNC_COLON, QEMU_CAPS_NO_REBOOT, @@ -648,7 +656,9 @@ mymain(void) QEMU_CAPS_PCI_ROMBAR, QEMU_CAPS_ICH9_AHCI, QEMU_CAPS_NO_ACPI, - QEMU_CAPS_FSDEV_READONLY); + QEMU_CAPS_FSDEV_READONLY, + QEMU_CAPS_VIRTIO_BLK_SCSI, + QEMU_CAPS_VIRTIO_BLK_SG_IO); return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } diff --git a/tests/qemuxml2argvdata/qemuxml2argv-boot-complex-bootindex.args b/tests/qemuxml2argvdata/qemuxml2argv-boot-complex-bootindex.args index ae0b2b1..ed00c64 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-boot-complex-bootindex.args +++ b/tests/qemuxml2argvdata/qemuxml2argv-boot-complex-bootindex.args @@ -9,9 +9,9 @@ LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test \ -monitor unix:/tmp/test-monitor,server,nowait \ -no-acpi \ -drive file=/tmp/vda.img,if=none,id=drive-virtio-disk0 \ --device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=3 \ +-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=3 \ -drive file=/tmp/vdb.img,if=none,id=drive-virtio-disk1 \ --device virtio-blk-pci,bus=pci.0,addr=0x6,drive=drive-virtio-disk1,id=virtio-disk1 \ +-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk1,id=virtio-disk1 \ -drive file=/dev/HostVG/hda,if=none,id=drive-ide0-0-0 \ -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \ -drive file=/dev/HostVG/hdb,if=none,id=drive-ide0-0-1 \ diff --git a/tests/qemuxml2argvdata/qemuxml2argv-boot-complex.args b/tests/qemuxml2argvdata/qemuxml2argv-boot-complex.args index c8d32ec..cffb8ad 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-boot-complex.args +++ b/tests/qemuxml2argvdata/qemuxml2argv-boot-complex.args @@ -10,9 +10,9 @@ LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test \ -no-acpi \ -boot dnca \ -drive file=/tmp/vda.img,if=none,id=drive-virtio-disk0,boot=on \ --device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 \ +-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 \ -drive file=/tmp/vdb.img,if=none,id=drive-virtio-disk1 \ --device virtio-blk-pci,bus=pci.0,addr=0x6,drive=drive-virtio-disk1,id=virtio-disk1 \ +-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk1,id=virtio-disk1 \ -drive file=/dev/HostVG/hda,if=none,id=drive-ide0-0-0 \ -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \ -drive file=/dev/HostVG/hdb,if=none,id=drive-ide0-0-1 \ diff --git a/tests/qemuxml2argvdata/qemuxml2argv-boot-order.args b/tests/qemuxml2argvdata/qemuxml2argv-boot-order.args index 14367b1..9220987 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-boot-order.args +++ b/tests/qemuxml2argvdata/qemuxml2argv-boot-order.args @@ -12,7 +12,7 @@ LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test /usr/bin/qemu \ -drive file=/root/boot.iso,if=none,media=cdrom,id=drive-ide0-1-0 \ -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 \ -drive file=sheepdog:example.org:6000:image,if=none,id=drive-virtio-disk0 \ --device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=3 \ +-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=3 \ -drive file=/dev/null,if=none,id=drive-fdc0-0-1 \ -global isa-fdc.driveB=drive-fdc0-0-1 \ -global isa-fdc.bootindexB=4 \ diff --git a/tests/qemuxml2argvdata/qemuxml2argv-disk-ioeventfd.args b/tests/qemuxml2argvdata/qemuxml2argv-disk-ioeventfd.args index c512f15..2f4e7fd 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-disk-ioeventfd.args +++ b/tests/qemuxml2argvdata/qemuxml2argv-disk-ioeventfd.args @@ -3,7 +3,7 @@ LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ -monitor unix:/tmp/test-monitor,server,nowait -no-acpi \ -boot dc -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \ -drive file=/var/lib/libvirt/images/f14.img,if=none,id=drive-virtio-disk0 \ --device virtio-blk-pci,ioeventfd=on,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 \ +-device virtio-blk-pci,ioeventfd=on,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 \ -drive file=/var/lib/libvirt/Fedora-14-x86_64-Live-KDE.iso,if=none,media=cdrom,id=drive-ide0-1-0 \ -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 \ -device virtio-net-pci,tx=bh,ioeventfd=off,vlan=0,id=net0,mac=52:54:00:e5:48:58,bus=pci.0,addr=0x3 \ diff --git a/tests/qemuxml2argvdata/qemuxml2argv-disk-order.args b/tests/qemuxml2argvdata/qemuxml2argv-disk-order.args index 6483e5e..0643072 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-disk-order.args +++ b/tests/qemuxml2argvdata/qemuxml2argv-disk-order.args @@ -13,8 +13,8 @@ LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test /usr/bin/qemu \ -drive file=/dev/HostVG/QEMUGuest2,if=none,media=cdrom,id=drive-ide0-1-0 \ -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 \ -drive file=/tmp/data.img,if=none,id=drive-virtio-disk0 \ --device virtio-blk-pci,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0 \ +-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0 \ -drive file=/tmp/logs.img,if=none,id=drive-virtio-disk1 \ --device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 \ +-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 \ -usb \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 diff --git a/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.args b/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.args index c6634fd..022f6ad 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.args +++ b/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.args @@ -5,6 +5,6 @@ encryptdisk -uuid 496898a6-e6ff-f7c8-5dc2-3cf410945ee9 -nographic -nodefconfig \ path=//var/lib/libvirt/qemu/encryptdisk.monitor,server,nowait -mon \ chardev=monitor,mode=readline -rtc base=utc -no-acpi -boot c -drive \ file=/storage/guest_disks/encryptdisk,if=none,id=drive-virtio-disk0,boot=on,\ -format=qcow2 -device virtio-blk-pci,bus=pci.0,addr=0x4,\ +format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,\ drive=drive-virtio-disk0,id=virtio-disk0 -usb -device virtio-balloon-pci,\ id=balloon0,bus=pci.0,addr=0x3 diff --git a/tests/qemuxml2argvdata/qemuxml2argv-event_idx.args b/tests/qemuxml2argvdata/qemuxml2argv-event_idx.args index f6ebb60..a506274 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-event_idx.args +++ b/tests/qemuxml2argvdata/qemuxml2argv-event_idx.args @@ -3,7 +3,7 @@ LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ -monitor unix:/tmp/test-monitor,server,nowait -no-acpi \ -boot dc -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \ -drive file=/var/lib/libvirt/images/f14.img,if=none,id=drive-virtio-disk0 \ --device virtio-blk-pci,event_idx=on,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 \ +-device virtio-blk-pci,event_idx=on,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 \ -drive file=/var/lib/libvirt/Fedora-14-x86_64-Live-KDE.iso,if=none,media=cdrom,id=drive-ide0-1-0 \ -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 \ -device virtio-net-pci,event_idx=off,vlan=0,id=net0,mac=52:54:00:e5:48:58,bus=pci.0,addr=0x3 \ diff --git a/tests/qemuxml2argvdata/qemuxml2argv-virtio-lun.args b/tests/qemuxml2argvdata/qemuxml2argv-virtio-lun.args new file mode 100644 index 0000000..5df9182 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-virtio-lun.args @@ -0,0 +1,11 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/bin/qemu -S -M pc-0.13 -m 1024 -smp 1 -nodefaults \ +-monitor unix:/tmp/test-monitor,server,nowait -no-acpi \ +-boot dc -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \ +-drive file=/dev/sdfake,if=none,id=drive-virtio-disk0 \ +-device virtio-blk-pci,scsi=on,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 \ +-drive file=/var/lib/libvirt/Fedora-14-x86_64-Live-KDE.iso,if=none,media=cdrom,id=drive-ide0-1-0 \ +-device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 \ +-device virtio-net-pci,vlan=0,id=net0,mac=52:54:00:e5:48:58,bus=pci.0,addr=0x3 \ +-net user,vlan=0,name=hostnet0 -serial pty -usb -vnc 127.0.0.1:-809 -std-vga \ +-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 diff --git a/tests/qemuxml2argvdata/qemuxml2argv-virtio-lun.xml b/tests/qemuxml2argvdata/qemuxml2argv-virtio-lun.xml new file mode 100644 index 0000000..999954e --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-virtio-lun.xml @@ -0,0 +1,57 @@ +<domain type='qemu'> + <name>test</name> + <uuid>bba65c0e-c049-934f-b6aa-4e2c0582acdf</uuid> + <memory>1048576</memory> + <currentMemory>1048576</currentMemory> + <vcpu>1</vcpu> + <os> + <type arch='x86_64' machine='pc-0.13'>hvm</type> + <boot dev='cdrom'/> + <boot dev='hd'/> + <bootmenu enable='yes'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>restart</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='lun' device='disk'> + <driver name='qemu' type='qcow2'/> + <source dev='/dev/sdfake'/> + <target dev='vda' bus='virtio'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </disk> + <disk type='file' device='cdrom'> + <driver name='qemu' type='raw'/> + <source file='/var/lib/libvirt/Fedora-14-x86_64-Live-KDE.iso'/> + <target dev='hdc' bus='ide'/> + <readonly/> + <address type='drive' controller='0' bus='1' unit='0'/> + </disk> + <controller type='virtio-serial' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> + </controller> + <controller type='ide' index='0'/> + <interface type='user'> + <mac address='52:54:00:e5:48:58'/> + <model type='virtio'/> + <driver name='vhost' event_idx='off'/> + </interface> + <serial type='pty'> + <target port='0'/> + </serial> + <console type='pty'> + <target type='serial' port='0'/> + </console> + <input type='mouse' bus='ps2'/> + <graphics type='vnc' port='5091' autoport='no' listen='127.0.0.1'> + <listen type='address' address='127.0.0.1'/> + </graphics> + <video> + <model type='vga' vram='9216' heads='1'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </video> + <memballoon model='virtio'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index e1221eb..eec0187 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -342,12 +342,15 @@ mymain(void) QEMU_CAPS_BOOT_MENU, QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE, QEMU_CAPS_BOOTINDEX); DO_TEST("boot-order", false, - QEMU_CAPS_BOOTINDEX, QEMU_CAPS_DRIVE, QEMU_CAPS_DEVICE); + QEMU_CAPS_BOOTINDEX, QEMU_CAPS_DRIVE, QEMU_CAPS_DEVICE, + QEMU_CAPS_VIRTIO_BLK_SCSI, QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("boot-complex", false, - QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE, QEMU_CAPS_DRIVE_BOOT); + QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE, QEMU_CAPS_DRIVE_BOOT, + QEMU_CAPS_VIRTIO_BLK_SCSI, QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("boot-complex-bootindex", false, QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE, QEMU_CAPS_DRIVE_BOOT, - QEMU_CAPS_BOOTINDEX); + QEMU_CAPS_BOOTINDEX, + QEMU_CAPS_VIRTIO_BLK_SCSI, QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("bootloader", true, QEMU_CAPS_DOMID); DO_TEST("bios", false, QEMU_CAPS_DEVICE, QEMU_CAPS_SGA); DO_TEST("clock-utc", false, NONE); @@ -365,7 +368,8 @@ mymain(void) DO_TEST("disk-many", false, NONE); DO_TEST("disk-virtio", false, QEMU_CAPS_DRIVE, QEMU_CAPS_DRIVE_BOOT); DO_TEST("disk-order", false, - QEMU_CAPS_DRIVE, QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE_BOOT); + QEMU_CAPS_DRIVE, QEMU_CAPS_DEVICE, QEMU_CAPS_DRIVE_BOOT, + QEMU_CAPS_VIRTIO_BLK_SCSI, QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("disk-xenvbd", false, QEMU_CAPS_DRIVE, QEMU_CAPS_DRIVE_BOOT); DO_TEST("disk-drive-boot-disk", false, QEMU_CAPS_DRIVE, QEMU_CAPS_DRIVE_BOOT); @@ -433,14 +437,20 @@ mymain(void) QEMU_CAPS_DRIVE_CACHE_V2, QEMU_CAPS_DRIVE_FORMAT); DO_TEST("disk-ioeventfd", false, QEMU_CAPS_DRIVE, QEMU_CAPS_VIRTIO_IOEVENTFD, - QEMU_CAPS_VIRTIO_TX_ALG, QEMU_CAPS_DEVICE); + QEMU_CAPS_VIRTIO_TX_ALG, QEMU_CAPS_DEVICE, + QEMU_CAPS_VIRTIO_BLK_SCSI, QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("disk-snapshot", false, QEMU_CAPS_DRIVE, QEMU_CAPS_DRIVE_CACHE_V2, QEMU_CAPS_DRIVE_FORMAT); DO_TEST("event_idx", false, QEMU_CAPS_DRIVE, QEMU_CAPS_VIRTIO_BLK_EVENT_IDX, QEMU_CAPS_VIRTIO_NET_EVENT_IDX, - QEMU_CAPS_DEVICE); + QEMU_CAPS_DEVICE, + QEMU_CAPS_VIRTIO_BLK_SCSI, QEMU_CAPS_VIRTIO_BLK_SG_IO); + DO_TEST("virtio-lun", false, + QEMU_CAPS_DRIVE, + QEMU_CAPS_DEVICE, + QEMU_CAPS_VIRTIO_BLK_SCSI, QEMU_CAPS_VIRTIO_BLK_SG_IO); DO_TEST("graphics-vnc", false, NONE); DO_TEST("graphics-vnc-socket", false, NONE); diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 35bfdce..96edd00 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -190,6 +190,7 @@ mymain(void) DO_TEST("smp"); DO_TEST("lease"); DO_TEST("event_idx"); + DO_TEST("virtio-lun"); DO_TEST("usb-redir"); DO_TEST("blkdeviotune"); -- 1.7.7.4

On 12/22/2011 11:39 AM, Laine Stump wrote:
As a result of CVE-2011-4127 (see: http://seclists.org/oss-sec/2011/q4/536), qemu (and the kernel) now disable all SG_IO commands on disks via qemu by default. To continue providing this capability to those who specifically ask for it (and hopefully understand the risks), a new libvirt disk type has been created.
This new disk type is identical to disk type='block', except that:
1) It is only allowed if bus='virtio' and the qemu version is "new enough" to support it ("new enough" == qemu 0.11 or better), otherwise the domain will fail to start and a CONFIG_UNSUPPORTED error will be logged).
2) The option "scsi=on" will be added to the -device arg to allow SG_IO commands (if type!='lun', "scsi=off" will be added to the -device arg so that SG_IO commands are forbidden).
Guests which continue to use disk type='block' will no longer be able to use SG_IO commands on the disk; those that have their disk type changed to type='lun' will still be able to use SG_IO commands.
<dd>If the disk <code>type</code> is "file", then the <code>file</code> attribute specifies the fully-qualified path to the file holding the disk. If the disk - <code>type</code> is "block", then the <code>dev</code> + <code>type</code> is "block" or "lun", then the <code>dev</code> attribute specifies the path to the host device to serve as - the disk. If the disk <code>type</code> is "dir", then the - <code>dir</code> attribute specifies the fully-qualified path - to the directory to use as the disk. If the disk <code>type</code> - is "network", then the <code>protocol</code> attribute specifies - the protocol to access to the requested image; possible values - are "nbd", "rbd", and "sheepdog". If the <code>protocol</code> - attribute is "rbd" or "sheepdog", an additional + the disk ("lun" is only valid if the <code>target + bus="virtio"</code>, and behaves identically to "block", + except that generic SCSI commands from the guest are accepted + in the case of "lun" - also note that <b>you should only use + type='lun' for actual raw devices, never for individual partitions, + or LVM partitions</b>). If the disk <code>type</code> is
s/individual partitions, or/individual partitions or/
+++ b/docs/schemas/domaincommon.rng @@ -779,6 +779,22 @@ </group> <group> <attribute name="type"> + <value>lun</value> + </attribute> + <interleave> + <optional> + <element name="source"> + <attribute name="dev"> + <ref name="absFilePath"/> + </attribute> + <empty/> + </element> + </optional> + <ref name="diskspec"/> + </interleave> + </group> + <group> + <attribute name="type"> <value>dir</value>
Is it worth compressing this? <group> <attribute name='type'> <choice> <value>block</value> <value>lun</value> </choice> </attribute> ... existing block stuff What you have looks reasonable; of course, it all changes if we go with a different XML representation. ACK to this code, if we agree on the approach. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Thu, Dec 22, 2011 at 01:39:30PM -0500, Laine Stump wrote:
These two patches are in response to CVE-2011-4127:
http://seclists.org/oss-sec/2011/q4/536
Once the kernel security fix and corresponding qemu mitigation patch are in place, access to SG_IO commands from qemu guests will be disabled by default. This patch series provides a way to explicitly enable such support when it is required.
In a discussion just before sending this patch series, Paolo Bonzini wondered if rather than the xml syntax being what's proposed here:
<disk type='block' device='disk' dev='/dev/sda'> <!-- SG_IO off --> <disk type='lun' device='disk' dev='/dev/sda'> <!-- SG_IO on -->
maybe it should instead be:
<disk type='block' device='disk' dev='/dev/sda'> <!-- SG_IO off --> <disk type='block' device='lun' dev='/dev/sda'> <!-- SG_IO on -->
I guess it partly depends on whether we would ever want to turn on SG_IO for a disk with device='cdrom|floppy' vs. if we would ever want to turn it on for type='file|dir|network'.
Opinions?
The 'type' attribute refers to how the host emulator deals with the disk. The 'device' attribute refers to what type of device hardware is exposed to the guest. What we're doing here is controlling whether the host emulator allows SG_IO. The guest visible device hardware has not changed at all. Thus using the 'type' attribute is the correct approach. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 12/23/2011 12:15 AM, Daniel P. Berrange wrote:
The 'type' attribute refers to how the host emulator deals with the disk.
The 'device' attribute refers to what type of device hardware is exposed to the guest.
What we're doing here is controlling whether the host emulator allows SG_IO. The guest visible device hardware has not changed at all. Thus using the 'type' attribute is the correct approach.
You could also say that the host is still accessing the device in the same way (via /dev/sdX). The guest visible device changes because in one case it allows SG_IO, in the other it doesn't, even if the host device is the same. Paolo

On 12/23/2011 01:42 AM, Paolo Bonzini wrote:
On 12/23/2011 12:15 AM, Daniel P. Berrange wrote:
The 'type' attribute refers to how the host emulator deals with the disk.
The 'device' attribute refers to what type of device hardware is exposed to the guest.
What we're doing here is controlling whether the host emulator allows SG_IO. The guest visible device hardware has not changed at all. Thus using the 'type' attribute is the correct approach.
You could also say that the host is still accessing the device in the same way (via /dev/sdX). The guest visible device changes because in one case it allows SG_IO, in the other it doesn't, even if the host device is the same.
I also tend to agree that 1) this is a guest-visible change in behavior for the same host resource, and 2) Paolo's proposals for virtio-scsi LUN pass-through present device='lun'. https://www.redhat.com/archives/libvir-list/2011-December/msg01013.html Or do we mix the two concepts? That is, type='lun' is used to control SG_IO enabling, while device='lun' is used to control LUN pass-through, and you can even have the combination type='lun' device='lun'? -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 12/22/2011 11:39 AM, Laine Stump wrote:
These two patches are in response to CVE-2011-4127:
http://seclists.org/oss-sec/2011/q4/536
Once the kernel security fix and corresponding qemu mitigation patch are in place, access to SG_IO commands from qemu guests will be disabled by default. This patch series provides a way to explicitly enable such support when it is required.
Given that this helps mitigate a CVE, I think we want to include this in 0.9.9 (another reason for an rc2 build shortly). I'll go ahead and review these patches, but I'm still not sure whether we have consensus on whether to use type='lun' or device='lun'. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (4)
-
Daniel P. Berrange
-
Eric Blake
-
Laine Stump
-
Paolo Bonzini