[libvirt] [PATCH 0/4] Introduce support for virtio-blk-pci iothreads

Introduce iothreads support to libvirt. These will be used to facilitate adding an iothread attribute to a support disk which will enable having a dedicated event loop thread for the disk. IOThreads are a QEMU feature recently added (2.1) as a replacement for the virtio-blk data plane functionality that's been in tech preview since 1.4. Followup patches will add API's in order to list the IOThreads and eventually be able to assign IOThreads to specific CPU's if so desired. This set of patches should cover at least the bare minimum in order to allow modifying domain XML in order to use the feature. John Ferlan (4): domain_conf: Introduce iothreads XML qemu: Add support for iothreads domain_conf: Add support for iothreads in disk definition qemu: Allow use of iothreads for disk definitions docs/formatdomain.html.in | 34 ++++++++++++++ docs/schemas/domaincommon.rng | 25 +++++++++++ src/conf/domain_conf.c | 52 +++++++++++++++++++++- src/conf/domain_conf.h | 4 ++ src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 48 ++++++++++++++++++++ src/qemu/qemu_command.h | 5 +++ src/qemu/qemu_hotplug.c | 11 +++++ .../qemuxml2argv-iothreads-disk.args | 17 +++++++ .../qemuxml2argv-iothreads-disk.xml | 40 +++++++++++++++++ tests/qemuxml2argvdata/qemuxml2argv-iothreads.args | 8 ++++ tests/qemuxml2argvdata/qemuxml2argv-iothreads.xml | 29 ++++++++++++ tests/qemuxml2argvtest.c | 4 ++ tests/qemuxml2xmltest.c | 2 + 15 files changed, 281 insertions(+), 1 deletion(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads.xml -- 1.9.3

Introduce XML to allowing adding iothreads to the domain. These can be used by virtio-blk-pci devices in order to assign a specific thread to handle the workload for the device. The iothreads are the official implementation of the virtio-blk Data Plane that's been in tech preview for QEMU. Signed-off-by: John Ferlan <jferlan@redhat.com> --- docs/formatdomain.html.in | 26 ++++++++++++++++++++++++++ docs/schemas/domaincommon.rng | 12 ++++++++++++ src/conf/domain_conf.c | 28 ++++++++++++++++++++++++++++ src/conf/domain_conf.h | 2 ++ 4 files changed, 68 insertions(+) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 59127bb..0fe10f4 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -470,6 +470,32 @@ </dd> </dl> + <h3><a name="elementsIOThreadsAllocation">IOThreads Allocation</a></h3> + <p> + IOThreads are a QEMU feature that will allow supported disk + devices to be configured to use a dedicated event loop thread + to handle block I/O requests. + <span class="since">Since 1.2.8 (QEMU only)</span> + </p> + +<pre> +<domain> + ... + <iothreads>4</iothreads> + ... +</domain> +</pre> + + <dl> + <dt><code>iothreads</code></dt> + <dd> + The content of this optional element defines the number + of IOThreads to be assigned to the domain for use by + virtio-blk-pci target storage devices. There should be + only 1 or 2 IOThreads per host CPU and 1 IOThread per + supported device. + </dd> + </dl> <h3><a name="elementsCPUTuning">CPU Tuning</a></h3> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 9a89dd8..b4ac483 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -632,6 +632,12 @@ </optional> <optional> + <element name="iothreads"> + <ref name="countIOThreads"/> + </element> + </optional> + + <optional> <ref name="blkiotune"/> </optional> @@ -4747,6 +4753,12 @@ <param name="minInclusive">1</param> </data> </define> + <define name="countIOThreads"> + <data type="unsignedInt"> + <param name="pattern">[0-9]+</param> + <param name="minInclusive">0</param> + </data> + </define> <define name="vcpuid"> <data type="unsignedShort"> <param name="pattern">[0-9]+</param> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 22a7f7e..671c41c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -11953,6 +11953,23 @@ virDomainDefParseXML(xmlDocPtr xml, } } + /* Optional - iothreads */ + n = virXPathULong("string(./iothreads[1])", ctxt, &count); + if (n == -2) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("iothreads count must be an integer")); + goto error; + } else if (n < 0) { + def->iothreads = 0; + } else { + if ((unsigned int) count != count) { + virReportError(VIR_ERR_XML_ERROR, + _("invalid iothreads count '%lu'"), count); + goto error; + } + def->iothreads = count; + } + /* Extract cpu tunables. */ if ((n = virXPathULong("string(./cputune/shares[1])", ctxt, &def->cputune.shares)) < -1) { @@ -14461,6 +14478,14 @@ virDomainDefCheckABIStability(virDomainDefPtr src, goto error; } + if (src->iothreads != dst->iothreads) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Target domain iothreads count %u does not " + "match source %u"), + dst->iothreads, src->iothreads); + goto error; + } + if (STRNEQ(src->os.type, dst->os.type)) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Target domain OS type %s does not match source %s"), @@ -17874,6 +17899,9 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAsprintf(buf, " current='%u'", def->vcpus); virBufferAsprintf(buf, ">%u</vcpu>\n", def->maxvcpus); + if (def->iothreads > 0) + virBufferAsprintf(buf, "<iothreads>%u</iothreads>\n", def->iothreads); + if (def->cputune.sharesSpecified || (def->cputune.nvcpupin && !virDomainIsAllVcpupinInherited(def)) || def->cputune.period || def->cputune.quota || diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 36ccf10..705ce32 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1908,6 +1908,8 @@ struct _virDomainDef { int placement_mode; virBitmapPtr cpumask; + unsigned int iothreads; + struct { unsigned long shares; bool sharesSpecified; -- 1.9.3

On Mon, Aug 25, 2014 at 08:38:06PM -0400, John Ferlan wrote:
Introduce XML to allowing adding iothreads to the domain. These can be used by virtio-blk-pci devices in order to assign a specific thread to handle the workload for the device. The iothreads are the official implementation of the virtio-blk Data Plane that's been in tech preview for QEMU.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- docs/formatdomain.html.in | 26 ++++++++++++++++++++++++++ docs/schemas/domaincommon.rng | 12 ++++++++++++ src/conf/domain_conf.c | 28 ++++++++++++++++++++++++++++ src/conf/domain_conf.h | 2 ++ 4 files changed, 68 insertions(+)
[...]
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 9a89dd8..b4ac483 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -632,6 +632,12 @@ </optional>
<optional> + <element name="iothreads"> + <ref name="countIOThreads"/>
What's the difference between this and using ref name="unsignedInt" directly?
+ </element> + </optional> + + <optional> <ref name="blkiotune"/> </optional>
@@ -4747,6 +4753,12 @@ <param name="minInclusive">1</param> </data> </define> + <define name="countIOThreads"> + <data type="unsignedInt"> + <param name="pattern">[0-9]+</param> + <param name="minInclusive">0</param> + </data> + </define> <define name="vcpuid"> <data type="unsignedShort"> <param name="pattern">[0-9]+</param> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 22a7f7e..671c41c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -11953,6 +11953,23 @@ virDomainDefParseXML(xmlDocPtr xml, } }
+ /* Optional - iothreads */ + n = virXPathULong("string(./iothreads[1])", ctxt, &count); + if (n == -2) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("iothreads count must be an integer")); + goto error; + } else if (n < 0) { + def->iothreads = 0; + } else { + if ((unsigned int) count != count) {
Instead of this machinery, it would be more straightforward to just do (example written by hand, not tested): tmp = virXPathString("string(./iothreads[1])", ctxt); if (tmp && virStrToLong_uip(tmp, NULL, 0, &def->iothreads) < 0) virReportError(VIR_ERR_XML_ERROR, _("invalid iothreads count '%s'"), tmp); Martin

On 08/26/2014 01:03 AM, Martin Kletzander wrote:
On Mon, Aug 25, 2014 at 08:38:06PM -0400, John Ferlan wrote:
Introduce XML to allowing adding iothreads to the domain. These can be used by virtio-blk-pci devices in order to assign a specific thread to handle the workload for the device. The iothreads are the official implementation of the virtio-blk Data Plane that's been in tech preview for QEMU.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- docs/formatdomain.html.in | 26 ++++++++++++++++++++++++++ docs/schemas/domaincommon.rng | 12 ++++++++++++ src/conf/domain_conf.c | 28 ++++++++++++++++++++++++++++ src/conf/domain_conf.h | 2 ++ 4 files changed, 68 insertions(+)
[...]
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 9a89dd8..b4ac483 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -632,6 +632,12 @@ </optional>
<optional> + <element name="iothreads"> + <ref name="countIOThreads"/>
What's the difference between this and using ref name="unsignedInt" directly?
I (more or less) modeled (eg, cut, copy, paste) after the vcpu element which as you'll note is "countCPU" for both current and maxvcpus. Unlike the vcpu, I went with an 'int' instead of a 'short' although technically I cannot imagine more than a short number of threads or disks assigned to any domain! Also unlike the vcpu argument a "0" is a perfectly valid value.
+ </element> + </optional> + + <optional> <ref name="blkiotune"/> </optional>
@@ -4747,6 +4753,12 @@ <param name="minInclusive">1</param> </data> </define> + <define name="countIOThreads"> + <data type="unsignedInt"> + <param name="pattern">[0-9]+</param> + <param name="minInclusive">0</param> + </data> + </define> <define name="vcpuid"> <data type="unsignedShort"> <param name="pattern">[0-9]+</param> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 22a7f7e..671c41c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -11953,6 +11953,23 @@ virDomainDefParseXML(xmlDocPtr xml, } }
+ /* Optional - iothreads */ + n = virXPathULong("string(./iothreads[1])", ctxt, &count); + if (n == -2) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("iothreads count must be an integer")); + goto error; + } else if (n < 0) { + def->iothreads = 0; + } else { + if ((unsigned int) count != count) {
Instead of this machinery, it would be more straightforward to just do (example written by hand, not tested):
tmp = virXPathString("string(./iothreads[1])", ctxt); if (tmp && virStrToLong_uip(tmp, NULL, 0, &def->iothreads) < 0) virReportError(VIR_ERR_XML_ERROR, _("invalid iothreads count '%s'"), tmp);
See my above explanation and then look about 10-20 lines before iothreads and see vcpu/current and maxvcpus. The power of the visual suggestion of what code around area I was changing was doing was far greater than thinking hmm.. should this code use the virStrToLong calls. Unlike of course when I added the iothread to the disk property which does use the function. Maybe in a different patch it'd be worth working through the various other calls to virXPathU* and seeing if they too could use the virStrToLong* interfaces. I'll adjust both - no problem. John

On 08/25/2014 06:38 PM, John Ferlan wrote:
Introduce XML to allowing adding iothreads to the domain. These can be used by virtio-blk-pci devices in order to assign a specific thread to handle the workload for the device. The iothreads are the official implementation of the virtio-blk Data Plane that's been in tech preview for QEMU.
Signed-off-by: John Ferlan <jferlan@redhat.com> ---
@@ -4747,6 +4753,12 @@ <param name="minInclusive">1</param> </data> </define> + <define name="countIOThreads"> + <data type="unsignedInt"> + <param name="pattern">[0-9]+</param> + <param name="minInclusive">0</param> + </data>
Isn't minInclusive 0 of an unsignedInt redundant? It only matters for an integer that must be larger than zero, so you could omit a line. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Add a new capability to ensure the iothreads feature exists for the qemu emulator being run - requires the "query-iothreads" QMP command. Using the domain XML add correspoding command argument in order to generate the threads. The iothreads will use a name space "libvirtIothread#" where, the future patch to add support for using an iothread to a disk definition to merely define which of the available threads to use. Add tests to ensure the xml/argv processing is correct. Note that no change was made to qemuargv2xmltest.c as processing the -object element would require knowing more than just iothreads. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 13 ++++++++++ tests/qemuxml2argvdata/qemuxml2argv-iothreads.args | 8 ++++++ tests/qemuxml2argvdata/qemuxml2argv-iothreads.xml | 29 ++++++++++++++++++++++ tests/qemuxml2argvtest.c | 2 ++ tests/qemuxml2xmltest.c | 1 + 7 files changed, 56 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads.xml diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 410086b..b331be7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -268,6 +268,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "rtc-reset-reinjection", "splash-timeout", /* 175 */ + "query-iothreads", ); @@ -1430,6 +1431,7 @@ struct virQEMUCapsStringFlags virQEMUCapsCommands[] = { { "nbd-server-start", QEMU_CAPS_NBD_SERVER }, { "change-backing-file", QEMU_CAPS_CHANGE_BACKING_FILE }, { "rtc-reset-reinjection", QEMU_CAPS_RTC_RESET_REINJECTION }, + { "query-iothreads", QEMU_CAPS_OBJECT_IOTHREAD}, }; struct virQEMUCapsStringFlags virQEMUCapsEvents[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 48a4eaa..0980c00 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -215,6 +215,7 @@ typedef enum { QEMU_CAPS_OBJECT_USB_AUDIO = 173, /* usb-audio device support */ QEMU_CAPS_RTC_RESET_REINJECTION = 174, /* rtc-reset-reinjection monitor command */ QEMU_CAPS_SPLASH_TIMEOUT = 175, /* -boot splash-time */ + QEMU_CAPS_OBJECT_IOTHREAD = 176, /* -object iothread */ QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 6dac9d3..287a3f3 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -7510,6 +7510,19 @@ qemuBuildCommandLine(virConnectPtr conn, virCommandAddArg(cmd, smp); VIR_FREE(smp); + if (def->iothreads > 0 && + virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD)) { + /* Create named iothread objects starting with 1. These may be used + * by a disk definition which will associate to an iothread by + * supplying a value of 1 up to the number of iothreads available + * (since 0 would indicate to not use the feature). + */ + for (i = 1; i <= def->iothreads; i++) { + virCommandAddArg(cmd, "-object"); + virCommandAddArgFormat(cmd, "iothread,id=libvirtIothread%zu", i); + } + } + if (def->cpu && def->cpu->ncells) if (qemuBuildNumaArgStr(cfg, def, cmd, qemuCaps) < 0) goto error; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-iothreads.args b/tests/qemuxml2argvdata/qemuxml2argv-iothreads.args new file mode 100644 index 0000000..ca0c174 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-iothreads.args @@ -0,0 +1,8 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/bin/qemu -S -M \ +pc -m 214 -smp 2 \ +-object iothread,id=libvirtIothread1 \ +-object iothread,id=libvirtIothread2 \ +-nographic -monitor \ +unix:/tmp/test-monitor,server,nowait -no-acpi -boot c -usb -hda \ +/dev/HostVG/QEMUGuest1 -net none -serial none -parallel none diff --git a/tests/qemuxml2argvdata/qemuxml2argv-iothreads.xml b/tests/qemuxml2argvdata/qemuxml2argv-iothreads.xml new file mode 100644 index 0000000..12a92e7 --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-iothreads.xml @@ -0,0 +1,29 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>2</vcpu> + <iothreads>2</iothreads> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 7407272..0e3f60a 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1182,6 +1182,8 @@ mymain(void) DO_TEST("smp", QEMU_CAPS_SMP_TOPOLOGY); + DO_TEST("iothreads", QEMU_CAPS_OBJECT_IOTHREAD); + DO_TEST("cpu-topology1", QEMU_CAPS_SMP_TOPOLOGY); DO_TEST("cpu-topology2", QEMU_CAPS_SMP_TOPOLOGY); DO_TEST("cpu-topology3", NONE); diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 79cf59f..47314e3 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -298,6 +298,7 @@ mymain(void) DO_TEST("cputune-zero-shares"); DO_TEST("smp"); + DO_TEST("iothreads"); DO_TEST("lease"); DO_TEST("event_idx"); DO_TEST("vhost_queues"); -- 1.9.3

On Mon, Aug 25, 2014 at 08:38:07PM -0400, John Ferlan wrote:
Add a new capability to ensure the iothreads feature exists for the qemu emulator being run - requires the "query-iothreads" QMP command. Using the domain XML add correspoding command argument in order to generate the threads. The iothreads will use a name space "libvirtIothread#" where, the future patch to add support for using an iothread to a disk definition to merely define which of the available threads to use.
Add tests to ensure the xml/argv processing is correct. Note that no change was made to qemuargv2xmltest.c as processing the -object element would require knowing more than just iothreads.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 13 ++++++++++ tests/qemuxml2argvdata/qemuxml2argv-iothreads.args | 8 ++++++ tests/qemuxml2argvdata/qemuxml2argv-iothreads.xml | 29 ++++++++++++++++++++++ tests/qemuxml2argvtest.c | 2 ++ tests/qemuxml2xmltest.c | 1 + 7 files changed, 56 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads.xml
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 410086b..b331be7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -268,6 +268,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "rtc-reset-reinjection",
"splash-timeout", /* 175 */ + "query-iothreads",
Why "query-" when the capability is _OBJECT_? Or is this just a typo? </bikeshed>
);
@@ -1430,6 +1431,7 @@ struct virQEMUCapsStringFlags virQEMUCapsCommands[] = { { "nbd-server-start", QEMU_CAPS_NBD_SERVER }, { "change-backing-file", QEMU_CAPS_CHANGE_BACKING_FILE }, { "rtc-reset-reinjection", QEMU_CAPS_RTC_RESET_REINJECTION }, + { "query-iothreads", QEMU_CAPS_OBJECT_IOTHREAD},
We have virQEMUCapsObjectTypes[] where you can just stick the name of the object you want to test, and not rely on a related command to probe for it.
};
struct virQEMUCapsStringFlags virQEMUCapsEvents[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 48a4eaa..0980c00 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -215,6 +215,7 @@ typedef enum { QEMU_CAPS_OBJECT_USB_AUDIO = 173, /* usb-audio device support */ QEMU_CAPS_RTC_RESET_REINJECTION = 174, /* rtc-reset-reinjection monitor command */ QEMU_CAPS_SPLASH_TIMEOUT = 175, /* -boot splash-time */ + QEMU_CAPS_OBJECT_IOTHREAD = 176, /* -object iothread */
QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 6dac9d3..287a3f3 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -7510,6 +7510,19 @@ qemuBuildCommandLine(virConnectPtr conn, virCommandAddArg(cmd, smp); VIR_FREE(smp);
+ if (def->iothreads > 0 && + virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD)) { + /* Create named iothread objects starting with 1. These may be used + * by a disk definition which will associate to an iothread by + * supplying a value of 1 up to the number of iothreads available + * (since 0 would indicate to not use the feature). + */ + for (i = 1; i <= def->iothreads; i++) { + virCommandAddArg(cmd, "-object"); + virCommandAddArgFormat(cmd, "iothread,id=libvirtIothread%zu", i);
I don't see we would use 'libvirt*' naming for any other IDs, 'iothread%zu' would be enough, I guess (and the command-line wouldn't be so long as well). Martin

On 08/26/2014 01:03 AM, Martin Kletzander wrote:
On Mon, Aug 25, 2014 at 08:38:07PM -0400, John Ferlan wrote:
Add a new capability to ensure the iothreads feature exists for the qemu emulator being run - requires the "query-iothreads" QMP command. Using the domain XML add correspoding command argument in order to generate the threads. The iothreads will use a name space "libvirtIothread#" where, the future patch to add support for using an iothread to a disk definition to merely define which of the available threads to use.
Add tests to ensure the xml/argv processing is correct. Note that no change was made to qemuargv2xmltest.c as processing the -object element would require knowing more than just iothreads.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 13 ++++++++++ tests/qemuxml2argvdata/qemuxml2argv-iothreads.args | 8 ++++++ tests/qemuxml2argvdata/qemuxml2argv-iothreads.xml | 29 ++++++++++++++++++++++ tests/qemuxml2argvtest.c | 2 ++ tests/qemuxml2xmltest.c | 1 + 7 files changed, 56 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads.xml
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 410086b..b331be7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -268,6 +268,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "rtc-reset-reinjection",
"splash-timeout", /* 175 */ + "query-iothreads",
Why "query-" when the capability is _OBJECT_? Or is this just a typo? </bikeshed>
hmm... so how is this different then say perhaps QEMU_CAPS_OBJECT_RNG_RANDOM, QEMU_CAPS_OBJECT_MEMORY_RAM or QEMU_CAPS_OBJECT_MEMORY_FILE which each eventually uses "-object" to build it's part of the command line
);
@@ -1430,6 +1431,7 @@ struct virQEMUCapsStringFlags virQEMUCapsCommands[] = { { "nbd-server-start", QEMU_CAPS_NBD_SERVER }, { "change-backing-file", QEMU_CAPS_CHANGE_BACKING_FILE }, { "rtc-reset-reinjection", QEMU_CAPS_RTC_RESET_REINJECTION }, + { "query-iothreads", QEMU_CAPS_OBJECT_IOTHREAD},
We have virQEMUCapsObjectTypes[] where you can just stick the name of the object you want to test, and not rely on a related command to probe for it.
Hmm.. I wasn't completely sure where to put this - it's not overly clear which of the various capabilities structures should be used for what reason. Although I guess since the above named flags are in ObjectTypes I suppose I should have used it. I'll move it.
};
struct virQEMUCapsStringFlags virQEMUCapsEvents[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 48a4eaa..0980c00 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -215,6 +215,7 @@ typedef enum { QEMU_CAPS_OBJECT_USB_AUDIO = 173, /* usb-audio device support */ QEMU_CAPS_RTC_RESET_REINJECTION = 174, /* rtc-reset-reinjection monitor command */ QEMU_CAPS_SPLASH_TIMEOUT = 175, /* -boot splash-time */ + QEMU_CAPS_OBJECT_IOTHREAD = 176, /* -object iothread */
QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 6dac9d3..287a3f3 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -7510,6 +7510,19 @@ qemuBuildCommandLine(virConnectPtr conn, virCommandAddArg(cmd, smp); VIR_FREE(smp);
+ if (def->iothreads > 0 && + virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD)) { + /* Create named iothread objects starting with 1. These may be used + * by a disk definition which will associate to an iothread by + * supplying a value of 1 up to the number of iothreads available + * (since 0 would indicate to not use the feature). + */ + for (i = 1; i <= def->iothreads; i++) { + virCommandAddArg(cmd, "-object"); + virCommandAddArgFormat(cmd, "iothread,id=libvirtIothread%zu", i);
I don't see we would use 'libvirt*' naming for any other IDs, 'iothread%zu' would be enough, I guess (and the command-line wouldn't be so long as well).
Using 'libvirt' as the prefix for "iothread" was more of a convention to make sure it was clear what created it. I can go with the shorter name and initially had done so. John

Add a new disk "driver" attribute "iothread" to be parsed as the thread number for the disk to use. In order to more easily facilitate the usage and configuration of the iothread, a "zero" for the attribute indicates iothreads are not supported for the device and a positive value indicates the specific thread to try and use. Signed-off-by: John Ferlan <jferlan@redhat.com> --- docs/formatdomain.html.in | 8 ++++++++ docs/schemas/domaincommon.rng | 13 +++++++++++++ src/conf/domain_conf.c | 24 +++++++++++++++++++++++- src/conf/domain_conf.h | 2 ++ 4 files changed, 46 insertions(+), 1 deletion(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 0fe10f4..ea0ca83 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2141,6 +2141,14 @@ (ignore the discard request). <span class='since'>Since 1.0.6 (QEMU and KVM only)</span> </li> + <li> + The optional <code>iothread</code> attribute will assign the + disk to an IOThread as defined by the range for the domain + <a href="#elementsIOThreadsAllocation"><code>iothreads</code></a> + value. Each device must use a unique IOThread and threads will + be numbered from 1 to the domain iothreads value. + <span class='since'>Since 1.2.8 (QEMU only)</span> + </li> </ul> </dd> <dt><code>boot</code></dt> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index b4ac483..40891e4 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1572,6 +1572,9 @@ <optional> <ref name="discard"/> </optional> + <optional> + <ref name="driverIOThread"/> + </optional> <empty/> </element> </define> @@ -1659,6 +1662,11 @@ </choice> </attribute> </define> + <define name="driverIOThread"> + <attribute name='iothread'> + <ref name="iothreadsid"/> + </attribute> + </define> <define name="controller"> <element name="controller"> <attribute name="index"> @@ -4759,6 +4767,11 @@ <param name="minInclusive">0</param> </data> </define> + <define name="iothreadsid"> + <data type="unsignedInt"> + <param name="pattern">[0-9]+</param> + </data> + </define> <define name="vcpuid"> <data type="unsignedShort"> <param name="pattern">[0-9]+</param> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 671c41c..b15f279 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -5399,6 +5399,7 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, char *ioeventfd = NULL; char *event_idx = NULL; char *copy_on_read = NULL; + char *driverIOThread = NULL; char *devaddr = NULL; virStorageEncryptionPtr encryption = NULL; char *serial = NULL; @@ -5547,6 +5548,7 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, event_idx = virXMLPropString(cur, "event_idx"); copy_on_read = virXMLPropString(cur, "copy_on_read"); discard = virXMLPropString(cur, "discard"); + driverIOThread = virXMLPropString(cur, "iothread"); } else if (!def->mirror && xmlStrEqual(cur->name, BAD_CAST "mirror") && !(flags & VIR_DOMAIN_XML_INACTIVE)) { @@ -6080,6 +6082,15 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, } } + if (driverIOThread) { + if (virStrToLong_uip(driverIOThread, NULL, 10, &def->iothread) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid iothread attribute in disk driver " + "element: %s"), driverIOThread); + goto error; + } + } + if (devaddr) { if (virDomainParseLegacyDeviceAddress(devaddr, &def->info.addr.pci) < 0) { @@ -6180,6 +6191,7 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, VIR_FREE(event_idx); VIR_FREE(copy_on_read); VIR_FREE(discard); + VIR_FREE(driverIOThread); VIR_FREE(devaddr); VIR_FREE(serial); virStorageEncryptionFree(encryption); @@ -11968,6 +11980,14 @@ virDomainDefParseXML(xmlDocPtr xml, goto error; } def->iothreads = count; + + /* Create a bitmap for inuse threads - noting that entries are + * numbered 1..def->iothreads since 0 (zero) iothreads means + * nothing and assigning a disk to an IOThread requires at least a + * thread# > 0 since a zero would indicate no IOThread for the disk + */ + if (!(def->iothreadmap = virBitmapNew(def->iothreads+1))) + goto error; } /* Extract cpu tunables. */ @@ -15538,7 +15558,7 @@ virDomainDiskDefFormat(virBufferPtr buf, if (def->src->driverName || def->src->format > 0 || def->cachemode || def->error_policy || def->rerror_policy || def->iomode || def->ioeventfd || def->event_idx || def->copy_on_read || - def->discard) { + def->discard || def->iothread > 0) { virBufferAddLit(buf, "<driver"); if (def->src->driverName) virBufferAsprintf(buf, " name='%s'", def->src->driverName); @@ -15561,6 +15581,8 @@ virDomainDiskDefFormat(virBufferPtr buf, virBufferAsprintf(buf, " copy_on_read='%s'", copy_on_read); if (def->discard) virBufferAsprintf(buf, " discard='%s'", discard); + if (def->iothread > 0) + virBufferAsprintf(buf, " iothread='%u'", def->iothread); virBufferAddLit(buf, "/>\n"); } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 705ce32..1bd989d 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -667,6 +667,7 @@ struct _virDomainDiskDef { int rawio; /* no = 0, yes = 1 */ int sgio; /* enum virDomainDeviceSGIO */ int discard; /* enum virDomainDiskDiscard */ + unsigned int iothread; /* unused = 0, > 0 specific thread # */ }; @@ -1909,6 +1910,7 @@ struct _virDomainDef { virBitmapPtr cpumask; unsigned int iothreads; + virBitmapPtr iothreadmap; struct { unsigned long shares; -- 1.9.3

On Mon, Aug 25, 2014 at 08:38:08PM -0400, John Ferlan wrote:
Add a new disk "driver" attribute "iothread" to be parsed as the thread number for the disk to use. In order to more easily facilitate the usage and configuration of the iothread, a "zero" for the attribute indicates iothreads are not supported for the device and a positive value indicates the specific thread to try and use.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- docs/formatdomain.html.in | 8 ++++++++ docs/schemas/domaincommon.rng | 13 +++++++++++++ src/conf/domain_conf.c | 24 +++++++++++++++++++++++- src/conf/domain_conf.h | 2 ++ 4 files changed, 46 insertions(+), 1 deletion(-)
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index b4ac483..40891e4 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1572,6 +1572,9 @@ <optional> <ref name="discard"/> </optional> + <optional> + <ref name="driverIOThread"/> + </optional> <empty/> </element> </define> @@ -1659,6 +1662,11 @@ </choice> </attribute> </define> + <define name="driverIOThread"> + <attribute name='iothread'> + <ref name="iothreadsid"/>
Again, what's the difference to just ref name="unsignedInt"?
+ </attribute> + </define> <define name="controller"> <element name="controller"> <attribute name="index"> @@ -4759,6 +4767,11 @@ <param name="minInclusive">0</param> </data> </define> + <define name="iothreadsid"> + <data type="unsignedInt"> + <param name="pattern">[0-9]+</param> + </data> + </define> <define name="vcpuid"> <data type="unsignedShort"> <param name="pattern">[0-9]+</param> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 671c41c..b15f279 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c [...] @@ -11968,6 +11980,14 @@ virDomainDefParseXML(xmlDocPtr xml, goto error; } def->iothreads = count; + + /* Create a bitmap for inuse threads - noting that entries are + * numbered 1..def->iothreads since 0 (zero) iothreads means + * nothing and assigning a disk to an IOThread requires at least a + * thread# > 0 since a zero would indicate no IOThread for the disk + */ + if (!(def->iothreadmap = virBitmapNew(def->iothreads+1))) + goto error;
virBitmapFree(def->iothreadmap) is missing in virDomainDefFree(). Martin

On 08/26/2014 01:03 AM, Martin Kletzander wrote:
On Mon, Aug 25, 2014 at 08:38:08PM -0400, John Ferlan wrote:
Add a new disk "driver" attribute "iothread" to be parsed as the thread number for the disk to use. In order to more easily facilitate the usage and configuration of the iothread, a "zero" for the attribute indicates iothreads are not supported for the device and a positive value indicates the specific thread to try and use.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- docs/formatdomain.html.in | 8 ++++++++ docs/schemas/domaincommon.rng | 13 +++++++++++++ src/conf/domain_conf.c | 24 +++++++++++++++++++++++- src/conf/domain_conf.h | 2 ++ 4 files changed, 46 insertions(+), 1 deletion(-)
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index b4ac483..40891e4 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1572,6 +1572,9 @@ <optional> <ref name="discard"/> </optional> + <optional> + <ref name="driverIOThread"/> + </optional> <empty/> </element> </define> @@ -1659,6 +1662,11 @@ </choice> </attribute> </define> + <define name="driverIOThread"> + <attribute name='iothread'> + <ref name="iothreadsid"/>
Again, what's the difference to just ref name="unsignedInt"?
See .1 response... I'll change it.
+ </attribute> + </define> <define name="controller"> <element name="controller"> <attribute name="index"> @@ -4759,6 +4767,11 @@ <param name="minInclusive">0</param> </data> </define> + <define name="iothreadsid"> + <data type="unsignedInt"> + <param name="pattern">[0-9]+</param> + </data> + </define> <define name="vcpuid"> <data type="unsignedShort"> <param name="pattern">[0-9]+</param> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 671c41c..b15f279 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c [...] @@ -11968,6 +11980,14 @@ virDomainDefParseXML(xmlDocPtr xml, goto error; } def->iothreads = count; + + /* Create a bitmap for inuse threads - noting that entries are + * numbered 1..def->iothreads since 0 (zero) iothreads means + * nothing and assigning a disk to an IOThread requires at least a + * thread# > 0 since a zero would indicate no IOThread for the disk + */ + if (!(def->iothreadmap = virBitmapNew(def->iothreads+1))) + goto error;
virBitmapFree(def->iothreadmap) is missing in virDomainDefFree().
Hmm.. good catch - now it *was* there in one of my buffers at some point in time. I *know* I typed it! John

On 08/26/2014 02:38 AM, John Ferlan wrote:
Add a new disk "driver" attribute "iothread" to be parsed as the thread number for the disk to use. In order to more easily facilitate the usage and configuration of the iothread, a "zero" for the attribute indicates iothreads are not supported for the device and a positive value indicates the specific thread to try and use.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- docs/formatdomain.html.in | 8 ++++++++ docs/schemas/domaincommon.rng | 13 +++++++++++++ src/conf/domain_conf.c | 24 +++++++++++++++++++++++- src/conf/domain_conf.h | 2 ++ 4 files changed, 46 insertions(+), 1 deletion(-)
@@ -11968,6 +11980,14 @@ virDomainDefParseXML(xmlDocPtr xml, goto error; } def->iothreads = count; + + /* Create a bitmap for inuse threads - noting that entries are + * numbered 1..def->iothreads since 0 (zero) iothreads means + * nothing and assigning a disk to an IOThread requires at least a + * thread# > 0 since a zero would indicate no IOThread for the disk + */ + if (!(def->iothreadmap = virBitmapNew(def->iothreads+1))) + goto error; }
Do you plan to do multiple disks on a single iothread in the followup patches? Or is that a configuration that doesn't make sense? Jan

On 08/26/2014 12:04 PM, Ján Tomko wrote:
On 08/26/2014 02:38 AM, John Ferlan wrote:
Add a new disk "driver" attribute "iothread" to be parsed as the thread number for the disk to use. In order to more easily facilitate the usage and configuration of the iothread, a "zero" for the attribute indicates iothreads are not supported for the device and a positive value indicates the specific thread to try and use.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- docs/formatdomain.html.in | 8 ++++++++ docs/schemas/domaincommon.rng | 13 +++++++++++++ src/conf/domain_conf.c | 24 +++++++++++++++++++++++- src/conf/domain_conf.h | 2 ++ 4 files changed, 46 insertions(+), 1 deletion(-)
@@ -11968,6 +11980,14 @@ virDomainDefParseXML(xmlDocPtr xml, goto error; } def->iothreads = count; + + /* Create a bitmap for inuse threads - noting that entries are + * numbered 1..def->iothreads since 0 (zero) iothreads means + * nothing and assigning a disk to an IOThread requires at least a + * thread# > 0 since a zero would indicate no IOThread for the disk + */ + if (!(def->iothreadmap = virBitmapNew(def->iothreads+1))) + goto error; }
Do you plan to do multiple disks on a single iothread in the followup patches? Or is that a configuration that doesn't make sense?
I would say it doesn't make sense and from the bz I have the comment is "2. 1 IOThread per device" (bz 1101574 if you care to read although it's mostly private comments though which is why I didn't put it in the patches). John

On 08/26/2014 06:42 PM, John Ferlan wrote:
On 08/26/2014 12:04 PM, Ján Tomko wrote:
On 08/26/2014 02:38 AM, John Ferlan wrote:
Add a new disk "driver" attribute "iothread" to be parsed as the thread number for the disk to use. In order to more easily facilitate the usage and configuration of the iothread, a "zero" for the attribute indicates iothreads are not supported for the device and a positive value indicates the specific thread to try and use.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- docs/formatdomain.html.in | 8 ++++++++ docs/schemas/domaincommon.rng | 13 +++++++++++++ src/conf/domain_conf.c | 24 +++++++++++++++++++++++- src/conf/domain_conf.h | 2 ++ 4 files changed, 46 insertions(+), 1 deletion(-)
@@ -11968,6 +11980,14 @@ virDomainDefParseXML(xmlDocPtr xml, goto error; } def->iothreads = count; + + /* Create a bitmap for inuse threads - noting that entries are + * numbered 1..def->iothreads since 0 (zero) iothreads means + * nothing and assigning a disk to an IOThread requires at least a + * thread# > 0 since a zero would indicate no IOThread for the disk + */ + if (!(def->iothreadmap = virBitmapNew(def->iothreads+1))) + goto error; }
Do you plan to do multiple disks on a single iothread in the followup patches? Or is that a configuration that doesn't make sense?
I would say it doesn't make sense and from the bz I have the comment is "2. 1 IOThread per device" (bz 1101574 if you care to read although it's mostly private comments though which is why I didn't put it in the patches).
I was wondering what's up with all the private comments. Hopefully I can share the bit you're quoting: There are basically two ways to use IOThreads: 1. 1 or maybe 2 IOThreads per host CPU 2. 1 IOThread per device For usage 1), can you just add some <iothreads> to the domain XML and they will automatically be used, or do you need to 'bind' the specific disk devices to them? Jan

On 08/26/2014 01:51 PM, Ján Tomko wrote:
On 08/26/2014 06:42 PM, John Ferlan wrote:
On 08/26/2014 12:04 PM, Ján Tomko wrote:
On 08/26/2014 02:38 AM, John Ferlan wrote:
Add a new disk "driver" attribute "iothread" to be parsed as the thread number for the disk to use. In order to more easily facilitate the usage and configuration of the iothread, a "zero" for the attribute indicates iothreads are not supported for the device and a positive value indicates the specific thread to try and use.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- docs/formatdomain.html.in | 8 ++++++++ docs/schemas/domaincommon.rng | 13 +++++++++++++ src/conf/domain_conf.c | 24 +++++++++++++++++++++++- src/conf/domain_conf.h | 2 ++ 4 files changed, 46 insertions(+), 1 deletion(-)
@@ -11968,6 +11980,14 @@ virDomainDefParseXML(xmlDocPtr xml, goto error; } def->iothreads = count; + + /* Create a bitmap for inuse threads - noting that entries are + * numbered 1..def->iothreads since 0 (zero) iothreads means + * nothing and assigning a disk to an IOThread requires at least a + * thread# > 0 since a zero would indicate no IOThread for the disk + */ + if (!(def->iothreadmap = virBitmapNew(def->iothreads+1))) + goto error; }
Do you plan to do multiple disks on a single iothread in the followup patches? Or is that a configuration that doesn't make sense?
I would say it doesn't make sense and from the bz I have the comment is "2. 1 IOThread per device" (bz 1101574 if you care to read although it's mostly private comments though which is why I didn't put it in the patches).
I was wondering what's up with all the private comments.
Hopefully I can share the bit you're quoting: There are basically two ways to use IOThreads: 1. 1 or maybe 2 IOThreads per host CPU 2. 1 IOThread per device
For usage 1), can you just add some <iothreads> to the domain XML and they will automatically be used, or do you need to 'bind' the specific disk devices to them?
Binding of disks would be "specific" to "pick" which thread to use. Although I do see a perhaps veiled point = "iothread=yes|no" would work too where the "next" available could/would be picked. I like the 0 better because if it's 0, then it's not written out to the config file. The one part that still doesn't quite make sense how it will be linked is the desired feature to bind an iothread to a specific (set of) cpu(s). There's nothing in how iothreads are created/sent to qemu that would allow that yet as far as I can tell. The other part that makes me a bit nervous on this is there's no set upper bound for iothreads to create. So if by mistake I fat finger 100 to 1000 or 10000, I'm in for a world of hurt when the domain starts. There's only an advisory 1 or 2 iothreads per host CPU. Trying not to overthink or overengineer this feature, but I've been burned by similar a feature in a previous project. John

On Tue, Aug 26, 2014 at 1:38 AM, John Ferlan <jferlan@redhat.com> wrote:
+ value. Each device must use a unique IOThread and threads will + be numbered from 1 to the domain iothreads value.
It is reasonable for multiple devices to share the same IOThread. For example, if you define just 1 IOThread and bind all virtio-blk devices to it you have a similar setup as the QEMU (no IOThreads) with the big exception that the QEMU global mutex is not needed for virtio-blk I/O request processing. This can already result in a performance improvement and may make sense where there are many disks but few host CPUs. Stefan

On Wed, Aug 27, 2014 at 05:21:39PM +0100, Stefan Hajnoczi wrote:
On Tue, Aug 26, 2014 at 1:38 AM, John Ferlan <jferlan@redhat.com> wrote:
+ value. Each device must use a unique IOThread and threads will + be numbered from 1 to the domain iothreads value.
It is reasonable for multiple devices to share the same IOThread.
For example, if you define just 1 IOThread and bind all virtio-blk devices to it you have a similar setup as the QEMU (no IOThreads) with the big exception that the QEMU global mutex is not needed for virtio-blk I/O request processing. This can already result in a performance improvement and may make sense where there are many disks but few host CPUs.
This makes sense. And it's good that John was strict about this because we will have no problem to relax the checking. One question, though. Is it also OK to hotplug a disk that is about to share the iothread with some other, aleady plugged, disk? I assume it is. Martin

For virtio-blk-pci disks with the disk iothread attribute that are running the correct emulator, add the "iothread=libvirtIothread#" to the -device command line in order to enable iothreads for the disk. This code will check both the "start" and "hotplug" paths for the capability, whether the iothreadsmap has been defined, and whether there's an available iothread to be used for the disk. Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_command.c | 35 +++++++++++++++++++ src/qemu/qemu_command.h | 5 +++ src/qemu/qemu_hotplug.c | 11 ++++++ .../qemuxml2argv-iothreads-disk.args | 17 +++++++++ .../qemuxml2argv-iothreads-disk.xml | 40 ++++++++++++++++++++++ tests/qemuxml2argvtest.c | 2 ++ tests/qemuxml2xmltest.c | 1 + 7 files changed, 111 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 287a3f3..740b6ec 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3862,6 +3862,9 @@ qemuBuildDriveDevStr(virDomainDefPtr def, virBufferAddLit(&opt, "virtio-blk-device"); } else { virBufferAddLit(&opt, "virtio-blk-pci"); + if (disk->iothread > 0) + virBufferAsprintf(&opt, ",iothread=libvirtIothread%u", + disk->iothread); } qemuBuildIoEventFdStr(&opt, disk->ioeventfd, qemuCaps); if (disk->event_idx && @@ -6614,6 +6617,31 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg, } +bool +qemuDomainIothreadsAvailable(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainDiskDefPtr disk) +{ + bool inuse; + const char *src = virDomainDiskGetSource(disk); + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD) || + !def->iothreadmap) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThreads not supported for this QEMU " + "for disk '%s'"), src); + return false; + } + if (virBitmapGetBit(def->iothreadmap, disk->iothread, &inuse) || inuse) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThread '%u' for disk '%s' is already " + "being used"), disk->iothread, src); + return false; + } + return true; +} + + static int qemuBuildGraphicsVNCCommandLine(virQEMUDriverConfigPtr cfg, virCommandPtr cmd, @@ -8055,6 +8083,13 @@ qemuBuildCommandLine(virConnectPtr conn, disk->src->driverName, disk->src->path); goto error; } + + /* Check iothreads relationship */ + if (disk->iothread > 0) { + if (!qemuDomainIothreadsAvailable(def, qemuCaps, disk)) + goto error; + ignore_value(virBitmapSetBit(def->iothreadmap, disk->iothread)); + } } if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE)) { diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 633ff71..bec6c14 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -81,6 +81,11 @@ virCommandPtr qemuBuildCommandLine(virConnectPtr conn, bool forXMLToArgv) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(11); +bool +qemuDomainIothreadsAvailable(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainDiskDefPtr disk); + /* Generate '-device' string for chardev device */ int qemuBuildChrDeviceStr(char **deviceStr, diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index a364c52..720220d 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -335,6 +335,11 @@ qemuDomainAttachVirtioDiskDevice(virConnectPtr conn, } } + if (disk->iothread > 0) { + if (!qemuDomainIothreadsAvailable(vm->def, priv->qemuCaps, disk)) + goto cleanup; + } + if (qemuDomainPrepareDisk(driver, vm, disk, NULL, false) < 0) goto cleanup; @@ -396,6 +401,9 @@ qemuDomainAttachVirtioDiskDevice(virConnectPtr conn, if (ret < 0) goto error; + if (disk->iothread && vm->def->iothreadmap) + ignore_value(virBitmapSetBit(vm->def->iothreadmap, disk->iothread)); + virDomainDiskInsertPreAlloced(vm->def, disk); cleanup: @@ -2539,6 +2547,9 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } } + if (disk->iothread && vm->def->iothreadmap) + ignore_value(virBitmapClearBit(vm->def->iothreadmap, disk->iothread)); + qemuDomainReleaseDeviceAddress(vm, &disk->info, src); if (virSecurityManagerRestoreDiskLabel(driver->securityManager, diff --git a/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.args b/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.args new file mode 100644 index 0000000..693cb4c --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.args @@ -0,0 +1,17 @@ +LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \ +/usr/bin/qemu -S -M \ +pc -m 214 -smp 2 \ +-object iothread,id=libvirtIothread1 \ +-object iothread,id=libvirtIothread2 \ +-nographic -nodefaults -monitor \ +unix:/tmp/test-monitor,server,nowait -no-acpi -boot c -usb \ +-drive file=/dev/HostVG/QEMUGuest1,if=none,id=drive-ide0-0-0 \ +-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \ +-drive file=/var/lib/libvirt/images/iothrtest1.img,if=none,\ +id=drive-virtio-disk1 \ +-device virtio-blk-pci,iothread=libvirtIothread1,bus=pci.0,addr=0x4,\ +drive=drive-virtio-disk1,id=virtio-disk1 \ +-drive file=/var/lib/libvirt/images/iothrtest2.img,if=none,\ +id=drive-virtio-disk2 \ +-device virtio-blk-pci,iothread=libvirtIothread2,bus=pci.0,addr=0x3,\ +drive=drive-virtio-disk2,id=virtio-disk2 diff --git a/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml b/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml new file mode 100644 index 0000000..72122fb --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml @@ -0,0 +1,40 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>2</vcpu> + <iothreads>2</iothreads> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/> + <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <disk type='file' device='disk'> + <driver name='qemu' type='raw' iothread='1'/> + <source file='/var/lib/libvirt/images/iothrtest1.img'/> + <target dev='vdb' bus='virtio'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </disk> + <disk type='file' device='disk'> + <driver name='qemu' type='raw' iothread='2'/> + <source file='/var/lib/libvirt/images/iothrtest2.img'/> + <target dev='vdc' bus='virtio'/> + </disk> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 0e3f60a..0bb282d 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1183,6 +1183,8 @@ mymain(void) DO_TEST("smp", QEMU_CAPS_SMP_TOPOLOGY); DO_TEST("iothreads", QEMU_CAPS_OBJECT_IOTHREAD); + DO_TEST("iothreads-disk", QEMU_CAPS_OBJECT_IOTHREAD, QEMU_CAPS_DEVICE, + QEMU_CAPS_DRIVE); DO_TEST("cpu-topology1", QEMU_CAPS_SMP_TOPOLOGY); DO_TEST("cpu-topology2", QEMU_CAPS_SMP_TOPOLOGY); diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 47314e3..bc345c9 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -299,6 +299,7 @@ mymain(void) DO_TEST("smp"); DO_TEST("iothreads"); + DO_TEST("iothreads-disk"); DO_TEST("lease"); DO_TEST("event_idx"); DO_TEST("vhost_queues"); -- 1.9.3

On Mon, Aug 25, 2014 at 08:38:09PM -0400, John Ferlan wrote:
For virtio-blk-pci disks with the disk iothread attribute that are running the correct emulator, add the "iothread=libvirtIothread#" to the -device command line in order to enable iothreads for the disk.
This code will check both the "start" and "hotplug" paths for the capability, whether the iothreadsmap has been defined, and whether there's an available iothread to be used for the disk.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_command.c | 35 +++++++++++++++++++ src/qemu/qemu_command.h | 5 +++ src/qemu/qemu_hotplug.c | 11 ++++++ .../qemuxml2argv-iothreads-disk.args | 17 +++++++++ .../qemuxml2argv-iothreads-disk.xml | 40 ++++++++++++++++++++++ tests/qemuxml2argvtest.c | 2 ++ tests/qemuxml2xmltest.c | 1 + 7 files changed, 111 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 287a3f3..740b6ec 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3862,6 +3862,9 @@ qemuBuildDriveDevStr(virDomainDefPtr def, virBufferAddLit(&opt, "virtio-blk-device"); } else { virBufferAddLit(&opt, "virtio-blk-pci"); + if (disk->iothread > 0) + virBufferAsprintf(&opt, ",iothread=libvirtIothread%u", + disk->iothread);
You are using the def->iothread only to format it with virtio disks, but ... [1]
} qemuBuildIoEventFdStr(&opt, disk->ioeventfd, qemuCaps); if (disk->event_idx && @@ -6614,6 +6617,31 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg, }
+bool +qemuDomainIothreadsAvailable(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainDiskDefPtr disk) +{ + bool inuse; + const char *src = virDomainDiskGetSource(disk); + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD) || + !def->iothreadmap) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThreads not supported for this QEMU " + "for disk '%s'"), src); + return false; + } + if (virBitmapGetBit(def->iothreadmap, disk->iothread, &inuse) || inuse) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThread '%u' for disk '%s' is already " + "being used"), disk->iothread, src);
Some disks may not have source set (empty cdrom drive), so it should probably be NULLSTR(src), but I think ""IOThread '%u' used multiple times" is more than enough. This leads me to a code simplification idea. If you checked QEMU_CAPS_OBJECT_IOTHREAD when creating the iothread objects it's enough to make virBitmapGetBit() handle NULL bitmap carefully and then just virBitmapGetBit() the bit the same way. And if the message does not output the disk name (src), this whole function can be thrown away and instead of doing: if (!qemuDomainIothreadsAvailable(def, qemuCaps, disk)) goto error; you'd do: if (virBitmapGetBit(def->iothreadmap, disk->iothread, &tmp) || tmp) virReportError(...); Just an idea, though, not a requirement ;) If you keep it in a separate function, though, I'd suggest checking the range and reporting specific error, because when testing I got for example this error: "IOThread '3' for disk '(null)' is already being used" even though it clearly wasn't; I just had only 1 or 2 I/O threads created.
+ return false; + } + return true; +} + + static int qemuBuildGraphicsVNCCommandLine(virQEMUDriverConfigPtr cfg, virCommandPtr cmd, @@ -8055,6 +8083,13 @@ qemuBuildCommandLine(virConnectPtr conn, disk->src->driverName, disk->src->path); goto error; } + + /* Check iothreads relationship */ + if (disk->iothread > 0) { + if (!qemuDomainIothreadsAvailable(def, qemuCaps, disk)) + goto error; + ignore_value(virBitmapSetBit(def->iothreadmap, disk->iothread)); + } }
[1] ... here you check it for all disks. And it's kept in the domain definition for all disks as well. Maybe removing it from unsupported disks and checking(+building) the iothreadmap could be done in qemuDomainDefPostParse().
if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE)) { diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 633ff71..bec6c14 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -81,6 +81,11 @@ virCommandPtr qemuBuildCommandLine(virConnectPtr conn, bool forXMLToArgv) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(11);
+bool +qemuDomainIothreadsAvailable(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainDiskDefPtr disk); + /* Generate '-device' string for chardev device */ int qemuBuildChrDeviceStr(char **deviceStr, diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index a364c52..720220d 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -335,6 +335,11 @@ qemuDomainAttachVirtioDiskDevice(virConnectPtr conn, } }
+ if (disk->iothread > 0) { + if (!qemuDomainIothreadsAvailable(vm->def, priv->qemuCaps, disk)) + goto cleanup; + } + if (qemuDomainPrepareDisk(driver, vm, disk, NULL, false) < 0) goto cleanup;
@@ -396,6 +401,9 @@ qemuDomainAttachVirtioDiskDevice(virConnectPtr conn, if (ret < 0) goto error;
+ if (disk->iothread && vm->def->iothreadmap) + ignore_value(virBitmapSetBit(vm->def->iothreadmap, disk->iothread)); + virDomainDiskInsertPreAlloced(vm->def, disk);
cleanup:
From the names of the functions I see in these two hunks I guess this is ignored for non-virtio disks, which is a difference to cold-plugging non-virtio disks. The PostParse() modification I suggest above would take care of that.
@@ -2539,6 +2547,9 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } }
+ if (disk->iothread && vm->def->iothreadmap) + ignore_value(virBitmapClearBit(vm->def->iothreadmap, disk->iothread)); +
Making virBitmapClearBit() handle NULL bitmaps would clean this up to be unconditional (even without the check for disk->iothread) ;)
qemuDomainReleaseDeviceAddress(vm, &disk->info, src);
if (virSecurityManagerRestoreDiskLabel(driver->securityManager,
[...]
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml b/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml new file mode 100644 index 0000000..72122fb --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml @@ -0,0 +1,40 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>2</vcpu> + <iothreads>2</iothreads> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/>
Adding the iothread='x' in here and making a copy without it in tests/qemuxml2xmloutdata/ you could do ... [2]
+ <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <disk type='file' device='disk'> + <driver name='qemu' type='raw' iothread='1'/> + <source file='/var/lib/libvirt/images/iothrtest1.img'/> + <target dev='vdb' bus='virtio'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </disk> + <disk type='file' device='disk'> + <driver name='qemu' type='raw' iothread='2'/> + <source file='/var/lib/libvirt/images/iothrtest2.img'/> + <target dev='vdc' bus='virtio'/> + </disk> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 0e3f60a..0bb282d 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1183,6 +1183,8 @@ mymain(void) DO_TEST("smp", QEMU_CAPS_SMP_TOPOLOGY);
DO_TEST("iothreads", QEMU_CAPS_OBJECT_IOTHREAD); + DO_TEST("iothreads-disk", QEMU_CAPS_OBJECT_IOTHREAD, QEMU_CAPS_DEVICE, + QEMU_CAPS_DRIVE);
DO_TEST("cpu-topology1", QEMU_CAPS_SMP_TOPOLOGY); DO_TEST("cpu-topology2", QEMU_CAPS_SMP_TOPOLOGY); diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 47314e3..bc345c9 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -299,6 +299,7 @@ mymain(void)
DO_TEST("smp"); DO_TEST("iothreads"); + DO_TEST("iothreads-disk");
[2] ... DO_TEST_DIFFERENT() in here to test what I meant up there. Martin P.S.: Other than that it looks fine and I'm looking forward to v2!

On 08/26/2014 01:03 AM, Martin Kletzander wrote:
On Mon, Aug 25, 2014 at 08:38:09PM -0400, John Ferlan wrote:
For virtio-blk-pci disks with the disk iothread attribute that are running the correct emulator, add the "iothread=libvirtIothread#" to the -device command line in order to enable iothreads for the disk.
This code will check both the "start" and "hotplug" paths for the capability, whether the iothreadsmap has been defined, and whether there's an available iothread to be used for the disk.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_command.c | 35 +++++++++++++++++++ src/qemu/qemu_command.h | 5 +++ src/qemu/qemu_hotplug.c | 11 ++++++ .../qemuxml2argv-iothreads-disk.args | 17 +++++++++ .../qemuxml2argv-iothreads-disk.xml | 40 ++++++++++++++++++++++ tests/qemuxml2argvtest.c | 2 ++ tests/qemuxml2xmltest.c | 1 + 7 files changed, 111 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 287a3f3..740b6ec 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3862,6 +3862,9 @@ qemuBuildDriveDevStr(virDomainDefPtr def, virBufferAddLit(&opt, "virtio-blk-device"); } else { virBufferAddLit(&opt, "virtio-blk-pci"); + if (disk->iothread > 0) + virBufferAsprintf(&opt, ",iothread=libvirtIothread%u", + disk->iothread);
You are using the def->iothread only to format it with virtio disks, but ... [1]
} qemuBuildIoEventFdStr(&opt, disk->ioeventfd, qemuCaps); if (disk->event_idx && @@ -6614,6 +6617,31 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg, }
+bool +qemuDomainIothreadsAvailable(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainDiskDefPtr disk) +{ + bool inuse; + const char *src = virDomainDiskGetSource(disk); + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD) || + !def->iothreadmap) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThreads not supported for this QEMU " + "for disk '%s'"), src); + return false; + } + if (virBitmapGetBit(def->iothreadmap, disk->iothread, &inuse) || inuse) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThread '%u' for disk '%s' is already " + "being used"), disk->iothread, src);
Some disks may not have source set (empty cdrom drive), so it should probably be NULLSTR(src), but I think ""IOThread '%u' used multiple times" is more than enough.
OK - fair enough - although I suspect a cdrom wasn't an expected device for an iothread... My thinking was since/if we do have src available it's "nicer" to let the user know which disk/src is failing - although yes, finding multiple "iothread='#'" is fairly simple too...
This leads me to a code simplification idea. If you checked QEMU_CAPS_OBJECT_IOTHREAD when creating the iothread objects it's enough to make virBitmapGetBit() handle NULL bitmap carefully and then just virBitmapGetBit() the bit the same way. And if the message does not output the disk name (src), this whole function can be thrown away and instead of doing:
if (!qemuDomainIothreadsAvailable(def, qemuCaps, disk)) goto error;
you'd do:
if (virBitmapGetBit(def->iothreadmap, disk->iothread, &tmp) || tmp) virReportError(...);
Just an idea, though, not a requirement ;) If you keep it in a separate function, though, I'd suggest checking the range and reporting specific error, because when testing I got for example this error:
"IOThread '3' for disk '(null)' is already being used"
even though it clearly wasn't; I just had only 1 or 2 I/O threads created.
I'll look into your ideas here after the caffeine has coursed through my veins for a little while longer. I wasn't quite sure the "best" place to make the checks. It had to be after all the parsing was done, but before the command was generated - the domain [device] post parse checking code wasn't on my radar of places to look... I'm also not sure an iothread can support (at this time) "all" the various disk types/formats, but figured at the very least I had to start 'somewhere'. The problem with using a "definitive" list is adding support for 'newer' types in the future requires multiple changes; however, being more relaxed with what is allowed could cause unforeseen problems when the vm is started.
+ return false; + } + return true; +} + + static int qemuBuildGraphicsVNCCommandLine(virQEMUDriverConfigPtr cfg, virCommandPtr cmd, @@ -8055,6 +8083,13 @@ qemuBuildCommandLine(virConnectPtr conn, disk->src->driverName, disk->src->path); goto error; } + + /* Check iothreads relationship */ + if (disk->iothread > 0) { + if (!qemuDomainIothreadsAvailable(def, qemuCaps, disk)) + goto error; + ignore_value(virBitmapSetBit(def->iothreadmap, disk->iothread)); + } }
[1] ... here you check it for all disks. And it's kept in the domain definition for all disks as well. Maybe removing it from unsupported disks and checking(+building) the iothreadmap could be done in qemuDomainDefPostParse().
Yes, it's in a loop checking all ndisks for the initial start, but would only check the disk if the iothread was set for the disk - which in my mind is a deliberate action. For the hotplug virtio case - if the map isn't there, but iothread was "somehow" set that (to me) meant someone was trying to add an iothread to the disk, but none existed or were available, thus cause failure. But I'm guessing perhaps the post parse will be a better place - so again - I'll look into using it.
if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE)) { diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 633ff71..bec6c14 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -81,6 +81,11 @@ virCommandPtr qemuBuildCommandLine(virConnectPtr conn, bool forXMLToArgv) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(11);
+bool +qemuDomainIothreadsAvailable(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainDiskDefPtr disk); + /* Generate '-device' string for chardev device */ int qemuBuildChrDeviceStr(char **deviceStr, diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index a364c52..720220d 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -335,6 +335,11 @@ qemuDomainAttachVirtioDiskDevice(virConnectPtr conn, } }
+ if (disk->iothread > 0) { + if (!qemuDomainIothreadsAvailable(vm->def, priv->qemuCaps, disk)) + goto cleanup; + } + if (qemuDomainPrepareDisk(driver, vm, disk, NULL, false) < 0) goto cleanup;
@@ -396,6 +401,9 @@ qemuDomainAttachVirtioDiskDevice(virConnectPtr conn, if (ret < 0) goto error;
+ if (disk->iothread && vm->def->iothreadmap) + ignore_value(virBitmapSetBit(vm->def->iothreadmap, disk->iothread)); + virDomainDiskInsertPreAlloced(vm->def, disk);
cleanup:
From the names of the functions I see in these two hunks I guess this is ignored for non-virtio disks, which is a difference to cold-plugging non-virtio disks. The PostParse() modification I suggest above would take care of that.
Right support is supposed to be for "virtio-blk-pci" devices only...
@@ -2539,6 +2547,9 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } }
+ if (disk->iothread && vm->def->iothreadmap) + ignore_value(virBitmapClearBit(vm->def->iothreadmap, disk->iothread)); +
Making virBitmapClearBit() handle NULL bitmaps would clean this up to be unconditional (even without the check for disk->iothread) ;)
Right - since bit 0 is meaningless anyway for this purpose I suppose it's a tossup whether clearing for "every" disk is any more/less of a "performance penalty" than making the check for available and then doing the clear.
qemuDomainReleaseDeviceAddress(vm, &disk->info, src);
if (virSecurityManagerRestoreDiskLabel(driver->securityManager,
[...]
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml b/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml new file mode 100644 index 0000000..72122fb --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml @@ -0,0 +1,40 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>2</vcpu> + <iothreads>2</iothreads> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/>
Adding the iothread='x' in here and making a copy without it in tests/qemuxml2xmloutdata/ you could do ... [2]
Not sure I quite understand what you're driving at. This disk uses bus 'ide' which isn't a supported option. Sure I could "change" it, but then what does DO_TEST_DIFFERENT provide?
+ <source dev='/dev/HostVG/QEMUGuest1'/> + <target dev='hda' bus='ide'/> + <address type='drive' controller='0' bus='0' target='0' unit='0'/> + </disk> + <disk type='file' device='disk'> + <driver name='qemu' type='raw' iothread='1'/> + <source file='/var/lib/libvirt/images/iothrtest1.img'/> + <target dev='vdb' bus='virtio'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> + </disk> + <disk type='file' device='disk'> + <driver name='qemu' type='raw' iothread='2'/> + <source file='/var/lib/libvirt/images/iothrtest2.img'/> + <target dev='vdc' bus='virtio'/> + </disk> + <controller type='usb' index='0'/> + <controller type='ide' index='0'/> + <controller type='pci' index='0' model='pci-root'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index 0e3f60a..0bb282d 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -1183,6 +1183,8 @@ mymain(void) DO_TEST("smp", QEMU_CAPS_SMP_TOPOLOGY);
DO_TEST("iothreads", QEMU_CAPS_OBJECT_IOTHREAD); + DO_TEST("iothreads-disk", QEMU_CAPS_OBJECT_IOTHREAD, QEMU_CAPS_DEVICE, + QEMU_CAPS_DRIVE);
DO_TEST("cpu-topology1", QEMU_CAPS_SMP_TOPOLOGY); DO_TEST("cpu-topology2", QEMU_CAPS_SMP_TOPOLOGY); diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index 47314e3..bc345c9 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -299,6 +299,7 @@ mymain(void)
DO_TEST("smp"); DO_TEST("iothreads"); + DO_TEST("iothreads-disk");
[2] ... DO_TEST_DIFFERENT() in here to test what I meant up there.
Martin
P.S.: Other than that it looks fine and I'm looking forward to v2!
Thanks for the quick review... John

On Tue, Aug 26, 2014 at 07:34:16AM -0400, John Ferlan wrote:
On 08/26/2014 01:03 AM, Martin Kletzander wrote:
On Mon, Aug 25, 2014 at 08:38:09PM -0400, John Ferlan wrote:
For virtio-blk-pci disks with the disk iothread attribute that are running the correct emulator, add the "iothread=libvirtIothread#" to the -device command line in order to enable iothreads for the disk.
This code will check both the "start" and "hotplug" paths for the capability, whether the iothreadsmap has been defined, and whether there's an available iothread to be used for the disk.
Signed-off-by: John Ferlan <jferlan@redhat.com> --- src/qemu/qemu_command.c | 35 +++++++++++++++++++ src/qemu/qemu_command.h | 5 +++ src/qemu/qemu_hotplug.c | 11 ++++++ .../qemuxml2argv-iothreads-disk.args | 17 +++++++++ .../qemuxml2argv-iothreads-disk.xml | 40 ++++++++++++++++++++++ tests/qemuxml2argvtest.c | 2 ++ tests/qemuxml2xmltest.c | 1 + 7 files changed, 111 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 287a3f3..740b6ec 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3862,6 +3862,9 @@ qemuBuildDriveDevStr(virDomainDefPtr def, virBufferAddLit(&opt, "virtio-blk-device"); } else { virBufferAddLit(&opt, "virtio-blk-pci"); + if (disk->iothread > 0) + virBufferAsprintf(&opt, ",iothread=libvirtIothread%u", + disk->iothread);
You are using the def->iothread only to format it with virtio disks, but ... [1]
} qemuBuildIoEventFdStr(&opt, disk->ioeventfd, qemuCaps); if (disk->event_idx && @@ -6614,6 +6617,31 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg, }
+bool +qemuDomainIothreadsAvailable(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainDiskDefPtr disk) +{ + bool inuse; + const char *src = virDomainDiskGetSource(disk); + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD) || + !def->iothreadmap) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThreads not supported for this QEMU " + "for disk '%s'"), src); + return false; + } + if (virBitmapGetBit(def->iothreadmap, disk->iothread, &inuse) || inuse) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThread '%u' for disk '%s' is already " + "being used"), disk->iothread, src);
Some disks may not have source set (empty cdrom drive), so it should probably be NULLSTR(src), but I think ""IOThread '%u' used multiple times" is more than enough.
OK - fair enough - although I suspect a cdrom wasn't an expected device for an iothread... My thinking was since/if we do have src available it's "nicer" to let the user know which disk/src is failing - although yes, finding multiple "iothread='#'" is fairly simple too...
As I said, NULLSTR(src) is fine, too if you want, that wasn't some strict requirement to get rid of that.
This leads me to a code simplification idea. If you checked QEMU_CAPS_OBJECT_IOTHREAD when creating the iothread objects it's enough to make virBitmapGetBit() handle NULL bitmap carefully and then just virBitmapGetBit() the bit the same way. And if the message does not output the disk name (src), this whole function can be thrown away and instead of doing:
if (!qemuDomainIothreadsAvailable(def, qemuCaps, disk)) goto error;
you'd do:
if (virBitmapGetBit(def->iothreadmap, disk->iothread, &tmp) || tmp) virReportError(...);
Just an idea, though, not a requirement ;) If you keep it in a separate function, though, I'd suggest checking the range and reporting specific error, because when testing I got for example this error:
"IOThread '3' for disk '(null)' is already being used"
even though it clearly wasn't; I just had only 1 or 2 I/O threads created.
I'll look into your ideas here after the caffeine has coursed through my veins for a little while longer. I wasn't quite sure the "best" place to make the checks. It had to be after all the parsing was done, but before the command was generated - the domain [device] post parse checking code wasn't on my radar of places to look...
I'm also not sure an iothread can support (at this time) "all" the various disk types/formats, but figured at the very least I had to start 'somewhere'. The problem with using a "definitive" list is adding support for 'newer' types in the future requires multiple changes; however, being more relaxed with what is allowed could cause unforeseen problems when the vm is started.
It's definitely better to allow it for one disk type where it does work, is allowed, etc. and disable for everything else as there is no problem with relaxing the condition in future. However, making it more strict would not be very nice to users ;)
+ return false; + } + return true; +} + + static int qemuBuildGraphicsVNCCommandLine(virQEMUDriverConfigPtr cfg, virCommandPtr cmd, @@ -8055,6 +8083,13 @@ qemuBuildCommandLine(virConnectPtr conn, disk->src->driverName, disk->src->path); goto error; } + + /* Check iothreads relationship */ + if (disk->iothread > 0) { + if (!qemuDomainIothreadsAvailable(def, qemuCaps, disk)) + goto error; + ignore_value(virBitmapSetBit(def->iothreadmap, disk->iothread)); + } }
[1] ... here you check it for all disks. And it's kept in the domain definition for all disks as well. Maybe removing it from unsupported disks and checking(+building) the iothreadmap could be done in qemuDomainDefPostParse().
Yes, it's in a loop checking all ndisks for the initial start, but would only check the disk if the iothread was set for the disk - which in my mind is a deliberate action. For the hotplug virtio case - if the map isn't there, but iothread was "somehow" set that (to me) meant someone was trying to add an iothread to the disk, but none existed or were available, thus cause failure.
Yes, you're right. I wasn't saying it's bad, it was more of a question whether it was intended :)
But I'm guessing perhaps the post parse will be a better place - so again - I'll look into using it.
if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE)) { diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 633ff71..bec6c14 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -81,6 +81,11 @@ virCommandPtr qemuBuildCommandLine(virConnectPtr conn, bool forXMLToArgv) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(11);
+bool +qemuDomainIothreadsAvailable(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainDiskDefPtr disk); + /* Generate '-device' string for chardev device */ int qemuBuildChrDeviceStr(char **deviceStr, diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index a364c52..720220d 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -335,6 +335,11 @@ qemuDomainAttachVirtioDiskDevice(virConnectPtr conn, } }
+ if (disk->iothread > 0) { + if (!qemuDomainIothreadsAvailable(vm->def, priv->qemuCaps, disk)) + goto cleanup; + } + if (qemuDomainPrepareDisk(driver, vm, disk, NULL, false) < 0) goto cleanup;
@@ -396,6 +401,9 @@ qemuDomainAttachVirtioDiskDevice(virConnectPtr conn, if (ret < 0) goto error;
+ if (disk->iothread && vm->def->iothreadmap) + ignore_value(virBitmapSetBit(vm->def->iothreadmap, disk->iothread)); + virDomainDiskInsertPreAlloced(vm->def, disk);
cleanup:
From the names of the functions I see in these two hunks I guess this is ignored for non-virtio disks, which is a difference to cold-plugging non-virtio disks. The PostParse() modification I suggest above would take care of that.
Right support is supposed to be for "virtio-blk-pci" devices only...
@@ -2539,6 +2547,9 @@ qemuDomainRemoveDiskDevice(virQEMUDriverPtr driver, } }
+ if (disk->iothread && vm->def->iothreadmap) + ignore_value(virBitmapClearBit(vm->def->iothreadmap, disk->iothread)); +
Making virBitmapClearBit() handle NULL bitmaps would clean this up to be unconditional (even without the check for disk->iothread) ;)
Right - since bit 0 is meaningless anyway for this purpose I suppose it's a tossup whether clearing for "every" disk is any more/less of a "performance penalty" than making the check for available and then doing the clear.
qemuDomainReleaseDeviceAddress(vm, &disk->info, src);
if (virSecurityManagerRestoreDiskLabel(driver->securityManager,
[...]
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml b/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml new file mode 100644 index 0000000..72122fb --- /dev/null +++ b/tests/qemuxml2argvdata/qemuxml2argv-iothreads-disk.xml @@ -0,0 +1,40 @@ +<domain type='qemu'> + <name>QEMUGuest1</name> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> + <memory unit='KiB'>219136</memory> + <currentMemory unit='KiB'>219136</currentMemory> + <vcpu placement='static'>2</vcpu> + <iothreads>2</iothreads> + <os> + <type arch='i686' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu</emulator> + <disk type='block' device='disk'> + <driver name='qemu' type='raw'/>
Adding the iothread='x' in here and making a copy without it in tests/qemuxml2xmloutdata/ you could do ... [2]
Not sure I quite understand what you're driving at. This disk uses bus 'ide' which isn't a supported option. Sure I could "change" it, but then what does DO_TEST_DIFFERENT provide?
Wow, I'm reading my comment third time and I kinda don't understand what I meant either :-D Sorry for that, but I guess I wanted to show either how it didn't work with target bud='ide' or the error I had. Not sure though :) Martin

On 08/26/2014 01:03 AM, Martin Kletzander wrote:
On Mon, Aug 25, 2014 at 08:38:09PM -0400, John Ferlan wrote:
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 287a3f3..740b6ec 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3862,6 +3862,9 @@ qemuBuildDriveDevStr(virDomainDefPtr def, virBufferAddLit(&opt, "virtio-blk-device"); } else { virBufferAddLit(&opt, "virtio-blk-pci"); + if (disk->iothread > 0) + virBufferAsprintf(&opt, ",iothread=libvirtIothread%u", + disk->iothread);
You are using the def->iothread only to format it with virtio disks, but ... [1]
} qemuBuildIoEventFdStr(&opt, disk->ioeventfd, qemuCaps); if (disk->event_idx && @@ -6614,6 +6617,31 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg, }
+bool +qemuDomainIothreadsAvailable(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainDiskDefPtr disk) +{ + bool inuse; + const char *src = virDomainDiskGetSource(disk); + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD) || + !def->iothreadmap) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThreads not supported for this QEMU " + "for disk '%s'"), src); + return false; + } + if (virBitmapGetBit(def->iothreadmap, disk->iothread, &inuse) || inuse) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThread '%u' for disk '%s' is already " + "being used"), disk->iothread, src);
Some disks may not have source set (empty cdrom drive), so it should probably be NULLSTR(src), but I think ""IOThread '%u' used multiple times" is more than enough.
This leads me to a code simplification idea. If you checked QEMU_CAPS_OBJECT_IOTHREAD when creating the iothread objects it's enough to make virBitmapGetBit() handle NULL bitmap carefully and then just virBitmapGetBit() the bit the same way. And if the message does not output the disk name (src), this whole function can be thrown away and instead of doing:
if (!qemuDomainIothreadsAvailable(def, qemuCaps, disk)) goto error;
you'd do:
if (virBitmapGetBit(def->iothreadmap, disk->iothread, &tmp) || tmp) virReportError(...);
I looked at the above code simplification idea and I wasn't able to see what you were driving at. Maybe it's my current code myopia... There's a ditty about not being able to see the forest through the trees that may apply for me. Maybe I'm missing something obvious - I dunno... Beyond the obvious of if any other code passes a NULL Bitmap, the existing code will fail miserably... Modifying GetBit/SetBit/ClearBit in order to handle a NULL iothreadmap doesn't help the case where iothread == 0 which I think could be the more common case. I've adjusted where the checks are made to within the BuildDriveDevStr code. That way both the hotplug and regular startup will make the same check.
Just an idea, though, not a requirement ;) If you keep it in a separate function, though, I'd suggest checking the range and reporting specific error, because when testing I got for example this error:
"IOThread '3' for disk '(null)' is already being used"
even though it clearly wasn't; I just had only 1 or 2 I/O threads created.
+ return false; + } + return true; +} + + static int qemuBuildGraphicsVNCCommandLine(virQEMUDriverConfigPtr cfg, virCommandPtr cmd, @@ -8055,6 +8083,13 @@ qemuBuildCommandLine(virConnectPtr conn, disk->src->driverName, disk->src->path); goto error; } + + /* Check iothreads relationship */ + if (disk->iothread > 0) { + if (!qemuDomainIothreadsAvailable(def, qemuCaps, disk)) + goto error; + ignore_value(virBitmapSetBit(def->iothreadmap, disk->iothread)); + } }
[1] ... here you check it for all disks. And it's kept in the domain definition for all disks as well. Maybe removing it from unsupported disks and checking(+building) the iothreadmap could be done in qemuDomainDefPostParse().
I'm not sure using a PostParse will work properly, especially in the hotplug case where the disk xml is not added to the configuration yet - the live add is attempted first and if it succeeds, then the bitmap would need to be updated. If the configuration is to be saved, then the on-disk config is updated. I assume that for hotplug the domain def passed along is the "live" one that already has the map filled in for any disks that were properly vetted at startup time. Without the "live" map being updated, one could add 'n' live disks to the same IOThread and that has seemingly nothing to do with PostParse - although I may have misfollowed things. The other part of PostParse that's difficult is that the DevicePostParse would probably be the place to check whether the device itself is valid because while 'virtio' bus could be set, when does 'pci' get set? I also note qemuDomainDefPostParse() uses virCapsPtr and not virQEMUCapsPtr and it's seemingly tasked to primarily add things that will be needed by the guest due to "other" found devices in the guest XML and not to peruse through the ndisks looking to fill in a bitmap or check for all the right configuration setting options for iothreads. Just wasn't clear enough whether it "should be" used for this purpose. I'm not against using it - just wasn't completely sure.
if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE)) { diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h index 633ff71..bec6c14 100644 --- a/src/qemu/qemu_command.h +++ b/src/qemu/qemu_command.h @@ -81,6 +81,11 @@ virCommandPtr qemuBuildCommandLine(virConnectPtr conn, bool forXMLToArgv) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(11);
+bool +qemuDomainIothreadsAvailable(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainDiskDefPtr disk); + /* Generate '-device' string for chardev device */ int qemuBuildChrDeviceStr(char **deviceStr, diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index a364c52..720220d 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -335,6 +335,11 @@ qemuDomainAttachVirtioDiskDevice(virConnectPtr conn, } }
+ if (disk->iothread > 0) { + if (!qemuDomainIothreadsAvailable(vm->def, priv->qemuCaps, disk)) + goto cleanup; + } + if (qemuDomainPrepareDisk(driver, vm, disk, NULL, false) < 0) goto cleanup;
@@ -396,6 +401,9 @@ qemuDomainAttachVirtioDiskDevice(virConnectPtr conn, if (ret < 0) goto error;
+ if (disk->iothread && vm->def->iothreadmap) + ignore_value(virBitmapSetBit(vm->def->iothreadmap, disk->iothread)); + virDomainDiskInsertPreAlloced(vm->def, disk);
cleanup:
From the names of the functions I see in these two hunks I guess this is ignored for non-virtio disks, which is a difference to cold-plugging non-virtio disks. The PostParse() modification I suggest above would take care of that.
So v2 is coming to a mailbox near you soon... John

On Tue, Aug 26, 2014 at 06:03:46PM -0400, John Ferlan wrote:
On 08/26/2014 01:03 AM, Martin Kletzander wrote:
On Mon, Aug 25, 2014 at 08:38:09PM -0400, John Ferlan wrote:
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 287a3f3..740b6ec 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3862,6 +3862,9 @@ qemuBuildDriveDevStr(virDomainDefPtr def, virBufferAddLit(&opt, "virtio-blk-device"); } else { virBufferAddLit(&opt, "virtio-blk-pci"); + if (disk->iothread > 0) + virBufferAsprintf(&opt, ",iothread=libvirtIothread%u", + disk->iothread);
You are using the def->iothread only to format it with virtio disks, but ... [1]
} qemuBuildIoEventFdStr(&opt, disk->ioeventfd, qemuCaps); if (disk->event_idx && @@ -6614,6 +6617,31 @@ qemuBuildNumaArgStr(virQEMUDriverConfigPtr cfg, }
+bool +qemuDomainIothreadsAvailable(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps, + virDomainDiskDefPtr disk) +{ + bool inuse; + const char *src = virDomainDiskGetSource(disk); + + if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD) || + !def->iothreadmap) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThreads not supported for this QEMU " + "for disk '%s'"), src); + return false; + } + if (virBitmapGetBit(def->iothreadmap, disk->iothread, &inuse) || inuse) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("IOThread '%u' for disk '%s' is already " + "being used"), disk->iothread, src);
Some disks may not have source set (empty cdrom drive), so it should probably be NULLSTR(src), but I think ""IOThread '%u' used multiple times" is more than enough.
This leads me to a code simplification idea. If you checked QEMU_CAPS_OBJECT_IOTHREAD when creating the iothread objects it's enough to make virBitmapGetBit() handle NULL bitmap carefully and then just virBitmapGetBit() the bit the same way. And if the message does not output the disk name (src), this whole function can be thrown away and instead of doing:
if (!qemuDomainIothreadsAvailable(def, qemuCaps, disk)) goto error;
you'd do:
if (virBitmapGetBit(def->iothreadmap, disk->iothread, &tmp) || tmp) virReportError(...);
I looked at the above code simplification idea and I wasn't able to see what you were driving at. Maybe it's my current code myopia... There's a ditty about not being able to see the forest through the trees that may apply for me. Maybe I'm missing something obvious - I dunno...
Beyond the obvious of if any other code passes a NULL Bitmap, the existing code will fail miserably... Modifying GetBit/SetBit/ClearBit in order to handle a NULL iothreadmap doesn't help the case where iothread == 0 which I think could be the more common case.
Oh, yes, I missed a case, but anyway, nothing is wrong with reporting it when building the command-line. Looking back at my ideas, it seems like a copy-paste from insomnia-driven programming book.
I've adjusted where the checks are made to within the BuildDriveDevStr code. That way both the hotplug and regular startup will make the same check.
Great, that way it's still in building a command-line, but applies for hotplug too.
Just an idea, though, not a requirement ;) If you keep it in a separate function, though, I'd suggest checking the range and reporting specific error, because when testing I got for example this error:
"IOThread '3' for disk '(null)' is already being used"
even though it clearly wasn't; I just had only 1 or 2 I/O threads created.
+ return false; + } + return true; +} + + static int qemuBuildGraphicsVNCCommandLine(virQEMUDriverConfigPtr cfg, virCommandPtr cmd, @@ -8055,6 +8083,13 @@ qemuBuildCommandLine(virConnectPtr conn, disk->src->driverName, disk->src->path); goto error; } + + /* Check iothreads relationship */ + if (disk->iothread > 0) { + if (!qemuDomainIothreadsAvailable(def, qemuCaps, disk)) + goto error; + ignore_value(virBitmapSetBit(def->iothreadmap, disk->iothread)); + } }
[1] ... here you check it for all disks. And it's kept in the domain definition for all disks as well. Maybe removing it from unsupported disks and checking(+building) the iothreadmap could be done in qemuDomainDefPostParse().
I'm not sure using a PostParse will work properly, especially in the hotplug case where the disk xml is not added to the configuration yet - the live add is attempted first and if it succeeds, then the bitmap would need to be updated. If the configuration is to be saved, then the on-disk config is updated. I assume that for hotplug the domain def passed along is the "live" one that already has the map filled in for any disks that were properly vetted at startup time. Without the "live" map being updated, one could add 'n' live disks to the same IOThread and that has seemingly nothing to do with PostParse - although I may have misfollowed things.
The other part of PostParse that's difficult is that the DevicePostParse would probably be the place to check whether the device itself is valid because while 'virtio' bus could be set, when does 'pci' get set?
I also note qemuDomainDefPostParse() uses virCapsPtr and not virQEMUCapsPtr and it's seemingly tasked to primarily add things that will be needed by the guest due to "other" found devices in the guest XML and not to peruse through the ndisks looking to fill in a bitmap or check for all the right configuration setting options for iothreads.
Just wasn't clear enough whether it "should be" used for this purpose. I'm not against using it - just wasn't completely sure.
Yes, PostParse shouldn't be doing that, the iothreadmap is easier to be created when starting the machine. Then it's clear that live machines have iothreadmap and persistent definitions don't. I was just curious whether it would make sense to error out earlier (when defining the XML) rather then later (when starting the machine). Unfortunately I covered that simple piece of information in a crusty hardly readable shell. [...]
So v2 is coming to a mailbox near you soon...
Looking forward! Martin
participants (5)
-
Eric Blake
-
John Ferlan
-
Ján Tomko
-
Martin Kletzander
-
Stefan Hajnoczi