On 03/25/2015 11:37 AM, Peter Krempa wrote:
On Wed, Mar 25, 2015 at 16:29:22 +0100, Ján Tomko wrote:
> On Thu, Mar 19, 2015 at 01:08:22PM -0400, John Ferlan wrote:
>> Add virDomainIOThreadsSet to allow setting the number of IOThreads for
>> a domain.
>>
>> The API supports updating both the --live domain and the --config data.
>>
>> Signed-off-by: John Ferlan <jferlan(a)redhat.com>
>> ---
>> include/libvirt/libvirt-domain.h | 3 ++
>> src/driver-hypervisor.h | 6 ++++
>> src/libvirt-domain.c | 60 ++++++++++++++++++++++++++++++++++++++++
>> src/libvirt_public.syms | 1 +
>> 4 files changed, 70 insertions(+)
>>
>
> This one has been siting there for a while, I'll think out loud just to
> get a discussion going.
>
>> --- a/src/libvirt-domain.c
>> +++ b/src/libvirt-domain.c
>> @@ -8042,6 +8042,66 @@ virDomainPinIOThread(virDomainPtr domain,
>>
>>
>> /**
>> + * virDomainSetIOThreads:
>> + * @domain: pointer to domain object
>> + * @niothreads: the new number of IOThreads for this domain
>> + * @flags: bitwise-OR of virDomainModificationImpact
>> + *
>> + * Dynamically change the number of IOThreads used by the domain.
>> + * Note that this call faiy fail if the underlying virtualization hypervisor
>> + * does not support it or if growing the number is arbitrarily limited.
>> + * This function may require privileged access to the hypervisor.
>> + *
>> + * @flags may include VIR_DOMAIN_AFFECT_LIVE to affect a running
>> + * domain (which may fail if domain is not active), or
>> + * VIR_DOMAIN_AFFECT_CONFIG to affect the next boot via the XML
>> + * description of the domain. Both flags may be set.
>> + * If neither flag is specified (that is, @flags is VIR_DOMAIN_AFFECT_CURRENT),
>> + * then an inactive domain modifies persistent setup, while an active domain
>> + * is hypervisor-dependent on whether just live or both live and persistent
>> + * state is changed.
>> + *
>> + * Not all hypervisors can support all flag combinations.
>> + *
>> + * Returns 0 in case of success, -1 in case of failure.
>> + */
>> +int
>> +virDomainSetIOThreads(virDomainPtr domain,
>> + unsigned int niothreads,
>> + unsigned int flags)
>> +{
>
> This seems ok for the use case where we just give a domain a few
> iothreads and all the disks will share them.
>
> But if you go for one iothread per disk, creating holes in iothread
> indexes could make sense - we allow unplugging disks from the middle.
>
> I can't think of a case where hotplugging a thread with id=6 would be
> useful if there are only 3 threads.
>
> Maybe there should be separate APIs for adding and deleting threads?
I have to agree here. Separate API for adding and removing (or perhaps
just a parameter to switch?) specific threads is the way to go. Why?
Well this function looks like it was copied from virDomainSetVcpus which
I'm currently planning to replace with specific vcpu hotplug api.
The reasons there are the same as Jan pointed out just translated into
the context of vCPUS. At the point where there is no other mapping or
relationship of belonging the API is fine. This was true for vCPUs in
the pre-NUMA era. With NUMA, (or disks IO executed by iothreads in this
case) you have a relationship where a thread (or vCPU) belongs to a dis
(or NUMA node) and you need to be able to select them individually.
hmmm... well yes, the model is the Vcpu algorithms and specifically
SetVcpus. Allowing unordered IOThread id's will require some rework of
other algorithms.
Talking through this ...
If I start with 0 IOThreads and want 2, is it easier to "SetIOThreads"
once or "AddIOThread" twice?
Similarly, if I have 2 IOThreads and want 0, is it easier to
SetIOThreads once or "DelIOThread" twice?
If I start with 2 IOThreads, I add 2 IOThreads, then want to remove
IOThread 3, then does that mean this has similar logic to Vcpus where I
have to introduce a 'current' value. Then at (re)startup, I have to
determine which IOThreads are assigned disks in order to determine which
of the 3 current to start. The add iothread to disk logic would need
adjustment to know that IOThread#3 doesn't exist. It's easy to start one
in the middle with its own number, I just wonder the 'real world' use
case (not talking the varying array of test matrix patterns that need to
be validated). Hence the KISS method of managing IOThreads.
If I add an IOThread to a configuration with 3 current and 4 total, I
reuse the currently empty entry as the place to add (and current goes
away). Same setup, if I added 2 threads, then current goes away, I fill
in the 3rd entry, and add a 5th entry.
Moving a disk from one IOThread to another is certainly different than
moving NUMA nodes and vCPUs around. An IOThread is just a qemu
object... a disk is assigned to use that object... In order to stop
using that object, the disk needs to be unplugged. Sure we can choose
any disk and thus logically should be able to choose any IOThread. But
it's still a process to unplug, "unnecessarily" remove IOThread, and
hutplug to a different IOThread. How many real world customers will do
that? I don't know. My view is someone is more likely to add
IOThreads, add their disks to them, and be done. I see removing disks,
removing iothreads, etc. as mostly a testing exercise.
In some ways I wonder if we're over-engineering this? IOThreads are
different than Vcpus, but having a starting place is good. I'm open to
ideas and suggestions, but I also am concerned the "rules" become more
complex introducing more chances for bugs/errors.
John