On 07/21/2011 09:21 AM, Daniel P. Berrange wrote:
On Thu, Jul 21, 2011 at 08:44:35AM -0500, Anthony Liguori wrote:
> On 07/21/2011 08:34 AM, Daniel P. Berrange wrote:
>> On Thu, Jul 21, 2011 at 07:54:05AM -0500, Adam Litke wrote:
>>> Added Anthony to give him the opportunity to address the finer points of
>>> this one especially with respect to the qemu IO thread(s).
>>>
>>> This feature is really about capping the compute performance of a VM
>>> such that we get consistent top end performance. Yes, qemu has non-VCPU
>>> threads that this patch set doesn't govern, but that's the point.
We
>>> are not attempting to throttle IO or device emulation with this feature.
>>> It's true that an IO-intensive guest may consume more host resources
>>> than a compute intensive guest, but they should still have equal top-end
>>> CPU performance when viewed from the guest's perspective.
>>
>> I could be mis-understanding, what you're trying to achieve,
>> here, so perhaps we should consider an example.
>>
>> - A machine has 4 physical CPUs
>> - There are 4 guests on the machine
>> - Each guest has 2 virtual CPUs
>>
>> So we've overcommit the host CPU resources x2 here.
>>
>> Lets say that we want to use this feature to ensure consistent
>> top end performance of every guest, splitting the host pCPUs
>> resources evenly across all guests, so each guest is ensured
>> 1 pCPU worth of CPU time overall.
>>
>> This patch lets you do this by assigning caps per VCPU. So
>> in this example, each VCPU cgroup would have to be configured
>> to cap the VCPUs at 50% of a single pCPU.
>>
>> This leaves the other QEMU threads uncapped / unaccounted
>> for. If any one guest causes non-trivial compute load in
>> a non-VCPU thread, this can/will impact the top-end compute
>> performance of all the other guests on the machine.
>
> But this is not undesirable behavior. You're mixing up consistency
> and top end performance. They are totally different things and I
> think most consumers of capping really only care about the later.
>
>>
>> If we did caps per VM, then you could set the VM cgroup
>> such that the VM as a whole had 100% of a single pCPU.
>
> Consistent performance is very hard to achieve. The desire is to
> cap performance not just within a box, but also across multiple
> different boxes with potentially different versions of KVM. The I/O
> threads are basically hypervisor overhead. That's going to change
> over time.
>
>> If a guest is 100% compute bound, it can use its full
>> 100% of a pCPU allocation in vCPU threads. If any other
>> guest is causing CPU time in a non-VCPU thread, it cannot
>> impact the top end compute performance of VCPU threads in
>> the other guests.
>>
>> A per-VM cap would, however, mean a guest with 2 vCPUs
>> could have unequal scheduling, where one vCPU claimed 75%
>> of the pCPU and the othe vCPU got left with only 25%.
>>
>> So AFAICT, per-VM cgroups is better for ensuring top
>> end compute performance of a guest as a whole, but
>> per-VCPU cgroups can ensure consistent top end performance
>> across vCPUs within a guest.
>
> And the later is the primary use case as far as I can tell.
>
> If I'm a user, I'll be very confused if I have a 4 VCPU guest, run 1
> instance of specint, and get X as the result. I then run 4
> instances of specint and get X as the result. My expectation is to
> get 4X as the result because I'm running 4 instances.
I don't see why doing limits at per-VM vs per-VM has an impact
on performance when adding extra guests. It would certainly
change behaviour if adding extra vCPUs to a guest though.
My use of "instance" was referring to a copy of specint. IOW,
4-VCPU guest:
$ ./specint
score - 490
$ ./specint --threads=4
score - 490
That would confuse a user. The expectation is:
$ ./specint --threads=4
score - 1960
Regards,
Anthony Liguori
Daniel