----- Original Message -----
From: "Radim Krčmář" <rkrcmar(a)redhat.com>
To: libvir-list(a)redhat.com
Cc: "Daniel P. Berrange" <berrange(a)redhat.com>, "Andrew
Theurer" <atheurer(a)redhat.com>
Sent: Thursday, August 14, 2014 9:25:05 AM
Subject: Suboptimal default cpu Cgroup
Hello,
by default, libvirt with KVM creates a Cgroup hierarchy in 'cpu,cpuacct'
[1], with 'shares' set to 1024 on every level. This raises two points:
1) Every VM is given an equal amount of CPU time. [2]
($CG/machine.slice/*/shares = 1024)
Which means that smaller / less loaded guests are given an advantage.
2) All VMs combined are given 1024 shares. [3]
($CG/machine.slice/shares)
This is made even worse on RHEL7, by sched_autogroup_enabled = 0, so
every other process in the system is given the same amount of CPU as
all VMs combined.
It does not seem to be possible to tune shares and get a good general
behavior, so the best solution I can see is to disable the cpu cgroup
and let users do it when needed. (Keeping all tasks in $CG/tasks.)
Could we have each VM's shares be nr_vcpu * 1024, and the share for $CG/machine.slice
be sum of all VM's share?
Do we want cgroups in the default at all?
(Is OpenStack dealing with these quirks?)
Thanks.
---
1: machine.slice/
machine-qemu\\x2d${name}.scope/
{emulator,vcpu*}/
2: To reproduce, run two guests with > 1 VCPU and execute two spinners
on the first and one on the second.
The result will be 50%/50% CPU assignment between guests; 66%/33%
seems more natural, but it could still be considered as a feature.
3: Run a guest with $n VCPUs and $n spinners in it, and $n spinners in
the host
- RHEL7: 1/($n + 1)% CPU for the guest -- I'd expect 50%/50%.
- Upstream: 50%/50% between guest and host because of autogrouping;
if you run $n more spinners in the host, it will still be 50%/50%,
instead of seemingly more fair 33%/66%. (And you can run spinners
from different groups, so it would be the same as in RHEL7 then.)
And it also works the other way: if the host has $n CPUs, then
$n/2 tasks in the host suffice to minimize VMs' performance,
regardless of the amount of running VCPUs.