
----- Original Message -----
From: "Radim Krčmář" <rkrcmar@redhat.com> To: libvir-list@redhat.com Cc: "Daniel P. Berrange" <berrange@redhat.com>, "Andrew Theurer" <atheurer@redhat.com> Sent: Thursday, August 14, 2014 9:25:05 AM Subject: Suboptimal default cpu Cgroup
Hello,
by default, libvirt with KVM creates a Cgroup hierarchy in 'cpu,cpuacct' [1], with 'shares' set to 1024 on every level. This raises two points:
1) Every VM is given an equal amount of CPU time. [2] ($CG/machine.slice/*/shares = 1024)
Which means that smaller / less loaded guests are given an advantage.
2) All VMs combined are given 1024 shares. [3] ($CG/machine.slice/shares)
This is made even worse on RHEL7, by sched_autogroup_enabled = 0, so every other process in the system is given the same amount of CPU as all VMs combined.
It does not seem to be possible to tune shares and get a good general behavior, so the best solution I can see is to disable the cpu cgroup and let users do it when needed. (Keeping all tasks in $CG/tasks.)
Could we have each VM's shares be nr_vcpu * 1024, and the share for $CG/machine.slice be sum of all VM's share?
Do we want cgroups in the default at all? (Is OpenStack dealing with these quirks?)
Thanks.
--- 1: machine.slice/ machine-qemu\\x2d${name}.scope/ {emulator,vcpu*}/
2: To reproduce, run two guests with > 1 VCPU and execute two spinners on the first and one on the second. The result will be 50%/50% CPU assignment between guests; 66%/33% seems more natural, but it could still be considered as a feature.
3: Run a guest with $n VCPUs and $n spinners in it, and $n spinners in the host - RHEL7: 1/($n + 1)% CPU for the guest -- I'd expect 50%/50%. - Upstream: 50%/50% between guest and host because of autogrouping; if you run $n more spinners in the host, it will still be 50%/50%, instead of seemingly more fair 33%/66%. (And you can run spinners from different groups, so it would be the same as in RHEL7 then.)
And it also works the other way: if the host has $n CPUs, then $n/2 tasks in the host suffice to minimize VMs' performance, regardless of the amount of running VCPUs.