
On Thu, Jul 21, 2011 at 08:44:35AM -0500, Anthony Liguori wrote:
On 07/21/2011 08:34 AM, Daniel P. Berrange wrote:
On Thu, Jul 21, 2011 at 07:54:05AM -0500, Adam Litke wrote:
Added Anthony to give him the opportunity to address the finer points of this one especially with respect to the qemu IO thread(s).
This feature is really about capping the compute performance of a VM such that we get consistent top end performance. Yes, qemu has non-VCPU threads that this patch set doesn't govern, but that's the point. We are not attempting to throttle IO or device emulation with this feature. It's true that an IO-intensive guest may consume more host resources than a compute intensive guest, but they should still have equal top-end CPU performance when viewed from the guest's perspective.
I could be mis-understanding, what you're trying to achieve, here, so perhaps we should consider an example.
- A machine has 4 physical CPUs - There are 4 guests on the machine - Each guest has 2 virtual CPUs
So we've overcommit the host CPU resources x2 here.
Lets say that we want to use this feature to ensure consistent top end performance of every guest, splitting the host pCPUs resources evenly across all guests, so each guest is ensured 1 pCPU worth of CPU time overall.
This patch lets you do this by assigning caps per VCPU. So in this example, each VCPU cgroup would have to be configured to cap the VCPUs at 50% of a single pCPU.
This leaves the other QEMU threads uncapped / unaccounted for. If any one guest causes non-trivial compute load in a non-VCPU thread, this can/will impact the top-end compute performance of all the other guests on the machine.
But this is not undesirable behavior. You're mixing up consistency and top end performance. They are totally different things and I think most consumers of capping really only care about the later.
If we did caps per VM, then you could set the VM cgroup such that the VM as a whole had 100% of a single pCPU.
Consistent performance is very hard to achieve. The desire is to cap performance not just within a box, but also across multiple different boxes with potentially different versions of KVM. The I/O threads are basically hypervisor overhead. That's going to change over time.
If a guest is 100% compute bound, it can use its full 100% of a pCPU allocation in vCPU threads. If any other guest is causing CPU time in a non-VCPU thread, it cannot impact the top end compute performance of VCPU threads in the other guests.
A per-VM cap would, however, mean a guest with 2 vCPUs could have unequal scheduling, where one vCPU claimed 75% of the pCPU and the othe vCPU got left with only 25%.
So AFAICT, per-VM cgroups is better for ensuring top end compute performance of a guest as a whole, but per-VCPU cgroups can ensure consistent top end performance across vCPUs within a guest.
And the later is the primary use case as far as I can tell.
If I'm a user, I'll be very confused if I have a 4 VCPU guest, run 1 instance of specint, and get X as the result. I then run 4 instances of specint and get X as the result. My expectation is to get 4X as the result because I'm running 4 instances.
I don't see why doing limits at per-VM vs per-VM has an impact on performance when adding extra guests. It would certainly change behaviour if adding extra vCPUs to a guest though. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|