On Tue, Jun 12, 2012 at 04:50:37PM +0400, Andrey Korolyov wrote:
You partially right, HW raid do cpu offload, but it still need an
amount of interrupts proportional to load, and same for server NICs
with hardware queues - they help to manage data outside cpu, so cpu
spent less time doing the job for peripherals, but not too much. As
for my experience, any VM(xen|qemu) should be pinned to a different
set of cores than pointed by smp_affinity for rx|tx queues and other
peripherals, if not - depending on overall system load, you may get
even freeze(my experience with crappy LSI MegaSAS from time when
2.6.18 rocks) and almost every time network benchmark lost a large
portion of throughput. Another side point when doing such tune - use
your NUMA topology to achieve best performance - e.g. do NOT pin NIC`
interrupts on neighbor socket` cores or assign ten cores of one VM to
six real ones mixed between two NUMA nodes.
Thanks Andrey. That gives me a lot of things to look up. Not that I'm asking
for answers here. Just noting it in case anyone does get around to writing
comprehensive docs on this. How do I learn the NUMA topology? Where do I see
where smp_affinity points? With VMs each configured to have only on virtual
core (as they are in my case) to what degree are these complications less in
play? What is the method to pin NIC interrupts? Again, not suggesting such
answers belong in this list. But if anyone were to write a book on this
stuff, those of us coming to virtualized systems from actual ones won't
necessarily have ever been concerned with such questions. The existing
introductory texts on libvirt, KVM and QEMU don't, IIRC, mention these
things.
Best,
Whit