On Wed, Jun 13, 2007 at 10:40:40AM -0500, Ryan Harper wrote:
Hello all,
Hello Ryan,
I wanted to start a discussion on how we might get libvirt to be able
to
probe the NUMA topology of Xen and Linux (for QEMU/KVM). In Xen, I've
recently posted patches for exporting topology into the [1]physinfo
hypercall, as well adding a [2]hypercall to probe the Xen heap. I
believe the topology and memory info is already available in Linux.
With these, we have enough information to be able to write some simple
policy above libvirt that can create guests in a NUMA-aware fashion.
I'd like to suggest the following for discussion:
(1) A function to discover topology
(2) A function to check available memory
(3) Specifying which cpus to use prior to domain start
Okay, but let's start by defining the scope a bit. Historically NUMA
have explored various paths, and I assume we are gonna work in a rather
small subset of what NUMA (Non Uniform Memory Access) have meant over time.
I assume the following, tell me if I'm wrong:
- we are just considering memory and processor affinity
- the topology, i.e. the affinity between the processors and the various
memory areas is fixed and the kind of mapping is rather simple
to get into more specifics:
- we will need to expand the model of libvirt
http://libvirt.org/intro.html
to split the Node ressources into separate sets containing processors
and memory areas which are highly connected together (assuming the
model is a simple partition of the ressources between the equivalent
of sub-Nodes)
- the function (2) would for a given processor tell how much of its memory
is already allocated (to existing running or paused domains)
Right ? Is the partition model sufficient for the architectures ?
If yes then we will need a new definition and terminology for those sub-Nodes.
For 3 we already have support for pinning the domain virtual CPUs to physical
CPUs but I guess it's not sufficient because you want this to be activated
from the definition of the domain:
http://libvirt.org/html/libvirt-libvirt.html#virDomainPinVcpu
So the XML format would have to be extended to allow specifying the subset
of processors the domain is supposed to start on:
http://libvirt.org/format.html
I would assume that if nothing is specified, the underlying Hypervisor
(in libvirt terminology, that could be a linux kernel in practice) will
by default try to do the optimal placement by itself, i.e. (3) is only
useful if you want to override the default behaviour.
Please correct me if I'm wrong,
Daniel
--
Red Hat Virtualization group
http://redhat.com/virtualization/
Daniel Veillard | virtualization library
http://libvirt.org/
veillard(a)redhat.com | libxml GNOME XML XSLT toolkit
http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine
http://rpmfind.net/