From: Wim ten Have <wim.ten.have(a)oracle.com>
This patch extends guest domain administration adding support to
advertise node sibling distances when configuring NUMA guests also
referred to as vNUMA (Virtual NUMA).
NUMA (Non-Uniform Memory Access), a method of configuring a cluster
of nodes within a single multiprocessing system such that it shares
processor local memory amongst others improving performance and the
ability of the system to be expanded.
A NUMA system could be illustrated as shown below. Within this 4-NODE
system, every socket is equipped with its own distinct memory and some
with I/O. Access to memory or I/O on remote nodes is only possible
through the "Interconnect". This results in different performance for
local and remote resources.
In contrast to NUMA we recognize the flat SMP system where no concept
of local or remote resource exists. The disadvantage of high socket
count SMP systems is that the shared bus can easily become a performance
bottleneck under high activity.
+-------------+-------+ +-------+-------------+
|NODE0| | | | | |NODE3|
| | CPU00 | CPU03 | | CPU12 | CPU15 | |
| | | | | | | |
| Mem +--- Socket0 ---<-------->--- Socket3 ---+ Mem |
| | | | | | | |
+-----+ CPU01 | CPU02 | | CPU13 | CPU14 | |
| I/O | | | | | | |
+-----+-------^-------+ +-------^-------+-----+
| |
| Interconnect |
| |
+-------------v-------+ +-------v-------------+
|NODE1| | | | | |NODE2|
| | CPU04 | CPU07 | | CPU08 | CPU11 | |
| | | | | | | |
| Mem +--- Socket1 ---<-------->--- Socket2 ---+ Mem |
| | | | | | | |
+-----+ CPU05 | CPU06 | | CPU09 | CPU10 | |
| I/O | | | | | | |
+-----+-------+-------+ +-------+-------+-----+
NUMA adds an intermediate level of memory shared amongst a few cores
per socket as illustrated above, so that data accesses do not have to
travel over a single bus.
Unfortunately the way NUMA does this adds its own limitations. This,
as visualized in the illustration above, happens when data is stored in
memory associated with Socket2 and is accessed by a CPU (core) in Socket0.
The processors use the "Interconnect" path to access resource on other
nodes. These "Interconnect" hops add data access delays. It is therefore
in our interest to describe the relative distances between nodes.
The relative distances between nodes are described in the system's SLIT
(System Locality Distance Information Table) which is part of the ACPI
(Advanced Configuration and Power Interface) specification.
On Linux systems the SLIT detail can be listed with help of the
'numactl -H' command. The above guest would show the following output.
[root@f25 ~]# numactl -H
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3
node 0 size: 2007 MB
node 0 free: 1931 MB
node 1 cpus: 4 5 6 7
node 1 size: 1951 MB
node 1 free: 1902 MB
node 2 cpus: 8 9 10 11
node 2 size: 1998 MB
node 2 free: 1910 MB
node 3 cpus: 12 13 14 15
node 3 size: 2015 MB
node 3 free: 1907 MB
node distances:
node 0 1 2 3
0: 10 21 31 21
1: 21 10 21 31
2: 31 21 10 21
3: 21 31 21 10
These patches extend core libvirt's XML description of NUMA cells to
include NUMA distance information and propagate it to Xen guests via
libxl. Recently qemu landed support for constructing the SLIT since
commit 0f203430dd ("numa: Allow setting NUMA distance for different NUMA
nodes"). The core libvirt extensions in this patch set could be used to
propagate NUMA distances to qemu quests in the future.
Wim ten Have (4):
numa: describe siblings distances within cells
xenconfig: add domxml conversions for xen-xl
libxl: vnuma support
xlconfigtest: add tests for numa cell sibling distances
docs/formatdomain.html.in | 63 +++-
docs/schemas/basictypes.rng | 7 +
docs/schemas/cputypes.rng | 18 ++
src/conf/numa_conf.c | 359 ++++++++++++++++++++-
src/conf/numa_conf.h | 20 ++
src/libvirt_private.syms | 5 +
src/libxl/libxl_conf.c | 121 +++++++
src/libxl/libxl_driver.c | 3 +-
src/xenconfig/xen_xl.c | 333 +++++++++++++++++++
.../test-fullvirt-vnuma-autocomplete.cfg | 26 ++
.../test-fullvirt-vnuma-autocomplete.xml | 85 +++++
.../test-fullvirt-vnuma-nodistances.cfg | 26 ++
.../test-fullvirt-vnuma-nodistances.xml | 53 +++
.../test-fullvirt-vnuma-partialdist.cfg | 26 ++
.../test-fullvirt-vnuma-partialdist.xml | 60 ++++
tests/xlconfigdata/test-fullvirt-vnuma.cfg | 26 ++
tests/xlconfigdata/test-fullvirt-vnuma.xml | 81 +++++
tests/xlconfigtest.c | 6 +
18 files changed, 1313 insertions(+), 5 deletions(-)
create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.cfg
create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.xml
create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-nodistances.cfg
create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-nodistances.xml
create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-partialdist.cfg
create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-partialdist.xml
create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma.cfg
create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma.xml
--
2.13.6