Hi,
Lately we’ve been puzzled by the nodeinfo returned for AMD 63xx based platforms so I
checked out libvirt from Git and checked out the testcases in test/nodeinfodata.
Currently you have 3 AMD test cases there :
linux-x86_64-test3 :
AMD 6172 2.1 GHz (MCM, 12 cores per package/socket, 2 numa nodes per package/socket, 1
thread/core)
4 Sockets, 8 NUMA nodes, 48 CPU cores total.
linux-x86_64-test7 :
AMD 6174 2.2 GHz (MCM, 12 cores per package/socket, 2 numa nodes per package/socket, 1
thread/core)
2 Sockets, 4 NUMA nodes, 24 CPU cores total.
linux-x86_64-test8 :
AMD 6282 SE 2.6 GHz (MCM, 8 CU(core) per package/socket, 2 numa nodes per package/socket,
2 threads/CU(core))
4 Sockets, 8 NUMA nodes, 64 CPU cores total.
However, the “expected” output from each of these are wrong I believe :
% ~/libvirt/tests/nodeinfodata$ cat linux-x86_64-test{3,7,8}.expected
CPUs: 48/48, MHz: 2100, Nodes: 8, Sockets: 1, Cores: 6, Threads: 1
CPUs: 24/24, MHz: 2200, Nodes: 1, Sockets: 1, Cores: 24, Threads: 1
CPUs: 64/64, MHz: 2593, Nodes: 1, Sockets: 1, Cores: 64, Threads: 1
In my opinion it should have been :
CPUs: 48/48, MHz: 2100, Nodes: 8, Sockets: 4, Cores: 12, Threads: 1
CPUs: 24/24, MHz: 2200, Nodes: 4, Sockets: 2, Cores: 12, Threads: 1
CPUs: 64/64, MHz: 2593, Nodes: 8, Sockets: 4, Cores: 8, Threads: 2
Atleast that’s more in line with what tools like “lscpu” and “lstopo” would report.
For example, I have a dual-socket AMD 6376 based system. lscpu reports :
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 4
Vendor ID: AuthenticAMD
CPU family: 21
Model: 2
Stepping: 0
CPU MHz: 2299.820
BogoMIPS: 4600.03
Virtualization: AMD-V
L1d cache: 16K
L1i cache: 64K
L2 cache: 2048K
L3 cache: 6144K
NUMA node0 CPU(s): 0-7
NUMA node1 CPU(s): 8-15
NUMA node2 CPU(s): 16-23
NUMA node3 CPU(s): 24-31
virsh nodeinfo however :
CPU model: x86_64
CPU(s): 32
CPU frequency: 2299 MHz
CPU socket(s): 1
Core(s) per socket: 32
Thread(s) per core: 1
NUMA cell(s): 1
Memory size: 49448656 KiB
Looking more closely at the code in src/nodeinfo.c:linuxNodeInfoCPUPopulate() I believe it
makes some false assumptions.
When iterating over the NUMA nodes, it expects that NUMA nodes can contain more than one
socket, whereas on AMD systems atleast it’s the other way around (Sockets can contain more
than 1 socket). Hence, the topology check at the end goes all wrong and libvirt uses just
a flat topology :
if ((nodeinfo->nodes *
nodeinfo->sockets *
nodeinfo->cores *
nodeinfo->threads) != (nodeinfo->cpus + offline)) {
nodeinfo->nodes = 1;
nodeinfo->sockets = 1;
nodeinfo->cores = nodeinfo->cpus + offline;
nodeinfo->threads = 1;
}
If instead the “fallback” mechanism is used, which iterates over
/sys/devices/system/cpu/cpu%d and finds sockets/cores/threads is used, *and* you leave
nodeinfo->nodes out of the equation then it all makes sense. Iterating over entries in
/sys/devices/system/node/node%d would then only be used to find nodeinfo->nodes.
I have limited insight into how this would work on say an S390 system, so if my
assumptions here are completely wrong please do tell :)
Cheers,
--
Steffen Persvold
Chief Architect NumaChip, Numascale AS
Tel: +47 23 16 71 88 Fax: +47 23 16 71 80 Skype: spersvold