On Fri, 2019-02-08 at 14:14 +0100, Jiri Denemark wrote:
> On Fri, Feb 08, 2019 at 11:03:34 -0200, Daniel Henrique Barboza wrote:
>> Some devices creates empty (= cpu-less) NUMA nodes to host
>> its memory. This results in topologies where the following
>> sanity rule does not apply as is:
>>
>> nodes * sockets * cores * threads = total_cpus
>>
>> As a result, a call to 'virsh nodeinfo' will return the default
>> value (1) to nodes, sockets and threads, while cores defaults
>> to the total_cpus value. For example, in a Power9 host that has
>> 160 total cpus, 4 cpu-less NUMA nodes, 2 populated NUMA nodes,
>> 1 socket per populated node, 20 cores per socket and 4 threads
>> per socket:
>>
>> $ virsh nodeinfo
>> CPU model: ppc64le
>> CPU(s): 160
>> CPU frequency: 3783 MHz
>> CPU socket(s): 1
>> Core(s) per socket: 160
>> Thread(s) per core: 1
>> NUMA cell(s): 1
>> Memory size: 535981376 KiB
>>
>> This patch adjusts virHostCPUGetInfoPopulateLinux to count the
>> cpu-less NUMA nodes and discard them in the sanity rule, changing
>> it to:
>>
>> (nodes - empty_nodes) * sockets * cores * threads = total_cpus
>>
>> And with this new rule, virsh nodeinfo will return the
>> appropriate info for those topologies, without changing the
>> behavior for any other scenario it was previously working.
>>
>> This is the resulting output of nodeinfo after this patch in the
>> same Power9 host mentioned above:
>>
>> $ virsh nodeinfo
>> CPU model: ppc64le
>> CPU(s): 160
>> CPU frequency: 3783 MHz
>> CPU socket(s): 1
>> Core(s) per socket: 20
>> Thread(s) per core: 4
>> NUMA cell(s): 6
>> Memory size: 535981376 KiB
> This change would break backward compatibility as we have the following
> public macro in libvirt-host.h:
>
> # define VIR_NODEINFO_MAXCPUS(nodeinfo) \
> ((nodeinfo).nodes*(nodeinfo).sockets*(nodeinfo).cores*(nodeinfo).threads)
>
> Anyway, the virNodeInfo structure is just not flexible enough to deal
> with all possible topologies and users are advised to look at the host
> capabilities XML to get a proper view of the host CPU topology.
Correct, and it's even documented as such:
https://libvirt.org/html/libvirt-libvirt-host.html#virNodeInfo
So, as long as the number of CPUs is reported correctly (which it
seems to be) and 'virsh capabilities' doesn't report incorrect
information in the <topology> element, then there's nothing to be
fixed - except for applications that still use virNodeInfo ;)
Fair enough. Guess it's better to leave this one alone then.
ps: is virNodeInfo somewhat deprecated for showing CPU topology then?
Thanks,
DHB