On Wed, Jan 16, 2013 at 05:06:21PM -0300, Amador Pahim wrote:
On 01/16/2013 04:30 PM, Daniel P. Berrange wrote:
>On Wed, Jan 16, 2013 at 02:15:37PM -0500, Peter Krempa wrote:
>>----- Original Message -----
>>From: Daniel P. Berrange <berrange(a)redhat.com>
>>To: Peter Krempa <pkrempa(a)redhat.com>
>>Cc: Jiri Denemark <jdenemar(a)redhat.com>, Amador Pahim
<apahim(a)redhat.com>, libvirt-list(a)redhat.com, dougsland(a)redhat.com
>>Sent: Wed, 16 Jan 2013 13:39:28 -0500 (EST)
>>Subject: Re: [libvirt] [RFC] Data in the <topology> element in
the capabilities XML
>>
>>On Wed, Jan 16, 2013 at 07:31:02PM +0100, Peter Krempa wrote:
>>>On 01/16/13 19:11, Daniel P. Berrange wrote:
>>>>On Wed, Jan 16, 2013 at 05:28:57PM +0100, Peter Krempa wrote:
>>>>>Hi everybody,
>>>>>
>>>>>a while ago there was a discussion about changing the data that is
>>>>>returned in the <topology> sub-element:
>>>>>
>>>>><capabilities>
>>>>><host>
>>>>><cpu>
>>>>><arch>x86_64</arch>
>>>>><model>SandyBridge</model>
>>>>><vendor>Intel</vendor>
>>>>><topology sockets='1' cores='2'
threads='2'/>
>>>>>
>>>>>
>>>>>The data provided here is as of today taken from the nodeinfo
>>>>>detection code and thus is really wrong when the fallback mechanisms
>>>>>are used.
>>>>>
>>>>>To get a useful count, the user has to multiply the data by the
>>>>>number of NUMA nodes in the host. With the fallback detection code
>>>>>used for nodeinfo the NUMA node count used to get the CPU count
>>>>>should be 1 instead of the actual number.
>>>>>
>>>>>As Jiri proposed, I think we should change this output to separate
>>>>>detection code that will not take into account NUMA nodes for this
>>>>>output and will rather provide data as the "lspci" command
does.
>>>>>
>>>>>This change will make the data provided by the element standalone
>>>>>and also usable in guest XMLs to mirror host's topology.
>>>>Well there are 2 parts which need to be considered here. What do we
report
>>>>in the host capabilities, and how do you configure guest XML.
>>>>
>>>> From a historical compatibility pov I don't think we should be
changing
>>>>the host capabilities at all. Simply document that 'sockets' is
treated
>>>>as sockets-per-node everywhere, and that it is wrong in the case of
>>>>machines where an socket can internally have multiple NUMA nodes.
>>>I'm too somewhat concerned about changing this output due to
>>>historic reasons.
>>>>Apps should be using the separate NUMA <topology> data in the
capabilities
>>>>instead of the CPU <topology> data, to get accurate CPU counts.
>>> From the NUMA <topology> the management apps can't tell if the
CPU
>>>is a core or a thread. For example oVirt/VDSM bases the decisions on
>>>this information.
>>Then, we should add information to the NUMA topology XML to indicate
>>which of the child <cpu> elements are sibling cores or threads.
>>
>>Perhaps add a 'socket_id' + 'core_id' attribute to every
<cpu>.
>
>>In this case, we will also need to add the thread siblings and
>>perhaps even core siblings information to allow reliable detection.
>The combination fo core_id/socket_id lets you determine that. If two
>core have the same socket_id then they are cores or threads within the
>same socket. If two <cpu> have the same socket_id & core_id then they
>are threads within the same core.
Not true to AMD Magny-Cours 6100 series, where different cores can
share the same physical_id and core_id. And they are not threads.
This processors has two numa nodes inside the same "package" (aka
socket) and they shares the same core ID set. Annoying.
I don't believe there's a problem with that. This example XML
shows a machine with 4 NUMA nodes, 2 sockets each containing
2 cores, and 2 threads, giving 16 logical CPUs
<topology>
<cells num='4'>
<cell id='0'>
<cpus num='4'>
<cpu id='0' socket_id='0' core_id='0'/>
<cpu id='1' socket_id='0' core_id='0'/>
<cpu id='2' socket_id='0' core_id='1'/>
<cpu id='3' socket_id='0' core_id='1'/>
</cpus>
</cell>
<cell id='1'>
<cpus num='2'>
<cpu id='4' socket_id='0' core_id='0'/>
<cpu id='5' socket_id='0' core_id='0'/>
<cpu id='6' socket_id='0' core_id='1'/>
<cpu id='7' socket_id='0' core_id='1'/>
</cpus>
</cell>
<cell id='2'>
<cpus num='2'>
<cpu id='8' socket_id='1' core_id='0'/>
<cpu id='9' socket_id='1' core_id='0'/>
<cpu id='10' socket_id='1' core_id='1'/>
<cpu id='11' socket_id='1' core_id='1'/>
</cpus>
</cell>
<cell id='3'>
<cpus num='2'>
<cpu id='12' socket_id='1' core_id='0'/>
<cpu id='13' socket_id='1' core_id='0'/>
<cpu id='14' socket_id='1' core_id='1'/>
<cpu id='15' socket_id='1' core_id='1'/>
</cpus>
</cell>
</cells>
</topology>
I believe there's enough info there to determine all the co-location
aspects of all the sockets/core/threads involved.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|