I think numad will probably work best with just the #vcpus and the #MBs
of memory in the guest as the requested job size parameters. Sorry for
lack of clarity here...
Numad should work -- pending bugs -- with any numbers passed. If the
requested parameters are bigger than actual physical resources
available, numad is supposed to just return all the nodes in the system
-- so the effective recommendation in that case would be "use the entire
system".
If the requested resources are a subset of the system, numad is supposed
to return a recommended subset of the system nodes to use for the
process -- based on the current amount of free memory and idle CPUs on
the various nodes.
On 03/01/2012 02:31 PM, Dave Allan wrote:
On Wed, Feb 29, 2012 at 06:29:55AM -0500, Bill Burns wrote:
> On 02/28/2012 11:34 PM, Osier Yang wrote:
>> On 02/29/2012 12:40 AM, Daniel P. Berrange wrote:
>>> On Tue, Feb 28, 2012 at 11:33:03AM -0500, Dave Allan wrote:
>>>> On Tue, Feb 28, 2012 at 10:10:50PM +0800, Osier Yang wrote:
>>>>> numad is an user-level daemon that monitors NUMA topology and
>>>>> processes resource consumption to facilitate good NUMA resource
>>>>> alignment of applications/virtual machines to improve performance
>>>>> and minimize cost of remote memory latencies. It provides a
>>>>> pre-placement advisory interface, so significant processes can
>>>>> be pre-bound to nodes with sufficient available resources.
>>>>>
>>>>> More details:
http://fedoraproject.org/wiki/Features/numad
>>>>>
>>>>> "numad -w ncpus:memory_amount" is the advisory interface
numad
>>>>> provides currently.
>>>>>
>>>>> This patch add the support by introducing new XML like:
>>>>> <numatune>
>>>>> <cpu required_cpus="4"
required_memory="524288"/>
>>>>> </numatune>
>>>>
>>>> Isn't the usual case going to be the vcpus and memory in the guest?
>>>> IMO we should default to passing those numbers to numad if
>>>> required_cpus and required_memory are not provided explicitly.
>>>
>>> Indeed, why you would want to specify anything different ? At
>>> first glance my reaction was just skip the XML and call numad
>>> internally automatically with the guest configured allocation
>>>
>>
>> Here the "required_cpus" stands for the physical CPUs number,
>> which will be used numad to choose the proper nodeset. So from
>> sementics point of view, it's different with<vcpus>4</vcpus>,
>> I can imagine two problems if we reuse the vCPUs number for
>> numad's use:
>>
>> 1) Suppose there are 16 pCPUs, but the specified vCPUs number
>> is "64". I'm not sure if numad will work properly in this case,
>> but isn't it a bad use case? :-)
>>
>> 2) Suppose there are 128 pCPUs, but the specified vCPUs number
>> is "2". numad will work definitely, but is that the result the
>> user wants to see? no good to performace.
>>
>> The basic thought is we provide the interface, and how to configure
>> the provided XML for good performace is on the end-user then. If
>> we mixed-use the two different sementics, and do things secrectly
>> in the codes, then I could imagine there will be performance problems.
>>
>> The "required_memory" could be omitted though, we can reuse
>> "<memory>524288</memory>", but I'm not sure if it's
good to
>> always pass a "memory amount" to numad command line, it may be
>> not good in some case. @Bill(s), correct me if I'm not right. :-)
>>
>> Perhaps we could have a bool attribute then, such as:
>>
>> <cpu required_cpus="4" required_memory="yes|no"/>
>>
>
> Please keep Bill Gray on this thread. He is the author of numad and
> is the best person to answer the above questions.
Bill (Gray),
Can you weigh in here?
Dave
> Bill
>
>> Regards,
>> Osier
>