On 09/10/2018 02:10 PM, Wang, Huaqiang wrote:
> -----Original Message-----
> From: John Ferlan [mailto:jferlan@redhat.com]
> Sent: Wednesday, September 5, 2018 11:00 PM
> To: Wang, Huaqiang <huaqiang.wang(a)intel.com>; libvir-list(a)redhat.com
> Cc: Feng, Shaohe <shaohe.feng(a)intel.com>; Niu, Bing
<bing.niu(a)intel.com>;
> Ding, Jian-feng <jian-feng.ding(a)intel.com>; Zang, Rui
<rui.zang(a)intel.com>
> Subject: Re: [libvirt] [PATCH 06/10] util: Introduce resctrl monitor for CMT
>
>
>
> On 08/27/2018 07:23 AM, Wang Huaqiang wrote:
>> 'virResctrlAllocMon' denotes a resctrl monitor reporting the resource
>> consumption information.
>>
>> This patch introduced the interfaces for resctrl monitor.
>>
>> Relationship of 'resctrl allocation' and 'resctrl monitor':
>> 1. resctrl monitor monitors resources (cache or memory bandwidth) of
>> particular allocation.
>> 2. resctrl allocation may refer to the 'default' allocation if no
>> dedicated resource 'control' applied to it. The 'default'
allocation
>> enjoys remaining resource that not allocated.
>> 3. resctrl monitor belongs to 'default' allocation if no
'cachetune'
>> specified in XML file.
>> 4. one resctrl allocation may have several monitors. It is also
>> permitted that there is no resctrl monitor associated with an
>> allocation.
>>
>> Key data structures:
>>
>> + struct _virResctrlAllocMon {
>> + char *id;
>> + char *path;
>> + };
>>
>> struct _virResctrlAlloc {
>> virObject parent;
>>
>> @@ -276,6 +289,12 @@ struct _virResctrlAlloc {
>> virResctrlAllocMemBWPtr mem_bw;
>> + virResctrlAllocMonPtr *monitors;
>> + size_t nmonitors;
>> }
>>
>> Signed-off-by: Wang Huaqiang <huaqiang.wang(a)intel.com>
>> ---
>> src/libvirt_private.syms | 6 +
>> src/util/virresctrl.c | 361
> ++++++++++++++++++++++++++++++++++++++++++++++-
>> src/util/virresctrl.h | 31 ++++
>> 3 files changed, 394 insertions(+), 4 deletions(-)
>>
>
> Similar to the previous patch - there's a bit too much going on for just one
patch
> here.
>
> Introducing "default_group" and "monitors". This needs some
separation.
>
Will split this patch into serval patches.
The 'default_group' will be a separate patch.
> I am not going to look at this patch. I think you really need to describe this
> default_group concept quite a bit more as you'd be totally changing the
> meaning of alloc->path by "removing" everything after the
> SYSFS_RESCTRL_PATH. What's the purpose of alloc->path then in this
> environment?
This is a bug.
'default_group' is not allowed to be removed in libvirt. It is created
by the resource control file system at the time of mounting, can only be
removed at the time of file system unmounting by the system.
In next version virResctrlAllocRemove, a check will be made to determine if
it is removing root directory /sys/fs/resctrl, if yes, report an error.
Like I've said before - just be sure to properly separate things. Fixing
existing issues in the middle of new code changes is just one of those
areas that makes reviewing difficult.
Following paragraph explains the 'default_group' and some relevant concepts.
I'll also write these content into cover of patch series.
In much less detail I hope... Not sure I can properly page in the rest.
I read it, but I certainly cannot internalize it.
John
The resctrl default group is described initially by Kernel document
'intel_rdt_ui.txt', with the link
"https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt".
The default group is the root directory of the resource control file system,
that is, the directory of '/sys/fs/resctrl/'. The default group is created
immediately after the mounting of resctrl file system, and owns all the tasks
and make full use of all resources.
In these patches, the libvirt virresctrl 'default_group' is introduced by
extending the scope of current 'resource allocation'. virResctrlAlloc is
presenting a 'resource allocation'. 'default_group' is also called
'default allocation'.
The 'default allocation' represents the root directory of resource control
file system. The main difference to a non-default allocation is the
'default group' is created after fs mounting, and owns all system tasks, while
non-default allocation need to create through 'mdkir' and explicit operation
of filling PID into the tasks file.
The main purpose for introducing 'default allocation' is creating monitors
that for non-default allocation to save hardware resources.
Think about the case: a KVM virtual machine with 1 working vcpu, we want to
monitor the cache utilization of the vcpu but don't apply any resource
limitaion on vcpu process. We have two possible solutions:
1. Create a new "resource allocation" through making a directory under
/sys/fs/resctrl/, assuming the allocation is /sys/fs/resource/p0,
and put the host process PID for the vcpu into file
/fys/fs/resource/p0/tasks. Then Get the cache utilization information through
file /fys/fs/resource/p0/mon_data/llc_occupancy.
2. Do not create an extra allocation, but create a resource monitor under
default allocation by creating an sub-directory under
'/sys/fs/resctrl/mon_group', then put the vcpu PID into the monitor's
sub-directory's tasks file. Get the cache utilization information through this
monitor's llc_occupancy file.
Comparing with slotion1, solution2 uses less hardware resource by saving one
CLOSID. Normally we have much more RMIDs than CLOSIDs, for CPU E5-2699v4, the
number of CLOSID is 16, while RMID number is 176.
To support 'default_group' the domain's xml configuration file need to be
changed:
The 'default allocation' has the similar behavior with original libvirt defined
'resource allocation' for creating a monitor group, getting resource
consumption data from a monitor, as well as the task assignment operations.
The 'default allocation' could be looked as a special 'resource
allocation'.
Libvirt treats virResctrlAlloc (sometimes called resource allocation)
as the representing of resctrl allocation, each resctrl allocation has the
capability to report the resource consumption information of involved
tasks, through files 'llc_occupancy', 'mbm_total_bytes' and
'mbm_local_bytes' under the directory of allocation.
virResctrlAllocMon (sometimes called resource monitor) is introduced to
represent virResctrlAlloc's role for resource consumption monitoring.
One or more resource monitors could be created to monitor the
resource utilization for a small set of tasks of current allocation.
This explains why the 'monitor' is being put under 'cachetune' element
in
domain's XML configuration file.
<cputune>
<cachetune vcpus='0-1'>
<cache id='0' level='3' type='code'
size='7680' unit='KiB'/>
<cache id='1' level='3' type='data'
size='3840' unit='KiB'/>
+ <monitor vcpus='0-1'/>
+ <monitor vcpus='0'/>
</cachetune>
</cputune>
The 'defautl_group' is resource allocation shared by all system tasks that do
not have specified resource limitation. In existing libvirt policy no resource
limitation is allowed to put on it. so I need to generate configuration such as
<cachetune vcpus='3'>
+ <monitor vcpus='3'/>
</cachetune>
In the implementation, this monitor for monitoring domain's vcpu 3 is
created under 'default allocation', default allocation is the directory
/sys/fs/resctrl, which is set up at time of mounting.
And above is the reason for why element 'cache' is changed being
optional in RNG file.
<element name="cachetune">
<attribute name="vcpus">
<ref name='cpuset'/>
</attribute>
- <oneOrMore>
+ <zeroOrMore>
<element name="cache">
<attribute name="id">
<ref name='unsignedInt'/>
</attribute>
<attribute name="level">
<ref name='unsignedInt'/>
</attribute>
<attribute name="type">
<choice>
<value>both</value>
<value>code</value>
<value>data</value>
</choice>
</attribute>
<attribute name="size">
<ref name='unsignedLong'/>
</attribute>
<optional>
<attribute name='unit'>
<ref name='unit'/>
</attribute>
</optional>
</element>
- </oneOrMore>
+ </zeroOrMore>
<zeroOrMore>
<element name="monitor">
<attribute name="vcpus">
<ref name='cpuset'/>
</attribute>
</element>
</zeroOrMore>
</element>
Thanks for review.
Huaqiang
[...]