On 10/15/18 11:25 AM, Wang, Huaqiang wrote:
On 10/13/2018 6:29 AM, John Ferlan wrote:
>
> On 10/12/18 3:10 AM, Wang, Huaqiang wrote:
>>> -----Original Message-----
[...]
>>> IOW: What is cache_occupancy measuring? Each cache? The
entire
>>> thing? If
>>> there's no cache elements, then what?
>> cache_occupancy is measuring based on cache bank. For Intel 2 socket
>> xeon CPU,
>> it is considered as two cache banks, one cache bank per socket. The
>> typical
>> output for each monitor of this case is:
>>
>> cpu.cache.0.name=vcpus_1
>> cpu.cache.0.vcpus=1
>> cpu.cache.0.bank.count=2 <--- 2 cache banks
>> cpu.cache.0.bank.0.id=0 <--- bank.id.0 cache_occypancy
>> cpu.cache.0.bank.0.bytes=9371648 _|
>> cpu.cache.0.bank.1.id=1 <--- bank.id.1 cache_occypancy
>> cpu.cache.0.bank.1.bytes=1081344 _|
>>
>> If you want to know the total cache occupancy for VM vcpu threads of
>> this
>> monitor, you need to add them up.
>>
> So if you have:
>
> <monitor... vcpus=0-1>
>
> what do you get in output for cache_occupancy? 0 + 1?
Yes. Output is sum of two vcpus.
for cache bank 0
vcpus_0-1.bank.0.bytes = vcpus_0.bank.0.bytes + vcpus_1.bank.0.bytes
for cache bank 1
vcpus_0-1.bank.1.bytes = vcpus_0.bank.1.bytes + vcpus_1.bank.1.bytes
>
>>> I honestly think this just needs to be simplified as much as possible.
"I honestly think this just needs to be simplified as much as possible."
I reconsidered your comment ( in above line), do you mean the XML
configuration for 'monitor' need to be simplified also?
This is/was a comment regarding default stuff which you are removing.
What I think is, even after the removal of 'default monitor'
and
'default allocation' concepts, the XML
configuration for monitors (with type 'all', 'm-to-n', 'one to
one')
still need such kind of arrangement.
Take an example, a VM has 4 vcpus, vcpu 0 and 1 run cache sensitive
workload, and wants to hold
private L3 caches, and there is no specific requirement for left vcpus
but still need a monitoring on
the cache usage.
Then we could create an cache allocation for vcpu 0 and 1 as well as a
monitor on getting the
actual cache that these two vcpus used. For vcpu 2 and 3, create a
monitor for it.
The XML configurations are: (no change in general rules comparing to my
previous examples)
<cachetune vcpus='0-1'>
<cache id='0' level='3' type='both' size='3'
unit='MiB'/>
<cache id='1' level='3' type='both' size='3'
unit='MiB'/>
<monitor level=3 vcpus='0-1'/>
</cachetune>
<cachetune vcpus='2-3'>
<monitor level=3 vcpus='2-3'/>
</cachetune>
Any suggestion from you is welcome.
I'm not sure what the question is and I'm not sure it matters at this
point. If you only create an allocation for any <cachetune> or
<memorytune> entry, then that's all that'll be reported which is what I
was trying to point out. Its not that something else may or may not
exist, it's what gets reported and can be queried via the XML.
>>> When you monitor specific vcpus within a cachetune, then you get what?
>> In this case, the monitor you created only monitors the specific vcpus
>> you added for monitor.
>>
>> Following two configurations satisfy your scenario, and the only monitor
>> will detect the cache usage of thread of vcpu 2.
>>
>> <cachetune vcpus='2-4'>
>> <cache id='0' level='3' type='both'
size='3' unit='MiB'/>
>> <cache id='1' level='3' type='both'
size='3' unit='MiB'/>
>> <monitor level=3 vcpus='2'/>
>> </cachetune>
>>
>> <cachetune vcpus='2-4'>
>> <monitor level=3 vcpus='2'/>
>> </cachetune>
>>
> Perhaps my question was mistyped or misinterpreted. In the above top
> example, if we have <monitor ... vcpus='2-4'>, then do the values in
> <cache> have any impact on the calculation as opposed to if they weren't
> there?
I perhaps still not understand you well ...
There will have significant influence for the output of monitor if
<cache> entry
exist and if vcpu2-4 demands much more caches that allocation can offer;
If the
cache that the allocation offers is much bigger than vcpu2-4 actually
used, the
influence will be tiny.
But in another case, that, if there is no 'cache' entries, just showing
in the second
example, it still influenced by the cache that the 'allocation' offers.
Its difference
with the first example is: the top example is using the cache resources
allocated
by the allocation of itself, while the second example uses the
allocation of resources
defined in /sys/fs/resctrl/schemata, and this cache is shared by
multiple system tasks.
The question was related to how <monitor> is defined and trying to
further describe my feeling that default was necessary.
>
>>
>>> If the cachetune has no specific cache entries, you get what?
>> If no cache entry in cachetune, it will also get vcpu threads' cache
>> utilization information based on cache bank.
>> No cache entry specified for the cachetune, means it will use the cache
>> allocating policy of default cache allocation, which is file
>> /sys/fs/resctrl/schemata.
>>
>> If valid cache entries are provided in cachetune, then an allocation
>> will
>> be created for the threads of vcpu listed in <cachetune> 'vcpus'
>> attribute. Supposing the allocation is the directory /sys/fs/resctrl/p0,
>> then the cache resource limitation was applied on these threads.
>>
>> For monitor, it does not care if vcpu threads are allowed or not
>> alloowed to
>> access a limit amount of cache-lines. Monitor only reports the amount of
>> cache has been accesses.
>>
>>> If you monitor
>>> multiple vcpus within a cachetune then you get what? (?An
>>> aggregation of all?).
>> Yes.
>> supposing you have this vcpus setting for <cachetune>
>> <cachetune vcpus='0-4,8' ..../>
>>
>> and you choose to monitor the cache usage for vcpu 0,3,8, then you
>> create
>> following monitor entry inside the cachetune entry, with the output of
>> monitor, you will get an aggregative cache occupancy information for
>> threads
>> of vcpu 0,3,8.
>>
>> <cachetune vcpus='0-4,8'/>
>> <monitor level='3' vcpus='0,3,8'/>
>> </cachetune>
>>
>>> This whole default and specific description doesn't make sense.
>> Sorry for make you confused, I'll try to refine the descriptions.
>>
> In this last case if you also had
>
> <monitor level='3', vcpus='4'/>
> <monitor level='3', vcpus='0-4,8'/>
>
> then I'd expect that the values output in "0-4,8" to match those that
I
> could add by myself with "4" and "0-3,8". True?
Yes.
and this essentially solidifies the point I was making above.
>
> Is it apparent yet why I'm saying mentioning default just confuses
> things? If so, I'm not sure what else I can do to explain.
Agree with the conclusion that 'default xxx' is a confusing things.
But hope you understand that, a monitor has same vcpu list with the
allocation is
created along with the creation of allocation, no matter you defined a
<monitor>
in <cachetune> and has a same 'vcpus' setting with allocation in the XML
configuration
or not. This is the behavior of kernel resctrl fs.
To get the cache utilization information for whole allocation, enable
this system created
monitor is most economic way in terms of saving RMID.
Sure, one cannot have too many monitors because there are limitations.
[...]
>> I forget to free it. Will be added.
>>
> Again, Coverity
Thank you again. Hope someday I can hold the power of Coverity ...
It's nice to have, but it has it's own issues. Getting to know what's a
real issue and some false positive takes a while. I'm sure there's other
code analyzers out there.
[...]
As stated in prior paragraph. Will remove 'default monitor'
and
'default allocation' and
make cleaning for code and comments.
Do I miss anything?
I hope not, it's time consuming to read/comprehend everything. I see the
need to post more because it doesn't necessarily make sense without
understanding the future, but long series mean long reviews and long
reviews mean more questions and more questions mean deeper responses in
the mail list. In the long run I hope we get something acceptable to be
used by/for libvirt to describe/summarize the depths that is CAT. I
think we're getting closer that's for sure.
BTW, I find the 'virsh domstats --cpu-total' output for
monitors,
introduced in patch18,
is not good enough.
current is
"
Domain: 'ubuntu16.04-base'
cpu.cache.monitor.count=2
cpu.cache.0.name=vcpus_0
cpu.cache.0.vcpus=0
cpu.cache.0.bank.count=2
cpu.cache.0.bank.0.id=0
cpu.cache.0.bank.0.bytes=9371648
cpu.cache.0.bank.1.id=1
cpu.cache.0.bank.1.bytes=1081344
cpu.cache.1.name=vcpus_3
cpu.cache.1.vcpus=3
cpu.cache.1.bank.count=2
cpu.cache.1.bank.0.id=0
cpu.cache.1.bank.0.bytes=630784
cpu.cache.1.bank.1.id=1
cpu.cache.1.bank.1.bytes=10452992
"
I may change the output to following by adding 'monitor' for each line:
Domain: 'ubuntu16.04-base'
cpu.cache.monitor.count=2
cpu.cache.monitor.0.name=vcpus_0
cpu.cache.monitor.0.vcpus=0
cpu.cache.monitor.0.bank.count=2
cpu.cache.monitor.0.bank.0.id=0
cpu.cache.monitor.0.bank.0.bytes=9371648
cpu.cache.monitor.0.bank.1.id=1
cpu.cache.monitor.0.bank.1.bytes=1081344
cpu.cache.monitor.1.name=vcpus_3
cpu.cache.monitor.1.vcpus=3
cpu.cache.monitor.1.bank.count=2
cpu.cache.monitor.1.bank.0.id=0
cpu.cache.monitor.1.bank.0.bytes=630784
cpu.cache.monitor.1.bank.1.id=1
cpu.cache.monitor.1.bank.1.bytes=10452992
Please take this change in consideration when you make review for patch 18.
Some day we'll get there.
John
BTW: Next week is KVM Forum - so that usually means less activity on
this list and less time for reviews.
[...]