On 10/13/2018 6:29 AM, John Ferlan wrote:
On 10/12/18 3:10 AM, Wang, Huaqiang wrote:
>> -----Original Message-----
>> From: John Ferlan [mailto:jferlan@redhat.com]
>> Sent: Thursday, October 11, 2018 4:58 AM
>> To: Wang, Huaqiang <huaqiang.wang(a)intel.com>; libvir-list(a)redhat.com
>> Cc: Feng, Shaohe <shaohe.feng(a)intel.com>; Niu, Bing
<bing.niu(a)intel.com>;
>> Ding, Jian-feng <jian-feng.ding(a)intel.com>; Zang, Rui
<rui.zang(a)intel.com>
>> Subject: Re: [libvirt] [PATCHv5 13/19] conf: Add resctrl monitor configuration
>>
>>
>>
>> On 10/9/18 6:30 AM, Wang Huaqiang wrote:
>>> Introducing <monitor> element under <cachetune> to represent a
cache
>>> monitor.
>>>
>>> Supports two kind of monitors, which are, monitor under default
>>> allocation or monitor under particular allocation.
>> I still don't see what the big difference is with singling out default.
>> Does it really matter - what advantage is there. Having a monitor for 'n'
vcpus is
>> a choice.
>>
>>
>>> Monitor supervises the cache or memory bandwidth usage for
>> At this point memory bandwidth is not in play...
> Yes. Will remove the memory BW related words.
>
>>> interested vcpu thread set, if the vcpu thread set is belong to some
>>> resctrl allocation, then the monitor will be created under this
>>> allocation, that is, creating a resctrl monitoring group directory
>>> under the directory of '@alloc->path/mon_group'. Otherwise, the
>>> monitor will be created under default allocation.
>> This is becoming increasing difficult to describe/decipher - makes me wonder
>> who would really use it.
> Monitor in default allocation could be commonly used when we want to
> observe the resource usage for VM vcpu threads that are not specified
> with any dedicated resource limitation through CAT or MBA technology.
>
> And another common case is it will be used if CAT/MBA is not supported
> but CMT/MBM is enabled in libvirt.
>
> The reason I have introduced the monitor in previous patch and further
> introducing the monitor in default allocation in this patch is the monitor
> n two different allocations has different behavior.
>
> Let's focus on CAT and CMT technology in my below lines, since MBA and
> MBM are very similar cases to them.
>
> As we know, before the CAT technology was introduced, any process in Linux
> is sharing the L3 cache along with all other active processes. After
> CAT is enabled in libvirt, it has the capability to apply cache isolation
> and assign dedicated amount of cache to some key VM vcpu threads.
>
> If we want to observe the real L3 cache usage information, then we
> need the help of monitor.
>
> == Monitor usage case 1: monitor in default allocation ==
>
> If you want to get the cache utilization data before applying any cache
> isolation to the VM vcpu threads, you need to create a monitor in the
> default allocation, because you haven't create any cache allocation.
>
> == Monitor usage case 2: monitor in non-default allocation ==
>
> If you have created a cache allocation for specific VM vcpu treads
> and not sure how many cache-lines these VM vcpu threads are really used,
> you need to create a monitor under the this allocation to get real cache
> usage information.
>
> If you find your VM vcpu threads only used a small part of cache that
> have allocated, you might think about to reduce its allocation.
>
> == Usage for default monitor and non-default monitor ==
>
> Since we have introduced the 'default monitor' and 'non-default
monitor'
> concepts in previous patches, and now, you can monitor the cache usage
> for all VM vcpu threads that added to this allocation, and also you has
> the capability to monitor a subset of vcpu list of this allocation.
>
> Without 'default monitor', it is impossible to get the cache usage for
> all vcpu threads in the allocation and at the same time get the cache
> usage for some highly interested vcpu threads of allocation.
"Default monitor" is in name only. It's just "a" monitor for all
the
threads in cachetune (or memorytune eventually); whereas, a "non default
monitor" is "a" monitor for specific vcpus within the range in cachetune.
Agree.
As stated in reply of patch10, 'default monitor' and will be removed,
and related
code will be cleaned.
Thus your description is that a monitor can be a monitor all vcpus or a
monitor for some subset of of all vcpus. A <monitor> describes which
vcpus of a cachetune (or memorytune) can be monitored - there are 3
options - "all", "m-to-n", or "one-to-one".
Thanks for suggestion. Then no things will be mentioned as 'default',
code and document
will be cleaned accordingly.
What they relate to in the filesystem is the magic of the code of paths
to the data. What data structures are called or how this is described in
docs just doesn't seem to need to make a delineation.
Agree.
> == Monitor usage case 3: allocating dedicated cache and monitoring its
> usage of one VM, and getting its influence over another VM==
>
> Think about the scenario that there are two VMs, it is a known information
> that one VM is very cache sensitive and don't want to share cache with
> other workloads, and for another VM, we have no knowledge about cache
> requirement, but it is required to monitor the cache usage for the two
> VMs.
>
> With the concepts introduced until now. We need to create an allocation
> and for this VM, then create a default-monitor in this allocation for
> monitoring. For another VM, it is required to create a non-default
> monitor under default allocation.
>
>
> With introduced concepts, the allocation, the default allocation, the monitor
> and the default monitor, it is possible to fulfill requirement all scenarios.
I think it's far easier to describe 2 things rather than 4.
Agree. Will be done. Only 'allocation' and 'monitor'.
> The creation of monitor does not have too much flexibility, as I
stated
> in my reply of v5patch0's review comments, the monitors need to follow
> below rules:
>
> 1. In each <cachetune> entry more than one monitors could be specified.
> 2. In each <cachetune> entry up to one allocation could be specified.
> 3. The allocation is using the vcpu list specified in <cachetune> attribute
> 'vcpus'.
An allocation is either from the list of vcpus for a cachetune or a
memorytune if cachetune for the listed vcpus in a memorytune doesn't
exist (and yes, doesn't conflict with any vcpus in any found cachetune).
If the domain was created with 4 vcpus, then is "theoretically" the
default allocation for 4 vcpus. Allocations beyond that allow one to
slice and dice *as long as* there is no overlap with the definitions.
I don't think "default" or "hidden" really needs to be described
or
called out, it just is. Maybe I'm oversimplifying.
After going through the code, I think the 'default allocation' and
'default monitor'
could be removed, as I stated, without any lose of functionality
currently implemented.
> 4. A monitor has the same vcpu list as allocation is allowed, and this
> monitor is allocation's default monitor.
> 5. A monitor has a subset vcpu list of allocation is allowed.
> 6. For non-default monitors, any vcpu list overlap is not permitted.
> And also, the number of monitors could be generated is subject to
> the hardware RMID numbers.
>
> About the behavior of underlying '/sys/fs/resctrl' file system, you can
> get more detail from this link:
>
https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt
>
>>> For default allocation monitor, it will have such kind of XML layout:
>>>
>>> <cachetune vcpus='1'>
>>> <monitor level=3 vcpus='1'/>
>>> </cachetune>
>>>
>>> For other type monitor, the XML layout will be something like:
>>>
>>> <cachetune vcpus='2-4'>
>>> <cache id='0' level='3' type='both'
size='3' unit='MiB'/>
>>> <cache id='1' level='3' type='both'
size='3' unit='MiB'/>
>>> <monitor level=3 vcpus='2'/>
>>> </cachetune>
>>>
>> Since all we get is the "cache_occupancy" that would seem to me to be
most
>> important when there is a <cache> bank, right? So what would the first (or
your
>> default style) collect/display. Or does cache not really matter.
> <cache> does not matter for monitor.
> Even in case of no CAT in system, the <cache> entry will never be shown
> under <cachetune>.
>
> If valid <cache> entries exist, means allocating some part of cache-lines to
> VM vcpu threads specified in attribute 'vcpus'.
> If there is no <cache> entry existed in <cachetune>, it does not mean
these
> vcpu threads does not use any cache resource, it means it will use the
> cache resource specified in default allocation. Normally the default
> allocation's cache resource is shared by a lot of cache insensitive workloads.
>
Right it's not overriding anything. It's just using the values for all
vcpus present, but that's an implementation detail of the underlying
technology which it feels like could be "papered over".
>> IOW: What is cache_occupancy measuring? Each cache? The entire thing? If
>> there's no cache elements, then what?
> cache_occupancy is measuring based on cache bank. For Intel 2 socket xeon CPU,
> it is considered as two cache banks, one cache bank per socket. The typical
> output for each monitor of this case is:
>
> cpu.cache.0.name=vcpus_1
> cpu.cache.0.vcpus=1
> cpu.cache.0.bank.count=2 <--- 2 cache banks
> cpu.cache.0.bank.0.id=0 <--- bank.id.0 cache_occypancy
> cpu.cache.0.bank.0.bytes=9371648 _|
> cpu.cache.0.bank.1.id=1 <--- bank.id.1 cache_occypancy
> cpu.cache.0.bank.1.bytes=1081344 _|
>
> If you want to know the total cache occupancy for VM vcpu threads of this
> monitor, you need to add them up.
>
So if you have:
<monitor... vcpus=0-1>
what do you get in output for cache_occupancy? 0 + 1?
Yes. Output is sum of two vcpus.
for cache bank 0
vcpus_0-1.bank.0.bytes = vcpus_0.bank.0.bytes + vcpus_1.bank.0.bytes
for cache bank 1
vcpus_0-1.bank.1.bytes = vcpus_0.bank.1.bytes + vcpus_1.bank.1.bytes
>> I honestly think this just needs to be simplified as much as possible.
"I honestly think this just needs to be simplified as much as possible."
I reconsidered your comment ( in above line), do you mean the XML
configuration for 'monitor' need to be simplified also?
What I think is, even after the removal of 'default monitor' and
'default allocation' concepts, the XML
configuration for monitors (with type 'all', 'm-to-n', 'one to
one')
still need such kind of arrangement.
Take an example, a VM has 4 vcpus, vcpu 0 and 1 run cache sensitive
workload, and wants to hold
private L3 caches, and there is no specific requirement for left vcpus
but still need a monitoring on
the cache usage.
Then we could create an cache allocation for vcpu 0 and 1 as well as a
monitor on getting the
actual cache that these two vcpus used. For vcpu 2 and 3, create a
monitor for it.
The XML configurations are: (no change in general rules comparing to my
previous examples)
<cachetune vcpus='0-1'>
<cache id='0' level='3' type='both' size='3'
unit='MiB'/>
<cache id='1' level='3' type='both' size='3'
unit='MiB'/>
<monitor level=3 vcpus='0-1'/>
</cachetune>
<cachetune vcpus='2-3'>
<monitor level=3 vcpus='2-3'/>
</cachetune>
Any suggestion from you is welcome.
>> When you monitor specific vcpus within a cachetune, then you
get what?
> In this case, the monitor you created only monitors the specific vcpus
> you added for monitor.
>
> Following two configurations satisfy your scenario, and the only monitor
> will detect the cache usage of thread of vcpu 2.
>
> <cachetune vcpus='2-4'>
> <cache id='0' level='3' type='both'
size='3' unit='MiB'/>
> <cache id='1' level='3' type='both'
size='3' unit='MiB'/>
> <monitor level=3 vcpus='2'/>
> </cachetune>
>
> <cachetune vcpus='2-4'>
> <monitor level=3 vcpus='2'/>
> </cachetune>
>
Perhaps my question was mistyped or misinterpreted. In the above top
example, if we have <monitor ... vcpus='2-4'>, then do the values in
<cache> have any impact on the calculation as opposed to if they weren't
there?
I perhaps still not understand you well ...
There will have significant influence for the output of monitor if
<cache> entry
exist and if vcpu2-4 demands much more caches that allocation can offer;
If the
cache that the allocation offers is much bigger than vcpu2-4 actually
used, the
influence will be tiny.
But in another case, that, if there is no 'cache' entries, just showing
in the second
example, it still influenced by the cache that the 'allocation' offers.
Its difference
with the first example is: the top example is using the cache resources
allocated
by the allocation of itself, while the second example uses the
allocation of resources
defined in /sys/fs/resctrl/schemata, and this cache is shared by
multiple system tasks.
>
>> If the cachetune has no specific cache entries, you get what?
> If no cache entry in cachetune, it will also get vcpu threads' cache
> utilization information based on cache bank.
> No cache entry specified for the cachetune, means it will use the cache
> allocating policy of default cache allocation, which is file
> /sys/fs/resctrl/schemata.
>
> If valid cache entries are provided in cachetune, then an allocation will
> be created for the threads of vcpu listed in <cachetune> 'vcpus'
> attribute. Supposing the allocation is the directory /sys/fs/resctrl/p0,
> then the cache resource limitation was applied on these threads.
>
> For monitor, it does not care if vcpu threads are allowed or not alloowed to
> access a limit amount of cache-lines. Monitor only reports the amount of
> cache has been accesses.
>
>> If you monitor
>> multiple vcpus within a cachetune then you get what? (?An aggregation of all?).
> Yes.
> supposing you have this vcpus setting for <cachetune>
> <cachetune vcpus='0-4,8' ..../>
>
> and you choose to monitor the cache usage for vcpu 0,3,8, then you create
> following monitor entry inside the cachetune entry, with the output of
> monitor, you will get an aggregative cache occupancy information for threads
> of vcpu 0,3,8.
>
> <cachetune vcpus='0-4,8'/>
> <monitor level='3' vcpus='0,3,8'/>
> </cachetune>
>
>> This whole default and specific description doesn't make sense.
> Sorry for make you confused, I'll try to refine the descriptions.
>
In this last case if you also had
<monitor level='3', vcpus='4'/>
<monitor level='3', vcpus='0-4,8'/>
then I'd expect that the values output in "0-4,8" to match those that I
could add by myself with "4" and "0-3,8". True?
Yes.
Is it apparent yet why I'm saying mentioning default just confuses
things? If so, I'm not sure what else I can do to explain.
Agree with the conclusion that 'default xxx' is a confusing things.
But hope you understand that, a monitor has same vcpu list with the
allocation is
created along with the creation of allocation, no matter you defined a
<monitor>
in <cachetune> and has a same 'vcpus' setting with allocation in the XML
configuration
or not. This is the behavior of kernel resctrl fs.
To get the cache utilization information for whole allocation, enable
this system created
monitor is most economic way in terms of saving RMID.
>>> Signed-off-by: Wang Huaqiang
<huaqiang.wang(a)intel.com>
>>> ---
>>> docs/formatdomain.html.in | 26 +++
>>> docs/schemas/domaincommon.rng | 10 +
>>> src/conf/domain_conf.c | 217
++++++++++++++++++++-
>>> src/conf/domain_conf.h | 11 ++
>>> tests/genericxml2xmlindata/cachetune-cdp.xml | 3 +
>>> .../cachetune-colliding-monitor.xml | 30 +++
>>> tests/genericxml2xmlindata/cachetune-small.xml | 7 +
>>> tests/genericxml2xmltest.c | 2 +
>>> 8 files changed, 301 insertions(+), 5 deletions(-) create mode
>>> 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
>>>
>>> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
>>> index b1651e3..2fd665c 100644
>>> --- a/docs/formatdomain.html.in
>>> +++ b/docs/formatdomain.html.in
>>> @@ -759,6 +759,12 @@
>>> <cachetune vcpus='0-3'>
>>> <cache id='0' level='3' type='both'
size='3' unit='MiB'/>
>>> <cache id='1' level='3' type='both'
size='3' unit='MiB'/>
>>> + <monitor level='3' vcpus='1'/>
>>> + <monitor level='3' vcpus='0-3'/>
>>> + </cachetune>
>>> + <cachetune vcpus='4-5'>
>>> + <monitor level='3' vcpus='4'/>
>>> + <monitor level='3' vcpus='5'/>
>>> </cachetune>
>>> <memorytune vcpus='0-3'>
>>> <node id='0' bandwidth='60'/> @@ -978,6
+984,26 @@
>>> </dd>
>>> </dl>
>>> </dd>
>>> + <dt><code>monitor</code></dt>
>>> + <dd>
>>> + The optional element <code>monitor</code> creates
the cache
>>> + monitor(s) for current cache allocation and has the following
>>> + required attributes:
>>> + <dl>
>>> + <dt><code>level</code></dt>
>>> + <dd>
>>> + Host cache level the monitor belongs to.
>>> + </dd>
>>> + <dt><code>vcpus</code></dt>
>>> + <dd>
>>> + vCPU list the monitor applies to. A monitor's vCPU list
>>> + can only be the member(s) of the vCPU list of associating
>>> + allocation. The default monitor has the same vCPU list as
the
>>> + associating allocation. For non-default monitors, there
>>> + are no vCPU overlap permitted.
>>> + </dd>
>>> + </dl>
>>> + </dd>
>>> </dl>
>>> </dd>
>>>
>>> diff --git a/docs/schemas/domaincommon.rng
>>> b/docs/schemas/domaincommon.rng index 5c533d6..7ce49d3 100644
>>> --- a/docs/schemas/domaincommon.rng
>>> +++ b/docs/schemas/domaincommon.rng
>>> @@ -981,6 +981,16 @@
>>> </optional>
>>> </element>
>>> </zeroOrMore>
>>> + <zeroOrMore>
>>> + <element name="monitor">
>>> + <attribute name="level">
>>> + <ref name='unsignedInt'/>
>>> + </attribute>
>>> + <attribute name="vcpus">
>>> + <ref name='cpuset'/>
>>> + </attribute>
>>> + </element>
>>> + </zeroOrMore>
>>> </element>
>>> </zeroOrMore>
>>> <zeroOrMore>
>>> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index
>>> 9a514a6..4f4604f 100644
>>> --- a/src/conf/domain_conf.c
>>> +++ b/src/conf/domain_conf.c
>>> @@ -2955,13 +2955,30 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr
>>> loader)
>>>
>>>
>>> static void
>>> +virDomainResctrlMonDefFree(virDomainResctrlMonDefPtr domresmon) {
>>> + if (!domresmon)
>>> + return;
>>> +
>>> + virBitmapFree(domresmon->vcpus);
>>> + virObjectUnref(domresmon->instance);
>> VIR_FREE(domresmon);
> I forget to free monitor itself. Will be fixed.
>
Coverity found this!
Still thank you.
>>> +}
>>> +
>>> +
>>> +static void
>>> virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) {
>>> + size_t i = 0;
>>> +
>>> if (!resctrl)
>>> return;
>>>
>>> + for (i = 0; i < resctrl->nmonitors; i++)
>>> + virDomainResctrlMonDefFree(resctrl->monitors[i]);
>>> +
>>> virObjectUnref(resctrl->alloc);
>>> virBitmapFree(resctrl->vcpus);
>>> + VIR_FREE(resctrl->monitors);
>>> VIR_FREE(resctrl);
>>> }
>>>
>>> @@ -18919,6 +18936,154 @@
>> virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt,
>>> return ret;
>>> }
>>>
>> Two blank lines
> OK.
>
>>> +/* Checking if the monitor's vcpus is conflicted with existing
>>> +allocation
>>> + * and monitors.
>>> + *
>>> + * Returns 1 if @vcpus equals to @resctrl->vcpus, means it is a
>>> +default
>>> + * monitor. Returns - 1 if a conflict found. Returns 0 if no conflict
>>> +and
>>> + * @vcpus is not equal to @resctrl->vcpus.
>>> + * */
>>> +static int
>>> +virDomainResctrlMonValidateVcpu(virDomainResctrlDefPtr resctrl,
>>> + virBitmapPtr vcpus)
>> This should be *ValidateVcpus.
> OK.
>
>>> +{
>>> + size_t i = 0;
>>> + int vcpu = -1;
>>> +
>>> + if (virBitmapIsAllClear(vcpus)) {
>>> + virReportError(VIR_ERR_INVALID_ARG, "%s",
>>> + _("vcpus is empty"));
>>> + return -1;
>>> + }
>>> +
>>> + while ((vcpu = virBitmapNextSetBit(vcpus, vcpu)) >= 0) {
>>> + if (!virBitmapIsBitSet(resctrl->vcpus, vcpu)) {
>>> + virReportError(VIR_ERR_INVALID_ARG, "%s",
>>> + _("Monitor vcpus conflicts with
allocation"));
>>> + return -1;
>>> + }
>>> + }
>>> +
>>> + if (resctrl->alloc && virBitmapEqual(vcpus,
resctrl->vcpus))
>> The ->alloc check is confusing in light of having a monitor as a child of
>> cachetune.
> I am looking cachetune as an entity with the capability to do the performance
> tuning tasks by a better cache resource arrangement.
>
> In a general performance tuning taks, first we need to get to know the
> resource utilization information and then applies some resource arrangement,
> such as cache isolation or allocation. @resctrl->alloc represents the
> role for resource arranging, @resctrl->monitors are for the performance
> detecting/monitoring part.
>
> Performance monitoring belongs to the scope of performance tuning, just like
> the part doing the resource limitation. Based on this understanding, I
> combined @alloc and @monitors, and let them as a child of @resctrl.
>
> If you still think it is strange to put @monitors under @resctrl, then I
> have to change it, creating a data structure at the level of @def->resctrls,
> then the new _virDomainDef structure would be:
I don't believe this is what I was saying.
> struct _virDomainDef {
> ...
> virDomainResctrlDefPtr *resctrls;
> size_t nresctrls;
>
> virDomainResctrlMonDefPtr *monitors;
> size_t nmonitors;
> ...
> }
>
> It seems the "virDomainResctrlDefPtr" should be refactored by reducing its
> scope that the name implies.
>
> Further, even have refactored like this, I have to add a lot of code to maintain
> the relationship between @resctrls->vcpus and each one of @monitors->vcpus.
>
>>> + return 1;
>>> +
>>> + for (i = 0; i < resctrl->nmonitors; i++) {
>>> + if (virBitmapEqual(resctrl->vcpus,
resctrl->monitors[i]->vcpus))
>>> + continue;
>>> +
>>> + if (virBitmapOverlaps(vcpus, resctrl->monitors[i]->vcpus)) {
>>> + virReportError(VIR_ERR_INVALID_ARG, "%s",
>>> + _("Monitor vcpus conflicts with
>>> + monitors"));
>>> +
>>> + return -1;
>>> + }
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +
>>> +static int
>>> +virDomainResctrlMonDefParse(virDomainDefPtr def,
>>> + xmlXPathContextPtr ctxt,
>>> + xmlNodePtr node,
>>> + virResctrlMonitorType tag,
>>> + virDomainResctrlDefPtr resctrl) {
>>> + virDomainResctrlMonDefPtr domresmon = NULL;
>>> + xmlNodePtr oldnode = ctxt->node;
>>> + xmlNodePtr *nodes = NULL;
>>> + unsigned int level = 0;
>>> + char * tmp = NULL;
>>> + char * id = NULL;
>>> + size_t i = 0;
>>> + int n = 0;
>>> + int rv = -1;
>>> + int ret = -1;
>>> +
>>> + ctxt->node = node;
>>> +
>>> + if ((n = virXPathNodeSet("./monitor", ctxt, &nodes)) <
0) {
>>> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
>>> + _("Cannot extract monitor nodes"));
>>> + goto cleanup;
>>> + }
>>> +
>>> + for (i = 0; i < n; i++) {
>>> +
>>> + if (VIR_ALLOC(domresmon) < 0)
>>> + goto cleanup;
>>> +
>>> + domresmon->tag = tag;
>>> +
>>> + domresmon->instance = virResctrlMonitorNew();
>>> + if (!domresmon->instance) {
>>> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
>>> + _("Could not create monitor"));
>>> + goto cleanup;
>>> + }
>>> +
>>> + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) {
>>> + tmp = virXMLPropString(nodes[i], "level");
>>> + if (!tmp) {
>>> + virReportError(VIR_ERR_XML_ERROR, "%s",
>>> + _("Missing monitor attribute
'level'"));
>>> + goto cleanup;
>>> + }
>>> +
>>> + if (virStrToLong_uip(tmp, NULL, 10, &level) < 0) {
>>> + virReportError(VIR_ERR_XML_ERROR,
>>> + _("Invalid monitor attribute
'level' value '%s'"),
>>> + tmp);
>>> + goto cleanup;
>>> + }
>>> +
>>> + if (virResctrlMonitorSetCacheLevel(domresmon->instance,
level) < 0)
>>> + goto cleanup;
>>> +
>>> + VIR_FREE(tmp);
>>> + }
>>> +
>>> + if (virDomainResctrlParseVcpus(def, nodes[i],
&domresmon->vcpus) < 0)
>>> + goto cleanup;
>>> +
>>> + rv = virDomainResctrlMonValidateVcpu(resctrl,
>>> + domresmon->vcpus);
>>> +
>>> + /* If monitor's vcpu list is identical to allocation's vcpu
list,
>>> + * set as default monitor */
>>> + if (rv == 1 && resctrl->alloc)
>> I'm still not seeing the need for default...
> As I stated, default monitor, monitor, and monitor in default allocation has
> different character and behavior in resctrl, they are necessary.
> If we don't support default monitor, we could not monitor cache usage of whole
> allocation and also monitor some the cache usage of a subset of allocation
> vcpus at the same time.
> If we don't support default allocation, we will have to create many
> duplicated allocations that has same cache settings, that is sharing same
> cache resources.
> In that case, we'll lose the flexibility that kernel resctrl fs provided.
>
>> FWIW: The resctrl->alloc check is
>> unnecessary since the only way rv == 1 is if it's there.
> You are right, it is double checked, changes to
>
> if (rv == 1)
> virResctrlMonitorSetDefault(domresmon->instance);
>
>>> + virResctrlMonitorSetDefault(domresmon->instance);
>>> + else if (rv < 0)
>>> + goto cleanup;
>>> +
>>> + if (!(tmp = virBitmapFormat(domresmon->vcpus)))
>>> + goto cleanup;
>>> +
>>> + if (virAsprintf(&id, "vcpus_%s", tmp) < 0)
>>> + goto cleanup;
>>> +
>>> + if (virResctrlMonitorSetID(domresmon->instance, id) < 0)
>>> + goto cleanup;
>>> +
>>> + if (VIR_APPEND_ELEMENT(resctrl->monitors,
>>> + resctrl->nmonitors,
>>> + domresmon) < 0)
>>> + goto cleanup;
>>> +
>>> + VIR_FREE(id);
>>> + VIR_FREE(tmp);
>>> + domresmon = NULL;
>>> + }
>>> +
>>> + ret = 0;
>>> + cleanup:
>>> + ctxt->node = oldnode;
>>> + VIR_FREE(id);
>>> + VIR_FREE(tmp);
>>> + virDomainResctrlMonDefFree(domresmon);
>> VIR_FREE(nodes);
> I forget to free it. Will be added.
>
Again, Coverity
Thank you again. Hope someday I can hold the power of Coverity ...
>>> + return ret;
>>> +}
>>> +
>>>
>>> static virDomainResctrlDefPtr
>>> virDomainResctrlNew(virResctrlAllocPtr alloc, @@ -19041,15 +19206,20
>>> @@ virDomainCachetuneDefParse(virDomainDefPtr def,
>>> }
>>> }
>>>
>>> - if (virResctrlAllocIsEmpty(alloc)) {
>>> - ret = 0;
>>> - goto cleanup;
>>> - }
>>> -
>>> resctrl = virDomainResctrlNew(alloc, vcpus);
>> So if @alloc == NULL (which is one of the reasons AllocIsEmpty is true) means
>> that we'll be virObjectRef(NULL) in ResctrlNew(). In this case @alloc could
be
>> NULL if there were no <cache> entries, IIUC.
> Your understanding is right.
>
>> I'm starting to see a real downside to a resctrl->alloc == NULL. I really
don't
>> want to continue to see churn on the internal hierarchy though.
> The downside, what you mean is the 'churn on the internal hierarchy though'?
>
The downside is it makes no sense to have resctrl->alloc == NULL if a
<cachetune> or <memorytune> element exists.
Yes. virresctrl.c could determine a monitor is a default one by checking
@resctrl->alloc.
If there's a <cachetune> or a <memorytune>, then a resctrl->alloc is
created for the listed vcpus. Everything builds off of that.
The <monitor> elements are children of cachetune (and I assume
eventually) memorytune. The fact that the filesystem has this other
"default" thing means what when a cachetune or memorytune element is
found - it's a child of the default thing
>> However, if I look at how it could be filled in within the context of this
function,
>> the virDomainResctrlVcpuMatch call and subsequent possible allocation of
>> @alloc would seemingly be possible outside the context of whether specific
>> <cache> entries existed.
> If no <cache> entry, then 'n = virXPathNodeSet("./cache", ctxt,
&nodes)'
> statement will assign n to 0. If n is 0, then virDomainResctrlVcpuMatch will
> not be called and its subsequent steps for creating @alloc will not be
> executed.
>
>> Probably could just do away with the term default
>> allocation - all it seems to be is an allocation without <cache> elements,
but it
>> can have <monitor> elements.
> Correct, this is how code works.
>
>> If someone places a <cachetune> without <cache> and without
<monitor>, so
>> what - who cares. Probably doesn't do much other than limit other
<cachetune>
>> (and perhaps <memorytune>) elements.
> Yes.
>
>>> if (!resctrl)
>>> goto cleanup;
>>>
>>> + if (virDomainResctrlMonDefParse(def, ctxt, node,
>>> + VIR_RESCTRL_MONITOR_TYPE_CACHE,
>>> + resctrl) < 0)
>>> + goto cleanup;
>>> +
>>> + if (virResctrlAllocIsEmpty(alloc) && !resctrl->nmonitors) {
>>> + ret = 0;
>>> + goto cleanup;
>>> + }
>>> +
>> Moving the AllocIsEmpty check should be separate.
> Got. Will be done.
>
>> I'm losing steam, but the next couple of patches had Coverity issues, so I
figured
>> I'll note that before going back to read all the comments you've posted
today
>> while I was reviewing without trying to go back.
>>
>> John
> I use a lot of paragraph in introducing the usage of 'default monitor',
> 'default allocation' and 'monitor in non-default allocation', and
states that
> these 'default' are necessary for purpose of saving RMIDs, removing
> allocation duplications and keeping the flexibility of monitor that kernel
> interface provided. Hope I tell them clearly.
Yes, lots of text, perhaps way too much. Still I'm no closer to figuring
out what the need for this default wording/mechnism really is.
John
As stated in prior paragraph. Will remove 'default monitor' and
'default allocation' and
make cleaning for code and comments.
Do I miss anything?
BTW, I find the 'virsh domstats --cpu-total' output for monitors,
introduced in patch18,
is not good enough.
current is
"
Domain: 'ubuntu16.04-base'
cpu.cache.monitor.count=2
cpu.cache.0.name=vcpus_0
cpu.cache.0.vcpus=0
cpu.cache.0.bank.count=2
cpu.cache.0.bank.0.id=0
cpu.cache.0.bank.0.bytes=9371648
cpu.cache.0.bank.1.id=1
cpu.cache.0.bank.1.bytes=1081344
cpu.cache.1.name=vcpus_3
cpu.cache.1.vcpus=3
cpu.cache.1.bank.count=2
cpu.cache.1.bank.0.id=0
cpu.cache.1.bank.0.bytes=630784
cpu.cache.1.bank.1.id=1
cpu.cache.1.bank.1.bytes=10452992
"
I may change the output to following by adding 'monitor' for each line:
Domain: 'ubuntu16.04-base'
cpu.cache.monitor.count=2
cpu.cache.monitor.0.name=vcpus_0
cpu.cache.monitor.0.vcpus=0
cpu.cache.monitor.0.bank.count=2
cpu.cache.monitor.0.bank.0.id=0
cpu.cache.monitor.0.bank.0.bytes=9371648
cpu.cache.monitor.0.bank.1.id=1
cpu.cache.monitor.0.bank.1.bytes=1081344
cpu.cache.monitor.1.name=vcpus_3
cpu.cache.monitor.1.vcpus=3
cpu.cache.monitor.1.bank.count=2
cpu.cache.monitor.1.bank.0.id=0
cpu.cache.monitor.1.bank.0.bytes=630784
cpu.cache.monitor.1.bank.1.id=1
cpu.cache.monitor.1.bank.1.bytes=10452992
Please take this change in consideration when you make review for patch 18.
Thanks for review.
Huaqiang
> Thanks for review.
> Huaqiang
>
>
>>> if (virDomainResctrlAppend(def, node, resctrl, flags) < 0)
>>> goto cleanup;
>>>
>>> @@ -27085,12 +27255,42 @@
>> virDomainCachetuneDefFormatHelper(unsigned
>>> int level,
>>>
>>>
>>> static int
>>> +virDomainResctrlMonDefFormatHelper(virDomainResctrlMonDefPtr
>> domresmon,
>>> + virResctrlMonitorType tag,
>>> + virBufferPtr buf) {
>>> +IIUC. char *vcpus = NULL;
>>> + unsigned int level = 0;
>>> +
>>> + if (domresmon->tag != tag)
>>> + return 0;
>>> +
>>> + virBufferAddLit(buf, "<monitor ");
>>> +
>>> + if (tag == VIR_RESCTRL_MONITOR_TYPE_CACHE) {
>>> + level = virResctrlMonitorGetCacheLevel(domresmon->instance);
>>> + virBufferAsprintf(buf, "level='%u' ", level);
>>> + }
>>> +
>>> + vcpus = virBitmapFormat(domresmon->vcpus);
>>> + if (!vcpus)
>>> + return -1;
>>> +
>>> + virBufferAsprintf(buf, "vcpus='%s'/>\n", vcpus);
>>> +
>>> + VIR_FREE(vcpus);
>>> + return 0;
>>> +}
>>> +
>>> +
>>> +static int
>>> virDomainCachetuneDefFormat(virBufferPtr buf,
>>> virDomainResctrlDefPtr resctrl,
>>> unsigned int flags) {
>>> virBuffer childrenBuf = VIR_BUFFER_INITIALIZER;
>>> char *vcpus = NULL;
>>> + size_t i = 0;
>>> int ret = -1;
>>>
>>> virBufferSetChildIndent(&childrenBuf, buf); @@ -27099,6 +27299,13
>>> @@ virDomainCachetuneDefFormat(virBufferPtr buf,
>>> &childrenBuf) < 0)
>>> goto cleanup;
>>>
>>> + for (i = 0; i < resctrl->nmonitors; i ++) {
>>> + if (virDomainResctrlMonDefFormatHelper(resctrl->monitors[i],
>>> +
VIR_RESCTRL_MONITOR_TYPE_CACHE,
>>> + &childrenBuf) < 0)
>>> + goto cleanup;
>>> + }
>>> +
>>> if (virBufferCheckError(&childrenBuf) < 0)
>>> goto cleanup;
>>>
>>> diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index
>>> e30a4b2..60f6464 100644
>>> --- a/src/conf/domain_conf.h
>>> +++ b/src/conf/domain_conf.h
>>> @@ -2236,12 +2236,23 @@ struct _virDomainCputune { };
>>>
>>>
>>> +typedef struct _virDomainResctrlMonDef virDomainResctrlMonDef;
>>> +typedef virDomainResctrlMonDef *virDomainResctrlMonDefPtr; struct
>>> +_virDomainResctrlMonDef {
>>> + virBitmapPtr vcpus;
>>> + virResctrlMonitorType tag;
>>> + virResctrlMonitorPtr instance;
>>> +};
>>> +
>>> typedef struct _virDomainResctrlDef virDomainResctrlDef; typedef
>>> virDomainResctrlDef *virDomainResctrlDefPtr;
>>>
>>> struct _virDomainResctrlDef {
>>> virBitmapPtr vcpus;
>>> virResctrlAllocPtr alloc;
>>> +
>>> + virDomainResctrlMonDefPtr *monitors;
>>> + size_t nmonitors;
>>> };
>>>
>>>
>>> diff --git a/tests/genericxml2xmlindata/cachetune-cdp.xml
>>> b/tests/genericxml2xmlindata/cachetune-cdp.xml
>>> index 9718f06..9f4c139 100644
>>> --- a/tests/genericxml2xmlindata/cachetune-cdp.xml
>>> +++ b/tests/genericxml2xmlindata/cachetune-cdp.xml
>>> @@ -8,9 +8,12 @@
>>> <cachetune vcpus='0-1'>
>>> <cache id='0' level='3' type='code'
size='7680' unit='KiB'/>
>>> <cache id='1' level='3' type='data'
size='3840' unit='KiB'/>
>>> + <monitor level='3' vcpus='0'/>
>>> + <monitor level='3' vcpus='1'/>
>>> </cachetune>
>>> <cachetune vcpus='2'>
>>> <cache id='1' level='3' type='code'
size='6' unit='MiB'/>
>>> + <monitor level='3' vcpus='2'/>
>>> </cachetune>
>>> <cachetune vcpus='3'>
>>> <cache id='1' level='3' type='data'
size='6912' unit='KiB'/>
>>> diff --git
>>> a/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
>>> b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
>>> new file mode 100644
>>> index 0000000..d481fb5
>>> --- /dev/null
>>> +++ b/tests/genericxml2xmlindata/cachetune-colliding-monitor.xml
>>> @@ -0,0 +1,30 @@
>>> +<domain type='qemu'>
>>> + <name>QEMUGuest1</name>
>>> + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
>>> + <memory unit='KiB'>219136</memory>
>>> + <currentMemory unit='KiB'>219136</currentMemory>
>>> + <vcpu placement='static'>4</vcpu>
>>> + <cputune>
>>> + <cachetune vcpus='0-1'>
>>> + <cache id='0' level='3' type='both'
size='768' unit='KiB'/>
>>> + <monitor level='3' vcpus='2'/>
>>> + </cachetune>
>>> + </cputune>
>>> + <os>
>>> + <type arch='i686' machine='pc'>hvm</type>
>>> + <boot dev='hd'/>
>>> + </os>
>>> + <clock offset='utc'/>
>>> + <on_poweroff>destroy</on_poweroff>
>>> + <on_reboot>restart</on_reboot>
>>> + <on_crash>destroy</on_crash>
>>> + <devices>
>>> + <emulator>/usr/bin/qemu-system-i686</emulator>
>>> + <controller type='usb' index='0'/>
>>> + <controller type='ide' index='0'/>
>>> + <controller type='pci' index='0'
model='pci-root'/>
>>> + <input type='mouse' bus='ps2'/>
>>> + <input type='keyboard' bus='ps2'/>
>>> + <memballoon model='virtio'/>
>>> + </devices>
>>> +</domain>
>>> diff --git a/tests/genericxml2xmlindata/cachetune-small.xml
>>> b/tests/genericxml2xmlindata/cachetune-small.xml
>>> index ab2d9cf..748be08 100644
>>> --- a/tests/genericxml2xmlindata/cachetune-small.xml
>>> +++ b/tests/genericxml2xmlindata/cachetune-small.xml
>>> @@ -7,6 +7,13 @@
>>> <cputune>
>>> <cachetune vcpus='0-1'>
>>> <cache id='0' level='3' type='both'
size='768' unit='KiB'/>
>>> + <monitor level='3' vcpus='0'/>
>>> + <monitor level='3' vcpus='1'/>
>>> + <monitor level='3' vcpus='0-1'/>
>>> + </cachetune>
>>> + <cachetune vcpus='2-3'>
>>> + <monitor level='3' vcpus='2'/>
>>> + <monitor level='3' vcpus='3'/>
>>> </cachetune>
>>> </cputune>
>>> <os>
>>> diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c
>>> index fa941f0..4393d44 100644
>>> --- a/tests/genericxml2xmltest.c
>>> +++ b/tests/genericxml2xmltest.c
>>> @@ -137,6 +137,8 @@ mymain(void)
>>> TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);
>>> DO_TEST_FULL("cachetune-colliding-types", false, true,
>>> TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);
>>> + DO_TEST_FULL("cachetune-colliding-monitor", false, true,
>>> + TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);
>>> DO_TEST("memorytune");
>>> DO_TEST_FULL("memorytune-colliding-allocs", false, true,
>>> TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);
>>>